0% found this document useful (0 votes)

129 views59 pages

FDS Record

This document is a certificate certifying that Sreya Reddy Addula completed practical work for the Fundamentals of Data Science lab during the 2020-2021 academic year at Chaitanya Bharathi Institute of Technology. The certificate is signed by the internal and external examiners as well as the Head of the Department of Computer Science and is dated February 2nd, 2022. The attached index lists topics covered in the lab including installing Python, NumPy commands, data visualization techniques, and data analysis methods with and without Scipy.

Uploaded by

tbhumuytj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

129 views59 pages

FDS Record

Uploaded by

tbhumuytj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

DEPARTMENT OF COMPUTER SCIENCE

B.E – III SEMESTER

Fundamentals of Data Science Lab
Course Code: 20CAC02

Academic Year
2021-22
CHAITANYA BHARATHI INSTITUTE OF
TECHNOLOGY
Gandipet, Hyderabad-500075

Certificate
Certified that this is the bonafide record of the practical work done during the academic year
2020-2021 by Sreya Reddy Addula
Roll Number _ 160120748017 Section CSE-4
in the Laboratory of Fundamentals of Data Science of the Department of Computer
Science.

Internal examiner External examiner

Head of the Department

Date : 02-02-2022
INDEX

S.No TOPICS Page Remarks

1. Installation process for Python in Windows 1-2
2. NUMPY 3-16
• Numpy Commands
• Array Slicings and dimensions in
Numpy
• User Defined Datatypes using Numpy
3. Pandas 17-20
4. Data Visualization 21-43
• Bar Graphs
• Pie Charts
• Box Plots
• Frequency Polygons
• Histograms
• Scatter Plots
5. Data Analysis and Distribution 44-56
With and without Scipy
• 1-sample t-test
• Unpaired unequal Variance T-Test
Theory
• Unpaired equal variance t test
• Paired t-test
• ANOVA Test
CSE-4 FDS RECORD 160120748017

Fundementals of Data Science Lab

INSTALLATION PROCEDURE FOR PYTHON IN WINDOWS

STEP 1: SELECT VERSION OF PYTHON TO INSTALL:

The installation procedure involves downloading the official python.exe
installer and running on the system.
STEP 2: DOWNLOAD PYTHON EXECUTABLE INSTALLER:
Open the browser and navigate to official Python website. Search for desired
version of python.
E.g : 3.9.7
STEP 3: RUN EXECUTABLE INSTALLER:
Run the python installer one downloaded and make sure you have select the
install launcher for all users. Add python 3.9.7 to path checkboxes the select
install now.
STEP 4: VERIFY PYTHON WAS INSTALLED ON WINDOWS
Navigate to the directory in which python was installed on the system.
C:/Programfile/Python/Python3.9.7
After finding that folder double click on python.exe. The output will be in a
python terminal
STEP 5: VERIFY Pip WAS INSTALLED OR NOT:
CASE 1: if Pip was not installed:
How to install Pip:
Pip is a package management system used to install and manage software
packages written in python.
Pip stands for preferred installer program
Step 1: Download Pip get_pip.py:
Browse from official website or use following command to get get-pip.py file
from Command prompt you need to run
https:\\ bootstrap.pypa.io\get-pip.py-oget-pip.py
Step 2:Install pip on windows:
Python get-pip.py
Step 3: Once you installed pip you can test by typing the following command in
the command prompt “pip.help”

1
CSE-4 FDS RECORD 160120748017

Case 2: Pip already installed :

Step 1: Open start menu and type cmd select command prompt application
and enter the command “Pip-V”. if pip was installed successfully you should
see the version of the python.
STEP 6: ADD PYTHON PATH TO ENVIRONMENT VARIABLES :
Open the start menu and choose my computer , right click my computer.
Choose properties and navigate to advance system settings and choose
environmental variable ,choose system variable and from there choose path.
For data science we need additional packages and libraries which are scipy,
pandas ,NumPy
NUMPY:
Numpy means numerical python. It is an opensource library for the python
programming language. It is used for scientific computing and working with
arrays. Apart from its multidimensional array objects, it also provides high level
functioning tools for working arrays.
How to install Numpy :
Note: Prerequeste is Python installed on your system.
To install Numpy in Python, type the following,
pip install NumPy.
PANDAS:
It is an open source python Package that is most widely used for data science,
data analysis and machine learning tasks. It is built on top of another package
named NumPy. Pandas work well with many other data science modules inside
python ecosystem. Pandas make it simple to do with many time consuming,
repetitive tasks associated with working with data which includes data
cleaning, normalization of data, visualization, statistics etc.
How to install pandas:
"pip install Pandas"

2
CSE-4 FDS RECORD 160120748017

WEEK-1:

PROGRAM 1:
AIM: To access various type of commands from the numpy array
PROCEDURE: In this code we have used type, shape commands. A numpy array
is a grid of values, all of the same type, and is indexed by a tuple of
nonnegative integers. The number of dimensions is the rank of the array;
the shape of an array is a tuple of integers giving the size of the array along
each dimension.
CODE:
import numpy as np
a = np.array([5, 12, 23, 40])
print(type(a)) print(a.shape)
print(a[3], a[1], a[0])
a[0] = 6
print(a)
b = np.array([[1,2,43],[14,5,6]])
print(b.shape)
print(b[0 0], b[0 1], b[1 0])
OUTPUT:

PROGRAM 2:
AIM: To create arrays using various functions
PROCEDURE: Here we have used different functions to create an array like
zeros([m,n]) is the command used to create an array with all zeros with m rows
and n columns
ones([m,n]) is the command used to create an array with all ones with m rows
and n columns
full([m,n]) is the command used to create a constant array with m rows and n
columns
eyes([m,n]) is the command used to create an identity matrix with m rows and
n columns
random.random([m,n]) is the command used to create an array consisting of
random values with m rows and n columns
CODE:
import numpy as np

3
CSE-4 FDS RECORD 160120748017

a = np.zeros((2,5))
print(a)
b = np.ones((2,3))
print(b)
c = np.full((2,2), 12)
print(c)
d = np.eye(2)
print(d)
e = np.random.random((3,2))
print(e)
OUTPUT:

PROGRAM 3:
AIM: To implement slicing.
PROCEDURE: Similar to Python lists, numpy arrays can be sliced. Since arrays
may be multidimensional, you must specify a slice for each dimension of the
array
CODE:
import numpy as np
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
b = a[:3, 1:2]
print(a[0, 1])
b[0, 0] = 23
print(a[0, 1])
OUTPUT:

PROGRAM 4:
AIM: To create an array with different dimensions

4
CSE-4 FDS RECORD 160120748017

PROCEDURE: In this program we need to create an array by using shape

command with different dimensions where shape command is used to return a
tuple of the size of each dimension in a Numpy array
CODE:
import numpy as np
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
row_r1 = a[1, :]
row_r2 = a[1:2, :]
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print(col_r2, col_r2.shape)
OUTPUT:

PROGRAM 5:
AIM: To implement integer array indexing
PROCEDURE: When you index into numpy arrays using slicing, the resulting
array view will always be a subarray of the original array. In contrast, integer
array indexing allows you to construct arbitrary arrays using the data from
another array.
CODE:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
print(a[[0, 1, 2], [0, 1, 0]])
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))
print(a[[0, 0], [1, 1]])
print(np.array([a[0, 1], a[0, 1]]))
OUTPUT:

5
CSE-4 FDS RECORD 160120748017

PROGRAM 6:
AIM: To implement Boolean array indexing
PROCEDURE: Boolean array indexing lets you pick out arbitrary elements of an
array. Frequently this type of indexing is used to select the elements of an
array that satisfy some condition.
CODE:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2)
print(bool_idx)
print(a[bool_idx])
print(a[a > 2])
OUTPUT:

PROGRAM 7:
AIM: To implement data types
PROCEDURE: Numpy provides a large set of numeric datatypes that you can
use to construct arrays. Numpy tries to guess a datatype when you create an
array, but functions that construct arrays usually also include an optional
argument to explicitly specify the datatype
CODE:
import numpy as np
x = np.array([1, 2])
print(x.dtype)
x = np.array([1.0, 2.0])
print(x.dtype)
x = np.array([1, 2], dtype=np.int64)
print(x.dtype)
OUTPUT:

PROGRAM 8
AIM: To implement math in arrays

6
CSE-4 FDS RECORD 160120748017

PROCEDURE: Basic mathematical functions operate elementwise on arrays,

and are available both as operator overloads and as functions in the numpy
module
CODE:
import numpy as np
x = np.array([[12,23],[39,40]], dtype=np.float64)
y = np.array([[6,21],[5,17]], dtype=np.float64)
print(x + y)
print(np.add(x, y))
print(x - y)
print(np.subtract(x, y))
print(x * y)
print(np.multiply(x, y))
print(x / y)
print(np.divide(x, y))
print(np.sqrt(x))

OUTPUT:

PROGRAM 9:
AIM: To implement inner product and vector product
PROCEDURE: We use the dot function to compute inner products of vectors, to
multiply a vector by a matrix, and to multiply matrices. dot is available both as
a function in the numpy module and as an instance method of array objects
CODE:

7
CSE-4 FDS RECORD 160120748017

import numpy as np
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
v = np.array([9,10])
w = np.array([11, 12])
print(v.dot(w))
print(np.dot(v, w))
print(x.dot(v))
print(np.dot(x, v))

OUTPUT:

PROGRAM 10
AIM: To implement computation functions.
PROCEDURE: Here is the sum command which is used to find the sum of
elements in the array.
CODE:
import numpy as np
x = np.array([[1,2],[3,4]])
print(np.sum(x))
print(np.sum(x, axis=0))
print(np.sum(x, axis=1))
OUTPUT:

PROGRAM 11
AIM: To display transpose of a matrix
PROCEDURE: The transpose of a matrix can be established by arrayname.T
CODE:
import numpy as np
x = np.array([[1,2], [3,4]])
print(x)
print(x.T)
v = np.array([1,2,3])
print(v)
print(v.T)

8
CSE-4 FDS RECORD 160120748017

OUTPUT:

9
CSE-4 FDS RECORD 160120748017

WEEK-1 Dt: 21.10.21

Aim: To demonstrate how inbuilt numpy function arrange works
Procedure: arange is a in-built numpy function which returns an array with
evenly spaced elements as per the interval. arange(start,stop) prints an array of
elements from start to end. arange(start,stop,interval) prints an array of
elements with an interval as inputted.
Code:
import numpy as np
b=np.arange(1,10)
print(list(b))

Output:

Code:
import numpy as np
b=np.arange(1,9,2)
print(list(b))

Output:

Code:
#arange([start,] stop[, step], [, dtype=None])
x = np.arange(19.8)
print(x)
x = np.arange(0.8, 19.8,1.0 )
print(x)

Output:

Code:

10
CSE-4 FDS RECORD 160120748017

# 8 values between 1 and 100:

print(np.linspace(1, 100, 8))

Output:

Aim: To demonstrate numpy arrays of various dimensions.

Procedure:
A numpy array is a grid of values, all of the same type, and is indexed by a tuple
of nonnegative integers. The number of dimensions is the rank of the array;
the shape of an array is a tuple of integers giving the size of the array along
each dimension.
Code:
#zero Dimensional Arrays
import numpy as np
l = np.array(89)
print("l: ", l)
print("The type of l: ", type(l))
print("The dimension of l:", np.ndim(l))

Output:

Code:

#one dimensional Arrays

A = np.array([1,3,4,6,10,13,15,19])
B = np.array([2.2,5.9,4.5,1.9,12.8,19.5])
print("A: ", A)
print("B: ", B)
print("Type of A: ", A.dtype)
print("Type of B: ", B.dtype)
print("Dimension of A: ", np.ndim(A))
print("Dimension of B: ", np.ndim(B))

Output:

11
CSE-4 FDS RECORD 160120748017

Code:

M = np.array([ [[-12, 100, -903,901], [-156,-34,123,392]],

[[39,278,890,456], [-12,-279,125,580]],
[[190,-19,-78,90], [-292,70,109,-18]]])

print(M.shape)
print(M)

Output:

Aim: To perform array indexing and slicing operations

Procedure:

Indexing: Array indexing refers to the accessing of elements in the given array.
Slicing: Similar to Python lists, numpy arrays can be sliced. Since arrays may
be multidimensional, you must specify a slice for each dimension of the array.

Code:
Q = np.array([1,5,14,6,87,24,84])
# print the first element of Q
print(Q[0])
# print the last but one element of Q
print(Q[-2])

12
CSE-4 FDS RECORD 160120748017

Output:

Code:
#slicing ( Single Dimensional Array)
S = np.array([ 1, 2, 3, 4, 5, 6, 7, 8, 9])
print(S[2:4])
print(S[:2])
print(S[3:])
print(S[:]) #prints entire array

Output:

Code:
L = np.array([ [[-12, 100, -903,901], [-156,-34,123,392]],
[[39,278,890,456], [-12,-279,125,580]],
[[190,-19,-78,90], [-292,70,109,-18]]])
L[1:3, 0:1,1:4] # equivalent to A[1:3, 0:2, :]

Output:

13
CSE-4 FDS RECORD 160120748017

Dt:21.10.2021

PROGRAM 1:
AIM : To write a program using numpy in python to create an array using dtype .
PROCEDURE: In this program, dtype is used to set the byte size of the elements in
the array . i4 is declared as dtype (np.int32) and arr array is then declared as
array(lst,dtype=i4) which results as all the elements in arr array are int32 data
type.
CODE:
import numpy as np
i4 = np.dtype(np.int32)
print(i4)
list_a = [ [1.2,2.3,4.5,9.0],[2.4,7.8,4.7,5],[7.9,-5.3,7, 5.9],[4.6,7,9,-6.8]]
arr= np.array(lst, dtype=i4)
print(arr)
OUTPUT:

PROGRAM 2:
AIM : To write a program to create an array using dtype and to show repr()
function.
PROCEDURE: In this program, dtype is used to set the layout for the array .dtype
can set different datatypes(different byte size ) to different columns in the multi
dimensional array.
CODE:
import numpy as np
dt = np.dtype([('area', np.int32)])
arr = np.array([(2357), (1456), (6789)], dtype=dt)
print(arr)
print("Internal representation:")
print(repr(arr))
OUTPUT:

14
CSE-4 FDS RECORD 160120748017

PROGRAM 3:
AIM : To write a program to create an array which shows different datatypes in
different columns of the array.
PROCEDURE:In this program , dtype is used to create the layout for the array.
dtype can set different datatypes(different byte size ) to different columns in the
multi dimentional array. And some slicing and indexing operations are done on
the array arr1.
CODE:
d=np.dtype([('product','S20'),('productId','i4'),('Price',np.float64)])
arr1= np.array([('Pen',245,20.4),
('Pencil',304,35.8),
('Book',498,57),
('Mask',268,10),
('Sanitiser',468,59.9)],dtype=d)
print(arr1)
print(repr(arr1))
print(arr1[1])
print(arr1[1][2])
print(arr1[1:])
OUTPUT:

PROGRAM 4:
AIM : To write a program to save the array to a file using savetxt and print data
from the file.
PROCEDURE: This method is used to save an array to a file in requires format .The
NumPy genfromtxt is one of the various functions supported by python numpy

15
CSE-4 FDS RECORD 160120748017

library that reads the table data and generates it into an array of data and
displays as output.
CODE:
np.savetxt("products.csv",
arr1,
fmt="%s;%d;%d",
delimiter=";")
d=np.dtype([('product','S20'),('productId','i4'),('Price','i4')])
a7 = np.genfromtxt("products.csv",
dtype=d,
delimiter=";")
print(a7)
OUTPUT:

16
CSE-4 FDS RECORD 160120748017

Dt: 28.10.21
PROGRAM 1
AIM: To demonstrate pandas series
PROCEDURE:: Pandas Series is a one-dimensional labeled array capable of
holding data of any type (integer, string, float, python objects, etc.). The axis
labels are collectively called index. Pandas Series is nothing but a column in an
excel sheet. Labels need not be unique but must be a hashable type
CODE:
import pandas as pd
A=pd.Series([12,40,23,17])
A
OUTPUT:

PROGRAM 2:
AIM: To access single values from pandas series
PROCEDURE:: Pandas Series is a one-dimensional labeled array capable of
holding data of any type (integer, string, float, python objects, etc.). The axis
labels are collectively called index. Pandas Series is nothing but a column in an
excel sheet. Labels need not be unique but must be a hashable type
PROGRAM CODE:
colors=['blue','red','black','white']
codes=[12,40,23,17]
I=pd.Series(codes,index=colors)
I
OUTPUT:

PROGRAM 3
AIM: To demonstrate addition on pandas series
PROCEDURE:: Pandas Series is a one-dimensional labeled array capable of
holding data of any type (integer, string, float, python objects, etc.). The axis

17
CSE-4 FDS RECORD 160120748017

labels are collectively called index. Pandas Series is nothing but a column in an
excel sheet. Labels need not be unique but must be a hashable type
CODE:
colors=['blue','red','black','white']
colors1=['blue','orange','black','green']
T=pd.Series([12,23,40,17],index=colors)
Y=pd.Series([5,12,16,39],index=colors1)
print(T+Y)
print(sum(T))
OUTPUT:

PROGRAM 4
AIM: To demonstrate how to handle missing values in pandas series
PROCEDURE:: Pandas Series is a one-dimensional labeled array capable of
holding data of any type (integer, string, float, python objects, etc.). The axis
labels are collectively called index. Pandas Series is nothing but a column in an
excel sheet. Labels need not be unique but must be a hashable type
CODE:
colors=['blue','red','black','white']
colors1=['pink','orange','yellow','green']
T=pd.Series([12,23,40,17],index=colors)
Y=pd.Series([5,12,16,39],index=colors1)
print(T+Y)
print(sum(T))

OUTPUT:

18
CSE-4 FDS RECORD 160120748017

PROGRAM 5
AIM: To demonstrate pandas isnull() and notnull() function
PROCEDURE:: Return a boolean same-sized object indicating if the values are
NA. NA values, such as None or numpy.NaN, gets mapped to True values.
Everything else gets mapped to False values. Characters such as empty strings
'' or numpy.inf are not considered NA values
CODE:
my_cities=["USA","Poland","Berlin","China"]
my_city_series=pd.Series(cities,index=my_cities)
print(my_city_series.isnull())
print(my_city_series.notnull())
OUTPUT:

PROGRAM 7
AIM: To demonstrate pandas dropna () function
PROCEDURE:: The dropna() function is used to return a new Series with
missing values removed. There is only one axis to drop values from. If True, do

19
CSE-4 FDS RECORD 160120748017

operation inplace and return None. Whether to perform the operation in place
on the data
CODE:
cities={"Australia":123456,
"China":9324,
"Russia":683506,
"USA":56897,
"Cambodia":896764}
city_series=pd.Series(cities)
print(city_series)
print(my_city_series.dropna())
print(my_city_series.fillna(0))
OUTPUT:

20
CSE-4 FDS RECORD 160120748017

Dt: 11.11.21

Program 1:
AIM : To plot a 2-d graph using matplotlib.
PROCEDURE:.A Line plot can be defined as a graph that displays data as points or check
marks above a number line, showing the frequency of each valuematplotlib.pyplot is
library of functions that make matplotlib work like matlab and helps to visualise the
data. In this program, plot() fuction is used to plot the 2d graph and xlabel , ylabel are
used to provide labels to the graph.
CODE:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
x=[100,200,300]
y=[400,500,600]
plt.plot(x,y)

OUTPUT:

CODE:
plt.title("Graph")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.plot(x,y)

21
CSE-4 FDS RECORD 160120748017

OUTPUT :

PROGRAM 2:
AIM : to plot a 2d graph and determine the use of title(), fontdict, xticks and yticks.
PROCEDURE : In this program, title is used to print the title to the graph. Fontdict is
used to style the title of the graph , to give fontname , fontsize to the title. Xticks and
yticks are used to set the current tick locations.
CODE:
x=[100,200,300]
y=[400,500,600]
plt.plot(x,y)
plt.title("Graph",fontdict={'fontname':'FreeSerif','fontsize':20})
plt.xlabel("X")
plt.ylabel("Y")
plt.xticks([60,100,140,180,220,260,300])
plt.yticks([400,500,600,700,800])
plt.show()

OUTPUT:

22
CSE-4 FDS RECORD 160120748017

PROGRAM 3:

AIM : to plot a 2d graph and demonstrate the use of plot() function.

PROCEDURE : In this program, plot() function helps to give color , to the plot, give
marker , markersize, markeredgecolor ,linestyle to the plot
CODE :
x=[100,200,300]
y=[400,500,600]
plt.plot(x,y,label='x+300',color="blue",linewidth=3,linestyle='--'
,marker="*",markersize=12,markeredgecolor="red")
plt.title("Graph",fontdict={'fontname':'FreeSerif','fontsize':20})
plt.xlabel("X")
plt.ylabel("Y")
plt.xticks([100,140,180,220,260,300])
plt.yticks([300,350,400,450,500,550,600])
plt.legend()
plt.show()
OUTPUT:

PROGRAM 4:
AIM : To draw multiple plots using plot() function and save the figure.
PROCEDURE : In this program, plots of x+1 , x^2, x^3 are ploted using plot() . to save
the figure, savefig() function is used . the figure is saved as linegraph.png and with dpi
300 by using savefig('linegraph.png',dpi=300).

23
CSE-4 FDS RECORD 160120748017
CODE:
x=[1,1.2,1.4,1.6]
y=[2,2.2,2.4,2.6]
plt.plot(x,y,'b*--',label='x+1')
plt.title("Graph",fontdict={'fontname':'FreeSerif','fontsize':20})
x2=np.arange(0,2.5,0.5)
plt.plot(x2,x2**2,'g^--',label='x^2')
plt.plot(x2,x2**3,'r',label='x^3')
plt.xlabel("X")
plt.ylabel("Y")
plt.savefig('linegraph.png',dpi=300)
plt.legend()
plt.show()

OUTPUT:

PROGRAM 5:

AIM : To plot a bar graph.

PROCEDURE : a bar plot is a plot that presents categorical data with rectangular bars
with lengths proportional to the values that they represent. A bar plot shows
comparisons among discrete categories. One axis of the plot shows the specific
categories being compared, and the other axis represents a measured value.
Set_hatch() is used to give different symbolled hatch to the barplot .
CODE:

24
CSE-4 FDS RECORD 160120748017
labels=['a','b','c']
values=[10,20,30]
b=plt.bar(labels,values)
OUTPUT:

CODE:
labels=['a','b','c']
values=[10,20,30]
b=plt.bar(labels,values)
b[0].set_hatch('/')
b[1].set_hatch('*')
b[2].set_hatch('.')

OUTPUT:

CODE:
labels=['a','b','c']

25
CSE-4 FDS RECORD 160120748017
values=[10,20,30]
b=plt.bar(labels,values)
patterns=['.','/',"*"]
for i in b:
i.set_hatch(patterns.pop(0))

OUTPUT:

26
CSE-4 FDS RECORD 160120748017

Dt:18.11.2021
Program 1:
Aim: To demonstrate plots for gas prices datasets
Procedure:
A Line plot can be defined as a graph that displays data as points or check
marks above a number line, showing the frequency of each value
A legend is an area describing the elements of the graph. In the matplotlib
library, there’s a function called legend() which is used to Place a legend on
the axes.
format :[color;marker;linestyle]
Program Code:

import matplotlib.pyplot as plt

import numpy as np
import pandas as pd
plt.title('Gas Prices (in USD)',fontdict={'fontweight':'bold','fontsize':10})
gas=pd.read_csv('gasprices.csv')
plt.plot(gas.Year,gas.USA,label='United States')
plt.plot(gas.Year,gas.Canada,label='Canada')
plt.plot(gas.Year,gas['South Korea'],label='S K')
plt.legend()
plt.show()

Output:

27
CSE-4 FDS RECORD 160120748017

Code:
for country in gas:
print(country)
for country in gas:
if country!='Year':
plt.plot(gas.Year,gas[country],marker='.',label=country)
print(gas.Year[::3])
plt.xticks(gas.Year[::3])
plt.xlabel('Year')
plt.ylabel('US Dollars')
plt.legend()
plt.show()

28
CSE-4 FDS RECORD 160120748017

Output:

Program 2:
Aim: To read data from fifa dataset
Procedure:
A Line plot can be defined as a graph that displays data as points or check
marks above a number line, showing the frequency of each value
A legend is an area describing the elements of the graph. In the matplotlib
library, there’s a function called legend() which is used to Place a legend on
the axes.

29
CSE-4 FDS RECORD 160120748017

format :[color;marker;linestyle]

Program Code:
fifa=pd.read_csv('fifa_data.csv')
fifa.head(5)

Output:

Program 3:
Aim: To represent data from fifa dataset using histograms
Procedure:
A histogram graph is a bar graph representation of data. It is a
representation of a range of outcomes into columns formation along the x-
axis. in the same histogram, the number count or multiple occurrences in the
data for each column is represented by the y-axis.
Program Code:
plt.hist(fifa.Overall)
plt.show()

Output:

30
CSE-4 FDS RECORD 160120748017

Program Code:

bins=[40,50,60,70,80,90,100]
plt.figure(figsize=(6,5))
plt.hist(fifa.Overall,bins=bins,color='blue')
plt.xticks(bins)
plt.ylabel('Number of Players')
plt.xlabel('Skill Level')
plt.title('Distribution of Player Skills in FIFA 2018')
plt.savefig('histogram.png',dpi=300)
plt.show()
Output:

31
CSE-4 FDS RECORD 160120748017

Program 4:
Aim: To represent data from fifa dataset using piecharts for preferred legs
Procedure:
A pie chart (or a circle chart) is a circular statistical graphic, which is divided
into slices to illustrate numerical proportion. In a pie chart, the arc length of
each slice (and consequently its central angle and area), is proportional to the
quantity it represents.

Program Code:

l=fifa.loc[fifa['Preferred Foot']=='Left'].count()[0]
r=fifa.loc[fifa['Preferred Foot']=='Right'].count()[0]
labels=['Left','Right']
colors=['y','g']
plt.pie([l,r],labels=labels,colors=colors,autopct='%.1f%%')
plt.title('Foot Preference of FIFA Players')
plt.show()

Output:

Program 5:

Aim: To represent data from fifa dataset using piecharts for weighs
Procedure:
A pie chart (or a circle chart) is a circular statistical graphic, which is divided
into slices to illustrate numerical proportion. In a pie chart, the arc length of

32
CSE-4 FDS RECORD 160120748017

each slice (and consequently its central angle and area), is proportional to the
quantity it represents.

Program Code:

light=fifa.loc[fifa.Weight<125].count()[0]
light_medium=fifa[(fifa.Weight>=125)&(fifa.Weight<150)].count()[0]
medium=fifa[(fifa.Weight>=150)&(fifa.Weight<175)].count()[0]
medium_heavy=fifa[(fifa.Weight>=200)&(fifa.Weight<200)].count()[0]
heavy=fifa[(fifa.Weight>=200)].count()[0]
labels=['Under 125','125-150','150-175','175-200','Over 200']
weights=[light,light_medium,medium,medium_heavy,heavy]
plt.pie(weights,labels=labels)
plt.title('Weight of professional Soccer Players(lbs)')
plt.show()

Output:

Program 6:
Aim: To demonstrate box plots for fifa dataset

Procedure:
Boxplots are a standardized way of displaying the distribution of data based on
a five number summary (“minimum”, first quartile (Q1), median, third quartile
(Q3), and “maximum”).

Program Code:
barcelona=fifa.loc[fifa.Club=='FC Barcelona']['Overall']

33
CSE-4 FDS RECORD 160120748017

madrid=fifa.loc[fifa.Club=='Real Madrid']['Overall']
bp=plt.boxplot([barcelona,madrid])
plt.title('Professional Soccer Team Comparision')
plt.ylabel('FIFA Overall Rating')
plt.show()

Output:

Program 7:
Aim: To demonstrate box plots for fifa dataset

Procedure:
Boxplots are a standardized way of displaying the distribution of data based on
a five number summary (“minimum”, first quartile (Q1), median, third quartile
(Q3), and “maximum”).

Program Code:

barcelona=fifa.loc[fifa.Club=='FC Barcelona']['Overall']
madrid=fifa.loc[fifa.Club=='Real Madrid']['Overall']
rev=fifa.loc[fifa.Club=='New England Revolution']['Overall']
labels=['FC Barcelona','Real Madrid','New England Revolution']
bp=plt.boxplot([barcelona,madrid,rev],labels=labels,patch_artist=True)
for box in bp['boxes']:
box.set(color='b',linewidth=2)
box.set(facecolor='y')
plt.title('Professional Soccer Team Comparision')
plt.ylabel('FIFA Overall Rating')

34
CSE-4 FDS RECORD 160120748017

plt.show()
Output:

35
CSE-4 FDS RECORD 160120748017
Dt:25.11.21
PROGRAM 1
AIM: To implement scatter plot
PROCEDURE: With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same
length, one for the values of the x-axis, and one for values on the y-axis
CODE:
import matplotlib.pyplot as plt
import numpy as np
price=np.asarray([23.3,23.20,20.3,10.3,3.2])
sales_per_day=np.asarray([10,20,30,40,50])
profit_margin=np.asarray([5,10,15,20,25])
low=(0,1,0)
medium=(0,0,1)
high=(1,0,0)
sugar_cont=[low,high,high,medium,high]
plt.scatter(x=price,y=sales_per_day,s=profit_margin*10,c=sugar_cont)
plt.show()
OUTPUT:

CODE:
import matplotlib.pyplot as plt
import numpy as np
low=(0,1,0)
medium=(0,0,1)
high=(1,0,0)
price_orange=np.asarray([23.3,23.20,20.3,10.3,3.2])
sales_per_day_orange=np.asarray([10,20,30,40,50])
profit_margin_orange=np.asarray([5,10,15,20,25])
sugar_cont_orange=[low,high,high,medium,high]
price_cereal = np.asarray([1.50, 2.50, 1.15, 1.95])

36
CSE-4 FDS RECORD 160120748017
sales_per_day_cereal = np.asarray([67, 34, 36, 12])
profit_margin_cereal = np.asarray([20,12,7,9])
sugar_cont_cereal = [low, high, medium, low]
plt.scatter(x=price_orange,y=sales_per_day_orange,s=profit_margin_orange*10,c=sugar_c
ont_orange,marker="X")
plt.scatter(x=price_cereal,y=sales_per_day_cereal,s=profit_margin_cereal*10,c=sugar_cont
_cereal,marker="D")
plt.show()
OUTPUT:

PROGRAM 2
AIM: To demonstrate plot function
DESCRIPTION: here in this program kind () function determines the type of the plot
required and fig size reperesents the window
CODE:
import pandas as pd
plotdata = pd.DataFrame({
"2018":[57,67,77,83],
"2019":[68,73,80,79],
"2020":[73,78,80,85]},
index=["Django", "Gafur", "Tommy", "Ronnie"])
plotdata.plot(kind="bar",figsize=(15, 8))
plt.title("FIFA ratings")
plt.xlabel("Footballer")
plt.ylabel("Ratings")
OUTPUT:

37
CSE-4 FDS RECORD 160120748017

PROGRAM 3
AIM: to implement stacked function
DESCRIPTION: here the stacked function plots the data one above the other like a pile
CODE:
import pandas as pd
plotdata = pd.DataFrame({
"2018":[57,67,77,83],
"2019":[68,73,80,79],
"2020":[73,78,80,85]},
index=["Django", "Gafur", "Tommy", "Ronnie"])
plotdata.plot(kind="bar",figsize=(15, 8),stacked="True")
plt.title("FIFA ratings")
plt.xlabel("Footballer")
plt.ylabel("Ratings")
OUTPUT:

38
CSE-4 FDS RECORD 160120748017
PROGRAM 4:
AIM: To demonstrate the first n number of observations from the csv file
PROCEDURE: value_counts() function is used to access the data to certain number given
which is present in the csv file or given dataset
CODE:
top_20 = df['Country'].value_counts()[:20]
top_20.plot(kind='bar',figsize=(10,8))
plt.title('All Time Medals of top 20 countries')
plt.show()
OUTPUT:

Dt:16.12.21
Program 1:
AIM: To implement box plot without using inbuilt function
PROCEDURE:
Boxplots are a standardized way of displaying the distribution of data based on a five
number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and
“maximum”).
CODE:
#box plot
import matplotlib.pyplot as plt
import numpy as np

39
CSE-4 FDS RECORD 160120748017
data=[199,201,236,269,271,278,283,291,301,303,341]
n=len(data)
m=(n+1)//2
q2=data[m]
q1=data[(n+1)//4]
q3=data[(n+1)*3//4]
iqr= q3-q1
min= q1-(iqr/2)
max= q3+(iqr/2)
x= ['min','q1','q2','q3','max']
y= [min,q1,q2,q3,max]
plt.boxplot(y)
print(y)
plt.show()
OUTPUT:

Program 2:
AIM: To plot frequency polygons using input frequency.
PROCEDURE:
A frequency polygon is a line graph of class frequency plotted against class midpoint. It can
be obtained by joining the midpoints of the tops of the rectangles in the histogram
CODE:
#q2 frequency polygons using frequency
import matplotlib.pyplot as plt
import numpy as np
range_bin=[5.5,10.5,15.5,20.5,25.5,30.5,35.5,40.5]
freq=[1,3,2,4,5,3,2]
l= len(range_bin)
r=[]
for i in range(l-1):
x= (range_bin[i]+range_bin[i+1])/2
r.append(x)

40
CSE-4 FDS RECORD 160120748017
plt.plot(r,freq,marker="*")
plt.xticks([0,5.5,10.5,15.5,20.5,25.5,30.5,35.5,40.5,45.5])
plt.xlabel("Bin range")
plt.ylabel("Frequency")
plt.show()
OUTPUT:

Program 3:
AIM: To plot relative frequency polygon with given input frequencies.
PROCEDURE:
A relative frequency polygon has peaks that represent the percentage of total data points
falling within the interval.
CODE:
#q3 frequency polygons using relative frequency
import matplotlib.pyplot as plt
import numpy as np
range_bin=[5.5,10.5,15.5,20.5,25.5,30.5,35.5,40.5]
freq=[1,3,2,4,5,3,2]
p=len(freq)
c_freq=0
for i in range(p):
c_freq=c_freq+freq[i]
print(c_freq)
c_f=[]
for i in range(p):
c_f.append(freq[i]/c_freq)
l= len(range_bin)
r=[]
for i in range(l-1):
x= (range_bin[i]+range_bin[i+1])/2
r.append(x)
plt.plot(r,c_f,marker="*")
plt.xticks([0,5.5,10.5,15.5,20.5,25.5,30.5,35.5,40.5,45.5])

41
CSE-4 FDS RECORD 160120748017
plt.xlabel("Bin range")
plt.ylabel("Frequency")
plt.show()
OUTPUT:

Program 4:
AIM: To demonstrate stem and leaf plot
PROCEDURE:
Stem and leaf plot is a way of organizing data into a form that makes it easy to observe the
frequency of different types of values.
CODE:
#stem and leaf plot
x=[143,163,154,159,172,165,162,171,146,165,176,145,165,182,175,186,160,158,167,172]
x=sorted(x)
dict_a={}
for i in x :
s= str(i)
y=int(s[0:2])
dict_a[y]=[]
for i in x:
s= str(i)
y= int(s[0:2])
z= int(s[2])
dict_a[y].append(z)
OUTPUT:

42
CSE-4 FDS RECORD 160120748017

Dt: 06-01-2022
Program-1
AIM: To implement one sample t test
PROCEDURE:
1. Identify Null hypothesis for the given problem.
2. Calculate mean of the given data set.

3. Calculate s value by using the formula: ∑ (xi − x¯)2

s=
n−1
Where is mean of given data set
n is size of data set
4. Find degrees of freedom i.e., n-1.
5. By using degrees of freedom find t critical value.
6. Calculate t value by using the formula:
t=(x-mu)/(s/n**0.5)
7. If t calculated is less than t critical then Null hypothesis is accepted or else rejected.
CODE:
#t test
#one sample
x=[90,98,110,150,200,91,82,80,110,96]
su=0
for j in x:

43
CSE-4 FDS RECORD 160120748017
su=su+j
n=len(x)
avg=su/n
b=90
sd=0
for i in x:
sd=sd+(i-avg)**2
s=(sd/(n-1))**(0.5)
tstat=(avg-b)/(s/(n)**(0.5))
print(tstat)
tcritic=1.83
if tstat<tcritic:
print("accept NH")
else:
print("reject NH")

OUTPUT:

USING SCIPY
#one sample t-test
from scipy import stats
data=[90,98,110,150,200,91,82,80,110,96]
t,p=stats.ttest_1samp(data,90)
print("tstat: ",t)
tcr=1.83

44
CSE-4 FDS RECORD 160120748017
print("tcritical: ",tcr)
if(t<tcr):
print("Null hypothesis is accepted")
else:
print("Null hypothesis is rejected")
OUTPUT:

Program 2
AIM: To implement Unpaired unequal Variance T-Test Theory
PROCEDURE:
1.Identify Null hypothesis for the given problem.
2.Calculate first sample mean and second sample for the given data set.
3.Calculate s1 and s2 values by using the formula:

4. Find degrees of freedom

5. t can be calculates using formula:

S1 is standard deviation of first sample

S2 is the standard deviation of second sample
X 1,X 2 are the mean of first
And second samples

45
CSE-4 FDS RECORD 160120748017
N 1,N 2 are the mean of first
And second samples

6. If t calculated is less than t critical then Null hypothesis is accepted or else rejected
CODE:
x=[13.5,23,13.2,12.7,22.1,17.5,20.1,22.5,19.0,21.9,13.2,11,12.8,13.1,11.6,23.0,13.2,22.9,13
.1]
y=[10.1,27.6,13.8,13.1,25.6,26.7,28.9,30.1,25.4,21.9,12.1,13.4,12.3,11.9,22.2,12.3,22.2]
t_critic=2.052
nx=len(x)
ny=len(y)
meanx=sum(x)/nx
meany=sum(y)/ny
def standard_deviation(a,mean):
x=0
n=len(a)
for i in a:
c= (i-mean)**2
x= x+ c
varience= x/(n-1)
sd=(varience)**(0.5)
return sd
sdx= standard_deviation(x,meanx)
sdy= standard_deviation(y,meany)
df=(((sdx**2)/nx) + ((sdy**2)/ny))/((((sdx/nx)**2)/(nx-1)) +(((sdy/ny)**2)/(ny-1)))
t_stat= (meanx-meany)/(np.sqrt((sdx**2/nx)+(sdy**2/ny)))
print("Degree of freedom :",df)

46
CSE-4 FDS RECORD 160120748017
print("t-statical value :",t_stat)
f_stat= (sdy**2)/(sdx**2)
f_critic=2.23
tcritic=2.052
if f_stat>f_critic :
print("Unequal variences")
else:
print("Equal variences")
if t_stat<tcritic:
print("Null hypothesis is accepted")
else:
print("Null hypothesis is rejected")
OUTPUT:

USING SCIPY
x=[13.5,23,13.2,12.7,22.1,17.5,20.1,22.5,19.0,21.9,13.2,11,12.8,13.1,11.6,23.0,13.2,22.9,13
.1]
y=[10.1,27.6,13.8,13.1,25.6,26.7,28.9,30.1,25.4,21.9,12.1,13.4,12.3,11.9,22.2,12.3,22.2]
res=stats.ttest_ind(x,y,equal_var=False)
print(res)
t_crit=2.052
if abs(res[0]) > t_crit:
print("Alternate hypothesis is rejected.")
else:
print("Null hypothesis is rejected.")

47
CSE-4 FDS RECORD 160120748017
OUTPUT:

Program 3
AIM: To implement unpaired equal variance t test
PROCEDURE:
1.Identify Null hypothesis for the given problem.
2.Calculate first sample mean and second sample mean as x1 bar and x2 bar.
3.Calculate s1 and s2 values by using the formula:

4. f can be calculated by using

5. If f calculated is less than f critical , then it denotes that the variances of samples are
equal
6. find degree of freedom

7. By using degrees of freedom find t critical value.

8. Calculate the Pooled Sample Variance and t value by using the formula

48
CSE-4 FDS RECORD 160120748017

9.If t calculated is less than t critical then Null hypothesis is accepted or else rejected.
CODE:
a=[23,15,16,25,20,17,18,14,12,19,21,22]
b= [16,21,16,11,24,21,18,15,19,22,13,24]
f_critic= 2.82
na=len(a)
sa=sum(a)
mean_a=sa/na
nb= len(b)
sb= sum(b)
mean_b= sb/nb
def standard_deviation(a,mean):
x=0
n=len(a)
for i in a:
c= (i-mean)**2
x= x+ c
varience= x/(n-1)
sd=(varience)**(0.5)
return sd

49
CSE-4 FDS RECORD 160120748017
sda= standard_deviation(a,mean_a)
sdb= standard_deviation(b,mean_b)
f_stat=(sdb/sda)**2
print("F-stat:",f_stat)
if f_stat>f_critic:
print("Variences are unequal")
elif f_stat< f_critic:
print("Equal variences")
def pooledSV(X,Y):
n1, n2 = len(X), len(Y)
xbar, ybar = np. mean( X), np.mean(Y)
sum1, sum2 = 0, 0
sum1 = sum([(x - xbar)**2 for x in X])
sum2 = sum([(y - ybar)**2 for y in Y])
return (sum1+sum2)/ (n1+n2-2)
S2 = pooledSV(a,b)
print("Pooled Sample Variance:{}".format(S2))
t_value = (mean_a-mean_b)/ (np.sqrt(S2 * (1/na + 1/nb)))
print("t-value :",t_value)
tcritic=1.796
if t<tcritic:
print("Null hypothesis is accepted")
else:
print("Null hypothesis is rejected")
OUTPUT:

50
CSE-4 FDS RECORD 160120748017

USING SCIPY
from scipy import stats
x= [23, 15, 16, 25, 20, 17, 18, 14, 12, 19, 21, 22]
y= [16, 21, 16, 11, 24, 21, 18, 15, 19, 22, 30, 24]
res = stats.ttest_ind(x,y, equal_var=True)
print(res)
t_crit = 1.717
if abs(res[0]) > t_crit:
print("Alternate hypothesis is rejected.")
else:
print("Null hypothesis is rejected.")
OUTPUT:

Program 4:
AIM: To implement paired t -test
PROCEDURE:
1. Identify Null hypothesis for the given problem.

2. Find the difference between the samples i.e. d and d^2

3. Find degrees of freedom i.e., n-1.

4. By using degrees of freedom find t critic

51
CSE-4 FDS RECORD 160120748017
5. Calculate t value by using the formula:

6. If t calculated is less than t critical NH is accepeted or else it is rejected

CODE:
pretest=[23,25,28,25,25,26,25,22,30,35,40,35,30,30]
posttest=[35,40,30,40,45,30,30,55,40,40,35,38,41,35]
t_critic=-2.160
D=[]
D2=[]
n=len(pretest)
for i in range(n):
d= pretest[i]-posttest[i]
D.append(d)
D2.append(d**2)
sD=sum(D)
sD2=sum(D2)
tnum=sD
tden=((n*sD2-(sD*sD))/(n-1))**(1/2)
tstat=tnum/tden
print("t_stat value :",tstat)
t_critic=1.771

52
CSE-4 FDS RECORD 160120748017
if tstat<t_critic:
print("Null hypothesis is accepted")
else:
print("Null hypothesis is rejected")
OUTPUT:

USING SCIPY
import pandas as pd
from scipy import stats
#paired t test
pretest=[23,25,28,25,25,26,25,22,30,35,40,35,30,30]
posttest=[35,40,30,40,45,30,30,55,40,40,35,38,41,35]
ttest,pvalue=stats.ttest_rel(pretest,posttest)
print("ttest:",ttest)
print("pvalue:",pvalue)
if pvalue<0.05:
print("Reject Null hypothesis")
else:
print("Accept Null hypothesis")
OUTPUT:

53
CSE-4 FDS RECORD 160120748017
Program 5:

AIM: To perform one way ANOVA test.

PROCEDURE: Analysis of variance (ANOVA) is a statistical technique that is used to check if
the means of two or more groups are significantly different from each other.The one-way
ANOVA tests the null hypothesis that two or more groups have the same population mean.
The test is applied to samples from two or more groups, possibly with differing sizes.

fstat=mssb/mssw
In the program to find with out using scipy, ssw, msw, ssb,msb,fstat were calculated using
numpy, pandas with the help of above formulae. If f statistic is less than f critical then Null
hypothesis is accepted or else rejected.
In the program to find with help of scipy, stats is imported , with the help of f_oneway(),
tstat value and p value will be generated

CODE:
1) Without using scipy
#one way anova
g1=[7,7,6,9,7,7,6,7,8,9]
g2=[5,6,3,5,4,6,5,4,5,5,6,7,6]
g3=[1,3,4,3,1,1,2,6,5,4,3,4,5]
mg1=sum(g1)/len(g1)
mg2=sum(g2)/len(g2)
mg3= sum(g3)/len(g3)
x=0

54
CSE-4 FDS RECORD 160120748017
n1=len(g1)
n2=len(g2)
n3=len(g3)
n= len(g1) +len(g2) +len(g3)
k=3
for i in g1:
y= (i-mg1)**2
x = x+ y
for i in g2:
y= (i-mg2)**2
x = x+ y
for i in g3:
y= (i-mg3)**2
x = x+ y
mssw= x/(n-k)
G=(sum(g1)+sum(g2)+sum(g3))/n
mssb= ((n1*(mg1-G)**2) + (n2*(mg2-G)**2) +(n3*(mg3-G)**2))/2
fstat=mssb/mssw
print("Mssb and Mssw are :",mssb,mssw)
fcritic=3.32
if fstat>fcritic :
print("Reject Null Hypothesis")
else:
print("Accept null hypothesis")
print("f-stat:",fstat)
OUTPUT:

55
CSE-4 FDS RECORD 160120748017

2) using scipy
from scipy import stats
import numpy as np
g1=[7,7,6,9,7,7,6,7,8,9]
g2=[5,6,3,5,4,6,5,4,5,5,6,7,6]
g3=[1,3,4,3,1,1,2,6,5,4,3,4,5]
stats.f_oneway(g1,g2,g3)
OUTPUT:

Unit 1
No ratings yet
Unit 1
170 pages
M3-Introduction To Numpy and Pandas
No ratings yet
M3-Introduction To Numpy and Pandas
55 pages
Advance Python Program Unit II
No ratings yet
Advance Python Program Unit II
92 pages
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
No ratings yet
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
62 pages
FDS Unit 4
No ratings yet
FDS Unit 4
66 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Fds Record
No ratings yet
Fds Record
69 pages
Fds Lab Record
No ratings yet
Fds Lab Record
84 pages
FDS Lab Manual R21
No ratings yet
FDS Lab Manual R21
47 pages
FDS Record Last
No ratings yet
FDS Record Last
61 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
3252 1,2,3
No ratings yet
3252 1,2,3
20 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
DSF Lab Exp Full
No ratings yet
DSF Lab Exp Full
88 pages
Unit 3
No ratings yet
Unit 3
37 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
Lab Manual Fds
No ratings yet
Lab Manual Fds
44 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
Unit 5 Python Packages 240127 185930
No ratings yet
Unit 5 Python Packages 240127 185930
34 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
24 pages
Unit Iii Using Numpy
No ratings yet
Unit Iii Using Numpy
23 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
Final Fds Manual Print
No ratings yet
Final Fds Manual Print
55 pages
Ty B Tech - Bda - Ai315 - Lab Manual
No ratings yet
Ty B Tech - Bda - Ai315 - Lab Manual
52 pages
Unit 3
No ratings yet
Unit 3
42 pages
Unit-3 PSC
No ratings yet
Unit-3 PSC
62 pages
Ysio
100% (1)
Ysio
252 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
Grace Python Numpy MB Final
No ratings yet
Grace Python Numpy MB Final
55 pages
Python Numpy Primer
No ratings yet
Python Numpy Primer
54 pages
Mosdorfer Catalog Clamps
No ratings yet
Mosdorfer Catalog Clamps
44 pages
Unit Vi
No ratings yet
Unit Vi
60 pages
Python Numpy
100% (1)
Python Numpy
31 pages
Python Unit 3
No ratings yet
Python Unit 3
38 pages
CS3361 - Data Science
No ratings yet
CS3361 - Data Science
56 pages
G10 Python 2
No ratings yet
G10 Python 2
64 pages
Numpy
No ratings yet
Numpy
71 pages
Unit III Python
No ratings yet
Unit III Python
42 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
Python
No ratings yet
Python
25 pages
Ch11a Numpy
No ratings yet
Ch11a Numpy
8 pages
Abs Bendix
No ratings yet
Abs Bendix
72 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
C1 W1 Lab 1 Introduction To Numpy Arrays
No ratings yet
C1 W1 Lab 1 Introduction To Numpy Arrays
12 pages
Module 4
No ratings yet
Module 4
4 pages
11 NumPy
No ratings yet
11 NumPy
14 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Numpy
No ratings yet
Numpy
14 pages
Lab 2 DWM
No ratings yet
Lab 2 DWM
13 pages
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
47 pages
Scientific Computing
No ratings yet
Scientific Computing
24 pages
HD07 - Amadeus Reservation and Ticketing Help Desk - Air - Help Desk Module - Jan2018 - 3903939 - en - US
No ratings yet
HD07 - Amadeus Reservation and Ticketing Help Desk - Air - Help Desk Module - Jan2018 - 3903939 - en - US
66 pages
FM Modulators: Experiment 7
100% (2)
FM Modulators: Experiment 7
17 pages
Important!: Read Before Proceeding!
No ratings yet
Important!: Read Before Proceeding!
10 pages
Corbin's Concepts of Fitness and Wellness: A Comprehensive Lifestyle Approach ISE 13th Edition Charles B. Corbin 2024 Scribd Download
100% (1)
Corbin's Concepts of Fitness and Wellness: A Comprehensive Lifestyle Approach ISE 13th Edition Charles B. Corbin 2024 Scribd Download
79 pages
python-notes-BCC-302 (Unit - 05)
No ratings yet
python-notes-BCC-302 (Unit - 05)
25 pages
Lab - Manual FDS
No ratings yet
Lab - Manual FDS
12 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Machine Learning Codes
No ratings yet
Machine Learning Codes
30 pages
Unit 5-Python Packages 240127 185930
No ratings yet
Unit 5-Python Packages 240127 185930
34 pages
Lecture+Notes Python+for+DS PDF
No ratings yet
Lecture+Notes Python+for+DS PDF
48 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
DSM 5 Chart
93% (30)
DSM 5 Chart
2 pages
Enthought: Introduction To Numerical Computing With Numpy
No ratings yet
Enthought: Introduction To Numerical Computing With Numpy
39 pages
FITA - Academy - UI UX Design
No ratings yet
FITA - Academy - UI UX Design
17 pages
Jira Certification Sample Questions
No ratings yet
Jira Certification Sample Questions
7 pages
CPM18th Care of Older Persons
No ratings yet
CPM18th Care of Older Persons
11 pages
Numerical Python Numpy
No ratings yet
Numerical Python Numpy
28 pages
Introduction to Python Programming: Do your first steps into programming with python
From Everand
Introduction to Python Programming: Do your first steps into programming with python
Greytower Corp
No ratings yet
Catalogo Reductor
No ratings yet
Catalogo Reductor
106 pages
2 Staad Analysis Output
No ratings yet
2 Staad Analysis Output
7 pages
Plano de Trabalho
No ratings yet
Plano de Trabalho
107 pages
The Technical Aspects When Using BENDER Communication Solutions
No ratings yet
The Technical Aspects When Using BENDER Communication Solutions
4 pages
Attitude Is Everything
No ratings yet
Attitude Is Everything
27 pages
Manual Bomba Horizontal Clase D PDF
No ratings yet
Manual Bomba Horizontal Clase D PDF
24 pages
SAN AGUSTIN v. PEOPLE
No ratings yet
SAN AGUSTIN v. PEOPLE
1 page
Dynamic Fluid Pulsation
No ratings yet
Dynamic Fluid Pulsation
17 pages
The Geisha Memory 2
No ratings yet
The Geisha Memory 2
25 pages
Cisco LISP Configuration Guide: Version 3 2 July 2010
No ratings yet
Cisco LISP Configuration Guide: Version 3 2 July 2010
26 pages
Digital Twins For Precision Healthcare
No ratings yet
Digital Twins For Precision Healthcare
20 pages
System Monitoring With Sar and Ksar
No ratings yet
System Monitoring With Sar and Ksar
9 pages
Quran & Prime Numbers - Part 2
No ratings yet
Quran & Prime Numbers - Part 2
6 pages
Lumbar Herniation Case Study
No ratings yet
Lumbar Herniation Case Study
1 page
IPR Gandhinagar Apprentice (Diploma Degree) Recruitment 2020RIJADEJAcom
No ratings yet
IPR Gandhinagar Apprentice (Diploma Degree) Recruitment 2020RIJADEJAcom
3 pages
Zero Knowledge
No ratings yet
Zero Knowledge
5 pages
He Sas 1
No ratings yet
He Sas 1
3 pages
Miraña Genus Aeromonas
No ratings yet
Miraña Genus Aeromonas
1 page

FDS Record

Uploaded by

FDS Record

Uploaded by

DEPARTMENT OF COMPUTER SCIENCE

B.E – III SEMESTER

Internal examiner External examiner

Head of the Department

S.No TOPICS Page Remarks

Fundementals of Data Science Lab

INSTALLATION PROCEDURE FOR PYTHON IN WINDOWS

STEP 1: SELECT VERSION OF PYTHON TO INSTALL:

Case 2: Pip already installed :

PROCEDURE: In this program we need to create an array by using shape

PROCEDURE: Basic mathematical functions operate elementwise on arrays,

WEEK-1 Dt: 21.10.21

# 8 values between 1 and 100:

Aim: To demonstrate numpy arrays of various dimensions.

#one dimensional Arrays

M = np.array([ [[-12, 100, -903,901], [-156,-34,123,392]],

Aim: To perform array indexing and slicing operations

AIM : to plot a 2d graph and demonstrate the use of plot() function.

AIM : To plot a bar graph.

import matplotlib.pyplot as plt

3. Calculate s value by using the formula: ∑ (xi − x¯)2

4. Find degrees of freedom

5. t can be calculates using formula:

S1 is standard deviation of first sample

4. f can be calculated by using

7. By using degrees of freedom find t critical value.

2. Find the difference between the samples i.e. d and d^2

3. Find degrees of freedom i.e., n-1.

4. By using degrees of freedom find t critic

6. If t calculated is less than t critical NH is accepeted or else it is rejected

AIM: To perform one way ANOVA test.

You might also like