0% found this document useful (0 votes)

245 views24 pages

AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed

The document outlines a series of experiments in Python programming using various libraries such as pandas, matplotlib, numpy, and scipy. Each experiment includes an aim, algorithm, program code, output, and a result indicating successful execution. Topics covered include data frames, basic plots, frequency distributions, normal curves, regression, z-tests, t-tests, ANOVA, and building linear models.

Uploaded by

csestaff16 DAYANA T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

245 views24 pages

AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed

Uploaded by

csestaff16 DAYANA T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

EXP NO:1 WORKING WITH PANDAS DATA FRAMES

DATE:

AIM:

To Write a Python Program for working with data frames using pandas.

ALGORITHM:

Step 1: Start
Step 2: Import the pandas modules as pd
Step 3: Declare the array in row and column
Step 4: Call the function inside the data frame
Step 5: Print the data frames
Step 6: Stop

PROGRAM:
CREATE A SIMPLE PANDAS DATA FRAME:

import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

OUTPUT:

calories 420
duration 50
Name: 0, dtype: int64

RESULT:
Thus the Python Program for working with data frames using pandas has been
executed successfully.
EXP NO:2 BASIC PLOTS USING MATPLOLIB

DATE:

AIM:

To Write a Python Program for working with Basic Plots Using Matplolib

ALGORITHM:

Step 1: Start
Step 2: Import the pyplot in matplotlib modules as plt
Step 3: Declare the array in row and column
Step 4: Give the necessary x and y plot values
Step 5: Print the basic plots
Step 6: Stop

PROGRAM:

import matplotlib.pyplot as plt

a = [1, 2, 3, 4, 5]
b = [0, 0.6, 0.2, 15, 10, 8, 16, 21]
plt.plot(a)
plt.plot(b, "or")
plt.plot(list(range(0, 22, 3)))
plt.xlabel('Day ->')
plt.ylabel('Temp ->')
c = [4, 2, 6, 8, 3, 20, 13, 15]
plt.plot(c, label='4th Rep')
ax = plt.gca()
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['left'].set_bounds(-3, 40)
plt.xticks(list(range(-3, 10)))
plt.yticks(list(range(-3, 20, 3)))
ax.legend(['1st Rep', '2nd Rep', '3rd Rep', '4th
Rep'])
plt.annotate('Temperature V / s Days', xy=(1.01, -2.15))
plt.title('BASIC PLOTS')
plt.show()
OUTPUT:

RESULT:
Thus the Python Program for working with for working with Basic Plots Using Matplolib
has been executed successfully.
EXP NO:3 FREQUENCY DISTRIBUTIONS,AVERAGES,
VARIABILITY
DATE:

AIM:

To Write a Python Program for working with Frequency Distributions, Averages,

Variability

ALGORITHM:

Step 1: Import the necessary library such as numpy.

Step 2: Calculate the average for the given data using the function np.average()
Step 3: Calculate the frequency count for the state using the function np.std()
Step 4: Calculate the variance for the data using np.var()
Step 5: Calculate the averages,variances and standard deviation
Step 6:Stop

PROGRAM:
Python program to get average of a list
import numpy as np
list = [2, 40, 2, 502, 177, 7, 9]
print(np.average(list))

Output:
105.57142857142857

Python program to get variance of a list

import numpy as np
list = [2, 4, 4, 4, 5, 5, 7, 9]
print(np.var(list))
Output:
4.0

Python program to get standard deviation of a list

import numpy as np
list = [290, 124, 127, 899]
print(np.std(list))
Output:
318.35750344541907

RESULT:
Thus the Python Program for working with for working with Frequency Distributions,
Averages, Variability has been executed successfully.
EXP NO:4 NORMAL CURVES, CORRELATION AND SCATTER
PLOTS, CORRELATION COEFFICIENT
DATE:

AIM:
To Write a Python Program for Normal Curves, Correlation And Scatter Plots,
Correlation Coefficient

ALGORITHM:

Step 1: Start the Program

Step 2: Import packages scipy and call function scipy.stats
Step 3: Import packages numpy, matplotlib
Step 4: Create the distribution
Step 5: Visualizing the distribution
Step 6: Stop the process

PROGRAM:
#Normal curves
import matplotlib.pyplot as plt import numpy as np
mu, sigma = 0.5, 0.1
s = np.random.normal(mu, sigma, 1000) # Create the bins and histogram
count, bins, ignored = plt.hist(s, 20, normed=True)

Output:

#Correlation and scatter plots import sklearn

import numpy as np
import matplotlib.pyplot as plt import pandas as pd
y = pd.Series([1, 2, 3, 4, 3, 5, 4])
x = pd.Series([1, 2, 3, 4, 5, 6, 7])
correlation = y.corr(x) correlate

Output:
0.8603090020146067

Correlation coefficient
import math
def correlationCoefficient(X, Y, n) :
sum_X = 0
sum_Y = 0
sum_XY = 0
squareSum_X = 0
squareSum_Y = 0
i=0
while i < n :
# sum of elements of array X. sum_X = sum_X + X[i]
# sum of elements of array Y. sum_Y = sum_Y + Y[i
# sum of X[i] * Y[i]. sum_XY = sum_XY + X[i] * Y[i]
# sum of square of array elements. squareSum_X = squareSum_X + X[i] * X[i]
squareSum_Y = squareSum_Y + Y[i] * Y[i]
i= i+1
# use formula for calculating correlation # coefficient.
corr = (float)(n * sum_XY - sum_X * sum_Y)/ (float)(math.sqrt((n * squareSum_X -
sum_X * sum_X)* (n * squareSum_Y - sum_Y * sum_Y)))
return corr
X = [15, 18, 21, 24, 27]
Y = [25, 25, 27, 31, 32]
# Find the size of array. n = len(X)
# Function call to correlationCoefficient.
print ('{0:.6f}'.format(correlationCoefficient(X, Y, n)))

OUTPUT:
0.953463

RESULT:
Thus the Python Program for Normal Curves, Correlation And Scatter Plots,
Correlation Coefficient has been executed successfully.
EXP NO:5 REGRESSION
DATE:

AIM:
To Write a Python Program for Regression concept.

ALGORITHM

Step 1: Start the Program

Step 2: Import numpy and matplotlib package Step 3: Define coefficient function
Step 4: Calculate cross-deviation and deviation about x Step 5: Calculate regression
coefficients
Step 6: Plot the Linear regression and define main function
Step 7: Print the result
Step 8: Stop the process

PROGRAM:
import numpy as np
import matplotlib.pyplot as plt def estimate_coef(x, y):
# number of observations/points n = np.size(x)
# mean of x and y vector m_x = np.mean(x)
m_y = np.mean(y)
# calculating cross-deviation and deviation about x SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plot plt.scatter(x, y, color = "m",
marker = "o", s = 30) # predicted response vector y_pred = b[0] + b[1]*x
# plotting the regression line plt.plot(x, y_pred, color = "g") # putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot
plt.show() def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
# estimating coefficients b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {}".format(b[0], b[1])) # plotting regression line plot_regression_line(x, y, b)
if name == " main ": main()

OUTPUT:

Estimated coefficients:
b_0 = -0.0586206896552
b_1 = 1.45747126437
RESULT:
Thus the Python Program for Regression has been executed successfully
EXP NO:6 Z-TEST
DATE:

AIM:
To Write a Python Program for z-test concept.

ALGORITHM:
Step 1: Evaluate the data distribution.
Step 2: Formulate Hypothesis statement symbolically
Step 3: Define the level of significance (alpha)
Step 4: Calculate Z test statistic or Z score.
Step 5: Derive P-value for the Z score calculated.
Step 6: Make decision:
Step 6.1: P-Value <= alpha, then we reject H0.
Step 6.2: If P-Value > alpha, Fail to reject H0

PROGRAM:
import math
import numpy as np
from numpy.random import randn
from statsmodels.stats.weightstats import ztest
# Generate a random array of 50 numbers having mean 110 and sd 15
# similar to the IQ scores data we assume above mean_iq = 110
sd_iq = 15/math.sqrt(50) alpha = 0.05
null_mean =100
data = sd_iq*randn(50)+mean_iq # print mean and sd
print('mean=%.2f stdv=%.2f' % (np.mean(data), np.std(data)))
ztest_Score,p_value=ztest(data,value=null_mean,alternative='la rger')
if(p_value < alpha):
print("Reject Null Hypothesis")
else:
print("Fail to Reject NUll Hypothesis")

OUTPUT:
Reject Null Hypothesis

RESULT:
Thus the Python Program for Z TEST has been executed successfully
EXP NO:7 T-TEST
DATE:

AIM:
To Write a Python Program for T-test concept.

ALGORITHM:
Step 1: Create some dummy age data for the population of voters in the entire
country
Step 2: Create Sample of voters in Minnesota and test the whether the
average age
of voters Minnesota differs from the population
Step 3: Conduct a t-test at a 95% confidence level and see if it correctly
rejects the
null hypothesis that the sample comes from the same distribution as the
population.
Step 4: If the t-statistic lies outside the quantiles of the t-distribution
corresponding
to our confidence level and degrees of freedom, we reject the null hypothesis.
Step 5: Calculate the chances of seeing a result as e×treme as the one being
observed (known as the p-value) by passing the t-statistic in as the quantile to
the
stats.t.cdf() function

PROGRAM:
import numpy as np
from scipy import stats
# Defining two random distributions
# Sample Size
N = 10
# Gaussian distributed data with mean = 2 and var = 1
x = np.random.randn(N) + 2
# Gaussian distributed data with mean = 0 and var = 1
y = np.random.randn(N)
# Calculating the Standard Deviation
# Calculating the variance to get the standard deviation
var_x = x.var(ddof = 1)
var_y = y.var(ddof = 1)
# Standard Deviation
SD = np.sqrt((var_x + var_y) / 2)
print("Standard Deviation =", SD)
# Calculating the T-Statistics
tval = (x.mean() - y.mean()) / (SD * np.sqrt(2 / N))
# Comparing with the critical T-Value
# Degrees of freedom
dof = 2 * N - 2
# p-value after comparison with the T-Statistics
pval = 1 - stats.t.cdf( tval, df = dof) print("t = " + str(tval))
print("p = " + str(2 * pval))
## Cross Checking using the internal function from SciPy Packa ge
tval2, pval2 = stats.ttest_ind(x, y)
print("t = " + str(tval2))
print("p = " + str(pval2))

OUTPUT:
Standard Deviation = 0.7642398582227466 t = 4.87688162540348
p = 0.0001212767169695983
t = 4.876881625403479
p = 0.00012127671696957205

RESULT:
Thus the Python Program for T TEST has been executed successfully
EXP NO:8 ANOVA TEST
DATE:

AIM:
To Write a Python Program for ANOVA.

ALGORITHM:

Step 1: Input the values

Step 2:To find the null hypothesis or alternate hypothesis is acceptable or not.
Step 3: Rows are grouped according to their value in the category column.
Step 4: The total mean value of the value column is computed.
Step 5: The mean within each group is computed.
Step 6: The difference between each value and the mean value for the group is
calculated and
squared.
Step 7: Calculate the F critical value and find the acceptance of the hypothesis

PROGRAM:
# Installing the package install.packages("dplyr") # Loading the package
library(dplyr)
# Variance in mean within group and between group
bo×plot(mtcars$disp~factor(mtcars$gear),
×lab = "gear", ylab = "disp")
# Step 1: Setup Null Hypothesis and Alternate Hypothesis # HO = mu = muO1
= muO2 (There is no difference
# between average displacement for different gear) # H1 = Not all means are
equal
# Step 2: Calculate test statistics using aov function mtcars_aov <-
aov(mtcars$disp~factor(mtcars$gear)) summary(mtcars_aov)
# Step 3: Calculate F-Critical Value
# For O.O5 Significant value, critical value = alpha = O.O5 # Step 4: Compare
test statistics with F-Critical value
# and conclude test p <alpha, Reject Null Hypothesis

\
OUTPUT:

RESULT:

Thus the Python Program for ANOVA has been executed successfully
EXP NO:9 BUILDING AND VALIDATING LINEAR MODELS
DATE:

AIM:
To Write a Python Program to build and validate linear models

ALGORITHM:
Step1: Consider a set of values ×, y.
Step2: Take the linear set of equation y = a+b×.
Step3: Computer value of a, b with respect to the given values, b = nΣxy −
(Σx)
(Σy) / nΣx2−(Σx)2, a = Σy−b (Σx)n.
Step4: Implement the value of a, b in the equation y = a+ b×.
Step5: Regress the value of y for any ×.

PROGRAM:
# Importing the necessary libraries import pandas as pd
import numpy as np
import matplotlib.pyplot as plt import seaborn as sns
from sklearn.datasets import load_boston
sns.set(style=”ticks”,color_codes=True) plt.rcParams[‘figure.figsize’] = (8,5)
plt.rcParams[‘figure.dpi’] = 150
# loading the databoston = load_boston()
You can check those keys with the following code. print(boston.keys())
The output will be as follow:
dict_keys([‘data’, ‘target’, ‘feature_names’, ‘DESCR’, ‘filename’])
print(boston.DESCR)

You will find these details in output: Attribute Information (in order):
— CRIM per capita crime rate by town
— ZN proportion of residential land zoned for lots over 25,OOO sq.ft.
— INDUS proportion of non-retail business acres per town
— CHAS Charles River dummy variable (= 1 if tract bounds river; O
otherwise)
— NOX nitric o×ides concentration (parts per 1O million)
— RM average number of rooms per dwelling
— AGE proportion of owner-occupied units built prior to 194O
— DIS weighted distances to five Boston employment centres
— RAD inde× of accessibility to radial highways
— TAX full-value property-ta× rate per $1O,OOO

— PTRATIO pupil-teacher ratio by town

— B 1OOO (Bk — 0.63)² where Bk is the proportion of blacks by town
— LSTAT % lower status of the population
— MEDV Median value of owner-occupied homes in $1000’s :Missing
Attribute Values: None
—
df=pd.DataFrame(boston.data,columns=boston.feature_names) df.head()
# print the columns present in the dataset print(df.columns)
# print the top 5 rows in the dataset print(df.head())

First five records from data set

#plotting heatmap for overall data setsns.heatmap(df.corr(), square=True,
cmap=’RdYlGn’)
Heat map of overall data set
So let’s plot a regression plot to see the correlation between RM and MEDV.
sns.lmplot(x = ‘RM’, y = ‘MEDV’, data = df)

Regression plot with RM and MEDV

RESULT:
Thus the Python Program for to building and validating linear models
has been executed successfully
EXP NO:10 BUILDING AND VALIDATING LOGISTIC MODELS
DATE:

AIM:
To Write a Python Program to build and validate logistic models

ALGORITHM:
Step1: Initialize the variables
Step2: Set the Data frame
Step3: Spilt data set into training and testing.
Step4: Fit the data into logistic regression function.
Step5: Predict the test data set.
Step6: Print the results.

PROGRAM:
Building the Logistic Regression model: # importing libraries import
statsmodels.api as sm import pandas as pd
# loading the training dataset
df = pd.read_csv('logit_train1.csv', inde×_col = O) # defining the dependent and
independent variables Xtrain = df[['gmat', 'gpa', 'work_e×perience']] ytrain =
df[['admitted']]
# building the model and fitting the data log_reg = sm.Logit(ytrain, Xtrain).fit()
Output :
Optimization terminated successfully. Current function value: O.3527O7
Iterations 8
# printing the summary table print(log_reg.summary())
Output :
Logit Regression Results
==================================================
===========
Dep. Variable: admitted No. Observations: 3O
Model: Logit Df Residuals: 27
Method: MLE Df Model: 2
Date: Wed, 15 Jul 2O2O Pseudo R-squ.: O.4912
Time: 16:O9:17 Log-Likelihood: -1O.581

converged: True LL-Null: -2O.794

Covariance Type: nonrobust LLR p-value: 3.668e-O5
===================================================
==========
===
coef std err z P>|z| [O.O25 O.975]

gmat -O.O262 O.O11 -2.383 O.O17 -O.O48 -O.OO5

gpa 3.9422 1.964 2.OO7 O.O45 O.O92 7.792
work_e×perience 1.1983 O.482 2.487 O.O13 O.254 2.143

Predicting on New Data :

# loading the testing dataset

df = pd.read_csv('logit_test1.csv', inde×_col = O) # defining the dependent and
independent variables Xtest = df[['gmat', 'gpa', 'work_e×perience']] ytest =
df['admitted']
# performing predictions on the test dataset yhat = log_reg.predict(Xtest)
prediction = list(map(round, yhat))
# comparing original and predicted values of y print('Actual values',
list(ytest.values)) print('Predictions :', prediction)

Output :
Optimization terminated successfully. Current function value: O.3527O7 Iterations
8
Actual values [O, O, O, O, O, 1, 1, O, 1, 1]
Predictions : [O, O, O, O, O, O, O, O, 1, 1]

Testing the accuracy of the model :

from sklearn.metrics import (confusion_matri×, accuracy_score)

# confusion matri×
cm = confusion_matri×(ytest, prediction) print ("Confusion Matri× : Vn", cm)
# accuracy score of the model
print('Test accuracy = ', accuracy_score(ytest, prediction))

Output :
Confusion Matri× :
[[6 O]
[2 2]]
Test accuracy = O.8

RESULT:
Thus the Python Program for to building and validating logistic models
has been executed successfully
EXP NO:11 TIME SERIES ANALYSIS
DATE:

AIM:
To Write a Python Program for Time Series Analysis

ALGORITHM:
Step1: Loading time series dataset correctly in Pandas
Step2: Inde×ing in Time-Series Data
Step4: Time-Resampling using Pandas
Step5: Rolling Time Series
Step6: Plotting Time-series Data using Pandas

PROGRAM:

import warnings import itertools import numpy as np

import matplotlib.pyplot as plt warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight') import pandas as pd
import statsmodels.api as sm
import matplotlibmatplotlib.rcParams['a×es.labelsize'] = 14
matplotlib.rcParams['×tick.labelsize'] = 12
matplotlib.rcParams['ytick.labelsize'] = 12 matplotlib.rcParams['te×t.color'] = 'k'
We start from time series analysis and forecasting for furniture sales.
df=pd.read_e×cel("Superstore.×ls")
furniture = df.loc[df['Category'] == 'Furniture'] A good 4-year furniture sales
data.
furniture['Order Date'].min(), furniture['Order Date'].max() Timestamp(‘2014–O1–
O6 00:00:00’), Timestamp(‘2017–12–3O
00:00:00’)

Data Preprocessing
This step includes removing columns we do not need, check missing values,
aggregate sales by date and so on.
cols = ['Row ID', 'Order ID', 'Ship Date', 'Ship Mode', 'Customer ID', 'Customer
Name', 'Segment', 'Country', 'City', 'State', 'Postal Code', 'Region', 'Product ID',
'Category', 'Sub-Category', 'Product Name', 'Quantity', 'Discount', 'Profit']
furniture.drop(cols,a×is=1,inplace=True) furniture=furniture.sort_values('Order
Date')furniture.isnull().sum()
furniture=furniture.groupby('OrderDate')['Sales'].sum().reset_ inde×()
Order Date 0

Sales dtype: 0

Figure 1

Indexing with Time Series Data

furniture=furniture.set_inde×('OrderDate') furniture.inde×

Figure 2
We will use the averages daily sales value for that month instead, and we are
using the start of each month as the timestamp.
y = furniture ['Sales'].resample('MS').mean() Have a quick peek 2O17 furniture
sales data. y['2O17':]
Figure 3

Yisualizing Furniture Sales Time Series Data

y.plot (figsize=(15,6)) plt.show()

RESULT:

Thus the Python Program for Time Series Analysis has been executed successfully

SWAN SODIUM Na
No ratings yet
SWAN SODIUM Na
120 pages
Six - Domains.Leadership Pyramid - Lind.Sitkin
No ratings yet
Six - Domains.Leadership Pyramid - Lind.Sitkin
24 pages
CO - Earth and Life Science (Detailed Lesson Plan)
No ratings yet
CO - Earth and Life Science (Detailed Lesson Plan)
6 pages
Ad3411 Data Science and Analytics Laboratory
100% (7)
Ad3411 Data Science and Analytics Laboratory
24 pages
C15....... Concrete Mix Design
No ratings yet
C15....... Concrete Mix Design
18 pages
Đọc Viết 2 - 23092021
No ratings yet
Đọc Viết 2 - 23092021
9 pages
Prospectus 2023-2024 SSC
No ratings yet
Prospectus 2023-2024 SSC
88 pages
Conventions CoralReefs
No ratings yet
Conventions CoralReefs
22 pages
Final Paper Answer
No ratings yet
Final Paper Answer
11 pages
DextranSPRI 04
No ratings yet
DextranSPRI 04
16 pages
CISCE
No ratings yet
CISCE
1 page
Lab Mannual
No ratings yet
Lab Mannual
49 pages
The Prodigy of Welding: By: Brandy Ratliff Graduation Project 2010
No ratings yet
The Prodigy of Welding: By: Brandy Ratliff Graduation Project 2010
16 pages
Exp - S5 - Vapour Liquid Equilibrium - Corrected
No ratings yet
Exp - S5 - Vapour Liquid Equilibrium - Corrected
6 pages
Transcripts
No ratings yet
Transcripts
3 pages
Learn About Ecosystems - Lesson Plan
No ratings yet
Learn About Ecosystems - Lesson Plan
2 pages
Data Sheet - F12 EN 2021.12.09
No ratings yet
Data Sheet - F12 EN 2021.12.09
4 pages
Review of Islanding Detection Using Advanced Signal Processing Techniques
No ratings yet
Review of Islanding Detection Using Advanced Signal Processing Techniques
22 pages
WYSIWYG
No ratings yet
WYSIWYG
26 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
Worksheet 2
No ratings yet
Worksheet 2
2 pages
AD3411 - 1 To 5
No ratings yet
AD3411 - 1 To 5
11 pages
Ultrasonic Horn Designs
No ratings yet
Ultrasonic Horn Designs
5 pages
Translation Criticism-Week 1
No ratings yet
Translation Criticism-Week 1
50 pages
Gandjariella Thermophila Gen Nov SP Nov
No ratings yet
Gandjariella Thermophila Gen Nov SP Nov
22 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
Root Cause Analysis Enhancing Safety in Chemical Processing Environments
100% (1)
Root Cause Analysis Enhancing Safety in Chemical Processing Environments
91 pages
Data Science Manual
No ratings yet
Data Science Manual
16 pages
Specification For Structural Steel Buildings 04
No ratings yet
Specification For Structural Steel Buildings 04
4 pages
Lab Manual (DAV)
No ratings yet
Lab Manual (DAV)
33 pages
USP-NF Aloe
No ratings yet
USP-NF Aloe
3 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Reinforcement Learning For Portfolio Management: Meng Dissertation
No ratings yet
Reinforcement Learning For Portfolio Management: Meng Dissertation
123 pages
Numpy and Pandas
No ratings yet
Numpy and Pandas
11 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
01 Slides
No ratings yet
01 Slides
109 pages
CS Project File
No ratings yet
CS Project File
8 pages
AD3411
No ratings yet
AD3411
28 pages
Stats
No ratings yet
Stats
33 pages
DS - Lab Manual
No ratings yet
DS - Lab Manual
31 pages
Dal Programs With Output
No ratings yet
Dal Programs With Output
11 pages
RRL
No ratings yet
RRL
20 pages
Smec ML Lab Manual R22
No ratings yet
Smec ML Lab Manual R22
21 pages
Data Science Experiments
No ratings yet
Data Science Experiments
31 pages
DA Manual - Part B
No ratings yet
DA Manual - Part B
13 pages
Python Code - Summary Statistics
No ratings yet
Python Code - Summary Statistics
6 pages
Sedimentology Final Examination
No ratings yet
Sedimentology Final Examination
12 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
38 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
DA Lab ANSWERS
No ratings yet
DA Lab ANSWERS
10 pages
Univds
No ratings yet
Univds
8 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
34 pages
DSA Lab Manual Pgms - fINAL
No ratings yet
DSA Lab Manual Pgms - fINAL
34 pages
Statistical Analysis With Scipy?
No ratings yet
Statistical Analysis With Scipy?
9 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
Statistical Exp 8
No ratings yet
Statistical Exp 8
9 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
31 pages
Stats Lab (7-9)
No ratings yet
Stats Lab (7-9)
8 pages
Ad3411-Data Science and Analytics Laboratory
No ratings yet
Ad3411-Data Science and Analytics Laboratory
27 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
4 12
No ratings yet
4 12
17 pages
Python Codes
No ratings yet
Python Codes
15 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
Data Science Algorithmen Master - 02 Data Handling
No ratings yet
Data Science Algorithmen Master - 02 Data Handling
76 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
32 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
27 pages
Lab 11,12
No ratings yet
Lab 11,12
7 pages
Ad3411 - Data Science and Analytics Laboratory
No ratings yet
Ad3411 - Data Science and Analytics Laboratory
26 pages
FDSA Lab Record
No ratings yet
FDSA Lab Record
30 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
ML Lab Manual
No ratings yet
ML Lab Manual
21 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
32 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Exp 5-6-7-8
No ratings yet
Exp 5-6-7-8
8 pages
FDSA Lab Manual Aim Algorithm
No ratings yet
FDSA Lab Manual Aim Algorithm
32 pages
Experimenting With Data Analysis Packages and Statistical Operations
No ratings yet
Experimenting With Data Analysis Packages and Statistical Operations
18 pages
Fha-Pyhton Program Unit 1-4
No ratings yet
Fha-Pyhton Program Unit 1-4
13 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
17 pages
Dsa Lab
No ratings yet
Dsa Lab
28 pages
APP Lab Expt 4
No ratings yet
APP Lab Expt 4
5 pages
ML Programs
No ratings yet
ML Programs
41 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
Fda Batch2program
No ratings yet
Fda Batch2program
18 pages
Fdsa Record Ai&Ds
No ratings yet
Fdsa Record Ai&Ds
26 pages
Cmap Esp 9 - Quarter 1 - Based On Matatag 2025
No ratings yet
Cmap Esp 9 - Quarter 1 - Based On Matatag 2025
4 pages
Datascience Lab
No ratings yet
Datascience Lab
24 pages
ML Lab
No ratings yet
ML Lab
12 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed

Uploaded by

AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed

Uploaded by

EXP NO:1 WORKING WITH PANDAS DATA FRAMES

import matplotlib.pyplot as plt

To Write a Python Program for working with Frequency Distributions, Averages,

Step 1: Import the necessary library such as numpy.

Python program to get variance of a list

Python program to get standard deviation of a list

Step 1: Start the Program

#Correlation and scatter plots import sklearn

Step 1: Start the Program

Step 1: Input the values

— PTRATIO pupil-teacher ratio by town

First five records from data set

Regression plot with RM and MEDV

converged: True LL-Null: -2O.794

gmat -O.O262 O.O11 -2.383 O.O17 -O.O48 -O.OO5

Predicting on New Data :

# loading the testing dataset

Testing the accuracy of the model :

from sklearn.metrics import (confusion_matri×, accuracy_score)

import warnings import itertools import numpy as np

Indexing with Time Series Data

Yisualizing Furniture Sales Time Series Data

You might also like