0% found this document useful (0 votes)
43 views10 pages

DA Lab ANSWERS

Uploaded by

sakthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views10 pages

DA Lab ANSWERS

Uploaded by

sakthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

AD8412 - DATA ANALYTICS LAB

1. Implement the following functions in the list of BMI values for people living in a rural area

bmi_list = [29, 18, 20, 22, 19, 25, 30, 28,22, 21, 18, 19, 20, 20, 22, 23]

(i) random.choice()
(ii) random.sample()
(iii) random.randint()
PROGRAM:

import random
from random import sample
def BMI(height, weight):
bmi = weight/(height**2)
return bmi

bmi_list = [29, 18, 20, 22, 19, 25, 30, 28,22, 21, 18, 19, 20, 20, 22, 23]
height = 1.79832
weight = 70
bmi= BMI(height, weight)
print("The BMI is", format(bmi), "so ", end='')
if (bmi < 18.5):
print("underweight")

elif ( bmi >= 18.5 and bmi < 24.9):


print("Healthy")

elif ( bmi >= 24.9 and bmi < 30):


print("overweight")

elif ( bmi >=30):


print("Suffering from Obesity")

The BMI is 21.64532402096181 so Healthy

(i) random.choice()

print(random.choice(bmi_list))

output:
30
In [98]:

Page 1 of 10
(ii)random.sample()

print(sample(bmi_list,3))

output: [18, 25, 22]

(iii)random.randint()

print(random.randint(0, 12))

output: 9

2. Use the random.choices() function to select multiple random items from a sequence with
repetition.

For example, You have a list of names, and you want to choose random four names from it,
and it’s okay for you if one of the names repeats.

names = ["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina", “KUMAR”]

PROGRAM:

import random

names=["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina","Kumar"]

# choose three random sample with replacement to including repetition

sample_list3 = random.choices(names, k=4)

print(sample_list3)

Output:
['Novac', 'Novac', 'Martina', 'Sarena']

3. Write a Python program to demonstrate the use of sample() function for string and tuple
types.

import random

string = "Welcome World"

print("With string:", random.sample(string, 4))


output: With string: ['r', 'm', 'W', 'W']

Page 2 of 10
tuple1 = ("Selshia", "AI", "computer", "science", "Jansons", "Engineering", "btech")

print("With tuple:", random.sample(tuple1, 4))

output:
With tuple: ['Jansons', 'Selshia', 'btech', 'Engineering']

4. Write a python script to implement the Z-Test for the following problem:

A school claimed that the students’ study that is more intelligent than the average school.
On calculating the IQ scores of 50 students, the average turns out to be 11. The mean of
the population IQ is 100 and the standard deviation is 15. Check whether the claim of
principal is right or not at a 5% significance level.

PROGRAM:

import math

import numpy as np

from numpy.random import randn

from statsmodels.stats.weightstats import ztest

mean_iq = 110

sd_iq = 15/math.sqrt(50)

alpha =0.05

null_mean =100

data = sd_iq*randn(50)+mean_iq

print('mean=%.2f stdv=%.2f' % (np.mean(data), np.std(data)))

ztest_Score, p_value= ztest(data,value = null_mean, alternative='larger')

if(p_value < alpha):

print("Reject Null Hypothesis")

else:

print("Fail to Reject NUll Hypothesis")

OUTPUT:mean=109.65 stdv=2.06
Reject Null Hypothesis

Page 3 of 10
5. Write a Python program to demonstrate the ‘T-Test’ with suitable libraries for a sample
student’s data. (Create and use dataset of your own)

import pandas as pd

df=pd.read_csv("paired_ttest - paired_ttest.csv") tscore,pvalue= stats.ttest_rel(df['Brand


1'],df['Brand 2']) alpha=0.20

print(tscore,pvalue) if (pvalue>alpha):

print("Failed to reject or do not reject null hypothesis") else:

print("Reject null hypothesis")

output:

6. Import the necessary libraries in Python for implementing ‘One-Way ANOVA Test’ in a
sample dataset. (Create and use dataset of your own)

Program:

import pandas as pd

# load data file

df = pd.read_csv("https://reneshbedre.github.io/assets/posts/anova/onewayanova.txt",
sep="\t")

# reshape the d dataframe suitable for statsmodels package

df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=['A', 'B', 'C', 'D'])

# replace column names

df_melt.columns = ['index', 'treatments', 'value']

# generate a boxplot to see the data distribution by treatments. Using boxplot, we can

# easily detect the differences between different treatments

import matplotlib.pyplot as plt

import seaborn as sns

Page 4 of 10
ax = sns.barplot(x='treatments', y='value', data=df_melt)

ax = sns.swarmplot(x="treatments", y="value", data=df_melt)

plt.show()

output:

7. Import the necessary libraries in Python for implementing Two-Way ANOVA Test’ in a
sample dataset. (Create and use dataset of your own)

8. Let us consider a dataset where we have a value of response y for every feature x:

Generate a regression line for this sample data using Python.

PROGRAM:

Page 5 of 10
import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points

n = np.size(x)

# mean of x and y vector

m_x = np.mean(x)

m_y = np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x

SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x

return (b_0, b_1)

def plot_regression_line(x, y, b):

plt.scatter(x, y, color = "m",

marker = "o", s = 30)

y_pred = b[0] + b[1]*x

plt.plot(x, y_pred, color = "g")

Page 6 of 10
plt.xlabel('x')

plt.ylabel('y')

plt.show()

def main():

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

plot_regression_line(x, y, b)

if __name__ == "__main__":

main()

OUTPUT:
Estimated coefficients:
b_0 = 1.2363636363636363
b_1 = 1.1696969696969697

9. Import scipy and draw the line of Linear Regression for the following data:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

Page 7 of 10
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

Where the x-axis represents age, and the y-axis represents speed. We have registered the
age and speed of 13 cars as they were passing a tollbooth.

PROGRAM:

import matplotlib.pyplot as plt

from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):

return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)

plt.plot(x, mymodel)

plt.xlabel('Age')

plt.ylabel('Speed Of Cars')

plt.show()

OUTPUT:

Page 8 of 10
Implement the time series analysis concept for a sample dataset using Pandas.
10. (Create and use dataset of your own)

Refer 12th program


Write a Python program to visualize the time series concepts using Matplotlib.
11. (Create and use dataset of your own)

REFER 12th program

12. Demonstrate various time series models using Python.(Create and use dataset of your own)

PROGRAM:

import matplotlib.pyplot as plt

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',
parse_dates=['date'], index_col='date')

# Draw Plot

def plot_df(df, x, y, title="", xlabel='Date', ylabel='Value', dpi=100):

plt.figure(figsize=(16,5), dpi=dpi)

plt.plot(x, y, color='tab:red')

plt.gca().set(title=title, xlabel=xlabel, ylabel=ylabel)

plt.show()

plot_df(df, x=df.index, y=df.value, title='Monthly anti-diabetic drug sales in Australia from


Page 9 of 10
1992 to 2008.')

OUTPUT:

Page 10 of 10

You might also like