0% found this document useful (0 votes)
457 views

Data Analytics Lab

The document outlines an index for a lab file on data analytics submitted by a student named Amit Singh to their professors at NOIDA INSTITUE OF ENGINEERING & TECHNOLOGY, listing topics like performing numerical operations, data import/export, matrix operations, statistical analysis, and simple linear and logistic regression using Python/R. The aims demonstrate how to handle data preprocessing tasks, fit regression models, and evaluate their performance on test data.

Uploaded by

Amit Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
457 views

Data Analytics Lab

The document outlines an index for a lab file on data analytics submitted by a student named Amit Singh to their professors at NOIDA INSTITUE OF ENGINEERING & TECHNOLOGY, listing topics like performing numerical operations, data import/export, matrix operations, statistical analysis, and simple linear and logistic regression using Python/R. The aims demonstrate how to handle data preprocessing tasks, fit regression models, and evaluate their performance on test data.

Uploaded by

Amit Singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

NOIDA INSTITUE OF ENGINEERING & TECHNOLOGY,

GREATER NOIDA

Department of Information Technology

LAB FILE
ON
DATA ANALYTICS LAB
KIT-651
(6th Semester)
(2020 – 2021)

Submitted To: Submitted by:

Ms. Tanya Name: Amit Singh

Dr. Vivek Kumar Roll: 1813313019

Affiliated to Dr. A.P.J Abdul Kalam Technical University, Uttar Pradesh, Lucknow.
Data ANALYTICS LAB
KIT-651
INDEX
S.NO TOPIC DATE GRADE SIGNATURE

To get the input from user and perform numerical


1 operations (MAX, MIN, AVG, SUM, SQRT, ROUND)
using in R/Python.
To perform data import/export (.CSV, .XLS, TXT)
2
operations using data frames in R/Python.
To get the input matrix from user and perform Matrix
addition, subtraction, multiplication, inverse transpose
3
and division operations using vector concept in
R/Python.
To perform statistical operations (Mean, Median, Mode
4
and Standard deviation) using R/Python.
To perform data pre-processing operations i) Handling
5
Missing data ii) Min-Max normalization.
6 To perform Simple Linear Regression with R/Python.

7 To perform Simple Logistic Regression with R/Python.

10

11

12

13

14

15

16
Aim -1. To get the input from user and perform numerical operations (MAX,
MIN, AVG, SUM, SQRT, ROUND) using in R/Python.

import math
list1 = []
  
n = int(input("Enter number of elements : "))
  
for i in range(0, n):
  ele = int(input())
  list1.append(ele)
      
print("Sum = ",sum(list1))
print("Maximum element = ",max(list1))
print("Minimum element = ",min(list1))
print("Square root =" ,math.sqrt(list1[1]))
print("Round =",round(5.56))
print("Average = ", sum(list1)/len(list1))

OUTPUT: -
Enter number of elements : 5
1
6
2
8
7
Sum = 24
Maximum element = 8
Minimum element = 1
Square root = 2.449489742783178
Round = 6
Average = 4.8
Aim - 2. To perform data import/export (.CSV, .XLS, TXT) operations using
data frames in R/Python.

from google.colab import drive


drive.mount("/content/drive")

import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/Da-Lab/ITUR_rain1.csv')

print(df.Frequency)

OUTPUT: -

0 1.0
1 1.5
2 2.0
3 2.5
4 3.0
...
99 96.0
100 97.0
101 98.0
102 99.0
103 100.0
Name: Frequency, Length: 104, dtype: float64
Aim - 3. To get the input matrix from user and perform Matrix addition,
subtraction, multiplication, inverse transpose and division operations using
vector concept in R/Python.

import numpy
r = int(input("Enter  no of row of matrix1 "))
c = int(input("Enter no of cloumns of matrix1 "))
m = []
print("Enter elements")
for i in range(r):          
    a =[]
    for j in range(c):      
         a.append(int(input()))
    m.append(a)
r1 = int(input("Enter the number of rows of matrix 2 "))
c1 = int(input("Enter the number of columns of matrix 2 "))
m1 = []
print("Enter elements")
for i in range(r1):          
    a1 =[]
    for j in range(c1):      
         a1.append(int(input()))
    m1.append(a1)
m2=[]
for i in range(r):
  a3=[]
  for j in range(c):
    a3.append(m[i][j]+m1[i][j])
  m2.append(a3)
print("Sum pf matrix is:")
for i in range (r):
  for j in range(c):
    print(m2[i][j],end=" ")
  print()
pm=[]
for i in range (r):
  sm=[]
  for j in range (c):
    s=0
    
    for k in range (c):
      s=s+m[i][k]*m1[k][j]
    sm.append(s)
  pm.append(sm)
print("Product of matrix:")
for i in range( r):
  for j in range (c):
    print(pm[i][j],end =" ")
  print()
print("Transpose of multiplication matrix is :")
print(numpy.transpose(pm))

OUTPUT: -

Enter no of row of matrix1 2


Enter no of cloumns of matrix1 2
Enter elements
1
2
3
4
Enter the number of rows of matrix 2 2
Enter the number of columns of matrix 2 2
Enter elements
4
5
6
7
Sum pf matrix is:
57
9 11
Product of matrix:
16 19
36 43
Transpose of multiplication matrix is :
[[16 36]
[19 43]]
Aim -4. To perform statistical operations (Mean, Median, Mode and Standard
deviation) using R/Python.

import statistics as st
lst = []
  

n = int(input("Enter number of elements : "))
  

for i in range(0, n):
    ele = int(input())
  
    lst.append(ele) 

print("Mean value is:",st.mean(lst))
print("Meadian is:",st.median(lst))
print("Mode value is :",st.mode(lst))
print("Standard deviation is :",statistics.stdev(lst))

OUTPUT :-

Enter number of elements : 5


1
2
3
4
5
Mean value is: 3
Meadian is: 3
Mode is: 0
Standard deviation is: 1.414
Aim - 5. To perform data pre-processing operations i) Handling Missing data
ii) Min-Max normalization.

import pandas as pd
import numpy as np
df = pd.read_csv("/content/drive/MyDrive/Da-Lab/titanic.csv")
df.head()

df.drop(['PassengerId','Name','SibSp','Parch','Ticket','Cabin','Embarked'],axis='columns',inplace=
True)
df.head()
target = df.Survived
inputs = df.drop('Survived',axis='columns')

#One-hot encoding
dummies = pd.get_dummies(inputs.Sex)
dummies.head(3)

inputs = pd.concat([inputs,dummies],axis='columns')
inputs.head(3)

inputs.drop(['Sex','male'],axis='columns',inplace=True)
inputs.head(3)
inputs.columns[inputs.isna().any()]

OUTPUT: -

Index(['Age'], dtype='object')

inputs.Age = inputs.Age.fillna(inputs.Age.mean())
inputs.head()

inputs.Age[:10]

OUTPUT: -

0 22.000000
1 38.000000
2 26.000000
3 35.000000
4 35.000000
5 29.699118
6 54.000000
7 2.000000
8 27.000000
9 14.000000
Name: Age, dtype: float64
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(inputs,target,test_size=0.3)

from sklearn.naive_bayes import GaussianNB
model = GaussianNB()

model.fit(X_train,y_train)

OUTPUT: -
GaussianNB(priors=None, var_smoothing=1e-09)

model.score(X_test,y_test)

OUTPUT: -

0.7574626865671642

model.predict(X_test[0:10])

OUTPUT: -

array([0, 1, 1, 1, 0, 1, 1, 0, 0, 1])
Aim - 6. To perform Simple Linear Regression with R/Python.

import numpy as np 

import pandas as pd
import matplotlib.pyplot as plt

from google.colab import files
uploaded = files.upload()

data = pd.read_csv("area.csv")
X = data.Area.values.astype(float)

y = data.Price.values.astype(float)

plt.scatter(X,y)
plt.xlabel("Area")
plt.ylabel("Price")
plt.show()
from sklearn import linear_model
from sklearn.linear_model import LinearRegression
reg = linear_model.LinearRegression()
reg.fit(data[['Area']],data.Price)

OUTPUT: -

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

reg.predict([[100]])

OUTPUT: -

array([9229.8328887])

reg.coef_

OUTPUT: -

array([40.46056658])

reg.intercept_

OUTPUT: -

5183.7762302371

100.6691978*100+1118.140232700558

OUTPUT: -

11185.060012700558
Aim - 7. To perform Simple Logistic Regression with R/Python.

You might also like