Iot Da1
Iot Da1
ECE3502
IoT Domain Analyst Lab
TASK-1
Theory:
EDA is a phenomenon under data analysis used for gaining a better
understanding of data aspects like:
1. main features of data
2. variables and relationships that hold between them
3. identifying which variables are important for our problem
Program:
Program:
Program:
Program:
Output:
Program:
Program:
Program:
#20BEC0174 - Keerthi Uppalapati
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing
df = pd.read_csv('C:/Users\keert\Downloads\pima-indians-diabetes.csv')
df2=df.copy()
threshold=90
df2 = (df['Glucose'] > threshold).astype(int)
print(df2)
Output:
(c) Standardization (Dataset: pima-indians-diabetes.csv)
Program:
#20BEC0174 - Keerthi Uppalapati
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing
df = pd.read_csv('C:/Users\keert\Downloads\pima-indians-diabetes.csv')
df2=df.copy()
m=df['Glucose'].mean()
s=df['Glucose'].std()
#using df[‘column’] =( df[‘column’] – df[‘column’].mean() ) /
df[‘column’].std()
df2['Glucose']=(df['Glucose']-m)/s
print(df['Glucose'])
print(df2['Glucose'])
Output:
Program:
#20BEC0174 - Keerthi Uppalapati
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing
df = pd.read_csv('C:/Users\keert\Downloads\pima-indians-diabetes.csv')
df["label"] = "default_label"
df.loc[df["Glucose"] > 100, "label"] = "diabetic"
df.loc[df["Glucose"] <= 100, "label"] = "Not diabetic"
print(df)
Output:
Conclusion:
This is how we’ll do Exploratory Data Analysis. Exploratory Data
Analysis (EDA) helps us to look beyond the data. The more we explore the
data, the more the insights we draw from it. As a data analyst, almost 80% of
our time will be spent understanding data and solving various business
problems through EDA.
Signature of student