Aids Lab
Aids Lab
AIM :
INSTALLATION PROCEDURE:
Python is a great language for doing data analysis, primarily because of the fantastic
ecosystem of data-centric Python packages. Pandas is one of those packages, and
makes importing and analyzing data much easier
Step 1: To ensure that the system is updated and the necessary packages are installed, open a
terminal window and type the following commands:
Step 2:
Sudo apt install python3
$ python3 –version
RESULT:
1
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
AIM:
To Perform exploratory data analysis (EDA) on with datasets like email data set. Export all
your emails as a dataset, import theminside a pandas data frame, visualize them and get
different insights from the data.
ALGORITHM:
Step 1 : Import necessary libraries
Step 2 :Create a simple example dataset
Step 3 :Display the dataset
Step 4 :Check for missing values
Step 5 :Visualize the distribution ofemail lengths,count and common words
Step 6 : Stop
PROGRAM:
2
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
print(email_data.isnull().sum())
3
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
RESULT:
4
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
Thus the exploratory data analysis (EDA) on with datasets like email data set Performed
successfully
AIM:
To Working with NumPy arrays, Pandas data frames, and creating basic plots using
Matplotlib
PROGRAM
import numpy as np
#Basicoperationsonarrays
#Displayingarraycontentsandresults
print("Array 1:", arr1)
print("Array2:",arr2)
print("Sum of arrays:", sum_array)
print("Productofarrays:", product_array)
print("Mean of Array 1:", mean_value)
5
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT
Array1: [1 2 3 4 5]
Array2: [6 7 8 9 10]
Sum of arrays: [7 9 1 1 1 3 1 5]
Product of arrays: [6 1 4 2 4 3 6 5 0]
Mean of Array 1:3.0
RESULT:
Thus we Working with NumPy arrays, Pandas data frames, and creating basic plots
6
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
using Matplotlib
AIM:
To Working with Pandas data frames, using Matplotlib
PROGRAM
import pandas as pd
#CreatingaPandasDataFramefromadictionary
print("Original DataFrame:")
print(df)
average_age=df['Age'].mean()
youngest_person=df[df['Age']==df['Age'].min()]
print("\nAverageAge:",average_age)
print("\nYoungest Person:")
print(youngest_person)
7
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
Original DataFrame:
Name Age City
0 Alice 25 NewYork
1 Bob 30 SanFrancisco
2 Charlie 22 LosAngeles
3 David 35 Chicago
4 Emily 28 Boston
AverageAge: 28.0
Youngest Person:
Name Age City
2 Charlie 22 LosAngeles
8
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus we working with pandas data frames using matplotlib worked successfully
AIM:
To Working with creating basic plots using Matplotlib
PROGRAM
import matplotlib.pyplot as plt
#Exampledata
x=[1,2,3,4,5]
y=[2,4,6,8,10]
# Line
plotplt.figure(figsize=(8,6)
plt.plot(x,y,label='LinePlot',marker='o',linestyle='-',color='blue')
plt.title('Simple Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.grid(True)
plt.show()
# Scatter
plotplt.figure(figsize=(8,6))
plt.scatter(x, y, label='Scatter Plot', color='red', marker='x')
plt.title('Simple Scatter Plot')
plt.xlabel('X-axis')
9
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
plt.ylabel('Y-axis')
plt.legend()
plt.grid(True)
plt.show()
OUTPUT:
1. LinePlot:
2. ScatterPlot:
10
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus we working with basic plot using matplotlib worked successfully
AIM:
To Working with customizing plot using Matplotlib
Program
import numpy as np
import matplotlib.pyplot as plt
#Generatedata
x=np.linspace(0,10,100)
y1 = np.sin(x)
y2=np.cos(x)
#Createafigureandaxis
fig,ax=plt.subplots(figsize=(8,6))
#Plotthedatawithcustomstyles
ax.plot(x,y1,label='SineWave',color='blue',linestyle='--',linewidth=2)
ax.plot(x,y2,label='CosineWave',color='red',linestyle='-',linewidth=2)
#Customizeaxeslabelsandtitle
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_title('CustomizedSineandCosineWaves')
#Addalegenda
x.legend()
11
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
#Addgrid
ax.grid(True,linestyle='--',alpha=0.7)
#Showtheplot
plt.show()
OUTPUT:
12
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus creating basic plots using Matplotlib worked successfully
AIM:
To explore various variable and row filtering techniques in R for cleaning data, and then
apply different plot features on a sample dataset.
PROGRAM
import pandas as pd
#Loadasampledataset(e.g.,Irisdataset)
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
column_names=["sepal_length","sepal_width","petal_length","petal_width","class"]
#Displaythefirstfewrowsofthedataset
print("Sample Dataset:")
print(df.head())
13
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
SampleDataset:
sepal_lengthsepal_widthpetal_lengthpetal_width class
14
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus we explore various variable and row filtering techniques in R for cleaning data,
and then apply different plot features on a sample dataset worked successfully.
AIM:
To explore and understand the underlying patterns, trends, and relationships within the
dataset.
PROGRAM
import pandas as pd
#Loadasampledataset(e.g.,Irisdataset)
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
column_names=["sepal_length","sepal_width","petal_length","petal_width","class"]
df = pd.read_csv(url, header=None, names=column_names)
#Displaybasicinformationaboutthedataset
print("Dataset Information:")
print(df.info())
#Displaysummarystatistics
print("\nSummary Statistics:")
print(df.describe())
#Displaythefirstfewrowsofthedataset
print("\nFirstFewRowsoftheDataset:")
print(df.head())
#Displayuniqueclassesinthe'class'column
print("\nUnique Classes:")
print(df['class'].unique())
15
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
DatasetInformation:
<class
'pandas.core.frame.DataFrame'>RangeInde
x: 150 entries, 0 to 149
Datacolumns(total5columns):
#Column Non-NullCountDtype
--------- -------------------
0 sepal_length150non-nullfloat64
1 sepal_width150non-nullfloat64
2 petal_length150non-nullfloat64
3 petal_width150non-nullfloat64
4 class 150non-nullobject
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
None
SummaryStatistics:
sepal_length sepal_width petal_length petal_width
count150.000000150.000000150.000000150.000000
mean 5.84333 3.054000 3.75866 1.198667
3 7
std 0.828066 0.433594 1.764420 0.763161
min 4.300000 2.000000 1.000000 0.100000
25% 5.100000 2.800000 1.600000 0.300000
50% 5.800000 3.000000 4.350000 1.300000
75% 6.400000 3.300000 5.100000 1.800000
max 7.900000 4.400000 6.900000 2.500000
FirstFewRowsoftheDataset:
sepal_lengthsepal_widthpetal_lengthpetal_width class
16
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
UniqueClasses:
['Iris-setosa''Iris-versicolor''Iris-virginica']
RESULT:
Thus, we explore and understand the underlying patterns, trends, and relationships within the
Dataset successfully
AIM:
To apply variable filters in order to refine data based on specific conditions or criteria.
PROGRAM
import pandas as pd
#CreateasampleDataFrame
data = {'Name':['Alice','Bob','Charlie'], 'Age': [25, 30, 22],'City':['NewYork','SanFrancisco','LosAngeles']}
df=pd.DataFrame(data)
#DisplaytheoriginalDataFrame
print("Original DataFrame:")
print(df)
#Filterspecificvariables(columns)
selected_columns=['Name','City']
filtered_df=df[selected_columns]
#DisplaythefilteredDataFrame
print("\nFilteredDataFrame:")
print(filtered_df)
17
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
Original DataFrame:
FilteredDataFrame:
Name City
Alice NewYork
Bob SanFrancisco
Charlie LosAngeles
18
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Filtered data will display only the values or records that meet the defined criteria, improving
data analysis and decision-making.
PROGRAM
import pandas as pd
# Sample data
data={'Name':['Alice','Bob','Charlie','David'], 'Age': [20, 22, 21, 23],'Grade':[85,92,78,95]}
df = pd.DataFrame(data)
print("OriginalDataFrame:")
print(df)
OUTPUT:
OriginalDataFrame:
19
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
1Bob 22 92
2Charlie 21 78
3David 23 95
20
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
Result:
The dataset will display only the rows that meet the filter criteria, improving focus and analysis
efficiency.
EX.NO:4.4 DATACLEANING
Aim:
To identify and correct errors, inconsistencies, and inaccuracies in the dataset to ensure its quality
and reliability.
PROGRAM
import pandas as pd
#Sampledatawithmissingvaluesandduplicates
data={'Name':['Alice','Bob','Charlie','David','Alice'],'Age':[20,None,21,23,22],'Grade':[85,92,78,95,92]}
df=pd.DataFrame(data)
print("Original DataFrame:")
print(df)
21
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
OriginalDataFrame:
In this example ,we have a Data Frame with missing values in the 'Age' column and aduplicate row for the
name 'Alice.'
22
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus the various variable and row filtering techniques in R for cleaning data, and then apply
different plot features on a sample dataset explored successfully
EX.NO:5 TIMESERIES
AIM:
To Perform Time Series Analysis and apply the various visualization techniques
DEFINITION
Timeseriesanalysisinvolvesanalyzingandmodelingdatacollectedovertimetoidentifypatterns, trends,
and make predictions. Here, I'll demonstrate timeseries analysis using Python with the pandas,
matplotlib, and seaborn libraries. I'll usea simpleexamplewith a synthetic time series dataset.
Firstly,ensureyouhavethe requiredlibrariesinstalled:
bash
pipinstallpandasmatplotlibseaborn
Now,let'screateasynthetictimeseriesdatasetandperformbasictimeseriesanalysis:
PROGRAM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
From datetime import datetime,timedelta
#Generateasynthetictimeseriesdataset
np.random.seed(42)
date_today=datetime.now()
days = pd.date_range(date_today, date_today + timedelta(9), freq='D')
23
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
This will generate a synthetic time series dataset and plot it using a line chart.
Now, let's perform some basic time series analysis and visualization techniques:
24
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
EX.NO:5.1 TRENDANALYSIS
DEFINITION:
Trend analysis examines data over time to identify long-termpatterns, helping discern upward,
downward, or stable directional changes.
PROGRAM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
25
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()
OUTPUT:
26
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
PROGRAM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#Generateasynthetictimeseriesdatasetwithaclearseasonalcomponent
np.random.seed(42)
date_today=pd.to_datetime('2023-01-01')
days=pd.date_range(date_today,date_today+pd.to_timedelta(365,unit='D'),freq='D')
#Creatingaseasonalcomponent
seasonal_component=np.sin(2*np.pi*np.arange(len(days))/365*7)
27
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
#Generatingsyntheticdatawithtrendandseasonalcomponents
trend_component = np.cumsum(np.random.randn(len(days))) values =
trend_component + 10 * seasonal_component
df = pd.DataFrame({'Date':days,'Value':values})
#Plottheoriginaltimeseriesdata
plt.figure(figsize=(10, 6))
plt.plot(df['Date'], df['Value'], label='Original Data')
plt.title('OriginalTimeSeriesDatawithSeasonalComponent')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()
OUTPUT:
28
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
Result:
Thus Time Series Analysis and apply the various visualization techniques Performed
successfully
AIM:
To Perform Data Analysis and representation on a Map using various Map data
sets with Mouse Rollover effect, user interaction, etc..
DEFINITION:
Performing data analysis and representation on a map involves visualizing geographic
data in away that provides in sights into spatial patterns. A popular tool for this is the
`folium` library in Python, which allows you to create interactive maps. Below is abasic
exampleusing`folium`withmouserollovereffectanduserinteraction.
PROGRAM
import folium
#LoadaGeoJSONdataset(youcanuseyourownGeoJSONfileorfindoneonline)
geojson_data = "https://raw.githubusercontent.com/nvkelso/natural-earth-
vector/master/geojson/ne_110m_admin_0_countries.geojson"
#Createafoliummapcenteredaroundthemeanlatitudeandlongitudeofthedataset m =
folium.Map(location=[0, 0], zoom_start=2)
#AddGeoJSONdatawithmouserollovereffectfolium.
29
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
GeoJson(
geojson_data,
name='geojson',
style_function=lambda x: {'fillColor': 'green', 'color': 'black'},
highlight_function=lambda x: {'fillColor': 'yellow', 'color': 'blue'},
tooltip=folium.features.GeoJsonTooltip(fields=['name'], labels=False,
sticky=False)
).add_to(m)
#Addlayercontrolforuserinteraction
folium.LayerControl().add_to(m)
# Display the
mapm.save("interactive_map_with_interaction.html")
OUTPUT:
When you run the script, it will create an HTML file (in this case,
"interactive_map_with_tooltip.html")inthesamedirectorywhereyourunthescript.
Open this HTML file in a web browser to visualize themap with mouserollover effects.
30
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus the Data Analysis and representation on a Map using various Map data sets with
Mouse Rollover effect, user interaction, etc.. are performed successfully.
DEFINITION:
Creating cartographic visualizations for multiple datasets involving various countries,
states ,or districts often involves combining data with geographical boundaries. Here's a
Python script that utilizes `geopandas` and `folium` tocreatevisualizationsforboth world
countries and states in India, along with fictional data for illustration
PROGRAM
Import folium
#Createafoliummapcenteredaroundaspecificlocation
m=folium.Map(location=[20.5937,78.9629],zoom_start=5)
folium.Marker([37.7749, -122.4194],
popup='USA').add_to(m)
folium.Marker([35.8617,104.1954],popup='China').add_to(m)
folium.Marker([20.5937,78.9629],popup='India').add_to(m)
#AddamarkerforafewIndianstates
folium.Marker([19.7515,75.7139],popup='Maharashtra').add_to(m)
folium.Marker([27.0238,74.2179],popup='Rajasthan').add_to(m)
# Save the
mapm.save("world_and_india_visualization_simple.html")
PROCEDURE:
Youneedtoruntheprovidedcodeonyourlocalmachinetoseetheoutput.Whenyourunthe script,
it will generate an HTML file named "world_and_india_visualization_simple.html" in
the same directory where you saved the script.
OUTPUT:
32
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
DEFINITION:
ExploratoryDataAnalysis(EDA)isacrucialstepinunderstandingthecharacteristicsofa
dataset.Let'sperformEDAonawinequalitydataset.Forthisexample,I'llusetheWine
Quality dataset available in the UCI Machine Learning Repository.
PROGRAM:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
#LoadtheWineQualitydataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-
white.csv"
wine_data=pd.read_csv(url,sep=';')
33
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
#Displaythefirstfewrowsofthedataset
print(wine_data.head())
# Summary
statisticsprint(wine_data.describe())
# Correlation heatmap
correlation_matrix =
wine_data.corr()
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix,annot=True,cmap='coolwarm',linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()
#Pairplotforselectedfeatures
selected_features=['fixedacidity','volatileacidity','citricacid','residualsugar','chlorides',
'quality']
sns.pairplot(wine_data[selected_features], hue='quality',
markers='o')
plt.title('Pairplot of Selected Features')
plt.show()
OUTPUT:
34
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
35
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
AIM:
To Use a case study on a data set and apply the various EDA and visualization
techniques and present an analysis report.
DEFINITION:
PROGRAM
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset (replace 'college_students.csv' with your actual file path)
df=pd.read_csv('college_students.csv')
print(df.head())
# Summary statistics
print(df.describe())
#UnivariateAnalysis
#Histogramsfornumericalvariables
df.plot(kind='hist', subplots=True, layout=(2, 2), figsize=(12, 10), bins=20,
title='Histograms')
plt.show()
#BivariateAnalysis
#Pairplotfornumericalvariables
sns.pairplot(df, hue='Gender', markers=['o', 's'], height=3) plt.suptitle('Pair Plot
ofNumerical Variables by Gender', y=1.02)
plt.show()
# Correlation heatmap
correlation_matrix=df.corr()
plt.figure(figsize=(8,6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('CorrelationHeatmap')
plt.show()
#InsightsandAnalysis
# You can print and analyze key insights based on the EDA performed
Csv file :
Age,Gender,GPA,StudyHours,Grade
24,Female,2.76,24,A
21,Male,2.53,25,D
22,Male,3.24,14,D
24,Female,2.77,12,F
20,Female,3.05,21,B
37
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
22,Female,3.62,29,F
22,Female,3.58,13,B
24,Female,2.96,25,F
19,Female,3.31,16,C
20,Female,3.26,22,C
24,Female,3.45,19,C
20,Male,2.88,16,C
20,Female,3.38,23,A
22,Female,3.97,14,C
21,Male,3.23,12,D
20,Female,3.86,20,D
23,Male,3.15,20,A
22,Male,3.03,27,C
19,Female,3.47,24,C
21,Male,3.5,21,C
23,Male,3.8,18,F
23,Male,2.85,19,B
19,Male,3.25,21,F
21,Female,3.36,26,B
22,Male,3.65,15,C
18,Female,2.57,16,C
21,Male,3.99,23,F
19,Male,3.2,22,F
23,Male,2.92,17,B
22,Male,3.83,19,D
21,Female,3.62,18,B
18,Female,3.93,27,F
38
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
18,Male,3.0,11,F
20,Male,3.33,14,A
20,Female,3.36,14,F
24,Male,3.97,15,A
19,Male,2.61,28,D
21,Male,2.96,17,B
21,Female,2.79,25,B
24,Female,2.9,22,A
23,Female,3.23,10,B
23,Male,3.06,29,F
24,Male,3.09,26,C
23,Female,3.77,16,A
20,Female,3.9,22,B
21,Female,2.61,13,A
24,Female,2.81,13,A
21,Male,3.51,15,C
18,Female,3.04,28,F
20,Male,2.88,21,A
22,Female,2.94,16,B
20,Male,2.98,19,D
24,Female,3.77,28,A
22,Female,2.7,16,A
18,Female,3.56,12,C
24,Female,3.33,22,F
19,Male,2.94,22,D
21,Female,3.13,27,B
18,Male,2.88,29,D
21,Male,3.42,17,B
23,Male,2.62,18,F
19,Male,2.51,16,B
19,Female,3.44,10,C
18,Male,2.79,12,C
19,Male,2.61,22,C
22,Male,3.1,26,C
19,Female,2.58,10,D
21,Female,3.83,15,F
21,Female,2.54,15,B
24,Female,3.37,21,B
21,Male,3.16,22,C
24,Male,3.51,22,C
21,Female,2.99,24,A
22,Male,2.73,25,F
24,Male,3.97,20,D
20,Male,3.76,14,B
23,Female,3.79,13,A
18,Female,2.88,12,A
21,Male,2.56,28,B
19,Female,2.95,29,D
39
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
21,Female,3.31,27,A
19,Female,2.99,24,A
23,Female,3.74,18,F
23,Female,2.91,26,D
23,Male,3.95,23,A
19,Female,3.19,24,D
21,Male,3.76,10,B
23,Male,2.79,12,C
22,Female,3.12,25,A
24,Male,3.55,20,F
19,Male,2.71,21,B
19,Male,2.7,19,D
21,Female,3.95,25,B
19,Female,3.57,17,A
19,Male,2.56,15,D
23,Female,3.1,21,C
21,Female,3.15,17,B
23,Female,3.62,13,A
24,Female,2.88,17,F
24,Male,2.78,27,D
23,Male,2.62,14,B
24,Female,3.14,18,B
21,Female,3.53,13,C
18,Male,2.59,26,C
23,Male,3.87,18,F
22,Female,3.16,10,F
22,Male,2.86,29,A
19,Female,2.64,22,A
24,Female,2.77,25,F
22,Male,3.9,22,F
19,Male,3.46,23,D
18,Female,3.28,12,C
21,Female,3.49,15,A
21,Male,3.15,27,C
21,Female,3.6,28,C
22,Male,2.57,14,F
18,Female,3.35,24,D
22,Male,2.74,11,B
24,Male,2.68,19,D
22,Male,3.01,27,D
18,Female,2.64,22,C
18,Female,2.64,14,D
24,Male,2.97,10,A
18,Female,3.97,10,C
18,Male,2.76,27,A
21,Male,2.53,24,B
24,Female,3.65,26,C
20,Female,3.71,20,B
40
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
20,Male,3.02,26,C
18,Female,3.2,22,F
20,Female,3.47,10,D
20,Male,2.57,11,F
18,Male,3.92,18,B
20,Female,3.83,12,D
22,Male,2.89,10,C
19,Male,2.52,25,D
24,Female,3.9,15,A
19,Male,3.25,26,D
18,Male,3.31,14,A
21,Female,3.53,14,D
24,Female,3.42,15,A
18,Male,3.92,12,B
21,Male,3.92,14,F
19,Female,3.8,14,C
18,Female,3.45,19,D
24,Female,3.7,19,F
24,Female,3.52,28,C
23,Male,3.36,26,C
22,Female,2.69,23,A
20,Female,3.72,18,B
21,Male,3.73,23,B
23,Female,3.44,10,F
20,Male,3.73,28,B
20,Female,3.48,22,D
18,Female,2.81,22,B
20,Female,2.91,13,F
22,Male,2.82,10,B
24,Male,3.07,26,D
23,Female,2.56,17,A
20,Male,3.43,11,F
18,Male,3.0,17,A
22,Male,3.48,16,A
19,Male,3.08,11,A
24,Male,3.52,12,C
24,Male,3.01,27,C
23,Female,2.89,21,A
24,Female,3.24,10,F
20,Male,3.54,21,D
18,Female,3.02,14,D
24,Female,3.9,26,B
24,Female,2.56,25,F
19,Female,3.13,24,C
19,Male,3.95,24,A
21,Female,3.32,14,B
22,Female,3.14,23,D
20,Male,3.35,11,C
41
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
24,Male,3.36,20,C
24,Female,3.6,28,A
18,Female,2.69,16,D
21,Female,2.88,15,F
22,Female,3.37,11,C
21,Female,3.8,15,A
23,Female,3.34,27,F
22,Male,2.86,11,D
24,Female,3.52,27,C
24,Female,3.61,24,F
22,Male,2.86,28,F
24,Male,3.07,11,F
20,Male,3.3,29,C
22,Male,3.24,15,C
21,Female,3.08,10,B
22,Female,2.95,24,D
24,Female,2.65,19,A
20,Female,2.58,28,F
20,Female,3.94,26,B
23,Female,3.77,14,A
21,Male,3.03,13,B
19,Female,3.94,19,C
19,Male,3.52,26,F
22,Male,3.22,19,A
42
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
OUTPUT:
43
Downloaded by hanipriya nirmal ([email protected])
lOMoARcPSD|57176504
RESULT:
Thus case study on a data set apply the various EDA and visualization
techniques and present an analysis report successfully.
44
Downloaded by hanipriya nirmal ([email protected])