0% found this document useful (0 votes)
13 views33 pages

Dav Practicals

The document outlines various practical exercises involving data analysis and visualization using Python and D3.js. It includes tasks such as calculating correlations, finding means and standard deviations for student scores, visualizing COVID cases, and applying machine learning algorithms like Random Forest and KNN. Additionally, it emphasizes the importance of data visualization techniques such as bar charts, pie charts, and bubble plots to represent data effectively.

Uploaded by

darshandarji1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views33 pages

Dav Practicals

The document outlines various practical exercises involving data analysis and visualization using Python and D3.js. It includes tasks such as calculating correlations, finding means and standard deviations for student scores, visualizing COVID cases, and applying machine learning algorithms like Random Forest and KNN. Additionally, it emphasizes the importance of data visualization techniques such as bar charts, pie charts, and bubble plots to represent data effectively.

Uploaded by

darshandarji1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Practical-1(A)

Aim: Prepare synthetic data set for student data, consisting of Enrollment
number, name, gender, semester wise, subject wise marks, difficulty level of
the subject, SPI (Semester Index), address with geographical location.

(i) Write a program to find correlation between gender and Semester


marks.
(ii) Write a program to find correlation between geographical location and
semester marks. Analyze which two are highly correlated.

Program:

import pandas as pd
from scipy.stats import spearmanr
df = pd.read_csv("Data.csv")

# i) correlation between gender and Semester marks


gender = df['Gender']
cpi = df['CPI']
corr1,p= spearmanr(gender, cpi)
print('correlation between gender and Semester marks: %f' %corr1)

# ii) correlation between geographical location and semester marks


geo_location = df['Address']
corr1,p= spearmanr(geo_location, cpi)
print('correlation between geographical location and semester marks: %f' %corr1)

Output:
B. Write a program to calculate correlation between difficulty level and
subject marks. The higher the difficulty level the marks should be less.
The two should be negatively correlated. Analyze the correlation.

Program:
import pandas as pd
df = pd.read_csv("Data.csv")

pds=df["PDS"]
easy,medium,hard=0,1,2

for i in range(0,len(pds)):
if(pds[i]<=33):
n[i]=hard
elif(pds[i]>33 & pds[i]<66):
n[i]=medium
else:
n[i]=easy
print("Correlation between difficulty level and PDS marks",n.corr(pds))

Output:
Practical-2

Aim: Consider the sample of 50 students. Gather the university exam score
of the students across all semesters of Engineering for one college. Write a
program to find out mean and standard deviation for this college. Now
consider the sample of students of different colleges of Gujarat for
university exam score. Write a program to find out mean and standard
deviation. Write the observations.
Clg-1

Program:

import pandas as pd
col_1=pd.read_csv("Clg_1.csv")
print(col_1.head())

Output:

Program:

average_marks=col_1.loc[:,"Average"]
print("The mean of college 1 is: "+str(average_marks.mean()))
print("The Standard Deviation of college 1 is: "+str(average_marks.std()))

Output:
Clg-2

Program:

import pandas as pd
col_2=pd.read_csv("Clg_2.csv")
print(col_2.head())

Output:

Program:

average_marks=col_2.loc[:,"Average"]
print("The mean of college 2 is: "+str(average_marks.mean()))
print("The Standard Deviation of college 2 is: "+str(average_marks.std()))

Output:
Clg-3

Program:

import pandas as pd
col_3=pd.read_csv("Clg_3.csv")
print(col_3.head())

Output:

Program:

average_marks=col_3.loc[:,"Average"]
print("The mean of college 3 is: "+str(average_marks.mean()))
print("The Standard Deviation of college 3 is: "+str(average_marks.std()))

Output:
Clg-4

Program:

import pandas as pd
col_4=pd.read_csv("Clg_4.csv")
print(col_4.head())

Output:

Program:

average_marks=col_4.loc[:,"Average"]
print("The mean of college 4 is: "+str(average_marks.mean()))
print("The Standard Deviation of college 4 is: "+str(average_marks.std()))

Output:
Clg-5

Program:

import pandas as pd
col_5=pd.read_csv("Clg_5.csv")
print(col_5.head())

Output:

Program:

average_marks=col_5.loc[:,"Average"]
print("The mean of college 5 is: "+str(average_marks.mean()))
print("The Standard Deviation of college 5 is: "+str(average_marks.std()))

Output:
Practical-3

Aim: Collect the month wise COVID cases data for cities – Ahmedabad,
Vadodara, Rajkot,Surat. Plot this time series Data. Analyze the trend as
per time.

Surat
Program:

import pandas as pd
surat=pd.read_csv("Surat.csv")
print(surat)

Output:

import matplotlib.pyplot as plt


plt.plot(surat['Month'],surat['Cases'],c='b')
plt.xlabel('Month')
plt.ylabel('Number of Cases')
plt.xticks(rotation=30)
plt.show()
Ahmedabad

Program:

import pandas as pd
Ahmedabad =pd.read_csv("Ahmedabad.csv")
print(Ahmedabad)

Output:

import matplotlib.pyplot as plt


plt.plot(Ahmedabad['Month'],Ahmedabad['Cases'],c='m')
plt.xlabel('Month')
plt.ylabel('Number of Cases')
plt.xticks(rotation=30)
plt.show()
Vadodara

Program:

import pandas as pd
Vadodara =pd.read_csv("Vadodara.csv")
print(Vadodara)

Output:

import matplotlib.pyplot as plt


plt.plot(Vadodara['Month'],Vadodara['Cases'],c='y')
plt.xlabel('Month')
plt.ylabel('Number of Cases')
plt.xticks(rotation=30)
plt.show()
Rajkot

Program:

import pandas as pd
Rajkot =pd.read_csv("Rajkot.csv")
print(Rajkot)

Output:

import matplotlib.pyplot as plt


plt.plot(Rajkot['Month'],Rajkot['Cases'],c='g')
plt.xlabel('Month')
plt.ylabel('Number of Cases')
plt.xticks(rotation=30)
plt.show()
plt.plot(Ahmedabad['Month'],Ahmedabad['Cases'],c='m')
plt.plot(surat['Month'],surat['Cases'],c='b')
plt.plot(Vadodara['Month'],Vadodara['Cases'],c='y')
plt.plot(Rajkot['Month'],Rajkot['Cases'],c='g')
plt.xlabel('Month')
plt.ylabel('Number of Cases')
plt.xticks(rotation=30)
plt.legend(['Ahmedabad','Surat','Vadodara','Rajkot'])
plt.sho
Practical-4
Aim: There is a need to advice the 12th standard students that which
college he/she should choose for engineering education. Decide the features
to use for grading the engineering college. Prepare the data set. Write a
program to apply random forest algorithm and suggest the best suited
college for 12th standard students.
Program:

import pandas as pd
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor

college_path = ('/content/College_Data.csv')
college_data = pd.read_csv(college_path)
y = college_data.Grad_Rate
features = ['Apps', 'Accept', 'Enroll', 'Top10perc', 'Top25perc', 'F.Undergrad'
, 'P.Undergrad','Outstate','Room.Board','Books','Personal','PhD'
,'Terminal','S.F.Ratio','perc.alumni','Expend']
X = college_data[features]

train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=1)

college_model = DecisionTreeRegressor(random_state=1)
college_model.fit(train_X, train_y)

val_predictions = college_model.predict(val_X) val_mae =


mean_absolute_error(val_predictions, val_y)
print("Validation MAE when not specifying max_leaf_nodes: {:,.0f}".format(val_mae))

college_model = DecisionTreeRegressor(max_leaf_nodes=100, random_state=1)


college_model.fit(train_X, train_y)
val_predictions = college_model.predict(val_X)
val_mae = mean_absolute_error(val_predictions, val_y)
print("Validation MAE for best value of max_leaf_nodes: {:,.0f}".format(val_mae))

from sklearn.ensemble import RandomForestRegressor

rf_model = RandomForestRegressor(random_state=3)
rf_model.fit(train_X,train_y)
predictions=rf_model.predict(val_X)
rf_val_mae = mean_absolute_error(predictions,val_y)
print("Validation MAE for Random Forest Model: {}".format(rf_val_mae))

val_X
val = val_X.reset_index()
val

result=pd.Series(predictions)
result.max()

result.sort_values()
val_X.reset_index().iloc[62]

college_data.iloc[60]
Practical-5

Aim: Consider the following data set. Write a program for KNN algorithm
to find out weight lifting category for height 161cm and weight 61kg.

Program:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("/content/data.csv")
df.head(3)

df.dtypes

df['Weightlifting'] = df.Weightlifting.astype('category')
df['Weightlifting'] = df.Weightlifting.astype('category')
df.dtypes

x=df.iloc[:,0:2]
x.head(3)

y=df.iloc[:,2]
y.head(3)

from sklearn.model_selection
import train_test_split
x_train, x_test, y_train, y_test=train_test_split(x, y,test_size=0.3,random_state=0)
from sklearn.neighbors import KNeighborsClassifier
knn=KNeighborsClassifier(n_neighbors=2,weights="distance",metric="euclidean")
knn.fit(x_train,y_train)

from sklearn.metrics import accuracy_score


y_pred=knn.predict(x_test)
print("Accuracy of test set=",accuracy_score(y_test, y_pred)*100)

knn.predict([[161,61]])
Practical-6

Aim: Take the data of the students prepared in exercise 1. Visualize the
data to show region wise results, branch wise results, subject wise results.
Decide the visualization technique to show appropriate data. Display 1)
bar chart, 2) pie chart, 3) maps, 4) scatter plot

Program:

1. Region wise Results :

Program:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_excel('/content/DAV_1_Data (1).xlsx')
mean2 = list(df.groupby(['Address'])['CPI'].mean())
mean2

df['Address'].value_counts()

cities = ['surat','bardoli','vyara','navsari','bilimora']
plt.bar(cities, mean2, color='r')
plt.title("Region Wise Results")
plt.xlabel("Address")
plt.ylabel("Average CPI")
plt.show()
2. Subject wise Results

Program:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel('/content/DAV_1_Data (1).xlsx')
meanPDS = df['PDS'].mean()
meanADA = df['ADA'].mean()
meanCN = df['CN'].mean()
meanSE = df['SE'].mean()
meanIPDC = df['IPDC'].mean()

mean = []
mean.append(meanPDS)
mean.append(meanADA)
mean.append(meanCN)
mean.append(meanSE)
mean.append(meanIPDC)
mean

Subjects = ['PDS','ADA','CN','SE','IPDC']
plt.pie(mean, labels = Subjects,autopct='%.0f%%')
plt.title("Subject Wise Results")
plt.legend(Subjects,title="Subject",loc="upper right", bbox_to_anchor =(1, 0, 0.5, 1))
plt.show()
Practical-7

Aim: Use D3.js to show following. (i) Take year wise population. (ii) Show
appropriate size circle for population as per year. (iii) Fill color in circle.
(iv) Prepare bar chart and pie chart. (v) Explore other functionality of
D3.js
(i) Take year wise population.

We have used a year wise population dataset which is shown below.

(ii) Show appropriate size circle for population as per year.

Program:

<!DOCTYPE html>
<head>
<title>Bubble plot</title>
<style>
svg {font: 10px font family "Times New Roman";}
</style>
</head>
<svg width="500" height="500" font-family="Times New Roman" font-size="15" text-
anchor="middle"></svg>
<script src="https://d3js.org/d3.v4.min.js"></script>
<body>
<script>
var svg = d3.select("svg"), width = +svg.attr("width"), height = +svg.attr("height");
svg.append("text")
.attr("x", 100)
.attr("y", -20 )
.attr("dy", "3.5em" )
.attr("text-anchor", "start")
.style("font-size", "20px")
.style("font-weight", "bold")
.text("Yearwise Population")

var pack = d3.pack()


.size([width-150, height])
.padding(1.5);

d3.csv("year_wise_population.csv", function(d)
{
d.value = +d["Population"]; d.Call_Type = d["Year"] return d;
},
function(error, data)
{
if (error) throw error;
var color = d3.scaleOrdinal()
.domain(data.map(function(d){ return d.Call_Type;}))
.range(['#728FCE']);
var root = d3.hierarchy({children: data})
.sum(function(d) { return d.value; })

var node = svg.selectAll(".node")


.data(pack(root).leaves())
.enter().append("g")
.attr("class", "node")
.attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; });

node.append("circle")
.attr("id", function(d) { return d.id; })
.attr("r", function(d) { return d.r; })
.style("fill", function(d) { return color(d.data.Call_Type); })

node.append("text")
.text(function(d)
{
if (d.data.value > 2000){return d.data.Call_Type;} return "";
});
var legend = svg.selectAll(".legend")
.data(data).enter()
.append("g")
.attr("class","legend")
.attr("transform", "translate(" + 380 + "," + 20+ ")");

legend.append("rect")
.attr("x", 7)
.attr("y", function(d, i) { return 20 * i; })
.attr ("width", 10)
.attr("height", 10)
.style("fill", function(d) { return color(d.Call_Type)});
legend.append("text")
.attr("x", 25)
.attr("text-anchor", "start")
.attr("dy", "1.001em")
.attr("y", function(d, i) { return 20 * i; })
.text(function(d) {return d.Call_Type;})
.attr("font-size", "11px");
legend.append("text")
.attr("x",30)
.attr("dy", "-.5em")
.attr("y",0)
.text("Years")
.attr("font-size", "15px");
});
</script>
</body>
</html>

Output:
(iii) Fill color in circle.

Program:

<!DOCTYPE html>
<head>
<title>Bubble plot</title>
<style>
svg {font: 10px font family "Times New Roman";}
</style>
</head>
<svg width="500" height="500" font-family="Times New Roman" font-size="15" text-
anchor="middle"></svg>
<script src="https://d3js.org/d3.v4.min.js"></script>
<body>
<script>

var svg = d3.select("svg"),


width = +svg.attr("width"),
height = +svg.attr("height");
svg.append("text")
.attr("x", 100)
.attr("y", -20 )
.attr("dy", "3.5em" )
.attr("text-anchor", "start")
.style("font-size", "20px")
.style("font-weight", "bold")
.text("Yearwise Population")
var pack = d3.pack()
.size([width-150, height])
.padding(1.5);

d3.csv("year_wise_population.csv",
function(d)
{
d.value = +d["Population"];
d.Call_Type = d["Year"]
return d;
},
function(error, data)
{
if (error) throw error;
var color = d3.scaleOrdinal()
.domain(data.map(function(d){ return d.Call_Type;}))
.range(['#728FCE','#98AFC7','#57FEFF','#8EEBEC','#F4A460',
'#FA8072','#F75D59','#E799A3','#F8B88B','#E38AAE',
'#DA70D6','#F433FF','#6960EC','#B666D2','#C6AEC7',
'#DDA0DD','#CCCCFF','#9172EC','#E45E9D','#C25283',
'#F8B88B','#E8ADAA']);

var root = d3.hierarchy({children: data})


.sum(function(d) { return d.value; })
var node = svg.selectAll(".node")
.data(pack(root).leaves())
.enter().append("g")
.attr("class", "node")
.attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; });
node.append("circle")
.attr("id", function(d) { return d.id; })
.attr("r", function(d) { return d.r; })
.style("fill", function(d) { return color(d.data.Call_Type); })
node.append("text")
.text(function(d)
{
if (d.data.value > 2000){return d.data.Call_Type;} return "";
});
var legend = svg.selectAll(".legend")
.data(data).enter()
.append("g")
.attr("class","legend")
.attr("transform", "translate(" + 380 + "," + 20+ ")"); legend.append("rect")
.attr("x", 7)
.attr("y", function(d, i) { return 20 * i; })
.attr("width", 10)
.attr("height", 10)
.style("fill", function(d) { return color(d.Call_Type)});
legend.append("text")
.attr("x", 25)
.attr("text-anchor", "start")
.attr("dy", "1.001em")
.attr("y", function(d, i) { return 20 * i; })
.text(function(d) {return d.Call_Type;})
.attr("font-size", "11px");
legend.append("text")
.attr("x",30)
.attr("dy", "-.5em")
.attr("y",0)
.text("Years")
.attr("font-size", "15px");
});
</script>
</body>
</html>
Output:
(iv) Prepare bar chart and pie chart.

Bar chart

Program:

<!doctype html>
<html>
<head>
<style>
.bar {
fill: rgb(174, 0, 255);
}
</style>
<script src="https://d3js.org/d3.v4.min.js"></script>
<body>
<svg width="1000" height="700"></svg>
<script>

var svg = d3.select("svg"), margin = 200,


width = svg.attr("width") - margin,
height = svg.attr("height") - margin

svg.append("text")
.attr("transform", "translate(100,0)")
.attr("x", 250)
.attr("y", 70)
.attr("font-size", "24px")
.text("Yearwise Population")

var xScale = d3.scaleBand().range([0, width]).padding(0.4), yScale =


d3.scaleLinear().range([height, 0]);

var g = svg.append("g")
.attr("transform", "translate(" + 100 + "," + 100 + ")");

d3.csv("year_wise_population.csv", function(error, data) { if (error) {


throw error;
}

xScale.domain(data.map(function(d) { return d.Year; }));


yScale.domain([0, d3.max(data, function(d) { return d.Population; })]);
g.append("g")
.attr("transform", "translate(0," + height + ")")
.call(d3.axisBottom(xScale))
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-45)" )
.append("text")
.attr("y", height - 450)
.attr("x", width - 200)
.attr("text-anchor", "end")
.attr("stroke", "black")
.text("Years");

g.append("g")
.call(d3.axisLeft(yScale).tickFormat(function(d){ return d;
})
.ticks(10))
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dx", "-20.1em")
.attr("dy", "-9.1em")
.attr("text-anchor", "end")
.attr("stroke", "black")
.text("Population");

g.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return xScale(d.Year); })
.attr("y", function(d) { return yScale(d.Population); })
.attr("width", xScale.bandwidth())
.attr("height", function(d) { return height - yScale(d.Population); });
});

</script>
</body>
</html>

Output:

Pie chart
(v) Pie chart

Program:

<!DOCTYPE html>
<html>
<head>
<style>
.arc text {
font: 15px Times New Roman; text-anchor: middle;
}
.arc path {
stroke: rgb(255, 255, 255);
}
.title {
fill: rgb(0, 0, 0); font-weight: bold; text-anchor: start;
font: 20px Times New Roman;
}
</style>
<script src="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<div>
<svg width="400" height="400" style="margin-top: 0%;"></svg>
<script>
var svg = d3.select("svg"), width = svg.attr("width"), height = svg.attr("height"),
radius = Math.min(width, height) / 2;

var g = svg.append("g")
.attr("transform", "translate(" + width / 2 + "," + height / 2 + ")");

var color = d3.scaleOrdinal(['#FC6C85',


'#DC143C','#728FCE','#98AFC7','#57FEFF','#8EEBEC','#F4A460',
'#FA8072','#F75D59','#E799A3','#F8B88B','#E38AAE',
'#DA70D6','#F433FF','#6960EC','#B666D2','#C6AEC7',
'#DDA0DD','#CCCCFF','#9172EC','#E45E9D','#C25283']);

var pie = d3.pie().value(function (d) { return d.Population;


});

var path = d3.arc()


.outerRadius(radius - 30)
.innerRadius(0);

var label = d3.arc()


.outerRadius(radius - 30)
.innerRadius(radius - 80);

d3.csv("year_wise_population.csv", function (error, data) { if (error) {


throw error;
}
var arc = g.selectAll(".arc")
.data(pie(data))
.enter().append("g")
.attr("class", "arc");

arc.append("path")
.attr("d", path)
.attr("fill", function (d) { return color(d.data.Year); });
console.log(arc)

arc.append("text")
.attr("transform", function (d) {
var midAngle = d.endAngle < Math.PI ? d.startAngle/2 + d.endAngle/2 :
d.startAngle/2 + d.endAngle/2 + Math.PI ;
return "translate(" + label.centroid(d)[0] + "," + label.centroid(d)[1] + ") rotate(- 90)
rotate(" + (midAngle * 180/Math.PI) + ")";
})
.text(function (d) { return d.data.Year; });
});

svg.append("g")
.attr("transform", "translate(" + (width / 2 - 70) + "," + 20 + ")")
.append("text")
.attr("class", "title")
.text("Yearwise Population")

</script>
</div>
</body>
</html>

Output:
(vi) Explore other functionality of D3.js
1. event handling

As in all other libraries, D3 also supports built-in events and custom events. We can bind an
event listener to any DOM element using d3.selection.on() method.

Program:

<html>
<head>
<style> div {
height: 100px; width: 100px;
background-color: steelblue; margin:5px;
}
</style>
<scriptsrc="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<div></div>
<script>

d3.selectAll("div")
.on("mouseover", function(){ d3.select(this)
.style("background-color", "orange");

// Get current event info console.log(d3.event);

// Get x & y co-ordinates


console.log(d3.mouse(this));
})
.on("mouseout", function(){ d3.select(this)
.style("background-color", "steelblue")
});
</script>
</body>
</html>
Output:

2. Animation:

D3 simplifies the process of animations with transitions. Transitions are made on DOM
selections using <selection>.transition() method. The following table lists important methods
for animation in D3.

Program:
<html>
<head>
<scriptsrc="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<script>
varsvg = d3.select("body")
.append("svg")
.attr("width", 500)
.attr("height", 500);

var bar1 = svg.append("rect")


.attr("fill", "blue")
.attr("x", 100)
.attr("y", 20)
.attr("height", 20)
.attr("width", 10)

var bar2 = svg.append("rect")


.attr("fill", "blue")
.attr("x", 120)
.attr("y", 20)
.attr("height", 20)
.attr("width", 10)
update();

function update() { bar1.transition()


.ease(d3.easeLinear)
.duration(2000)
.attr("height",100) bar2.transition()
.ease(d3.easeLinear)
.duration(2000)
.delay(2000)
.attr("height",100)
}
</script>
</body>
</html>

Output:

3. data binding
Program:

<html>
<head>
<scriptsrc="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>

<script>
var matrix = [
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]
];

vartr = d3.select("body")
.append("table") // adds <table>
.selectAll("tr") // selects all <tr>
.data(matrix) // joins matrix array
.enter()// create placeholders for each row in the array
.append("tr");// create <tr> in each placeholder

var td = tr.selectAll("td")
.data(function (d) { // joins inner array of each row
console.log(d);
return d;
})
.enter()// create placeholders for each element in an inner array
.append("td") // creates <td> in each placeholder
.text(function (d) {
console.log(d);
return d; // add value of each inner array as a text in <td>
});
</script>
</body>
</html>

Output:

You might also like