ML Lab Mannual
ML Lab Mannual
Given the following data, which specify classifications for nine combinations of
VAR1 and VAR2 predict a classification for a case where VAR1=0.906 and
VAR2=0.606, using the result of k-means clustering with 3 means (i.e., 3
centroids)
Home
Income Recreation Job Status Age group Risk level
owner
medium skiing design single twenties no highRisk
high golf trading married forties yes lowRisk
low speedway transport married thirties yes medRisk
5 medium football banking single thirties yes lowRisk
high flying media married fifties yes highRisk
low football security single twenties no medRisk
medium golf media single thirties yes medRisk
medium golf transport married forties yes lowRisk
high skiing banking single thirties yes highRisk
low golf unemployed married forties yes highRisk
10 Extracting data from the text file, excel file, remote location file
#1
The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school days in
a week, the probability that it is Friday is 20 %. What is the probability that a student is absent
given that today is Friday? Apply Baye’s rule in python to get the result.
Source Code:
posterior_prob=0
print('posterior_prob',posterior_prob)
Output: 15.0
#2
Source Code:
import csv
import pandas as pd
mydata= pd.read_csv("D:\\iris.csv")
print(mydata)
Output:
#3
Source Code1:
%matplotlib inline
import numpy as np
cluster_std=0.60, random_state=0)
kmeans = KMeans(n_clusters=4)
kmeans.fit(X)
y_kmeans = kmeans.predict(X)
centers = kmeans.cluster_centers_
Output:
4 clusters 3 clusters
Source Code2:
# Loading data
irisData = load_iris()
X = irisData.data
y = irisData.target
knn = KNeighborsClassifier(n_neighbors=7)
knn.fit(X_train, y_train)
print(knn.predict(X_test))
# Loading data
irisData = load_iris()
X = irisData.data
y = irisData.target
knn.fit(X_train, y_train)
print(knn.score(X_test, y_test))
import numpy as np
irisData = load_iris()
X = irisData.data
y = irisData.target
neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))
for i, k in enumerate(neighbors):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
# Generate plot
plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()
Output:
Clusters: [1 0 2 1 1 0 1 2 2 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]
Accuracy: 0.9666666666666667
Accuracy plot
#4
Given the following data, which specify classifications for nine combinations of VAR1 and VAR2
predict a classification for a case where VAR1=0.906 and VAR2=0.606, using the result of k-means
clustering with 3 means (i.e., 3 centroids)
import numpy as np
import pandas as pd
print(dataset)
print(x)
wcss = []
wcss.append(kmeans.inertia_)
print(wcss)
#Plotting the results onto a line graph, allowing us to observe 'The elbow'
plt.xlabel('Number of clusters')
plt.show()
print(kmeans)
y_kmeans = kmeans.fit_predict(x)
#z_kmeans = kmeans.fit_predict(test)
print(y_kmeans)
plt.legend()
Output:
[0 1 1 0 2 0 2 1 2]
#5
The following training examples map descriptions of individuals onto high, medium and low
creditworthiness.
Age Home
Income Recreation Job Status Risk level
group owner
medium skiing design single twenties no highRisk
high golf trading married forties yes lowRisk
low speedway transport married thirties yes medRisk
medium football banking single thirties yes lowRisk
high flying media married fifties yes highRisk
low football security single twenties no medRisk
medium golf media single thirties yes medRisk
medium golf transport married forties yes lowRisk
high skiing banking single thirties yes highRisk
low golf unemployed married forties yes highRisk
Find the unconditional probability of `golf' and the conditional probability of `single' given
`medRisk' in the dataset?
import numpy as np
import pandas as pd
import os
dataset = pd.read_csv('credit.csv')
total= len(dataset)
#print(os.getcwd())
k= dataset.recreation.value_counts().golf
Unconditional_probability= k/total*100
print('Unconditional probability:',Unconditional_probability)
# for i in dataset:
count=0
# count=count+1
#print(count)
def cond_prob(dataset):
global count
count=count+1
count_value = dataset.apply(cond_prob,axis=1)
#print(count)
conditional_probability = (count/total)*100
print('conditional probability:',conditional_probability)
Output
#6
Source Code:
import numpy as np
# number of observations/points
n = np.size(x)
m_x = np.mean(x)
m_y = np.mean(y)
# putting labels
plt.xlabel('x')
plt.ylabel('y')
plt.show()
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
plot_regression_line(x, y, b)
main()
Input:
x = ([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Output:
Estimated coefficients:
b_0 = 1.2363636363636363
b_1 = 1.1696969696969697
%matplotlib inline
import numpy as np
Source Code:
%matplotlib inline
import numpy as np
model = GaussianNB()
model.fit(X, y);
rng = np.random.RandomState(0)
ynew = model.predict(Xnew)
lim = plt.axis()
plt.axis(lim);
yprob = model.predict_proba(Xnew)
yprob[-8:].round(2)
Output Plot:
Gaussian naïve Bayes Classification Example-3
iris = load_iris()
X = iris.data
y = iris.target
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
# comparing actual response values (y_test) with predicted response values (y_pred)
Output:
# Genetic algorithm to evaluate a binary string based on the number of 1's in the string. Example:
a bit string with a length of 20 bits will have a score of 20 for a string of all 1's in the string.
(1111111111111111111 = 20, 11111111110000000000 =10)
Source Code:
# objective function
def onemax(x):
return -sum(x)
# tournament selection
selection_ix = randint(len(pop))
selection_ix = ix
return pop[selection_ix]
pt = randint(1, len(p1)-2)
# perform crossover
c1 = p1[:pt] + p2[pt:]
c2 = p2[:pt] + p1[pt:]
# mutation operator
for i in range(len(bitstring)):
bitstring[i] = 1 - bitstring[i]
# genetic algorithm
# enumerate generations
for i in range(n_pop):
# select parents
children = list()
# mutation
mutation(c, r_mut)
children.append(c)
# replace population
pop = children
n_iter = 100
# bits
n_bits = 20
n_pop = 100
# crossover rate
r_cross = 0.9
# mutation rate
r_mut = 1.0 / float(n_bits)
print('Done!')
plt.plot(best)
Output:
Implement the finite words classification system using the Backpropagation algorithm?
Source Code:
# Initialize a network
network = list()
network.append(hidden_layer)
network.append(output_layer)
return network
activation = weights[-1]
for i in range(len(weights)-1):
return activation
def transfer(activation):
inputs = row
new_inputs = []
for neuron in layer:
neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs
def transfer_derivative(output):
for i in reversed(range(len(network))):
layer = network[i]
errors = list()
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(expected[j] - neuron['output'])
for j in range(len(layer)):
neuron = layer[j]
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
for j in range(len(inputs)):
sum_error = 0
expected[row[-1]] = 1
backward_propagate_error(network, expected)
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
print(layer)
Output: