0% found this document useful (0 votes)
177 views13 pages

Machine Learning Lab

The document provides examples of implementing various machine learning algorithms in Python, including: 1. Bayes' theorem to calculate conditional probability. 2. K-nearest neighbors classification. 3. Linear regression. 4. Naive Bayes classification of text data. 5. Genetic algorithm for optimization problems. 6. Backpropagation neural network for classification. The examples cover both supervised and unsupervised learning techniques such as clustering, classification, and regression. Python libraries like Scikit-learn, NumPy and Pandas are used to demonstrate practical implementations of common machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views13 pages

Machine Learning Lab

The document provides examples of implementing various machine learning algorithms in Python, including: 1. Bayes' theorem to calculate conditional probability. 2. K-nearest neighbors classification. 3. Linear regression. 4. Naive Bayes classification of text data. 5. Genetic algorithm for optimization problems. 6. Backpropagation neural network for classification. The examples cover both supervised and unsupervised learning techniques such as clustering, classification, and regression. Python libraries like Scikit-learn, NumPy and Pandas are used to demonstrate practical implementations of common machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Machine Learning

Lab
Manual
1. The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school days
in a week, the probability that it is Friday is 20 %. What is theprobability that a student is absent
given that today is Friday? Apply Baye’s rule in python to get the result. (Ans: 15%)

Ans:
def bayes_theorem(p_f, p_a, p_f_given_a):
p_a_given_f=(p_f_given_a*p_a)/p_f
return p_a_given_f
p_a=1
p_f=0.2
p_f_given_a=0.03
result=bayes_theorem(p_f, p_a, p_f_given_a)
print("The result is:",(result*100))

Ans:15

2. Extract the data from database using python

3. Implement k-nearest neighbours classification using python


import math
def classifyAPoint(points, p, k=3):

distance = []
for group in points:
for feature in points[group]:

euclidean_distance = math.sqrt((feature[0] - p[0]) ** 2 + (feature[1] - p[1]) ** 2)


distance.append((euclidean_distance, group))
distance = sorted(distance)[:k]

freq1 = 0
freq2 = 0
for d in distance:
if d[1] == 0:
freq1 += 1
elif d[1] == 1:
freq2 += 1

return 0 if freq1 > freq2 else 1

def main():

points = {
0: [(1, 12), (2, 5), (3, 6), (3, 10), (3.5, 8), (2, 11), (2, 9), (1, 7)],
1: [(5, 3), (3, 2), (1.5, 9), (7, 2), (6, 1), (3.8, 1), (5.6, 4), (4, 2), (2, 5)]}

p = (1, 12)
k=2
print("The value classified to unknown point is: {} ".format(classifyAPoint(points, p, k)))
if _name_ == '_main_':
main()
4.Given the following data, which specify classifications for nine combinations of
VAR1 and VAR2 predict a classification for a case where VAR1=0.906 and
VAR2=0.606, using the result of kmeans clustering with 3 means (i.e., 3 centroids)

VAR1 VAR2 CLASS


1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1

from sklearn.cluster import KMeans


import numpy as np
X = np.array([[1.713,1.586], [0.180,1.786], [0.353,1.240],
[0.940,1.566], [1.486,0.759], [1.266,1.106],[1.540,0.419],[0.459,1.799],[0.773,0.186]])
y=np.array([0,1,1,0,1,0,1,1,1])
kmeans = KMeans(n_clusters=3, random_state=0).fit(X,y)
kmeans.predict([[0.906, 0.606]])

Output:
array([0])

To print cluster centres:


kmeans.cluster_centers_

array([[1.26633333, 0.45466667], [0.33066667, 1.60833333], [1.30633333, 1.41933333]])


5. The following training examples map descriptions of individuals onto high, medium and
low credit-worthiness.
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes -> medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk
Input attributes are (from left to right) income, recreation, job, status, age-group, home-
owner. Find the unconditional probability of `golf' and the conditional probability of
`single' given `medRisk' in the dataset?

import numpy as np
import pandas as pd
df = pd.read_csv(r'E:\test\e5.csv')
df.head()
len(df)
print(len(df))
print(df)
#P(A|B) = P(A ∩ B) / P(B)
data=pd.crosstab(df.status,df.risk,normalize=True,margins=True)
print(data)
a=data.medrisk.single
print(a)
b=(np.sum((data.medrisk.single)+(data.medrisk.married)))
print("The conditional Prob is")
print(a/b)
data1=pd.crosstab(df.recreation,df.risk,normalize=True,margins=True)
print(data1)

x=data1.medrisk.golf
y=(np.sum((data1.highrisk.golf)+(data1.medrisk.golf)+(data1.lowrisk.golf)))
print(y)
print("The unconditional Prob is")
print(x/y)

Output:
Unconditional probability of golf: = 0.4
Conditional probability of single given medRisk: = 0.6666666666666667
6. Implement linear regression using python.

import matplotlib.pyplot as plt


from scipy import stats
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
slope, intercept, r, p, std_err = stats.linregress(x, y)
def myfunc(x):
return slope * x + intercept
mymodel = list(map(myfunc, x))
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

7. Implement Naïve Bayes theorem to classify the English text


import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer from
sklearn.model_selection import train_test_split from
sklearn.naive_bayes import MultinomialNB from sklearn import
metrics
data=pd.read_csv(r'E:\test\naivebayes.csv',names=['text','label'])
print("\n the dataset is:\n",data)
print("\m the dimensions of the dataset",data.shape)
data['labelnum']= data.label.map({'yes':1, 'no':0})
x=data.text
y=data.labelnum
print(x)
print(y)
vectorizer=TfidfVectorizer()
data=vectorizer.fit_transform(x)
print('\n the Tfidf features of Dataset:\n')
df=pd.DataFrame(data.toarray(),columns=vectorizer.get_feature_names())
df.head()
print("\n Train test split:\n")
xtrain,xtest,ytrain,ytest=train_test_split(data,y,test_size=0.3,random_state=2)
print("\n the total number of training data",ytrain.shape)
print("\n the total number of test data",ytest.shape)
clf=MultinomialNB().fit(xtrain,ytrain)
predicted=clf.predict(xtest)
print("\n the accuracy of classifier is",metrics.accuracy_score(ytest,predicted))
print("\n the confusion matrix is",metrics.classification_report(ytest,predicted))
print("\n the value of precision is",metrics.precision_score(ytest,predicted))
print("\n the value of recall is",metrics.recall_score(ytest,predicted))

OUTPUT:
The Tfidf features of Dataset:
Train test split:
the total number of training data (12,)
the total number of test data (6,)
the accuracy of classifier is 0.6666666666666666
8. Implement an algorithm to demonstrate the significance of genetic algorithm
# genetic algorithm search of the one max optimization problem from
numpy.random import randint
from numpy.random import rand

# objective function
def onemax(x):
return -sum(x)

# tournament selection
def selection(pop, scores, k=3):
# first random selection selection_ix =
randint(len(pop)) for ix in randint(0,
len(pop), k-1):
# check if better (e.g. perform a tournament) if
scores[ix] < scores[selection_ix]:
selection_ix = ix
return pop[selection_ix]

# crossover two parents to create two children def


crossover(p1, p2, r_cross):
# children are copies of parents by default c1, c2 =
p1.copy(), p2.copy()
# check for recombination
if rand() < r_cross:
# select crossover point that is not on the end of the string pt =
randint(1, len(p1)-2)
# perform crossover
c1 = p1[:pt] + p2[pt:]
c2 = p2[:pt] + p1[pt:]
return [c1, c2]

# mutation operator
def mutation(bitstring, r_mut):
for i in range(len(bitstring)):
# check for a mutation if rand() <
r_mut:
# flip the bit
bitstring[i] = 1 - bitstring[i]
genetic algorithm
def genetic_algorithm(objective, n_bits, n_iter, n_pop, r_cross, r_mut):
# initial population of random bitstring
pop = [randint(0, 2, n_bits).tolist() for _ in range(n_pop)]
# keep track of best solution
best, best_eval = 0, objective(pop[0])
# enumerate generations for gen in
range(n_iter):
# evaluate all candidates in the population scores =
[objective(c) for c in pop]
# check for new best solution
for i in range(n_pop):
if scores[i] < best_eval:
best, best_eval = pop[i], scores[i]
print(">%d, new best f(%s) = %.3f" % (gen, pop[i], scores[i]))
# select parents
selected = [selection(pop, scores) for _ in range(n_pop)]
# create the next generation children
= list()
for i in range(0, n_pop, 2):
# get selected parents in pairs p1, p2 =
selected[i], selected[i+1]
# crossover and mutation
for c in crossover(p1, p2, r_cross):
# mutation mutation(c,
r_mut)
# store for next generation
children.append(c)
# replace population
pop = children
return [best, best_eval]

# define the total iterations n_iter =


100
# bits
n_bits = 20
# define the population size n_pop =
100
# crossover rate
r_cross = 0.9
# mutation rate

9. 9. Implement the finite words classification system using Back-propagation algorithm

import numpy as np

X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)

y = np.array(([92], [86], [89]), dtype=float)

X = X/np.amax(X,axis=0)

y = y/100

def sigmoid (x):

return 1/(1 + np.exp(-x))

def derivatives_sigmoid(x):

return x * (1 - x)
epoch=7000

lr=0.1

inputlayer_neurons = 2

hiddenlayer_neurons = 3

output_neurons = 1

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))

bh=np.random.uniform(size=(1,hiddenlayer_neurons))

wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))

bout=np.random.uniform(size=(1,output_neurons))

for i in range(epoch):

hinp1=np.dot(X,wh)

hinp=hinp1 + bh

hlayer_act = sigmoid(hinp)

outinp1=np.dot(hlayer_act,wout)

outinp= outinp1+ bout


output = sigmoid(outinp)

EO = y-output

outgrad = derivatives_sigmoid(output)

d_output = EO* outgrad

EH = d_output.dot(wout.T)

hiddengrad = derivatives_sigmoid(hlayer_act)

d_hiddenlayer = EH * hiddengrad

wout += hlayer_act.T.dot(d_output) *lr

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))


print("Predicted Output: \n" ,output)

print("Actual Output: \n" + str(y))

print("Predicted Output: \n" ,output)

OUTPUT:

Input:

[[0.66666667 1. ]

[0.33333333 0.55555556]

[1. 0.66666667]]

Actual Output: [[0.92]

[0.86]

[0.89]]

Predicted Output:

[[0.82425907]

[0.81420906]

[0.82360573]]

You might also like