6CS4-22 - ML Lab Manual
6CS4-22 - ML Lab Manual
TABLE OF CONTENT
S.No. Topic/Name of Experiment Page
Number
GENERAL DETAILS
1 Vision & Mission of Institute and Department iv
A Zero Lecture 1
Write a program to demonstrate basic data type in python. 3
1
Write a program to compute distance between two points taking input 5
2
from the user Write a program add.py that takes 2 numbers as command
line arguments and prints its sum.
VISION&MISSION
INSTITUTE VISION & MISSION
VISION
To create knowledge based society with scientific temper, team spirit and dignity of
labor to face the global competitive challenges
MISSION
To evolve and develop skill based systems for effective delivery of knowledge so as
to equip young professionals with dedication & commitment to excellence in all
spheres of life.
VISION
Evolve as a center of excellence with wider recognition and to adapt the rapid innovation
in Computer Engineering.
MISSION
To contribute significantly to the research and the discovery of new arenas of knowledge
and methods in the rapid developing field of Computer Engineering.
To support society through participation and transfer of advanced technology from one
sector to another.
iv
ML Lab (6CS4-22) manual Department of Advance Computing
Poornima College of Engineering, Jaipur ML Lab Manual
EVALUATION SCHEME
I+II Mid Term Examination Attendance and performance End Term Examination
Total Marks
Experiment Viva Total Attendance Performance Total Experiment Viva Total
20 10 30 10 5 15 20 10 30 75
v
ML Lab (6CS4-22) manual Department of Advance Computing
Poornima College of Engineering, Jaipur Department of Advance Computing
LAB OUTCOMES
LO1 To choose basic python Libraries and commands used in Machine Learning
LO2 To apply knowledge of machine learning algorithms for problem statements provided
vi
ML Lab (6CS4-22) Manual Department of Advance Computing
Poornima College of Engineering, Jaipur Department of Advance Computing
Modern tool usage: Create, select, and apply appropriate techniques, resources, and
PO5 modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
The engineer and society: Apply reasoning informed by the contextual knowledge to
PO6 assess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
Ethics: Apply ethical principles and commit to professional ethics and responsibilities
PO8 and norms of the engineering practice.
Individual and teamwork: Function effectively as an individual, and as a member or
PO9 leader in diverse teams, and in multidisciplinary settings.
Communication: Communicate effectively on complex engineering activities with the
PO10 engineering community and with society at large, such as, being able to comprehend
and write effective reports and design documentation, make effective presentations,
and give and receive clear instructions.
Project management and finance: Demonstrate knowledge and understanding of the
PO11 engineering and management principles and apply the set one’s own work, as a
member and leader in a team, to manage projects and in multidisciplinary
environments.
Life-long learning: Recognize the need for, and have the preparation and ability to
PO12 Engage in independent and life-long learning in the broadest context of technological
change.
PSO2 Exhibit the knowledge and skills in the field of Mechanical & Allied engineering
concepts.
PSO3 Apply the knowledge of skills in HVAC & Rand Automobile engineering.
vii
ML Lab (6CS4-22) Manual Department of Advance Computing
Poornima College of Engineering, Jaipur Department of Advance Computing
DO’S
Enter the lab on time and leave at proper time.
Wait for the previous class to leave before the next class enters.
Keep the bag outside in the respective racks.
Utilize lab hours in the corresponding.
Turn off the machine before leaving the lab unless a member of lab staff has specifically
told you not to do so.
Leave the labs at least as nice as you found them.
If you notice a problem with a piece of equipment (e.g. a computer doesn't respond) or the
room in general (e.g. cooling, heating, lighting) please report it to lab staff immediately.
Do not attempt to fix the problem yourself.
DON’TS
Don't abuse the equipment.
Do not adjust the heat or air conditioners. If you feel the temperature is not properly set,
inform lab staff; we will attempt to maintain a balance that is healthy for people and
machines.
Do not attempt to reboot a computer. Report problems to lab staff.
Do not remove or modify any software or file without permission.
Do not remove printers and machines from the network without being explicitly told to do
so by lab staff.
Don't monopolize equipment. If you're going to be away from your machine for more than
10 or 15 minutes, log out before leaving. This is both for the security of your account, and
to ensure that others are able to use the lab resources while you are not.
Don’t use internet, internet chat of any kind in your regular lab schedule.
Do not download or upload of MP3, JPG or MPEG files.
No games are allowed in the lab sessions.
No hardware including USB drives can be connected or disconnected in the labs without
prior permission of the lab in-charge.
No food or drink is allowed in the lab or near any of the equipment. Aside from the fact that
it leaves a mess and attracts pests, spilling anything on a keyboard or other piece of computer
equipment could cause permanent, irreparable, and costly damage. (and in fact has) If you
need to eat or drink, take a break and do so in the canteen.
Don’t bring any external material in the lab, except your lab record, copy and books.
Don’t bring the mobile phones in the lab. If necessary then keep them in silence mode.
Please be considerate of those around you, especially in terms of noise level. While labs are
a natural place for conversations of all types, kindly keep the volume turned down.
If you are having problems or questions, please go to either the faculty, lab in-charge or the lab
supporting staff. They will help you. We need your full support and cooperation for smooth
functioning of the lab.
All the students are supposed to prepare the theory regarding the next experiment/
Program.
Students are supposed to bring their lab records as per their lab schedule.
Previous experiment/program should be written in the lab record.
If applicable trace paper/graph paper must be pasted in lab record with proper labeling.
All the students must follow the instructions, failing which he/she may not be allowed in
the lab.
Zero Lecture
4. Deep learning
Falling hardware prices and the development of GPUs for personal use in the last few years have
contributed to the development of the concept of deep learning which consists of multiple hidden layers
in an artificial neural network. This approach tries to model the way the human brain processes light
and sound into vision and hearing. Some successful applications of deep learning are computer vision
and speech Recognition.
7. Clustering
Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that
observations within the same cluster are similar according to some pre designated criterion or criteria,
while observations drawn from different clusters are dissimilar. Different clustering techniques make
different assumptions on the structure of the data, often defined by some similarity metric and evaluated
for example by internal compactness (similarity between members of the same cluster) and separation
between different clusters. Other methods are based on estimated density and graph connectivity.
Clustering is a method of unsupervised learning, and a common technique for statistical data analysis.
8. Bayesian networks
A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical
model that represents a set of random variables and their conditional independencies via a directed
acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships
between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities
of the presence of various diseases. Efficient algorithms exist that perform inference and learning.
9. Reinforcement learning
Reinforcement learning is concerned with how an agent ought to take actions in an environment so as
to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a
policy that maps states of the world to the actions the agent ought to take in those states. Reinforcement
learning differs from the supervised learning problem in that correct input/output pairs are never
presented, nor sub-optimal actions explicitly corrected.
can thus be removed to reduce calculation cost without incurring much loss of information. Common
optimality criteria include accuracy, similarity and information measures.
Experiment–1
AIM: Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data from a
.CSV file.
Program:
import random
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
traindata = []
for row in datareader:
traindata.append(row)
return (traindata)
h=['phi','phi','phi','phi','phi','phi'
data=read_data('finds.csv')
def isConsistent(h,d):
if len(h)!=len(d)-1:
print('Number of attributes are not same in hypothesis.')
return False
else:
matched=0
for i in range(len(h)):
if ( (h[i]==d[i]) | (h[i]=='any') ):
matched=matched+1
if matched==len(h):
return True
else:
return False
def makeConsistent(h,d):
for i in range(len(h)):
if((h[i] == 'phi')):
h[i]=d[i]
elif(h[i]!=d[i]):
h[i]='any'
return h
print('Begin : Hypothesis :',h)
filename = "finds.csv"
dataset = loadCsv(filename)
print(dataset)
hypothesis=['0'] * num_attributes
print("Intial Hypothesis")
print(hypothesis)
print("The Hypothesis are")
for i in range(len(dataset)):
target = dataset[i][-1]
if(target == 'Yes'):
for j in range(num_attributes):
if(hypothesis[j]=='0'):
hypothesis[j] = dataset[i][j]
if(hypothesis[j]!= dataset[i][j]):
hypothesis[j]='?'
print(i+1,'=',hypothesis)
print("Final Hypothesis")
print(hypothesis)
Output:
Experiment–2
Object: For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a description of the set of all hypotheses consistent with the
training examples.
Program:
import numpy as np
import pandas as pd
data = pd.DataFrame(data=pd.read_csv('finds1.csv'))
concepts = np.array(data.iloc[:,0:-1])
target = np.array(data.iloc[:,-1])
def learn(concepts, target):
specific_h = concepts[0].copy()
print("initialization of specific_h and general_h")
print(specific_h)
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print(general_h)
for i, h in enumerate(concepts):
if target[i] == "Yes":
for x in range(len(specific_h)):
if h[x] != specific_h[x]:
specific_h[x] = '?'
general_h[x][x] = '?'
if target[i] == "No":
for x in range(len(specific_h)):
if h[x] != specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
print(" steps of Candidate Elimination Algorithm",i+1)
print("Specific_h ",i+1,"\n ")
print(specific_h)
print("general_h ", i+1, "\n ")
print(general_h)
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("Final Specific_h:", s_final, sep="\n")
print("Final General_h:", g_final, sep="\n")
Output:
Experiment–3
AIM: Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample.
Program:
import pandas as pd
import numpy as np
dataset= pd.read_csv('playtennis.csv',names=['outlook','temperature','humidity','wind','class',])
def entropy(target_col):
elements,counts = np.unique(target_col,return_counts = True)
entropy = np.sum([(-counts[i]/np.sum(counts))*np.log2(counts[i]/np.sum(counts))
for i in range(len(elements))])
return entropy
def InfoGain(data,split_attribute_name,target_name="class"):
total_entropy = entropy(data[target_name])
vals,counts= np.unique(data[split_attribute_name],return_counts=True)
Weighted_Entropy =
np.sum([(counts[i]/np.sum(counts))*entropy(data.where(data[split_attribute_name]==vals[i]).dr
opna()[target_name]) for i in range(len(vals))])
Output:
Display Tree
{'outlook': {'Overcast': 'Yes', 'Rain': {'wind': {'Strong': 'No', 'Weak': 'Yes'}}, 'Sunny':
{'humidity': {'High': 'No', 'Normal': 'Yes'}}}}
Experiment–4
AIM: Build an Artificial Neural Network by implementing the Back propagation Algorithm
and test the same using appropriate data sets.
Program:
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinally y = y/100
#Sigmoid Function
def sigmoid (x):
return (1/(1 + np.exp(-x)))
#Forward Propagation
for i in range(epoch):
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr
# dotproduct of nextlayererror and currentlayerop
bout += np.sum(d_output, axis=0,keepdims=True) *lr
wh += X.T.dot(d_hiddenlayer) *lr
#bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
Output:
Input:
[[ 0.66666667 1. ]
[ 0.33333333 0.55555556]
[ 1. 0.66666667]]
Actual Output:
[[ 0.92]
[ 0.86]
[ 0.89]]
Predicted Output:
[[ 0.89559591]
[ 0.88142069]
[ 0.8928407 ]]
Experiment–5
AIM: Write a program to implement the naïve Bayesian classifier for a sample training data
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
Program:
import csv
import random
import math
def loadCsv(filename):
lines = csv.reader(open(filename, "r"))
dataset = list(lines)
for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]
return dataset
def splitDataset(dataset, splitRatio):
trainSize = int(len(dataset) * splitRatio)
trainSet = []
copy = list(dataset)
while len(trainSet) < trainSize:
index = random.randrange(len(copy))
trainSet.append(copy.pop(index))
return [trainSet, copy]
def separateByClass(dataset):
separated = {}
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated
def mean(numbers):
return sum(numbers)/float(len(numbers))
def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)
def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)]
del summaries[-1]
return summaries
def summarizeByClass(dataset):
separated = separateByClass(dataset)
summaries = {}
for classValue, instances in separated.items():
summaries[classValue] = summarize(instances)
return summaries
def calculateProbability(x, mean, stdev):
exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent
def calculateClassProbabilities(summaries, inputVector):
probabilities = {}
for classValue, classSummaries in summaries.items():
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, stdev = classSummaries[i]
x = inputVector[i]
probabilities[classValue] *= calculateProbability(x, mean, stdev)
return probabilities
def predict(summaries, inputVector):
probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1
for classValue, probability in probabilities.items():
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel
def getPredictions(summaries, testSet):
predictions = []
for i in range(len(testSet)):
result = predict(summaries, testSet[i])
predictions.append(result)
return predictions
def getAccuracy(testSet, predictions):
correct = 0
for i in range(len(testSet)):
if testSet[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testSet))) * 100.0
def main():
filename = 'data.csv'
splitRatio = 0.67
dataset = loadCsv(filename)
trainingSet, testSet = splitDataset(dataset, splitRatio)
print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset),
len(trainingSet), len(testSet)))
# prepare model
summaries = summarizeByClass(trainingSet)
# test model
predictions = getPredictions(summaries, testSet)
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: {0}%'.format(accuracy))
main()
OUTPUT :
Split 306 rows into train=205 and test=101 rows
Accuracy: 72.27722772277228%
Experiment-6
AIM: Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program. Calculate
the accuracy, precision, and recall for your data set.
Program:
import pandas as pd
msg=pd.read_csv('naivetext1.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum
print(X)
print(y)
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print(xtest.shape)
print(xtrain.shape)
print(ytest.shape)
print(ytrain.shape)
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)
from sklearn import metrics
print('Accuracy metrics')
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))
print('Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('Recall and Precison ')
print(metrics.recall_score(ytest,predicted))
print(metrics.precision_score(ytest,predicted))
Output:
17 0
Accuracy metrics
Confusion matrix
[[3 1]
[0 1]]
Experiment-7
AIM: Write a program to construct a Bayesian network considering medical data. Use this
model to demonstrate the diagnosis of heart patients using standard Heart Disease Data
Set. You can use Java/Python ML library classes/API.
Program:
import numpy as np
from urllib.request import urlopen
import urllib
import pandas as pd
from pgmpy.inference import VariableElimination
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator, BayesianEstimator
names = ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca',
'thal', 'heartdisease']
heartDisease = pd.read_csv('heart.csv', names = names)
heartDisease = heartDisease.replace('?', np.nan)
model.fit(heartDisease, estimator=MaximumLikelihoodEstimator)
from pgmpy.inference import VariableElimination
HeartDisease_infer = VariableElimination(model)
q = HeartDisease_infer.query(variables=['heartdisease'], evidence={'age': 37, 'sex' :0})
print(q['heartdisease'])
Output:
╒════════════════╤════
│ heart disease │ phi (heart disease) │
╞══════════════════════
│ heartdisease_0 │ 0.5593 │
├──────────────┤
│ heartdisease_1 │ 0.4407 │
╘════════════════╧═════
Experiment-8
AIM: Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment
on the quality of clustering. You can add Java/Python ML library classes/API in the program.
Program:
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from sklearn.mixture import GaussianMixture
import pandas as pd
X=pd.read_csv("kmeansdata.csv")
x1 = X['Distance_Feature'].values
x2 = X['Speeding_Feature'].values
X = np.array(list(zip(x1, x2))).reshape(len(x1), 2)
plt.plot()
plt.xlim([0, 100])
plt.ylim([0, 50])
plt.title('Dataset')
plt.scatter(x1, x2)
plt.show()
#code for EM
gmm = GaussianMixture(n_components=3)
gmm.fit(X)
em_predictions = gmm.predict(X)
print("\nEM predictions")
print(em_predictions)
print("mean:\n",gmm.means_)
print('\n')
print("Covariances\n",gmm.covariances_)
print(X)
plt.title('Exceptation Maximum')
plt.scatter(X[:,0], X[:,1],c=em_predictions,s=50)
plt.show()
#code for Kmeans
import matplotlib.pyplot as plt1
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)
print(kmeans.cluster_centers_)
print(kmeans.labels_)
plt.title('KMEANS')
plt1.scatter(X[:,0], X[:,1], c=kmeans.labels_, cmap='rainbow')
plt1.scatter(kmeans.cluster_centers_[:,0] ,kmeans.cluster_centers_[:,1], color='black')
Output:
EM predictions
[0 0 0 1 0 1 1 1 2 1 2 2 1 1 2 1 2 1 0 1 0 1 1]
mean:
[[57.70629058 25.73574491]
[52.12044022 22.46250453]
[46.4364858 39.43288647]]
Covariances
[[[83.51878796 14.926902 ]
[14.926902 2.70846907]]
[[29.95910352 15.83416554]
[15.83416554 67.01175729]]
[[79.34811849 29.55835938]
[29.55835938 18.17157304]]]
[[71.24 28. ]
[52.53 25. ]
[64.54 27. ]
[55.69 22. ]
[54.58 25. ]
[41.91 10. ]
[58.64 20. ]
[52.02 8. ]
[31.25 34. ]
[44.31 19. ]
[49.35 40. ]
[58.07 45. ]
[44.22 22. ]
[55.73 19. ]
[46.63 43. ]
[52.97 32. ]
[46.25 35. ]
[51.55 27. ]
[57.05 26. ]
[58.45 30. ]
[43.42 23. ]
[55.68 37. ]
[55.15 18. ]
[[57.74090909 24.27272727]
[48.6 38. ]
[45.176 16.4 ]]
[0 0 0 0 0 2 0 2 1 2 1 1 2 0 1 1 1 0 0 0 2 1 0]
Experiment-9
AIM: Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions. Java/Python ML library classes can be used for this
problem.
Program:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
import pandas as pd
dataset=pd.read_csv("iris.csv")
X_train,X_test,y_train,y_test=train_test_split(X,y,random_state=0,test_size=0.25)
classifier=KNeighborsClassifier(n_neighbors=8,p=3,metric='euclidean')
classifier.fit(X_train,y_train)
Output:
Experiment-10
AIM: Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.
Program:
import numpy as np
from bokeh.plotting import figure, show, output_notebook
from bokeh.layouts import gridplot
from bokeh.io import push_notebook
def local_regression(x0, X, Y, tau):
# add bias term
x0 = np.r_[1, x0] # Add one to avoid the loss in information
X = np.c_[np.ones(len(X)), X]
# fit model: normal equations with kernel
xw = X.T * radial_kernel(x0, X, tau) # XTranspose * W
beta = np.linalg.pinv(xw @ X) @ xw @ Y # @ Matrix Multiplication or Dot Product
# predict value
return x0 @ beta # @ Matrix Multiplication or Dot Product for prediction
def radial_kernel(x0, X, tau):
return np.exp(np.sum((X - x0) ** 2, axis=1) / (-2 * tau * tau))
# Weight or Radial Kernal Bias Function
n = 1000
# generate dataset
X = np.linspace(-3, 3, num=n)
print("The Data Set ( 10 Samples) X :\n",X[1:10])
Y = np.log(np.abs(X ** 2 - 1) + .5)
print("The Fitting Curve Data Set (10 Samples) Y :\n",Y[1:10])
X += np.random.normal(scale=.1, size=n)
print("Normalised (10 Samples) X :\n",X[1:10])
domain = np.linspace(-3, 3, num=300)
print(" Xo Domain Space(10 Samples) :\n",domain[1:10])
def plot_lwr(tau):
# prediction through regression
prediction = [local_regression(x0, X, Y, tau) for x0 in domain]
plot = figure(plot_width=400, plot_height=400)
plot.title.text='tau=%g' % tau
plot.scatter(X, Y, alpha=.3)
plot.line(domain, prediction, line_width=2, color='red')
return plot
# Plotting the curves with different tau
show(gridplot([
[plot_lwr(10.), plot_lwr(1.)],
[plot_lwr(0.1), plot_lwr(0.01)]
]))
Output:
Program:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('Salaries.csv')
print(data)
from sklearn.ensemble import RandomForestRegressor
Program:
import numpy as np
import pandas as pd
import xgboost as xg
from sklearn.model_selection import train_test_split from
sklearn.metrics import mean_squared_error as MSE
# Splitting
train_X, test_X, train_y, test_y = train_test_split(X, y,
test_size = 0.3, random_state = 123)
# Instantiation
xgb_r = xg.XGBRegressor(objective ='reg:linear',
n_estimators = 10, seed = 123)
# RMSE Computation
rmse = np.sqrt(MSE(test_y, pred))
print("RMSE : % f" %(rmse))
Output:
129043.2314