0% found this document useful (0 votes)
11 views3 pages

01 Pima End To End Script

The document outlines an end-to-end machine learning example using the mlr package in R, detailing the process of reading training and testing data, checking for missing values, and ensuring the response variable is categorical. It describes the implementation of various classification methods including Naive Bayes, Quadratic Discriminant Analysis, Logistic Regression, CART, Random Forest, SVM, GBM, XGBoost, and KNN, along with the calculation of confusion matrices for model evaluation. Each method is systematically introduced with code snippets for training and prediction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

01 Pima End To End Script

The document outlines an end-to-end machine learning example using the mlr package in R, detailing the process of reading training and testing data, checking for missing values, and ensuring the response variable is categorical. It describes the implementation of various classification methods including Naive Bayes, Quadratic Discriminant Analysis, Logistic Regression, CART, Random Forest, SVM, GBM, XGBoost, and KNN, along with the calculation of confusion matrices for model evaluation. Each method is systematically introduced with code snippets for training and prediction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

## End to End Machine Learning Example 1

## Learning Points: Process

library(mlr)

## Reading the Train and Test data

train = read.csv("01_pima.csv")
test = read.csv("01_pimaTest.csv")

## To check for any missing values and outliers.

summarizeColumns(train)
summarizeColumns(test)

## Ensuring that dependant variable is Categorical

train$Y = as.factor(train$Y)
test$Y = as.factor(test$Y)

summary(train)
summary(test)

###

## Essential Step in mlr package. You have to ensure that response variable is
identified correctly

trainTask = makeClassifTask(data = train,target = "Y", positive = "1")


testTask = makeClassifTask(data = test, target = "Y", positive = "1")

trainTask ## To check the details

## Method 1- Naive Bayes

## SOP
## 1. Make Learner
## 2. Train Learner with Task
## 3. Predict

nb.learner = makeLearner("classif.naiveBayes")

nb.model = train(nb.learner, trainTask)


nb.predict=predict(nb.model,testTask)

calculateConfusionMatrix(nb.predict)

## Method 2: Quadratic Descriminant Analysis


qda.learner = makeLearner("classif.qda", predict.type = "response")
qda.model = train(qda.learner, trainTask)
qda.predict = predict(qda.model, testTask)

calculateConfusionMatrix(qda.predict)

#Method 3: Logistic Regression


logistic.learner = makeLearner("classif.logreg",predict.type = "response")
logistic.model = train(logistic.learner,trainTask)
logistic.predict = predict(logistic.model, testTask)

calculateConfusionMatrix(logistic.predict)

## Method 4: CART

cart.learner = makeLearner("classif.rpart", predict.type = "response")


cart.model = train(cart.learner, trainTask)

## Something Extra
cartModel=getLearnerModel(cart.model) ## In case you need to plot tree
library(rpart.plot)
prp(cartModel,extra=2, roundint=FALSE)## For plotting tree, you may need rpart.plot

#make predictions
cart.predict = predict(cart.model, testTask)

calculateConfusionMatrix(cart.predict)

## Method 5: Random Forest

rf.learner = makeLearner("classif.randomForest", predict.type = "response")


rf.model = train(rf.learner, trainTask)

rf.predict = predict(rf.model, testTask)

calculateConfusionMatrix(rf.predict)

## Method: 6 SVM

ksvm.learner = makeLearner("classif.ksvm", predict.type = "response")


ksvm.model = train(ksvm.learner, trainTask)
ksvm.predict = predict(ksvm.model, testTask)

calculateConfusionMatrix(ksvm.predict)

#Method 7: GBM
gbm.learner = makeLearner("classif.gbm", predict.type = "response", distribution =
"bernoulli")
gbm.model = train(gbm.learner, trainTask)
gbm.predict = predict(gbm.model, testTask)

calculateConfusionMatrix(gbm.predict)

#Method 8: XGB

#make learner with inital parameters


xgb.learner= makeLearner("classif.xgboost", predict.type = "response")
xgb.model = train(xgb.learner, trainTask)
xgb.predict = predict(xgb.model, testTask)

calculateConfusionMatrix(xgb.predict)

#Method 9: knn

knn.learner=makeLearner("classif.knn",predict.type = "response")
knn.model=train(knn.learner,trainTask)
knn.predict=predict(knn.model, testTask)

calculateConfusionMatrix(knn.predict)

You might also like