Unit 3 Classification - Dr. Vidyut D
Unit 3 Classification - Dr. Vidyut D
CLASSIFICATION
Classification:
1. Classification Tree
3. Display the top five rows from the data set using the head()
function.
4. Separate the independent and dependent variables using the slicing
method.
7. Predict the test data set values using the model above.
8. Calculate the accuracy of the model using the accuracy score function.
Advantages of a decision tree
<60 60 to 80 >80
G A E G A E
E – Excelent
G – Good
A - Average
Tree algorithms: ID3, C4.5, C5.0 and CART
https://youtu.be/1qzrTvHGPow
summary
• Support Vectors
• Support vectors are the data points nearest to
the hyperplane, the points of a data set that, if
removed, would alter the position of the
dividing hyperplane. Because of this, they can
be considered the critical elements of a data
set.
What is a hyperplane?
• Intuitively, the further from the hyperplane our data points lie, the
more confident we are that they have been correctly classified.
We therefore want our data points to be as far away from the
hyperplane as possible, while still being on the correct side of it.
• Pros
• Accuracy
• Works well on smaller cleaner datasets
• It can be more efficient because it uses a subset of training points
• Cons
• Isn’t suited to larger datasets as the training time with SVMs can be
high
• Less effective on noisier datasets with overlapping classes
Ensemble Learning
Meaning
a group of things or people acting or taken
together as a whole, especially a group of
musicians who regularly play together: