DSUP Exp6
DSUP Exp6
Experiment No. 6
AIM:
Implement and compare any one case study using SVM classifier and
Random Forest Classifier with web deployment
Theory:
Machine Learning is the field of study that gives computers the capability to
learn without being explicitly programmed. ML is one of the most exciting
technologies that one would have ever come across. As it is evident from
the name, it gives the computer that makes it more similar to humans: The
ability to learn. Machine learning is actively being used today, perhaps in
many more places than one would expect.
Machine Learning is an essential skill for any aspiring data analyst and data
scientist, and also for those who wish to transform a massive amount of raw
data into trends and predictions.
Supervised learning is the types of machine learning in which machines are
trained using well "labelled" training data, and on basis of that data,
machines predict the output. The labelled data means some input data is
already tagged with the correct output. In supervised learning, the training
data provided to the machines work as the supervisor that teaches the
machines to predict the output correctly.
Supervised learning is a process of providing input data as well as correct
output data to the machine learning model. The aim of a supervised
learning algorithm is to find a mapping function to map the input
variable(x) with the output variable(y).
Fortunately, with libraries such as Scikit Learn, it’s now easy to study
structured or unstructured data using scientific methods, algorithms and
systems to extract knowledge.
Here we are going to discuss two of the most popular algorithms — Support
Vector Machines abbreviated as SVMs and Random Forests.
RANDOM FOREST:
Random Forest is also one of the most used algorithms in machine learning.
It can be used for both classification and regression tasks. The “forest” it
builds, is an ensemble of decision trees, usually trained with the “bagging”
method. The general idea of the bagging method is to create a combination
of learning models which improves the overall result. Basically, Random
forest uses multiple decision trees and merges them together to get an
accurate and stable prediction.
Implementation:
Dataset:
We are going to discuss SVM VS Random forests by taking an example of Iris
dataset (data of flowers). Here we have to predict the species of the flower
with certain features, namely, sepal width, sepal length, petal width and petal
length.
Code:
SVM
Classifier:
Name: Nikhil Namade
Roll No. A16
ID - TU4F2223015
Output:
Conclusion:
It’s because in this dataset, data is sparse and easy to classify, hence SVM
works faster and provides better results. However, random forest also gives
good results but does not match SVM for this particular dataset. The choice
of algorithm depends upon the desired outcome. Although both of the
models are good at their place, but, it very much depends upon the quality
of data when it comes to algorithm’s performance. The choice between SVM
and Random Forest depends on the specific requirements of your project,
including the nature of your dataset, the importance of interpretability, and
the computational resources available. In some cases, SVMs may
outperform Random Forests, especially in tasks that require a clear
separation between classes or when interpretability is crucial. Conversely,
Random Forests may be more suitable for tasks with a large number of
features or when dealing with complex, non-linearly separable data.