Prediction On Iris
Prediction On Iris
Andhra University
Visakhapatnam-530001, AP
Internship Organization
Grafx IT Solutions
Introduction to SVMs: In machine learning, support vector machines (SVMs, also support vector networks)
are supervised learning models with associated learning algorithms that analyse data used for classification
and regression analysis. A Support Vector Machine (SVM) is a discriminative classifier formally defined by a
separating hyperplane. In other words, given labelled training data (supervised learning), the algorithm
outputs an optimal hyperplane which categorizes new examples.
An SVM model is a representation of the examples as points in space, mapped so that the examples of the
separate categories are divided by a clear gap that is as wide as possible. In addition to performing linear
classification, SVMs can efficiently perform a non-linear classification, implicitly mapping their inputs into high-
dimensional feature spaces
Index
1. Introduction
1.1. Purpose
1.2. Scope
2. Requirement Analysis
4.1 Functional requirements
4.2 Non-Functional requirements
4.3 System Models
4.3.1 Class Diagram
4.3.2 Activity Diagram
5. System Design
5.1 Technologies Used
5.2 Software Requirements
5.3 Hardware Requirements
6. Object Design
6.1 Architecture
6.2 Pros & Cons
6.3 Components
8. Conclusion
Introduction
What are Support Vector Machines? Support Vector Machine (SVM) is a relatively simple Supervised
Machine Learning Algorithm used for classification and/or regression. It is more preferred for classification
but is sometimes very useful for regression as well. Basically, SVM finds a hyper-plane that creates a
boundary between the types of data. In 2-dimensional space, this hyper-plane is nothing but a line. In
SVM, we plot each data item in the dataset in an N-dimensional space, where N is the number of
features/attributes in the data. Next, find the optimal hyperplane to separate the data. So, by this, you must
have understood that inherently, SVM can only perform binary classification (i.e., choose between two
classes). However, there are various techniques to use for multi-class problems.
For example, in a class of fruits, to perform multi-class classification, we can create a binary classifier for
each fruit. For say, the ‘mango’ class, there will be a binary classifier to predict if it IS a mango OR it is
NOT a mango. The classifier with the highest score is chosen as the output of the SVM. SVM for complex
(Non-Linearly Separable) SVM works very well without any modifications for linearly separable data.
Linearly Separable Data is any data that can be plotted in a graph and can be separated into classes using
a straight line.
1. Purpose
a. SVMs are used in applications like handwriting recognition, intrusion detection, face detection, email
classification, gene classification, and in web pages. This is one of the reasons we use SVMs in machine
learning. It can handle both classification and regression on linear and non-linear data.
b. Another reason we use SVMs is because they can find complex relationships between your data
without you needing to do a lot of transformations on your own. It's a great option when you are working
with smaller datasets that have tens to hundreds of thousands of features. They typically find more
accurate results when compared to other algorithms because of their ability to handle small, complex
datasets.
1.2 Scope
The Support Vector Machines algorithm is a great algorithm to learn. It offers many unique benefits,
including high degrees of accuracy in classification problems. The algorithm can also be applied to many
different use cases, including facial detection, classification of websites or emails, and handwriting
recognition.
However, a key benefit of the algorithm is that it is intuitive. Being able to understand the mechanics
behind an algorithm is important. This is true even when the math is a bit out of scope.
4. Requirement Analysis:
4.1. Functional Requirements:
User
Load Data
Data Analysis
Data Pre-processing
Prediction
Non-Functional Requirements: -
1. Secure access of confidential data (user’s details).
2. 24 X 7 availability.
3. Reliability.
4. Maintainability
.
4.3 System Models
4.3.1 Class Diagram
Class diagram
For SVM, an activity diagram can depict the major steps involved in training and testing the SVM
model. This includes actions such as inputting training and testing datasets, preprocessing the
data, training the model by optimizing parameters and finding the decision boundary, testing the
model on new data, and evaluating its performance.
System Design
Object Design
6.1 Architecture: -
At first approximation what SVMs do is to find a separating line (or hyperplane) between data of two
classes. SVM is an algorithm that takes the data as an input and outputs a line that separates those
classes if possible.
Let’s begin with a problem. Suppose you have a dataset as shown below and you need to classify the red
rectangles from the blue ellipses (let’s say positives from the negatives). So, your task is to find an ideal
line that separates this dataset in two classes (say red and blue).
6.3 Components
1. Pandas
2. Matplotlib
3. Seaborn
4. CSV (Comma Seperated Value)
5. Scikit-learn
1.Pandas:
Pandas is a Python library used for working with data sets. It has functions for analysing, cleaning,
exploring, and manipulating data. The name "Pandas" has a reference to both "Panel Data", and "Python
Data Analysis" and was created by Wes McKinney in 2008.Pandas allows us to analyse big data and
make conclusions based on statistical theories. Pandas can clean messy data sets, and make them
readable and relevant. Relevant data is very important in data science.
2.Matplotlib:
Matplotlib is a low-level graph plotting library in python that serves as a visualization utility. Matplotlib was
created by John D. Hunter. Matplotlib is open source and we can use it freely. It is mostly written in
python, a few segments are written in C, Objective-C and JavaScript for Platform compatibility.
3.Seaborn:
Seaborn is a library for making statistical graphics in Python. It builds on top of matplotlib and integrates
closely with pandas data structures. Seaborn helps you explore and understand your data. Its plotting
functions operate on data frames and arrays containing whole datasets and internally perform the necessary
semantic mapping and statistical aggregation to produce informative plots.
The Support Vector Machine (SVM) is a machine that is supervised to learn algorithms used for both
classification and regression. The SVM algorithm’s objective is to find a hyperplane in an N-dimensional space
that distinctly classifies data points. In this blog, we went over end-to-end questions that will be asked in an
interview. We got to know about the Kernel trick and understood the various terminology associated with
Support Vector Machines.
The method of support vector machines as an alternative to the conservative logistic regression models was
studied and its performance compared on the real credit data sets. Especially in combination with the non-
linear kernel, SVM proved itself as a competitive approach and provided a slight edge on top of the logistic
regression model.