0% found this document useful (0 votes)
21 views14 pages

Prediction On Iris

The document discusses using an SVM algorithm to perform prediction on Iris data. It describes the process of building an SVM model, including data collection, preprocessing, training the model, and evaluating performance. It also provides details about the various components and technologies used in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views14 pages

Prediction On Iris

The document discusses using an SVM algorithm to perform prediction on Iris data. It describes the process of building an SVM model, including data collection, preprocessing, training the model, and evaluating performance. It also provides details about the various components and technologies used in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Prediction on Iris data using SVM algorithm

Bachelor of computer application


By
Reemali Jayasree (Reg No: 120127206095)
Under the guidance of
Mr. Santhosh Mahapatro

Department Of Computer Science

Aditya Degree College

Andhra University

Visakhapatnam-530001, AP
Internship Organization

Grafx IT Solutions

I, Reemali Jayasree a student of program,


Reg. No: 120127206095 of the Department of BCA
college do hereby declare that I have completed the
mandatory internship
from 17-04-2023 to 30-06-2023 in Grafx IT Solutions
under the Faculty Guideship of Santosh Kumar sir,
Department of BCA, Aditya Degree College,
Asilmetta, Visakhapatnam.
Abstract

Introduction to SVMs: In machine learning, support vector machines (SVMs, also support vector networks)
are supervised learning models with associated learning algorithms that analyse data used for classification
and regression analysis. A Support Vector Machine (SVM) is a discriminative classifier formally defined by a
separating hyperplane. In other words, given labelled training data (supervised learning), the algorithm
outputs an optimal hyperplane which categorizes new examples.

An SVM model is a representation of the examples as points in space, mapped so that the examples of the
separate categories are divided by a clear gap that is as wide as possible. In addition to performing linear
classification, SVMs can efficiently perform a non-linear classification, implicitly mapping their inputs into high-
dimensional feature spaces
Index
1. Introduction
1.1. Purpose
1.2. Scope
2. Requirement Analysis
4.1 Functional requirements
4.2 Non-Functional requirements
4.3 System Models
4.3.1 Class Diagram
4.3.2 Activity Diagram
5. System Design
5.1 Technologies Used
5.2 Software Requirements
5.3 Hardware Requirements
6. Object Design
6.1 Architecture
6.2 Pros & Cons
6.3 Components
8. Conclusion
Introduction
What are Support Vector Machines? Support Vector Machine (SVM) is a relatively simple Supervised
Machine Learning Algorithm used for classification and/or regression. It is more preferred for classification
but is sometimes very useful for regression as well. Basically, SVM finds a hyper-plane that creates a
boundary between the types of data. In 2-dimensional space, this hyper-plane is nothing but a line. In
SVM, we plot each data item in the dataset in an N-dimensional space, where N is the number of
features/attributes in the data. Next, find the optimal hyperplane to separate the data. So, by this, you must
have understood that inherently, SVM can only perform binary classification (i.e., choose between two
classes). However, there are various techniques to use for multi-class problems.
For example, in a class of fruits, to perform multi-class classification, we can create a binary classifier for
each fruit. For say, the ‘mango’ class, there will be a binary classifier to predict if it IS a mango OR it is
NOT a mango. The classifier with the highest score is chosen as the output of the SVM. SVM for complex
(Non-Linearly Separable) SVM works very well without any modifications for linearly separable data.
Linearly Separable Data is any data that can be plotted in a graph and can be separated into classes using
a straight line.
1. Purpose

a. SVMs are used in applications like handwriting recognition, intrusion detection, face detection, email
classification, gene classification, and in web pages. This is one of the reasons we use SVMs in machine
learning. It can handle both classification and regression on linear and non-linear data.
b. Another reason we use SVMs is because they can find complex relationships between your data
without you needing to do a lot of transformations on your own. It's a great option when you are working
with smaller datasets that have tens to hundreds of thousands of features. They typically find more
accurate results when compared to other algorithms because of their ability to handle small, complex
datasets.

1.2 Scope
The Support Vector Machines algorithm is a great algorithm to learn. It offers many unique benefits,
including high degrees of accuracy in classification problems. The algorithm can also be applied to many
different use cases, including facial detection, classification of websites or emails, and handwriting
recognition.
However, a key benefit of the algorithm is that it is intuitive. Being able to understand the mechanics
behind an algorithm is important. This is true even when the math is a bit out of scope.

4. Requirement Analysis:
4.1. Functional Requirements:
User
Load Data
Data Analysis
Data Pre-processing
Prediction

Non-Functional Requirements: -
1. Secure access of confidential data (user’s details).
2. 24 X 7 availability.
3. Reliability.
4. Maintainability
.
4.3 System Models
4.3.1 Class Diagram

Class diagram
For SVM, an activity diagram can depict the major steps involved in training and testing the SVM
model. This includes actions such as inputting training and testing datasets, preprocessing the
data, training the model by optimizing parameters and finding the decision boundary, testing the
model on new data, and evaluating its performance.
System Design

5.3 Technologies used: Technologies Used:


Machine Learning using python

5.3 Software Requirements: -


Jupyter Note book

5.4 Hardware Requirements: -


• RAM: 4GB and Higher
• Processor: Intel i3 and above
• Hard Disk: 500GB: Minimum

Object Design
6.1 Architecture: -

At first approximation what SVMs do is to find a separating line (or hyperplane) between data of two
classes. SVM is an algorithm that takes the data as an input and outputs a line that separates those
classes if possible.

Let’s begin with a problem. Suppose you have a dataset as shown below and you need to classify the red
rectangles from the blue ellipses (let’s say positives from the negatives). So, your task is to find an ideal
line that separates this dataset in two classes (say red and blue).
6.3 Components
1. Pandas
2. Matplotlib
3. Seaborn
4. CSV (Comma Seperated Value)
5. Scikit-learn

1.Pandas:
Pandas is a Python library used for working with data sets. It has functions for analysing, cleaning,
exploring, and manipulating data. The name "Pandas" has a reference to both "Panel Data", and "Python
Data Analysis" and was created by Wes McKinney in 2008.Pandas allows us to analyse big data and
make conclusions based on statistical theories. Pandas can clean messy data sets, and make them
readable and relevant. Relevant data is very important in data science.

2.Matplotlib:
Matplotlib is a low-level graph plotting library in python that serves as a visualization utility. Matplotlib was
created by John D. Hunter. Matplotlib is open source and we can use it freely. It is mostly written in
python, a few segments are written in C, Objective-C and JavaScript for Platform compatibility.
3.Seaborn:
Seaborn is a library for making statistical graphics in Python. It builds on top of matplotlib and integrates
closely with pandas data structures. Seaborn helps you explore and understand your data. Its plotting
functions operate on data frames and arrays containing whole datasets and internally perform the necessary
semantic mapping and statistical aggregation to produce informative plots.

4. CSV: (Comma Seperated Value)


CSV stands for comma separated values. The csv module was incorporated in Python's standard library as
a result of PEP 305. It presents classes and methods to perform read/write operations on CSV file as per
recommendations of PEP 305. CSV is a preferred export data format by Microsoft's Excel
spreadsheet software. CSV is a typical file format that is often used in the industry to store data because it
stores data in a typical text file and maintains a tabular format.
5.Scikit-learn:
Scikit-Learn, also known as sklearn is a python library to implement machine learning models and statistical
modelling. Through scikit-learn, we can implement various machine learning models for regression,
classification, clustering, and statistical tools for analysing these models. It is the most useful and robust
library for machine learning in Python.
Conclusion

The Support Vector Machine (SVM) is a machine that is supervised to learn algorithms used for both
classification and regression. The SVM algorithm’s objective is to find a hyperplane in an N-dimensional space
that distinctly classifies data points. In this blog, we went over end-to-end questions that will be asked in an
interview. We got to know about the Kernel trick and understood the various terminology associated with
Support Vector Machines.

The method of support vector machines as an alternative to the conservative logistic regression models was
studied and its performance compared on the real credit data sets. Especially in combination with the non-
linear kernel, SVM proved itself as a competitive approach and provided a slight edge on top of the logistic
regression model.

You might also like