0% found this document useful (0 votes)
43 views33 pages

Linear Regression & SVM

The document discusses prediction and regression models. It explains that prediction involves predicting continuous values using models, unlike classification which predicts categories. The major method for prediction is regression, which models the relationship between predictor and response variables. Simple linear regression and its model are described.

Uploaded by

SANJIDA AKTER
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views33 pages

Linear Regression & SVM

The document discusses prediction and regression models. It explains that prediction involves predicting continuous values using models, unlike classification which predicts categories. The major method for prediction is regression, which models the relationship between predictor and response variables. Simple linear regression and its model are described.

Uploaded by

SANJIDA AKTER
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

What Is Prediction?

 (Numerical) prediction is similar to classification


 construct a model
 use model to predict continuous or ordered
value for a given input
 Prediction is different from classification
 Classification refers to predict categorical class
label
 Prediction models continuous-valued functions
What Is Prediction?
 Major method for prediction: regression
 model the relationship between one or more
independent or predictor variables and a
dependent or response variable
 Regression analysis
 Linear and multiple regression
 Non-linear regression
 Other regression methods: generalized linear
model, Poisson regression, log-linear models,
regression trees
Simple Linear Regression Model
 Only one independent variable, x
 Relationship between x and y is
described by a linear function
 Changes in y are assumed to be caused
by changes in x
Simple Linear Regression Model
 Linear regression: involves a response
variable y and a single predictor variable x
 y=b+mx (line)
 y = w 0 + w1 x
 where w0 (y-intercept) and w1 (slope) are
regression coefficients
What is “Linear”?
 Remember this:
 Y=mX+B?

B
What’s Slope?
 A slope of 2 means that every 1-unit
change in X yields a 2-unit change in Y.

B
Linear Regression
 Method of least squares: estimates the
best-fitting straight line
| D|

 ( xi  x )( yi  y )  xy   x y
w  i 1
w1  n
1  (x  x)
| D|

i
2
 x 2

(  x ) 2

i 1 n

w  y w x
0 1
Example
 x is the number of years
of work experience of a
college graduate
 y is the corresponding
salary of the graduate
Example
| D|

 ( x  x )( y  y)
w 
i i
i 1

1  (x  x)
| D|

i
2

i 1
Example

We can predict that the salary of a college graduate


with, 10 years of experience is $58,600.
Application
 Consider the following dataset
Application
 Consider the following dataset
SVM—Support Vector Machines
 A new classification method for both linear and
nonlinear data
 It uses a nonlinear mapping to transform the
original training data into a higher dimension
 With the new dimension, it searches for the linear
optimal separating hyperplane (i.e., “decision
boundary”)

April 29, 2024 Data Mining: Concepts and Techniques13


SVM—Support Vector Machines
 With an appropriate nonlinear mapping to a
sufficiently high dimension, data from two classes
can always be separated by a hyperplane
 SVM finds this hyperplane using support vectors
(“essential” training tuples) and margins (defined
by the support vectors)
SVM—History and Applications
 Vapnik and colleagues (1992)—groundwork from
Vapnik & Chervonenkis’ statistical learning theory
in 1960s
 Used both for classification and prediction
 Applications:
 handwritten digit recognition, object
recognition, speaker identification,
benchmarking time-series prediction tests
Support Vector Machines

 Find a linear hyperplane (decision boundary) that will separate


the data
Support Vector Machines
B1

 One Possible Solution


Support Vector Machines

B2

 Another possible solution


Support Vector Machines

B2

 Other possible solutions


Support Vector Machines
B1

B2

 Which one is better? B1 or B2?


 How do you define better?
SVM (cont…)
 SVM works by mapping data to a high-
dimensional feature space so that data
points can be categorized, even when the
data are not otherwise linearly separable.
 A separator between the categories is
found, then the data are transformed in
such a way that the separator could be
drawn as a hyperplane.
SVM (cont…)
 How can you draw a linear line to separate
the data points?
Parameters of SVM: Kernel, Regularization,
Gamma and Margin
 In machine learning, a “kernel” is usually used to
refer to the kernel trick, a method of using a linear
classifier to solve a non-linear problem
 It transforms the non-linear data points in such a way that
it can be separable.
 The kernel function is what is applied on each data
instance to map the original non-linear observations
into a higher-dimensional space in which they
become separable.
 Kernel types
 Linear Polynomial,
 Radial basis function (RBF)
 Sigmoid
Regularization
 The Regularization parameter tells the
SVM optimization how much you want to
avoid misclassifying each training
example.
Gama parameter
 The gamma parameter defines how far the
influence of a single training example
reaches, with low values meaning ‘far’ and
high values meaning ‘close’.
Margin

 Margin is the distance between the left


hyperplane and right hyperplane
 A good margin is one where this
separation is larger for both the classes.
 A good margin allows the points to be in their
respective classes without crossing to other
class.
Margin (cont…)
B1

B2

b21
b22

margin
b11

b12
 Find hyperplane maximizes the margin => B1 is better
than B2
Support Vector Machines

 Let data D be (X1, y1), …, (X|D|, y|D|), where Xi is the set of


training tuples associated with the class labels yi
 There are infinite lines (hyperplanes) separating the two
classes but we want to find the best one (the one that
minimizes classification error on unseen data)
 SVM searches for the hyperplane with the largest margin,
i.e., maximum marginal hyperplane (MMH)
Margin (cont…)
B1
Support
Vectors

B2

b21
b22

margin
b11

b12
Margin (cont…)
B1

 
w x  b  0  
w  x  b  1
 
w  x  b  1

b11
2
  b12 Margin   2
 1 if w  x  b  1 || w ||
f (x)    
 1 if w  x  b  1
Support Vector Machines
 A separating hyperplane can be written as
 W●X+b=0
 where W={w1, w2, …, wn} is a weight vector
and b a scalar (bias)
 For 2-D it can be written as
 w0 + w1 x1 + w2 x2 = 0
 The hyperplane defining the sides of the
margin:
 H1: w0 + w1 x1 + w2 x2 ≥ 1 for yi = +1
 H2: w0 + w1 x1 + w2 x2 ≤ – 1 for yi = –1
References
 https://medium.com/machine-learning-10
1/chapter-2-svm-support-vector-machine-
theory-f0812effc72
 https://www.ibm.com/support/knowledgec
enter/de/SS3RA7_15.0.0/com.ibm.spss.m
odeler.help/svm_howwork.htm
 https://towardsdatascience.com/kernel-fu
nction-6f1d2be6091
Thank you

You might also like