0% found this document useful (0 votes)
15 views9 pages

Linear Regression

Linear regression

Uploaded by

kaduridinesh9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views9 pages

Linear Regression

Linear regression

Uploaded by

kaduridinesh9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Linear Regression:

Problem: You are given the following data for a simple linear regression model:
x (Hours Studied) y (Exam Score)
1 50
2 55
3 65
4 70
5 80
Fit a linear regression model using the formula y=β0+β1xy = \beta_0 + \beta_1 xy=β0+β1x. Find the best fit
line and predict the exam score for a student who studies for 6 hours.
Steps to Solve:
1. Understand Linear Regression:
Linear regression tries to model the relationship between two variables by fitting a linear equation to
observed data. The equation of a simple linear regression is:
y=β0+β1x
Where:
 y is the dependent variable (exam score).
 x is the independent variable (hours studied).
 β0 is the y-intercept (the value of y when x=0).
 β1 is the slope (the change in y for each one-unit change in x)

4. Calculate β1\beta_1β1 (Slope):


We now calculate the slope β1\beta_1β1:
2. Problem Example: Spam Detection
Let’s consider a problem of classifying emails as spam or not spam (binary classification). You are given
the following dataset of emails and the frequency of the words “Free” and “Win” in each email, along with
the label showing if the email is spam or not.
Email No. Word "Free" Word "Win" Spam/Not Spam
1 Yes Yes Spam
2 Yes No Not Spam
3 No Yes Not Spam
4 Yes Yes Spam
5 No No Not Spam
Now, we want to classify a new email with the following properties:
 The email contains the word "Free".
 The email contains the word "Win".
We will use the Naive Bayes classifier to predict if this new email is spam or not spam.
Decision Tree using ID3 Algorithm
The ID3 algorithm (Iterative Dichotomiser 3) is a popular algorithm used to build a decision tree for
classification tasks. It uses information gain based on entropy to select the best feature for splitting the
dataset at each step.
Problem: Build a Decision Tree for Tennis Prediction
Consider the dataset:
Weather Humidity Play Tennis
Sunny High No
Sunny Normal Yes
Overcast High Yes
Rainy High No
Rainy Normal Yes
Overcast Normal Yes
Sunny High No
We need to use the ID3 algorithm to decide which feature should be chosen as the root node based on
information gain.
Support Vector Machines (SVM) Problem: Linear SVM
The problem is to find the separating hyperplane (or decision boundary) between two classes using linear
SVM. The dataset contains four 2D points with their labels:
Point (x, y) Label
(2, 3) +1
(3, 3) +1
(6, 6) -1
(7, 8) -1
Goal:
We need to compute the equation of the hyperplane that separates the two classes (+1 and -1) using Linear
SVM.
Step 1: Understanding SVM
A linear SVM tries to find the best possible hyperplane (or decision boundary) that maximizes the margin
between the two classes. The hyperplane is defined by the equation:
w1x1+w2x2+b=0w_1 x_1 + w_2 x_2 + b = 0w1x1+w2x2+b=0
Where:
 w1w_1w1 and w2w_2w2 are the weights associated with the features (coordinates) x1x_1x1 and
x2x_2x2,
 bbb is the bias term (or intercept),
 (x1,x2)(x_1, x_2)(x1,x2) are the coordinates of the points in the dataset.
The goal is to find the values of w1w_1w1, w2w_2w2, and bbb such that the hyperplane correctly classifies
the points.
Step 2: Support Vectors
The support vectors are the data points that lie closest to the hyperplane. The decision boundary is
equidistant from the support vectors of both classes, and the margin is the distance between the support
vectors and the hyperplane.
From the dataset, visually, we can assume that points (3, 3) with label +1 and (6, 6) with label -1 are the
support vectors, as they are closest to the separating boundary.
Step 3: Equation of the Hyperplane
For linear SVM, the equation of the hyperplane is derived using the following constraints:

These constraints ensure that points from the two classes lie on either side of the decision boundary with a
margin of 1.
Step 4: Finding the Hyperplane
We will use the support vectors (3, 3) and (6, 6) to find the hyperplane.
For the point (3, 3) with label +1:

You might also like