Mla Unit 2
Mla Unit 2
ALGORITHM
Unit-II
Linearregression
Before we start Linear Regression
we have to perform cleaning and initial data
Country Salary Purchased
analysis by France 44 72000 No
● Look for data errors with data sanity. France 48 79000 Yes
27
72000
48000
No
Yes
Spain 50 52000 No
Germany 50 83000 No
Positive Linear
Relationship
Case-02: β1 = 0
• It indicates that variable X has
no impact on Y.
• If X changes, there will be no
change in Y
Simple Linear Regression
Case-03: β1 > 0
• It indicates that variable X
has positive impact on Y.
• If X increases, Y will
increase and vice-versa.
• Ex: the weight of a person
is depend on height of a
person.
Multiple Linear Regression
• In multiple linear regression, the dependent variable
depends on more than one independent variables.
• For multiple linear regression, the form of the model is-
Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn
Here,
• Y is a dependent variable.
• X1, X2, …., Xn are independent variables.
• β0, β1,…, βn are the regression coefficients
• βj (1<=j<=n) is the slope or weight that specifies the
factor by which Xj has an impact on Y.
Multiple LinearRegression
Y = β0 + β1X1 + β2X2 + β3X3 + …… +
βnXn
Exp: Price of Flat depend on size of flat, floor ,
location, module kitchen etc.
Y= 0.9+ 1.2 . X1+ 2 . X2 + 4. X3 + 1 . X4
● Polynomial regression fits a nonlinear relationship between the value of x and the
corresponding conditional mean of y, denoted E(y |x)
Finding theminimum
loss
• repeat:
• pick a dimension
• move a small amount in w
■ pick a dimension
■ move a small amount in that dimension towards decreasing
loss (using the derivative)
d
wn = w o −η L
dw
■ pick a dimension
d
wj = w j −η loss(w)
dw j
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 new No SW Up
● Now, Find the Information Gain of target attribute
where P=down , N= Up
● Finally , find the Gain for all attribute ( here for 3 attributes ), those Gain will be
greatest , we should called it as Root of Tree.
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 New No SW Up
Decision Tree
Down UP
Old 3 0
Now, Find the Entropy of each attributes Age = Mid 2 2
Lets , start with attribute New 0 3
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 New No SW Up
Decision Tree
Down UP
Old 3 0
Now, find the Gain of all attribute
Mid 2 2
New 0 3
Where, Sr Age Competiti Type Profit
.N on
o
Gain (Age) = IG- E(Age) 1 Old Yes SW Down
6 Mid No HW Up
=0 8 New Yes SW Up
9 New No HW Up
10 New No SW Up
Implementing Decision TreeAlgorithm
Decision Tree
2 Old No SW Down
Age
3 Old No HW Down
OLD NEW
4 Mid Yes SW Down
MID 5 Mid Yes HW Down
6 Mid No HW Up
Down UP
Competition 7 Mid No SW Up
8 New Yes SW Up
YES NO
9 New No HW Up
10 New No SW Up
Down UP
Decision Tree
Over fitting
Sphere
Here , we are providing only one attribute to identify the object , i.e. Shape = Sphere
Example :Let us consider that , you have provide large number of attributes
like , Sphere, Play, Not Eat, Radius=5 cm.
Sphere
Play
Not Eat
Radius =5 cm
● KNN algorithm can be used for both classification and regression predictive problems.
However, it is more widely used in classification problems in the industry.
● KNN algorithm at the training phase just stores the dataset and when it gets new data, then
it classifies that data into a category that is much similar to the new data.
● KNN works by finding the distances between a query and all the examples in the data,
selecting the specified number examples (K) closest to the query, then votes for the most
frequent label (in the case of classification) or avers the labels (in the case of regression).
Instance Based Learning /LazyAlgorithm
K Nearest Neighbor Algorithm
4 5 5 Fail
5 8 8 Pass
X 6 8 ????
𝑑𝑑= |𝑋𝑋01− 𝑋𝑋𝐴𝐴1|2+ |𝑋𝑋02− 𝑋𝑋𝐴𝐴2|2
1 4 3 Fail
𝑑𝑑1= |6 − 4|2 + |8 − 3|2 = 29 = 5.38
2 6 7 Pass
𝑑𝑑2= |6 − 6|2 + |8 − 7|2 = 1 3 7 8 Pass
4 5 5 Fail
𝑑𝑑3= |6 − 7|2 + |8 − 8|2 = 1
5 8 8 Pass
X 6 8 ????
𝑑𝑑4= |6 − 5|2 + |8 − 5| 2 = 10 = 3.16
2 6 7 Pass
2 6 7 Pass
3 7 8 Pass 3 7 8 Pass
4 5 5 Fail
5 8 8 Pass
5 8 8 Pass
X 6 8 Pass
3 𝑃𝑃𝑎𝑎𝑠𝑠𝑠𝑠𝑎𝑎𝑛𝑛𝑑𝑑0 𝐹𝐹
𝑎𝑎
𝑖𝑖𝑙𝑙
3> 0
Instance Based Learning /LazyAlgorithm
K Nearest Neighbor Algorithm
How do we choosethe
03 factor ‘K’? 04 When do we useKNN?