0% found this document useful (0 votes)

14 views29 pages

Advanced Regression

This document discusses probabilistic interpretations of regression and regularization techniques. It also covers using classification algorithms like decision trees, random forests, and support vector machines for regression tasks by adjusting their implementations to handle continuous target variables.

Uploaded by

AnatoliiBalakiriev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views29 pages

Advanced Regression

Uploaded by

AnatoliiBalakiriev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Advanced Regression

Probabilistic interpretation of LR. Classification

algorithms for regression
Contents
1. Bayesian explanation of regularized regression
2. Classification algorithms for regression
Bayesian regression
CLassical probabilistic view on linear regression
Consider we have n points Y drawn A good estimate of mean maximizes
i.i.d. from the normal that likelihood
distribution. The probability of
those points being drawn defines
the likelihood function which is
just a multiplication of their
densities in every points.
CLassical probabilistic view on linear regression
● Assume our mean is a function of predictors X

● Thus our target is distributed according to

● Estimating regression parameters given (2)

A probabilistic interpretation of regularization
● Using a Bayes theorem we can estimate the probability distribution of the
parameters θ given the data we observe Y

● That gives us an opportunity to set the a prior distribution of model

parameters
● Compare that with the classical method where we instead try to find the
best parameters to maximize the likelihood of parameters given the data
A probabilistic interpretation of regularization
● We maximize the posterior probability estimate using Bayes theorem
(Maximum A Posteriori estimation)

● Compare that to MLE estimate

A probabilistic interpretation of L2 regularization
● Assume our model parameters zero-mean normally distributed with t^2
variance (prior knowledge)

● Small variance t^2 (large λ) leads to reduction of coefficients. If we

have a large variance (small λ) the coefficients are not affected much.
A probabilistic interpretation of L1 regularization
● Laplace distribution with mean μ and diversity b defined by a probability
density function

Compare to
A probabilistic interpretation of L1 regularization
● Assume our model parameters zero-mean Laplace-distributed with diversity b

● L1 regularization promotes sparsity in comparison with “just reducing the

coefficients” in L2. That makes sense if you look at Laplacean density
where there is a sharp increase in x=mean.
Why L1 zero out coefficients whereas L2 does not?
Why L1 zero out coefficients whereas L2 does not?
● Laplace distribution (sharp in x=mean) vs Normal
distribution (smooth)
● Intuitive understanding through Gradient Descent
https://developers.google.com/machine-learning/crash-course/regulariz
ation-for-sparsity/l1-regularization
● Intuitive understanding through visualization in 2d case
https://explained.ai/regularization/L1vsL2.html
Generalized linear models
● What if we change the hypothesis

● https://scikit-learn.org/stable/modules/linear_model.html#generalized-line
ar-regression
Real world examples
Insurance cost Number of calls arriving in a call
(Tweedie distribution) center per hour
(Poisson distribution)
Classiﬁcation algorithms for regression
KNN regressor
How to calculate continuous variable for KNN?
KNN regressor
How to calculate continuous variable for KNN?

● Intuitive – each object in training has known target value

● We have k neighbors for prediction – let’s average their target
value!
● We can use distancing

Pros:
● Simple, not many changes from
Classifier
Cons:
● All the cons of KNN
Decision tree regressor
How we can change decision tree to solve regression tasks?
Decision tree regressor
How we can change decision tree to solve regression tasks?
● Every leaf now contains the set of objects. Their average is the
prediction we are looking for.
● We have to use other, continuous
measures of information gain:
○ Variance (standard deviation)

Pros:
● Simplicity and interpretability
of DT
Cons:
● Limited set of predicted values
Random forest regressor
How we can change random forest to solve regression task?
Random forest regressor
How we can change random forest to solve regression task?

● Nothing has changed, just take decision tree regressor as basic

learner and average the result across estimators

Pros & cons:

● Everything is the same as in random forest classifier

Support Vector Machine
How this is going to work?
Support Vector Machine
How this is going to work?

● Reversing the SVM task: we create the plane, as narrow as

possible, which includes as many points as it can inside:

Minimize

Constraints
Gradient boosting
How do we use GB for regression tasks?
Gradient boosting
How do we use GB for regression tasks?

● Every new learner is fitted on error gradient with respect to

ensemble of previous learners
● That means we fit every new tree on residuals from previous step
Advanced Hyperparameter Tuning
Advanced Hyperparameter Tuning
Which technics do you already know?
Advanced Hyperparameter Tuning
Which technics do you already know?

● Blind pick
● Grid Search
● Random Search
Advanced Hyperparameter Tuning
● HyperOpt http://hyperopt.github.io/hyperopt/. The idea
behind can be explained through bayesian optimization

LASSO Book Tibshirani PDF
No ratings yet
LASSO Book Tibshirani PDF
362 pages
KANSAI SPECIAL JJ30 - PB - INST - 5th - Ed PART BOOK
No ratings yet
KANSAI SPECIAL JJ30 - PB - INST - 5th - Ed PART BOOK
56 pages
Operating System Lab Manual
No ratings yet
Operating System Lab Manual
58 pages
Supervised Learning Algorithms Cheat Sheet
No ratings yet
Supervised Learning Algorithms Cheat Sheet
20 pages
RF Online Setup Log
No ratings yet
RF Online Setup Log
1,353 pages
History of Computers
No ratings yet
History of Computers
43 pages
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
No ratings yet
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
27 pages
GPost Manual
No ratings yet
GPost Manual
449 pages
SLS Corrected 1.4.16 PDF
No ratings yet
SLS Corrected 1.4.16 PDF
362 pages
Data Scientist Cover Letter Example
100% (2)
Data Scientist Cover Letter Example
5 pages
Security Part I: Auditing Operating Systems and Networks: IT Auditing, Hall, 4e
100% (1)
Security Part I: Auditing Operating Systems and Networks: IT Auditing, Hall, 4e
34 pages
Stanford ML
No ratings yet
Stanford ML
168 pages
12-Regularization For Deep Learning-17!08!2024
No ratings yet
12-Regularization For Deep Learning-17!08!2024
51 pages
The Field Guide of Data Science Part 1
No ratings yet
The Field Guide of Data Science Part 1
210 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
No ratings yet
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
100 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Assembler Directives of 8086
100% (5)
Assembler Directives of 8086
1 page
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Maths in Data Science
No ratings yet
Maths in Data Science
3 pages
L09 - Regularisation
No ratings yet
L09 - Regularisation
79 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
4 MachineLearningForCV
No ratings yet
4 MachineLearningForCV
73 pages
Chapter 2 - Logistic Regression
No ratings yet
Chapter 2 - Logistic Regression
88 pages
3 IntroSoftSec
No ratings yet
3 IntroSoftSec
42 pages
Regularization
No ratings yet
Regularization
46 pages
Lecture6 Regularization
No ratings yet
Lecture6 Regularization
56 pages
Lecture 3
No ratings yet
Lecture 3
61 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
03 Linear Regression
No ratings yet
03 Linear Regression
54 pages
Advanced Regression Pres
No ratings yet
Advanced Regression Pres
42 pages
CM20315 09 Regularization
No ratings yet
CM20315 09 Regularization
44 pages
ML PYQs
No ratings yet
ML PYQs
32 pages
Deep Learning Basics Lecture 3 Regularization I
No ratings yet
Deep Learning Basics Lecture 3 Regularization I
32 pages
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
No ratings yet
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
43 pages
Lecture 7 Loss Function and Regularization
No ratings yet
Lecture 7 Loss Function and Regularization
38 pages
DXX 8170 Cluster Node
100% (1)
DXX 8170 Cluster Node
2 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Lecture 02
No ratings yet
Lecture 02
43 pages
Reverse Engineering Vehicles Burpsuite Style
No ratings yet
Reverse Engineering Vehicles Burpsuite Style
29 pages
Auto LISP
No ratings yet
Auto LISP
18 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Andrew Rosenberg - Lecture 5: Linear Regression With Regularization CSC 84020 - Machine Learning
No ratings yet
Andrew Rosenberg - Lecture 5: Linear Regression With Regularization CSC 84020 - Machine Learning
38 pages
Intro To ML RevisionNotes
No ratings yet
Intro To ML RevisionNotes
24 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
IT Cooling Full Product Catalogue 2022 2023
No ratings yet
IT Cooling Full Product Catalogue 2022 2023
27 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
Supervised Learning
No ratings yet
Supervised Learning
3 pages
L11+ Regularization
No ratings yet
L11+ Regularization
25 pages
GigaOM Research Going Green With HPC Computing
No ratings yet
GigaOM Research Going Green With HPC Computing
20 pages
PP APR-18 (Sol) (E-Next - In)
No ratings yet
PP APR-18 (Sol) (E-Next - In)
22 pages
L11+ Regularization
No ratings yet
L11+ Regularization
24 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
Regularization (Mathematics) - Wikipedia
No ratings yet
Regularization (Mathematics) - Wikipedia
13 pages
Lecture 4 - Regularization
No ratings yet
Lecture 4 - Regularization
22 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Ds 7
No ratings yet
Ds 7
20 pages
Post Lab 3 Eee205
No ratings yet
Post Lab 3 Eee205
18 pages
RSS260 Operating Instructions Safety Sensor
No ratings yet
RSS260 Operating Instructions Safety Sensor
10 pages
1.5 Regularization and Optimization
No ratings yet
1.5 Regularization and Optimization
17 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Regularization (Mathematics)
No ratings yet
Regularization (Mathematics)
11 pages
Blueprint For Supply Chain Visibility
No ratings yet
Blueprint For Supply Chain Visibility
16 pages
Lab Manual 05
No ratings yet
Lab Manual 05
13 pages
Unit 2 Data Representation: Worksheet 3 Characters
No ratings yet
Unit 2 Data Representation: Worksheet 3 Characters
3 pages
Phy Lab # 1
No ratings yet
Phy Lab # 1
7 pages
Lab 2 Part 2
No ratings yet
Lab 2 Part 2
13 pages
01 Lecturenote SRM
No ratings yet
01 Lecturenote SRM
9 pages
Mauryan Empire
No ratings yet
Mauryan Empire
11 pages
Chapter Regression
No ratings yet
Chapter Regression
10 pages
ASSIGNMENT2
No ratings yet
ASSIGNMENT2
6 pages
JSS 1 - 3
No ratings yet
JSS 1 - 3
6 pages
Toshiba e Studio 5518a 6518a 7518a 8518 Brochure
No ratings yet
Toshiba e Studio 5518a 6518a 7518a 8518 Brochure
6 pages
Dragonlock Cura Profile Instr v3
No ratings yet
Dragonlock Cura Profile Instr v3
6 pages
Accuvein Vein Finder How Does It Work?: Problem Statement
No ratings yet
Accuvein Vein Finder How Does It Work?: Problem Statement
5 pages
THE VISUAL FOXPRO REPORT WRITER - WWW - Foxite
No ratings yet
THE VISUAL FOXPRO REPORT WRITER - WWW - Foxite
5 pages
Website - Machine Learning
No ratings yet
Website - Machine Learning
6 pages
Unix Top100 e
No ratings yet
Unix Top100 e
3 pages
Resume 1734199998
No ratings yet
Resume 1734199998
1 page
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet

Advanced Regression

Uploaded by

Advanced Regression

Uploaded by

Advanced Regression

Probabilistic interpretation of LR. Classification

● Thus our target is distributed according to

● Estimating regression parameters given (2)

● That gives us an opportunity to set the a prior distribution of model

● Compare that to MLE estimate

● Small variance t^2 (large λ) leads to reduction of coefficients. If we

● L1 regularization promotes sparsity in comparison with “just reducing the

● Intuitive – each object in training has known target value

● Nothing has changed, just take decision tree regressor as basic

Pros & cons:

● Everything is the same as in random forest classifier

● Reversing the SVM task: we create the plane, as narrow as

● Every new learner is fitted on error gradient with respect to

You might also like