0% found this document useful (0 votes)

28 views

Linear Regression

Uploaded by

mahdi.rahmoune

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Linear Regression

Uploaded by

mahdi.rahmoune

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

CSC380: Principles of Data Science

Linear Models

Prof. Jason Pacheco

TA: Enfa Rose George TA: Saiful Islam Salim
Outline

 Linear Regression

 Least Squares Estimation

 Regularized Least Squares

 Logistic Regression
Outline

 Linear Regression

 Least Squares Estimation

 Regularized Least Squares

 Logistic Regression
Linear Regression

Regression Learn a function that

predicts outputs from inputs,

Outputs y are real-valued

OUTPUT: Y

Linear Regression As the name

suggests, uses a linear function:

We will add noise later…

INPUT: X
Linear Regression

Where is linear regression useful?

Trendlines Stock Prediction Climate Models

Massie and Rose (1997)

Used anywhere a linear relationship is assumed

between continuous inputs / outputs
Line Equation

Recall the equation for a line has a

slope and an intercept,

Slope Intercept

• Intercept (b) indicates where line crosses y-axis

• Slope controls angle of line
• Positive slope (w)  Line goes up left-to-right
• Negative slope  Line goes down left-to-right
Moving to higher dimensions…

In higher dimensions Line  Plane

Multiple ways to define a plane, we

will use:

Normal Vector In-Plane Vector

(controls orientation) (handles offset)

Regression weights will take place

of normal vector

Source: http://www.songho.ca/math/plane/plane.html
Inner Products

Recall the definition of an inner product:

Equivalently, projection of one vector onto another,

where
Vector Norm
Linear Regression
[ Image: Murphy, K. (2012) ]

For D-dimensional input vector the

plane equation,

Often we simplify this by including the intercept

into the weight vector,

Since:
Linear Regression

Input-output mapping is not exact, so we will add

zero-mean Gaussian noise, Multivariate Normal

OUTPUT: Y
(uncorrelated)
where

This is equivalent to the likelihood function,

INPUT: X

Because Adding a constant to a Normal RV is still a Normal RV,

In the case of linear regression and

Great, we’re done right?

Data – We have this

We need to fit it to
data by learning the
regression weights Random; Can’t do
anything about it

How to do this?
Don’t know these;
What makes good need to learn them
weights?
Learning Linear Regression Models

There are several ways to think about fitting regression:

• Intuitive Find a plane/line that is close to data

• Functional Find a line that minimizes the least squares loss

• Estimation Find maximum likelihood estimate of parameters

They are all the same thing…

Learning Linear Regression Models

There are several ways to think about fitting regression:

• Intuitive Find a plane/line that is close to data

• Functional Find a line that minimizes the least squares loss

• Estimation Find maximum likelihood estimate of parameters

They are all the same thing…

Fitting Linear Regression

Intuition Find a line that is as

close as possible to every
training data point

The distance from each point

to the line is the residual

Training Output Prediction

https://www.activestate.com/resources/quick-reads/how-to-run-linear-regressions-in-python-scikit-learn/
Outline

 Linear Regression

 Least Squares Estimation

 Regularized Least Squares

 Logistic Regression
Least Squares Solution

Functional Find a line that

minimizes the sum of
squared residuals

Over training all the data,

Least squares regression

https://www.activestate.com/resources/quick-reads/how-to-run-linear-regressions-in-python-scikit-learn/
Least Squares

This is just a quadratic function…

• Convex, unique minimum
• Minimum given by zero-derivative
• Can find a closed-form solution

Let’s see for scalar case with no bias,

Least Squares : Simple Case

Derivative (+ chain rule)

Distributive Property

Algebra
Least Squares in Higher Dimensions

Things are a bit more complicated in higher [ Image: Murphy, K. (2012) ]

dimensions and involve more linear algebra,

Design Matrix Vector of

( each training input on a column ) Training labels

Can write regression over all training data more compactly…

Nx1 Vector
Least Squares in Higher Dimensions

Least squares can also be written more [ Image: Murphy, K. (2012) ]

compactly,

Some slightly more advanced linear algebra

gives us a solution,
Derivation a bit advanced for this class, but…
• We know it has a closed-form and why
• We can evaluate it
• Generally know where it comes from
Ordinary Least Squares (OLS) solution
Learning Linear Regression Models

There are several ways to think about fitting regression:

• Intuitive Find a plane/line that is close to data

• Functional Find a line that minimizes the least squares loss

• Estimation Find maximum likelihood estimate of parameters

They are all the same thing…

Learning Linear Regression Models

There are several ways to think about fitting regression:

• Intuitive Find a plane/line that is close to data

• Functional Find a line that minimizes the least squares loss

• Estimation Find maximum likelihood estimate of parameters

They are all the same thing…

MLE for Linear Regression

Given training data likelihood function

is given by,

OUTPUT: Y
Recall that the likelihood is Gaussian:
INPUT: X

So MLE maximizes the log-likelihood over the whole data as,

Univariate Gaussian (Normal) Distribution
Gaussian (a.k.a. Normal) distribution with
mean (location) and variance (scale)
parameters,

PDF
The logarithm of the PDF if just a negative
quadratic,

Log- PDF

Constant in mean Quadratic Function of mean

Notation

Likelihood of linear basic regression model…

…we will just look at learning mean parameter for now

MLE of Gaussian Mean
Assume data are i.i.d. univariate Gaussian,
Variance is known

Log-likelihood function:

Constant doesn’t
depend on mean
MLE doesn’t change when we:
1) Drop constant terms (in )
MLE estimate is least squares estimator: 2) Minimize negative log-likelihood
MLE of Linear Regression

Substitute linear regression

prediction into MLE solution
and we have,

So for Linear Regression,

MLE = Least Squares
Estimation

https://www.activestate.com/resources/quick-reads/how-to-run-linear-regressions-in-python-scikit-learn/
Multivariate Gaussian Distribution
We have only seen scalar (1-dimensional) X, but MLE is still least
squares for higher-dimensional X…

Let with mean and positive semidefinite covariance

matrix then the PDF is,

Again, the logarithm is a negative quadratic form,

Constant (in mean) Quadratic Function of mean

Multivariate Quadratic Form

Quadratic form for vectors is

given by inner product,

For iid data MLE of Gaussian

mean is once-again least
squares,
• Strongly convex
• Differentiable
• Unique optimizer at zero gradient
Notation

Substitute multi-dimensional linear regression…

…brings us back to the least squares solution

MLE of Linear Regression

Using previous results, MLE is equivalent to [ Image: Murphy, K. (2012) ]

minimizing squared residuals,

Some slightly more advanced linear algebra

1. Definition of linear regression model,

where

2. For N iid training data fit using least squares,

3. Equivalent to maximum likelihood solution

Linear Regression Summary

Ordinary least squares solution

Is solved in closed-form using the Normal equations,

Design Matrix Vector of QUESTIONS?

( each training input on a column ) Training labels
A word on matrix inverses…

Least squares solution requires inversion of the term,

What are some issues with this?

1. Requires time for D input features

2. May be numerically unstable (or even non-invertible)

Small numerical errors in input
can lead to large errors in solution
Pseudoinverse

The Moore-Penrose pseudoinverse is denoted,

• Generalization of the standard matrix inverse

• Exists even for non-invertible XTX
• Directly computable in most libraries
• In Numpy it is: linalg.pinv
Linear Regression in Scikit-Learn

For Evaluation
Load your libraries,

Load data,

Train / Test Split:

Linear Regression in Scikit-Learn
Train (fit) and predict,

Plot regression line with the test set,

Outline

 Linear Regression

 Least Squares Estimation

 Regularized Least Squares

 Logistic Regression
Outliers
How does an outlier affect the estimator?

Squared Error
Outliers
How does an outlier affect the estimator?

Squared Error
Outliers in Linear Regression

Outlier “pulls”
regression line away
from inlier data

Y
Need a way to ignore or
to down-weight impact
of outlier

X
https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-multiple-regression/mlr-residual-analysis-and-outliers.html
Dealing with Outliers

Too many outliers can indicate many things: non-Gaussian

(heavy-tailed) data, corrupt data, bad data collection, …

A few ways to handle outliers…

1. Use a heavy-tailed noise distribution (Student’s T)
Fitting regression becomes difficult
2. Identify outliers and discard them
NP-Hard and throwing away data is generally bad

3. Penalize large weights to avoid overfitting (Regularization)

Regularization

Recall, regularization helps avoid overfitting training data…

Regularization Regularization Penalty

Strength

Y Red model is without regularization

Green model includes regularization

X
Regularized Least Squares
Ordinary least-squares estimation (no regularizer),
Already know how
solve this…

Quadratic Penalty
L2-regularized Least-Squares (Ridge)

L1-regularized Least-Squares (LASSO) Absolute Value (L1) Penalty

A word on vector norms…

The L2-norm (Euclidean norm) of a vector w is,

The L1-norm (absolute value) of a vector w is,

They are not the same functions…

Other Regularization Terms

q<1 is not a norm, L1 is non- L2 Regularization

and thus not convex differentiable

A more general regularization penalty,

Administrative Items

• HW7 out Thursday (Due next Thursday)

• HW6 due tonight

• Also, I saw this ad…

Regularized Least Squares

A couple regularizers are so common they have specific names

L2 Regularized Linear Regression

• Ridge Regression
• Tikhonov Regularization

L1 Regularized Linear Regression

• LASSO
• Stands for: Least Absolute Shrinkage and Selection Operator
L2 Regularized Least Squares
Quadratic

Quadratic

Quadratic + Quadratic = Quadratic

• Differentiable
• Convex
• Unique optimum
• Closed form solution
L2 Regularized Least Squares : Simple Case

Derivative (+ chain rule)

Distributive Property

Algebra
L2 Regularized Linear Regression – Ridge Regression
Source: Kevin Murphy’s Textbook

After some algebra…

Compare to ordinary least squares:

Regularized least-squares includes

pseudocount in weighting similar to
Gaussian mean estimator
Notes on L2 Regularization

• Feature weights are “shrunk” towards zero (and each other) –

statisticians often call this a “shrinkage” method
• Typically do not penalize bias (y-intercept, w0) parameter,

• Penalizing w0 would make solution depend on origin for Y – adding a

constant c to Y would not add a constant to solution weights
• Can fit bias in a two-step procedure, by centering features
then bias estimate is
• Solutions are not invariant to scaling, so typically we standardize (e.g.
Z-score) features before fitting model ( Sklearn StandardScaler )
Scikit-Learn : L2 Regularized Regression

Alpha is what we have been calling

Scikit-Learn : L2 Regularized Regression

Define and fit OLS and L2 regression,

Plot results,

L2 (Ridge) reduces impact of any single data point

Choosing Regularization Strength
We need to tune regularization strength to avoid over/under fitting…

Recall bias/variance tradeoff

Error = Irreducible error + Bias2 + Variance

High regularization reduces model

complexity: increases bias / decreases
variance

How should we properly tune ?

Cross-Validation

N-fold Cross Validation Partition training

data into N “chunks” and for each run
select one chunk to be validation data

For each run, fit to training data (N-1

chunks) and measure accuracy on
validation set. Average model error
across all runs.
Drawback Need to perform training N times.

Source: Bishop, C. PRML

Model Selection for Linear Regression

A couple of common metrics for model selection…

Residual Sum-of-squared Errors The total squared residual

error on the held-out validation set,

Coefficient of Determination Also called R-squared or R2.

Fraction of variation explained by the model.

Model selection metrics are known as “goodness of fit” measures

Coefficient of Determination R2

Predicted Variance Residual Sum-of-Squares

Total variance
in dataset Variance using avg. prediction

Where: is the average output

Coefficient of Determination R2

Maximum value R2=1.0 means

model explains all variation in the
data R2 > 0

R2 = 0
Maximum value R2=0 means model is
as good as predicting average
response

R2<0 means model worse than

predicting average output
“Shrinkage” Feature Selection
Down-weight features that are not useful for prediction…
Quadratic penalty down-weights
(shrinks) features that are not useful for
prediction
Example Prostate Cancer Dataset measures
prostate-specific cancer antigen with features:
age, log-prostate weight (lweight), log-benign
prostate hyperplasia (lbph), Gleason score
(gleason), seminal vesical invasion (svi), etc.

L2 regularization learns zero-weight

for log capsular penetration (lcp)

[ Source: Hastie et al. (2001) ]

Constrained Optimization Perspective

Intuition Find best model (lowest

RSS) given constraint on total
feature weights…
Squared Error

Total Weight
There exists a mathematically
Norm equivalent formulation for some
function
Optimal Model
L2 penalized regression rarely
learns feature weight that are
exactly zero…
[ Source: Hastie et al. (2001) ]
Regularized Least Squares
Ordinary least-squares estimation (no regularizer),

Quadratic Penalty
L2-regularized Least-Squares (Ridge)

L1-regularized Least-Squares (LASSO) Absolute Value (L1) Penalty

L1 Regularized Least-Squares

Squared Error

Optimal Model

Learns w2 = 0

Able to zero-out weights that are not predictive…

Feature Weight Profiles

Varying regularization
parameter moderates
shrinkage factor

For moderate regularization

strength weights for many
features go to zero

• Induces feature sparsity

• Ideal for high-dimensional settings
• Gracefully handles p>N case, for p
features and N training data
Feature Weight Profiles

L1 Penalty L2 Penalty
Learning L1 Regularized Least-Squares

Not differentiable…

…doesn’t exist at x=0

Can’t set derivatives to zero as

in the L2 case!
Learning L1 Regularized Least-Squares

• Not differentiable, no closed-form solution

• But it is convex! Can be solved by quadratic programming

(beyond the scope of this class…)

• Efficient optimization algorithms exist

• Least Angle Regression (LAR) computes full solution path for

a range of values

• Can be solved as efficiently as L2 regression

Specialized methods for cross-validation…

Computes solution using coordinate descent

Uses least angle regression (LARS) to compute solution path

L1 Regression Cross-Validation

Perform L1 Least Squares (LASSO) 20-fold cross-validation,

Plot solution path for range of alphas,

All alphas_

Learned alpha_ (no “s”… annoying…)

Example: Prostate Cancer Dataset

Best LASSO model learns to

ignore several features (age, lcp,
gleason, pgg45).

Wait…Is age really not a

significant predictor of prostate
cancer? What’s going on here?

Age is highly correlated with other

factors and thus not significant in
the presence of those factors
Administrative Items

HW7 will be posted tonight

• Ordinary least squares regression
• Ridge regression
• Lasso
• Feature selection

Due next Thursday (11/11)

• A bit more is left up to the student compared to HW5 / HW6
Best-Subset Selection
L1 / L2 shrinkage offer approximate feature selection…

The optimal strategy for p features looks at models over all possible
combinations of features,

For k in 1,…,p:
subset = Compute all subset of k-features (p-choose-k)
For kfeat in subset:
model = Train model on kfeat features
score = Evaluate model using cross-validation
Choose the model with best cross-validation score
Best-Subset Selection : Prostate Cancer Dataset

Each marker is the cross-val

R2 score of a trained model
for a subset of features

Data have 8 features, there

are 8-choose-k subsets for
each k=1,…,8 for a total of
255 models

Using 10-fold cross-val

requires 10 x 255 = 2,550
training runs!
Feature Selection: Prostate Cancer Dataset
Best subset has highest test accuracy (lowest
variance) with just 2 features

[ Source: Hastie et al. (2001) ]

Comparing Feature Selection Methods

Notation Change Least

squares weights are
rather than .
Forward Sequential Selection
An efficient method adds the most predictive feature one-by-one

featSel = empty
featUnsel = All features
For iter in 1,…,p:
For kfeat in featUnsel:
thisFeat = featSel + kfeat
model = Train model on thisFeat features
score = Evaluate model using cross-validation
featSel = featSel + best scoring feature
featUnsel = featUnsel - best scoring feature
Choose the model with best cross-validation score
Backward Sequential Selection
Backwards approach starts with all features and removes one-by-one

featSel = All features

For iter in 1,…,p:
For kfeat in featSel:
thisFeat = featSel - kfeat
model = Train model on thisFeat features
score = Evaluate model using cross-validation
featSel = featSel – worst scoring feature
Choose the model with best cross-validation score
Comparing Feature Selection Methods
Sequential selection is greedy, but often performs well…

Example Feature selection on synthetic

model with p=30 features with pairwise
correlations (0.85). True feature
weights are all zero except for 10
features, with weights drawn from
N(0,6.25).

Sequential selection with p features

takes O(p2) time, compared to
exponential time for best subset

Sequential feature selection available in Scikit-Learn under:

feature_selection.SequentialFeatureSelector
Outline

 Linear Regression

 Least Squares Estimation

 Regularized Least Squares

 Logistic Regression
Classification as Regression
Suppose our response variables are binary y={0,1}. How can we use
linear regression ideas to solve this classification problem?

https://towardsdatascience.com/why-linear-regression-is-not-suitable-for-binary-classification-c64457be8e28
Classification as Regression

Idea Fit a regression function to the

data (red). Classify points based on
whether they are above or below the
midpoint (green).

• This is a discriminant function, since it discriminates between classes

• It is a linear function and so is a linear discriminant
• Green line is the decision boundary (also linear)

https://towardsdatascience.com/why-linear-regression-is-not-suitable-for-binary-classification-c64457be8e28
Multiclass Classification as Regression
Suppose we have K classes. Training outputs
for each class are a set of indicator vectors,

With if class k, e.g. Y=(0,0,…,1,0,0).

For N training inputs create NxK matrix of outputs and solve,

W is NxK matrix of K linear
regression models, one for
each class

• Compute fitted output a K-vector

This is an instance of
• Identify largest component and classify as, multi-output linear
regression

[ Image: Hastie et al. (2001) ]

Linear Probability Models

Binary Classification Linear model approximates

probability of class assignment,

Multiclass Classification Multiple decision boundaries,

each approximated by the class-specific linear model,

Where is kth row

Approximates probability of class assignment,

What’s the rational?
Recall the linear regression model,

So linear regression models the expected value,

We can call this
approach least
For discrete values we have that, squares classification

Can easily verify that they sum to 1,

But they are not guaranteed to be positive!

Logistic Regression
Idea Distort the response variable in
some way to map to [0,1] so that it is
actually a probability.

Uses the logistic function,

• Logistic function is a type of sigmoid or squashing function, since it maps any

value to the range [0,1]

• Predictor variable now actually maps to a valid probability mass function (PMF),

https://towardsdatascience.com/why-linear-regression-is-not-suitable-for-binary-classification-c64457be8e28
Logistic Regression : Decision Boundary
Binary classification decisions are
based on the posterior odds ratio,

If this ratio is greater than 1.0 then

classify as C=1, otherwise C=0

In practice, we use the (natural) logarithm of the posterior odds ratio,

This is a linear decision boundary

Logistic regression is a linear classifier

Logistic vs. Logit Transformations
Logistic Function Logit Function

Maps to [0,1] Maps [0,1] to

Logistic also widely used in Neural Networks – for classification last

layer is typically just a logistic regression
Logistic vs. Logit Transformations

Logistic function maps the linear regression to the interval [0,1],

Logit function is defined for probability values p in [0,1] as,

Logit is the inverse of the logistic function, Logit is also the log-likelihood
ratio, and thus decision boundary
for our binary classifier
Multiclass Logistic Regression

Classification decision based on log-ratio compared to final class,

K-1 log-odds (or logit)

transformations ensures
probabilities sum to 1

Choice of denominator class is arbitrary, but use K by convention

Least Squares vs. Logistic Regression

Least Squares
Logistic Regression

• Both models learn a linear decision boundary

• Least squares can be solved in closed-form (convex objective)
• Least squares is sensitive to outliers (need to do regularization)
[Source: Bishop “PRML”]
Least Squares vs. Logistic Regression

Similar results in 1-dimension

https://towardsdatascience.com/why-linear-regression-is-not-suitable-for-binary-classification-c64457be8e28
Least Squares vs. Logistic Regression

Least Squares Logistic Regression

[Source: Bishop “PRML”]

Fitting Logistic Regression
Fit by maximum likelihood—start with the binary case
Posterior probability of class assignment is Bernoulli,

Given N iid training data pairs the log-likelihood function is,

Fitting Logistic Regression

Computing the derivatives with respect to each element wd,

• For D features this gives us D equations and D unknowns

• But equations are nonlinear and can’t be solved
• Need to use gradient-based optimization to solve (Newton’s method)
• Beyond scope of this class; but know that it is an iterative process
Iteratively Reweighted Least Squares

• Given some estimate of the weights update by solving,

Design Matrix NxN Diagonal

(NxD) Weight matrix

Where z is the gradient direction, P(y=1|x) for each

training point

• Essentially solving a reweighted version of least squares,

Each iteration changes W
and p so need to resolve
Choice of Optimizer

Since Logistic regression

requires an optimizer, there are
more parameters to consider

The choice of optimizer and

parameters can effect time to
fit model (especially if there are
many features)

https://www.datasciencecentral.com/profiles/blogs/an-overview-of-gradient-descent-optimization-algorithms
Scikit-Learn Logistic Regression

Function predict_proba(X) returns prediction of class

assignment probabilities (just a number in binary case)
https://towardsdatascience.com/why-linear-regression-is-not-suitable-for-binary-classification-c64457be8e28
Using Logistic Regression

The role of Logistic Regression differs in ML and Data Science,

• In Machine Learning we use Logistic Regression for building predictive
classification models
• In Data Science we use it for understanding how features relate to data
classes / categories

Example South African Heart Disease (Hastie et al. 2001)

Data result from Coronary Risk-Factor Study in 3 rural areas of South
Africa. Data are from white men 15-64yrs and response is
presence/absence of myocardial infraction (MI). How predictive are
each of the features?
Looking at Data
Each scatterplot shows
pair of risk factors. Cases
with MI (red) and without
(cyan)
Features
• Systolic blood pressure
• Tobacco use
• Low density lipoprotein (ldl)
• Family history (discrete)
• Obesity
• Alcohol use
• Age

[Source: Hastie et al. (2001)]

Example: African Heart Disease

Fit logistic regression to the

data using MLE estimate via
iteratively reweighted least
squares
Standard error is estimated
standard deviation of the
learned coefficients
Recall, Z-score of weights is a random variable from standard Normal,

Thus anything with Z-score > 2 is significant at 5% confidence level

Example: African Heart Disease

Finding Systolic blood

pressure (sbp) is not a
significant predictor

Obesity is not significant and

negatively correlated with heart
disease in the model

Remember All correlations / significance of features are based

on presence of other features. We must always consider that
features are strongly correlated.
Example: African Heart Disease
Doing some feature selection
we find a model with 4
features: tobacco, ldl, family
history, and age
How to interpret coefficients?
(e.g. tobacco  0.081)

• Tobacco is measured in total lifetime usage (in kg)

• Thus, increase of 1kg of lifetime tobacco yields

Or 8.4% increase in odds of coronary heart disease

• 95% CI is 3% to 14% since

ELLIPTIC GEOMETRY - Mondia
100% (2)
ELLIPTIC GEOMETRY - Mondia
30 pages
Group30 Linear Regression
No ratings yet
Group30 Linear Regression
20 pages
CS550 Lec2
No ratings yet
CS550 Lec2
24 pages
Lecture 2 - Linear Regression
No ratings yet
Lecture 2 - Linear Regression
54 pages
ML_Lec 4-introduction to regression
No ratings yet
ML_Lec 4-introduction to regression
65 pages
Machine Learning: Linear Models For Regression
No ratings yet
Machine Learning: Linear Models For Regression
54 pages
CS464 Ch9 LinearRegression
100% (1)
CS464 Ch9 LinearRegression
43 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
lecture3_supervised_learning_I
No ratings yet
lecture3_supervised_learning_I
84 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Intro To ML RevisionNotes
No ratings yet
Intro To ML RevisionNotes
24 pages
Chapter 4 - Linear Model: Prepared By: Shier Nee, SAW Based On: Probabilistic Machine Learning by Kevin Murphy
No ratings yet
Chapter 4 - Linear Model: Prepared By: Shier Nee, SAW Based On: Probabilistic Machine Learning by Kevin Murphy
42 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
G.C. Calafiore (Politecnico Di Torino)
No ratings yet
G.C. Calafiore (Politecnico Di Torino)
23 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
ML Unit
No ratings yet
ML Unit
23 pages
ML 5
No ratings yet
ML 5
21 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
training-models
No ratings yet
training-models
13 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
Regression_Questionnaire
No ratings yet
Regression_Questionnaire
10 pages
Wk05 machine learning
No ratings yet
Wk05 machine learning
6 pages
Group 30 Ppt
No ratings yet
Group 30 Ppt
33 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
ML-Lec8
No ratings yet
ML-Lec8
7 pages
Unit Iii
No ratings yet
Unit Iii
27 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Lec 6
No ratings yet
Lec 6
19 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
3.1 Linear and Logistic Regression
No ratings yet
3.1 Linear and Logistic Regression
36 pages
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
No ratings yet
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
26 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
ml-linear rrgession-1
No ratings yet
ml-linear rrgession-1
14 pages
ML-2
No ratings yet
ML-2
155 pages
2-Linear Regression
No ratings yet
2-Linear Regression
31 pages
2-Notes - Linear Regression-1
No ratings yet
2-Notes - Linear Regression-1
4 pages
Linear_regression
No ratings yet
Linear_regression
23 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Chapter4_Regression.docx
No ratings yet
Chapter4_Regression.docx
15 pages
w3 - Linear Model - Linear Regression
No ratings yet
w3 - Linear Model - Linear Regression
33 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
Asset-V1 ColumbiaX+CSMM.102x+3T2018+type@asset+block@ML Lecture3 PDF
No ratings yet
Asset-V1 ColumbiaX+CSMM.102x+3T2018+type@asset+block@ML Lecture3 PDF
33 pages
linear regression
No ratings yet
linear regression
20 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Week 9 PDF
No ratings yet
Week 9 PDF
70 pages
04 LinearModels
No ratings yet
04 LinearModels
28 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
ML EasySol
No ratings yet
ML EasySol
62 pages
Unit 2 ML
No ratings yet
Unit 2 ML
201 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Data Science Unit-II
No ratings yet
Data Science Unit-II
28 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Fgreps PDF
No ratings yet
Fgreps PDF
8 pages
Geometric Analysis Resource
No ratings yet
Geometric Analysis Resource
12 pages
Martingale and Markov Processes
No ratings yet
Martingale and Markov Processes
20 pages
Probability Theory - Kelly.78 PDF
No ratings yet
Probability Theory - Kelly.78 PDF
78 pages
Erratum Notice: Syllabus 9709 A/As Level Mathematics
No ratings yet
Erratum Notice: Syllabus 9709 A/As Level Mathematics
1 page
Arithmetica Part-1 PDF
No ratings yet
Arithmetica Part-1 PDF
30 pages
Measure Theory Notes
No ratings yet
Measure Theory Notes
4 pages
Understanding Array by DK Mamonai 09CE37
No ratings yet
Understanding Array by DK Mamonai 09CE37
17 pages
Continuity and Differentiability
No ratings yet
Continuity and Differentiability
12 pages
Arden's Theorem, Moore Machine, Mealy Machine
No ratings yet
Arden's Theorem, Moore Machine, Mealy Machine
31 pages
Computer Science 101: Boolean Algebra
No ratings yet
Computer Science 101: Boolean Algebra
18 pages
Circles and Construction
No ratings yet
Circles and Construction
3 pages
Action Plan in Numeracy
No ratings yet
Action Plan in Numeracy
1 page
Quiz 8B Solutions
No ratings yet
Quiz 8B Solutions
3 pages
Propositional Logic: Proof Method: Informatics Engineering Study Program
No ratings yet
Propositional Logic: Proof Method: Informatics Engineering Study Program
28 pages
SSB10203 Chapter 1 Basic of Differentiation and Integration Method
No ratings yet
SSB10203 Chapter 1 Basic of Differentiation and Integration Method
23 pages
4rth Maths
No ratings yet
4rth Maths
3 pages
Well Trajectory Optimization Constrained To Structural Uncertainties
No ratings yet
Well Trajectory Optimization Constrained To Structural Uncertainties
14 pages
III First Order ODE Part1
No ratings yet
III First Order ODE Part1
24 pages
Computational Learning Theory by Safdar Khan
No ratings yet
Computational Learning Theory by Safdar Khan
10 pages
A Mean-Risk Index Model For Uncertain Capital Budgeting
No ratings yet
A Mean-Risk Index Model For Uncertain Capital Budgeting
10 pages
Terminating and Non-Terminating Decimals
No ratings yet
Terminating and Non-Terminating Decimals
2 pages
Q2 - WK 3 - Circle and Other Related Terms
No ratings yet
Q2 - WK 3 - Circle and Other Related Terms
41 pages
Beyer Bommer 2006 - Relationships Between Median Values
No ratings yet
Beyer Bommer 2006 - Relationships Between Median Values
11 pages
7 Grade Math Ratios & Proportional Relationships CCSS "I Can" Statements
No ratings yet
7 Grade Math Ratios & Proportional Relationships CCSS "I Can" Statements
71 pages
EXAM GRADE 2 MATH 3RD MONTHLY (1)
No ratings yet
EXAM GRADE 2 MATH 3RD MONTHLY (1)
2 pages
RAT - TOS - Mathematics - Grade 2
No ratings yet
RAT - TOS - Mathematics - Grade 2
5 pages
Class_10 Maths Chapter-wise MCQ
No ratings yet
Class_10 Maths Chapter-wise MCQ
50 pages
Mathematics Performance Task 2 and 3
No ratings yet
Mathematics Performance Task 2 and 3
4 pages