0% found this document useful (0 votes)
14 views32 pages

Lecture 3

This document discusses parametric and non-parametric machine learning methods. Parametric methods make assumptions about the functional form of the model and estimate parameters. Non-parametric methods make no assumptions about the functional form. Parametric methods are simpler but may not match the true model, while non-parametric methods can fit a wider range of functions but require more data. There is a tradeoff between model flexibility/accuracy and interpretability.

Uploaded by

ABHILASH MS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views32 pages

Lecture 3

This document discusses parametric and non-parametric machine learning methods. Parametric methods make assumptions about the functional form of the model and estimate parameters. Non-parametric methods make no assumptions about the functional form. Parametric methods are simpler but may not match the true model, while non-parametric methods can fit a wider range of functions but require more data. There is a tradeoff between model flexibility/accuracy and interpretability.

Uploaded by

ABHILASH MS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

AIL 7310: MACHINE LEARNING FOR ECONOMICS

Lecture 3

4th August, 2023

AIL 7310: ML for Econ Lecture 3 1 / 11


Parametric Methods

Parametric methods involve a 2-step model based approach.

AIL 7310: ML for Econ Lecture 3 2 / 11


Parametric Methods

Parametric methods involve a 2-step model based approach.

First, we make an assumption about the functional form, or shape, of


f .For example, one very simple assumption is that f is linear in X.

AIL 7310: ML for Econ Lecture 3 2 / 11


Parametric Methods

Parametric methods involve a 2-step model based approach.

First, we make an assumption about the functional form, or shape, of


f .For example, one very simple assumption is that f is linear in X.

f (X ) = β0 + β1 X1 + β2 X2 + ... + βp Xp (1)

AIL 7310: ML for Econ Lecture 3 2 / 11


Parametric Methods

Parametric methods involve a 2-step model based approach.

First, we make an assumption about the functional form, or shape, of


f .For example, one very simple assumption is that f is linear in X.

f (X ) = β0 + β1 X1 + β2 X2 + ... + βp Xp (1)

This is a linear model.

AIL 7310: ML for Econ Lecture 3 2 / 11


Parametric Methods

Parametric methods involve a 2-step model based approach.

First, we make an assumption about the functional form, or shape, of


f .For example, one very simple assumption is that f is linear in X.

f (X ) = β0 + β1 X1 + β2 X2 + ... + βp Xp (1)

This is a linear model.

Once we have assumed that f is linear, the problem of estimating f is


greatly simplified. Instead of having to estimate an entirely arbitrary
p-dimensional function f (X ), one only needs to estimate the p + 1
coefficients β0 , β1 , ..., βp .

AIL 7310: ML for Econ Lecture 3 2 / 11


Parametric Methods

After a model has been selected, we need a procedure that uses the
training data to fit or train the model

AIL 7310: ML for Econ Lecture 3 3 / 11


Parametric Methods

After a model has been selected, we need a procedure that uses the
training data to fit or train the model

In this case, we need to estimate β0 , β1 , ..., βp .

AIL 7310: ML for Econ Lecture 3 3 / 11


Parametric Methods

After a model has been selected, we need a procedure that uses the
training data to fit or train the model

In this case, we need to estimate β0 , β1 , ..., βp .

The most common approach to fitting this model is referred to as


(ordinary) least squares.

AIL 7310: ML for Econ Lecture 3 3 / 11


Parametric Methods

After a model has been selected, we need a procedure that uses the
training data to fit or train the model

In this case, we need to estimate β0 , β1 , ..., βp .

The most common approach to fitting this model is referred to as


(ordinary) least squares.

Assuming a parametric form for f simplifies the problem of estimating f


because it is generally much easier to estimate a set of parameters, such as
β0 , β1 , ..., βp in the linear model, than it is to fit an entirely arbitrary
function f .

AIL 7310: ML for Econ Lecture 3 3 / 11


Parametric Estimation

AIL 7310: ML for Econ Lecture 3 4 / 11


Parametric Estimation

The potential disadvantage of a parametric approach is that the model we


choose will usually not match the true unknown form of f .

AIL 7310: ML for Econ Lecture 3 5 / 11


Parametric Estimation

The potential disadvantage of a parametric approach is that the model we


choose will usually not match the true unknown form of f .

If the chosen model is too far from the true f, then our estimate will be
poor.

AIL 7310: ML for Econ Lecture 3 5 / 11


Parametric Estimation

The potential disadvantage of a parametric approach is that the model we


choose will usually not match the true unknown form of f .

If the chosen model is too far from the true f, then our estimate will be
poor.

We can try to address this problem by choosing flexible models. E.g.


incorporate interaction terms.

AIL 7310: ML for Econ Lecture 3 5 / 11


Parametric Estimation

The potential disadvantage of a parametric approach is that the model we


choose will usually not match the true unknown form of f .

If the chosen model is too far from the true f, then our estimate will be
poor.

We can try to address this problem by choosing flexible models. E.g.


incorporate interaction terms.

But in general, fitting a more flexible model requires estimating a greater


number of parameters.

AIL 7310: ML for Econ Lecture 3 5 / 11


Parametric Estimation

The potential disadvantage of a parametric approach is that the model we


choose will usually not match the true unknown form of f .

If the chosen model is too far from the true f, then our estimate will be
poor.

We can try to address this problem by choosing flexible models. E.g.


incorporate interaction terms.

But in general, fitting a more flexible model requires estimating a greater


number of parameters.

These more complex models can lead to a phenomenon known as


overfitting the data, which essentially means they overfitting follow the
errors, or noise, too closely.

AIL 7310: ML for Econ Lecture 3 5 / 11


Non Parametric Methods

Non-parametric methods do not make explicit assumptions about the


functional form of f .

AIL 7310: ML for Econ Lecture 3 6 / 11


Non Parametric Methods

Non-parametric methods do not make explicit assumptions about the


functional form of f .

Such approaches can have a major advantage over parametric approaches:


by avoiding the assumption of a particular functional form for f , they have
the potential to accurately fit a wider range of possible shapes for f .E.g. A
thin-plate spline

AIL 7310: ML for Econ Lecture 3 6 / 11


Non Parametric Methods

Non-parametric methods do not make explicit assumptions about the


functional form of f .

Such approaches can have a major advantage over parametric approaches:


by avoiding the assumption of a particular functional form for f , they have
the potential to accurately fit a wider range of possible shapes for f .E.g. A
thin-plate spline

In non-parametric methods, we calculate the fit over small ranges of data


and then combine them overall using some smoothing technique.

AIL 7310: ML for Econ Lecture 3 6 / 11


Non Parametric Estimation

AIL 7310: ML for Econ Lecture 3 7 / 11


Non Parametric Estimation

But non-parametric approaches suffer from a major disadvantage: since


they do not reduce the problem of estimating f to a small number of
parameters, a very large number of observations (far more than is typically
needed for a parametric approach) is required in order to obtain an
accurate estimate for f .

AIL 7310: ML for Econ Lecture 3 8 / 11


Non Parametric Estimation

But non-parametric approaches suffer from a major disadvantage: since


they do not reduce the problem of estimating f to a small number of
parameters, a very large number of observations (far more than is typically
needed for a parametric approach) is required in order to obtain an
accurate estimate for f .

There are advantages and disadvantages to both parametric and


non-parametric methods for statistical learning.

AIL 7310: ML for Econ Lecture 3 8 / 11


Trade-off between Prediction Accuracy and Interpretability
In general, there is an negative relation between model flexibility and
interpretability.

AIL 7310: ML for Econ Lecture 3 9 / 11


Trade-off between Prediction Accuracy and Interpretability

When do we use a more restrictive model over a flexible one?

AIL 7310: ML for Econ Lecture 3 10 / 11


Trade-off between Prediction Accuracy and Interpretability

When do we use a more restrictive model over a flexible one?

When inference is our goal.

AIL 7310: ML for Econ Lecture 3 10 / 11


Trade-off between Prediction Accuracy and Interpretability

When do we use a more restrictive model over a flexible one?

When inference is our goal.

Least squares linear regression, is relatively inflexible but is quite


interpretable.

AIL 7310: ML for Econ Lecture 3 10 / 11


Trade-off between Prediction Accuracy and Interpretability

When do we use a more restrictive model over a flexible one?

When inference is our goal.

Least squares linear regression, is relatively inflexible but is quite


interpretable.

Fully non-linear methods such as bagging, boosting, and support vector


machines are highly flexible approaches that are harder to interpret.

AIL 7310: ML for Econ Lecture 3 10 / 11


Trade-off between Prediction Accuracy and Interpretability

When do we use a more restrictive model over a flexible one?

When inference is our goal.

Least squares linear regression, is relatively inflexible but is quite


interpretable.

Fully non-linear methods such as bagging, boosting, and support vector


machines are highly flexible approaches that are harder to interpret.

When inference is the goal, there are clear advantages to using simple and
relatively inflexible statistical learning methods. When prediction is the
goal, the interpretability of the predictive model is not of interest.

AIL 7310: ML for Econ Lecture 3 10 / 11


Regression vs Classification
Variables can be either qualitative and quantitative.

AIL 7310: ML for Econ Lecture 3 11 / 11


Regression vs Classification
Variables can be either qualitative and quantitative.

Quantitative variables take on continuous numeric values. E.g. person’s


age, height, or income, the value of a house,and the price of a stock.
Problems where the response variable is quantitative are called regression
problems.

AIL 7310: ML for Econ Lecture 3 11 / 11


Regression vs Classification
Variables can be either qualitative and quantitative.

Quantitative variables take on continuous numeric values. E.g. person’s


age, height, or income, the value of a house,and the price of a stock.
Problems where the response variable is quantitative are called regression
problems.

In contrast, qualitative variables take on values in one of K different


classes, or categories. Examples of qualitative class variables include a
person’s gender (male or female), the brand of product purchased (brand
A, B, or C), whether a person defaults on a debt(yes or no).Problems
where response variables are qualitative are called classification problems.

AIL 7310: ML for Econ Lecture 3 11 / 11


Regression vs Classification
Variables can be either qualitative and quantitative.

Quantitative variables take on continuous numeric values. E.g. person’s


age, height, or income, the value of a house,and the price of a stock.
Problems where the response variable is quantitative are called regression
problems.

In contrast, qualitative variables take on values in one of K different


classes, or categories. Examples of qualitative class variables include a
person’s gender (male or female), the brand of product purchased (brand
A, B, or C), whether a person defaults on a debt(yes or no).Problems
where response variables are qualitative are called classification problems.

Note: Classification is a term used only in the ML space. Even though


econometrics has many different models for handling qualitative response
variables, they are all called regressions (logit, probit etc)not classification.

AIL 7310: ML for Econ Lecture 3 11 / 11

You might also like