1 - Intro To Machine Learning
1 - Intro To Machine Learning
Machine Learning
Machine Learning Concepts
Lecture Overview
Low Prediction Variance, High Error High Prediction Variance, Low Error
Underfitting and Overfitting
Underfitting Overfitting
• Model is too simple to fit • Model is too complicated
the data well (capture and fits the data too
underlying patterns) closely.
• Underfit models do not • Do not generalize well to
explain changes in the output new data.
feature sufficiently. • Can be avoided by using
• Underfit models have high cross-fold validation or other
bias. model specific methods.
Overfitting and Underfitting Comparison
https://
www.geeksforge
eks.org/
underfitting-
and-overfitting-
in-machine-
learning/
Bias-Variance Tradeoff
• Mean Square
Error (MSE):
average of squared
prediction errors.
• Composed of bias
and prediction
variance.
• Models with
accurate predictions
will have low MSE.
Linear Regression
• Before we end today, we will learn of our first
machine learning model.
• Linear Regression uses numeric data to show a
relationship between an independent (x) and
dependent variable (y).
• Independent variable is the variable that we change, and
we see how the dependent variable is affected.
• Goal: to predict future values of one of the attributes
(features).
Linear Regression Chart
m is the slope
of the line
b is the y-intercept
𝑦 =𝑚 𝑥+ 𝑏
Correlation
• Evaluation Metric:
• Linear Regression's metric needs to tell us the strength of
the correlation between the two variables.
• There is a measure for this called the Correlation
Coefficient, R, but oftentimes we instead use a different
metric that is more useful called the Coefficient of
Determination or R2
• So just what is the Coefficient of Determination?
Evaluating the Regression:
Coefficient of Determination 2