0% found this document useful (0 votes)
46 views33 pages

Introduction To ML Linear Regression

Machine learning is a process that enables computers to learn from data without being explicitly programmed. It involves using algorithms to search for patterns in large amounts of data and then using the patterns to predict or classify new data. Machine learning is useful when algorithms cannot be written to identify patterns or when there is too much data or complexity. It is used in areas like fraud detection, recommendation systems, and image recognition. For machine learning to work, rich data representing the real world as well as skills in programming, statistics, and domain knowledge are required. Machine learning algorithms treat data as mathematical points in a feature space defined by attribute dimensions. They generate models, such as planes or hyperplanes, that classify or predict new data based on its position in

Uploaded by

Srinidhi Adya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views33 pages

Introduction To ML Linear Regression

Machine learning is a process that enables computers to learn from data without being explicitly programmed. It involves using algorithms to search for patterns in large amounts of data and then using the patterns to predict or classify new data. Machine learning is useful when algorithms cannot be written to identify patterns or when there is too much data or complexity. It is used in areas like fraud detection, recommendation systems, and image recognition. For machine learning to work, rich data representing the real world as well as skills in programming, statistics, and domain knowledge are required. Machine learning algorithms treat data as mathematical points in a feature space defined by attribute dimensions. They generate models, such as planes or hyperplanes, that classify or predict new data based on its position in

Uploaded by

Srinidhi Adya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introduction to machine learning

Machine Learning

1
Introduction to machine learning

“[Machine Learning is the] field of study that gives computers the ability to
learn without being explicitly programmed.” Arthur Samuel 1959:

2
Introduction to machine learning

What is machine learning?

1. Is a process of enabling a computer based system to learn to do tasks based on


well defined statistical and mathematical methods

2. The ability to do the tasks come from the underlying model which is the result of
the learning process. Sometimes the ability comes from an mathematical
algorithm

3. The model generated represents behaviour of the processes that were earlier
performed before machine learning

4. The model is generated from huge volume of data, huge both in breadth and
depth reflecting the real world in which the processes are performed

5. The more representative data is of the real world, the better the model would be.
The challenge is how to make it a true representative

3
Introduction to machine learning

What do machine learning algorithms do?

1. Search through data to look for patterns

2. Patterns in form of trends, cycles, associations, classes etc.

3. Express these patterns as mathematical structures such as probability equations


or polynomial equations

4
Introduction to machine learning
When is machine learning useful ?

1. Cannot express our knowledge about patterns as a program. For e.g. Character
recognition or natural language processing

2. Do not have an algorithm to identify a pattern of interest. For e.g. In spam mail detection

3. Too complex and dynamic. For e.g. Weather forecasting

4. Too many permutations and combinations possible. For e.g. Genetic code mapping

5. No prior experience or knowledge. For e.g. Mars rover

6. Patterns hidden in humongous data. For e.g. Recommendation system

5
Introduction to machine learning

Where are machine learning based systems used (examples only)

1. Fraud detection

2. Sentiment analysis

3. Credit risk management

4. Prediction of equipment failures

5. New pricing models / strategies

6. Network intrusion detection

7. Pattern and image recognition

8. Email spam filtering

6
Introduction to machine learning
Machine Learning & Data Science

1. Machine learning is part of a larger discipline called Data Science

2. Data science is the process of applying science and domain expertise to data to
extract useful information from data.

3. It includes application of all the statistical and mathematical tools and techniques to
glean out the useful information from data using machine learning

7
Introduction to machine learning
Machine Learning Pre-requisites

1. Rich set of data representing the real world

2. Knowledge and skills in


a. Maths and statistics
b. Programming (Python, R, Java, Go)
c. Tools / frameworks such as Keras / TensorFlow
d. Domain knowledge

8
Introduction to machine learning

Real World as Mathematical Space

9
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. A data set representing the real world, is a collection attributes that define an
entity

2. Each entity is represented as one record / line in the data set

Attributes / Dimensions

10
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. Each attribute becomes a


dimension

2. Each record becomes a


point in the space

Sugar

BP level
Heart healthy
Potential heart ailments

11
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. Position of a point in
space is defined with
respect to the origin

2. The position is decided by


the values of the attributes
for a point

Sugar

BP level

Heart healthy
Potential heart ailments

12
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

3. A model represents the real


world process that
generated the different set
of data points

4. The model could be a simple


plane, complex plane, hyper
plane

Sugar
5. But multiple planes can do
the job. Each representing
an alternate hypothesis

6. The learning algorithm


selects that hypothesis
which minimizes errors in
the test data BP level
Heart healthy
Erroneous classification Potential heart ailments

13
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

7. In the figure, since the


separator is a plane, the
model will be the equation
representing the plane

ax + by + cz = d

8. x , y, z represent the three


dimensions i.e. BP, Age,
Sugar while d represents

Sugar
the color i.e. healthy or
ailing heart

BP level
Heart healthy
Potential heart ailments

14
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

9. A new data point enters


the system

10. It’s x,y and z values will be


fed into the model to get
value of d (healthy or
ailing)

Sugar
11. The data point will be
placed above or below the
plane based on d

ax + by + cz = d, BP level

Heart healthy
Potential heart ailments

15
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

12. Whether the new data point


is correctly placed (above
or below the plane) i.e.
correctly classified as ailing
or healthy hear will be
known only after direct
observation

Sugar

ax + by + cz = d, BP level

Heart healthy
Potential heart ailments

16
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

13. Only direct test on the


object of interest will tell
whether the classification is
correct or not

Sugar
ax + by + cz = d,

14. If majority of new data


points are correctly
classified, the model is
good else not
BP level
Heart healthy
Potential heart ailments

17
Introduction to machine learning

Introduction to Supervised
Machine Learning

18
Introduction to machine learning
Characteristics of Supervised Machine Learning -

a. Class of machine learning algorithms that work on externally supplied instances


(data) in form of predictor attributes and associated target values

b. They produce a model representing alternate hypothesis i.e. distribution of


class labels in terms of predictor variables in the feature space

c. The model thus generated is used to make predictions about future instances
where the predictor feature values are known but the target / class value is
unknown
a. E.g.-1 building model to predict the re-sale value of a car based on its current mileage,
age, color etc.
b. E.g.-2 Predicting the final year scores based on student performance in previous
years.

19
Introduction to machine learning
Data Science Machine Learning Steps -
Identify Data Identify what type of data, source of data and how to ingest data into
your system. Need domain expertise and lateral thinking
Required

Pre-process Address data quality issues such as missing values, outliers, data
Data pollution etc. Establish veracity of the data. Select attributes for model,
Need domain expertise
Create
Split the data into training set and test set. Generally
training & 70:30 ratio is used
test set
Select
Select appropriate algorithm/s to model. For e.g. Random
appropriate Forest, K Nearest Neighbors etc. Depends on data
algorithm/s

Train & build Build the model in Python or Spark or R


the model
Evaluate the model on test data
Evaluate ensure it is not overfit or underfit
with test data and likely to generalize well

Deploy at scale
OK?
No Yes

Productionize
& calibrate
20
Introduction to machine learning

Linear Regression

21
Introduction to machine learning
Linear Regression Models -

a. The term "regression" generally refers to predicting a real number. However, it


can also be used for classification (predicting a category or class.)

b. The term "linear" in the name “linear regression” refers to the fact that the
method models data with linear combination of the explanatory variables.

c. A linear combination is an expression where one or more variables are scaled


by a constant factor and added together.

d. In the case of linear regression with a single explanatory variable, the linear
combination used in linear regression can be expressed as:

response = intercept + constant ∗ explanatory

e. In its most basic form fits a straight line to the response variable. The model is
designed to fit a line that minimizes the squared differences (also called errors
or residuals.).

22
Introduction to machine learning
Linear Regression Models -

a. Before we generate a model, we need to understand the degree of relationship


between the attributes Y and X

b. Mathematically correlation between two variables indicates how closely their


relationship follows a straight line. By default we use Pearson’s correlation which
ranges between -1 and +1.

c. Correlation of extreme possible values of -1 and +1 indicate a perfectly linear


relationship between X and Y whereas a correlation of 0 indicates absence of linear
relationship
I. When r value is small, one needs to test whether it is statistically significant or not to
believe that there is correlation or not

23
Introduction to machine learning
Linear Regression Models -

d. Coefficient of relation - Pearson’s coefficient p(x,y) = Cov(x,y) / ( stnd Dev (x) X stnd
Dev (y) )

r is near 0 r is near -1 r is near +1

e. Generating linear model for cases where r is near 0, makes no sense. The model will
not be reliable. For a given value of X, there can be many values of Y! Nonlinear
models may be better in such cases

24
Introduction to machine learning
Linear Regression Models (Recap) -

f. Coefficient of relation - Pearson’s coefficient p(x,y) = Cov(x,y) / ( stnd Dev (x) X stnd
Dev (y) )

- ve +ve
quad quad

+ve - ve
quad quad

=0
>0

http://www.socscistatistics.com/tests/pearson/Default2.aspx

25
Introduction to machine learning
Linear Regression Models -
g. Given Y = f(x) and the scatter plot shows apparent correlation between X and Y
Let’s fit a line into the scatter which shall be our model

h. But there are infinite number of lines that can be fit in the scatter. Which one
should we consider as the model?

i. This and many other


algorithms use gradient
descent or variants of
gradient descent method
for finding the best
model

j. Gradient descent
methods use partial
derivatives on the
parameters (slope and
intercept) to minimize
sum of squared errors

26
Introduction to machine learning
Linear Regression Models (Recap) -
k. Whichever line we consider as the model, it will not pass through all the points.
l. The distance between a point and the line (drop a line vertically (shown in
yellow)) is the error in prediction
m. That line which gives least sum of squared errors is considered as the best line

Error = (T – (mx + C)
Sum of all errors can cancel
out and give 0

We square all the errors and


sum it up. That line which
gives us least sum of squared
errors is the best fit

27
Introduction to machine learning
Linear Regression Models -
n. Coefficient of determinant – determines the fitness of a linear model. The closer the
points get to the line, the R^2 (coeff of determinant) tends to 1, the better the model is

Model line always passes


through Xbar and Ybar

Ybar

Xbar

28
Introduction to machine learning
Linear Regression Models -
o. Coefficient of determinant (Contd…)
I. There are a variety of errors for all those points that don’t fall exactly on the line.
II. It is important to understand these errors to judge the goodness of fit of the model i.e. How
representative the model is likely to be in general
III. Let us look at point P1 which is one of the given data points and associated errors due to
the model
1. P1 – Original y data point for given x

2. P2 - Estimated y value for given x

y P1 3. Ybar – Average of all Y values in data set


SSE
SST 4. SST – Sum of Square error Total (SST)
P2 Variance of P1 from Ybar (Y – Ybar)^2
SSR
Ybar 5. SSR - Regression error (p2 – ybar)^2 (portion
SST captured by regression model)

6. SSE - Residual error (p1 – p2)^2

Xbar x
29
Introduction to machine learning
Linear Regression Models -

p. Coefficient of determinant (Contd…)


1. That model is the most fit where every data
point lies on the line. i.e. SSE = 0 for all
y P1 data points

SSE
2. Hence SSR should be equal to SST i.e.
SST SSR/SST should be 1.
P2
SSR
Ybar 3. Poor fit will mean large SSE. SSR/SST will
be close to 0

4. SSR / SST is called as r^2 (r square) or


coefficient of determination

Xbar 5. r^2 is always between 0 and 1 and is a


x
measure of utility of the regression model

30
Introduction to machine learning
Linear Regression Models -

q. Coefficient of determinant (Contd…) -

Point B
Point B

Point A Point A

In case of point “A”, the line explains the variance of the point

Whereas point “B” the is a small area (light grey) which the line does not represent.

%age of total variance that is represented by the line is coeff of determinant

31
Introduction to machine learning
Linear Regression Model -

Advantages –
1. Simple to implement and easier to interpret the outputs coefficients

Disadvantages -
1. Assumes a linear relationships between dependent and independent variables. That
is, it assumes there is a straight-line relationship between them
2. Outliers can have huge effects on the regression
3. Linear regression assume independence between attributes
4. Linear regression looks at a relationship between the mean of the dependent variable
and the independent variables.
5. Just as the mean is not a complete description of a single variable, linear regression
is not a complete description of relationships among variables
6. Boundaries are linear

32
Introduction to machine learning
Linear Regression Model -

Lab- 1- Estimating mileage based on features of a second hand car

Description – Sample data is available at


https://archive.ics.uci.edu/ml/datasets/Auto+MPG

The dataset has 9 attributes listed below that define the quality
1. mpg: continuous
2. cylinders: multi-valued discrete
3. displacement: continuous
4. horsepower: continuous
5. weight: continuous
6. acceleration: continuous
7. model year: multi-valued discrete
8. origin: multi-valued discrete
9. car name: string (unique for each instance)

Sol : mpg-linear regression.ipynb

33

You might also like