0% found this document useful (0 votes)

13 views

ML Intro

This document provides an introduction to machine learning concepts. It discusses converting business problems into data problems by identifying the response/outcome and relevant factors/features from available business data. Common problem types are classified as supervised (regression and classification problems with explicit outcomes) and unsupervised (problems without measurable outcomes). The document explains that machine learning algorithms aim to minimize errors by optimizing parameters to reduce a cost function, such as error sum of squares. Gradient descent is introduced as a simple yet powerful method for optimizing parameters by updating them in the negative gradient direction of the cost function.

Uploaded by

shanthikk3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

ML Intro

Uploaded by

shanthikk3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Introduction to Machine Learning

In this module we’ll discuss ideas. No codes , no python [Only for a while though ;) ]. We’ll not get too
technical or mathematical, we’ll keep our discussion focused on concepts. Lets begin.

Converting Business Problems to Data Problems

At the end of the day; whatever we are doing here is about money. To put that more appropriately , we are
developing these techniques to solve business problems. Real business problems do not come conveniently all
dressed up as a data problem in general. For example consider this :
• A bank is making too many losses because of defaulters on retail loans
Its a genuine business problem but it isn’t a data problem yet on the face of it. So, what is a data problem
then. A data problem is a business problem expressed in terms of the data [ potentially ]available in the
business process pipeline.
It majorly has two components:
• Response/Goal/Outcome
• Set of factor/features data which affects our goal/response/outcome
Lets look at the loan default problem and find these components. Outcome is loan default, which we would like
to predict when considering giving loan to a prospect. What factors could help us in doing that. Banks collect
a lot of information on a loan application such as financial data, personal information. In addition to that they
also make queries regarding credit history to various agencies. We could use all these features/information to
predict whether a customer is going to default on their loan or not and the reconsider our decision depending
on the result.
Here are few more business problems which you can try converting to data problems :
* A marketing campaign is causing spam complaints from the existing customers

* Hospitals can not afford to have additional staff all year round
which are needed only when patient intakes become higher than a certain amount

* An ecommerce company wants to know how it should plan for the budget on cloud servers

* How many different election campaign strategies should a party opt for

* What kind of toys should lego launch in India

Three kind of Problems

If you went through the business problems mentioned above, you’d have realized there are mainly three kind
of problems which can then be clubbed into two categories :
1. Supervised
2. Unsupervised
Supervised problems are the problems which have explicit outcome. Such as default on loan, Required
Number of Staff, Server Load etc. Within these , you can see separated kind.
1. Regression
2. Classification

1
Regression problems are those where outcome is a continuous numeric value. Classification problem on the
other hand have their outcome as categories [e.g.: good/bad/worse; yes/no ; 1/0 etc]
Unsupervised problems are those where there is no explicit measurable outcome associated with the problem.
You just need to find general pattern hidden in the measured factors. Finding different customer segments
or electoral segments or finding latent factors in the data comes under such problems. Now categories of
problems can be formally grouped like this:
1. Supervised
1. Regression
2. Classification
2. Unsupervised

What drives a machine learning algorithm

In solving a business problem, our primary goal remains to solve it as well as possible. When it comes to
predict outcome or achieve a goal we need to be as accurate as possible with given information/measured
features.
To measure accuracy , we must first define error. To take a simple example, lets consider a scenario where we
have information on temperature and humidity today and we are trying to predict how much will it rain
tomorrow.
Say we come up with a linear equation for prediction which looks like this :

Rainpredicted = β0 + β1 ∗ T + β2 ∗ H

Where T and H are temperature and humidity respectively and β are estimated constants from the data.
Now; there can be many other factors as well as general randomness affecting rains. Our predictions with
equation given above will never be spot on. There will always be some error associated with it. We can
calculate this error as follows :

Rainreal − Rainpredicted = Rainreal − (β0 + β1 ∗ T + β2 ∗ H)

Lets take a step back and consider these two questions:

1. Do we want to minimize this error for a particular day or do we want a general solution which has
overall small error?
2. What values these βs take?
Answer to first question is naturally we want to minimize overall errors in order to obtain a general solution.
Answer to second question comes from the answer to first, values of βs are such that overall error is minimum.
To understand this in little more detail lets rewrite our rain prediction equation so that it represents rain
prediction for a particular day.

\i = β0 + β1 ∗ Ti + β2 ∗ Hi
Rain

where i can be taken as the day number and hat symbol on top of Raini means it is predicted. Now, to
calculate overall error we can sum the errors for all days as follows:

X
T otalError = (Raini − Rain
\i )

2
This is a bad idea to calculate overall error because many positive and negative errors will cancel each other
and simple summation will not give a good idea about true error. To avoid this issues we can either square
the errors and then sum them or take absolute values and then sum. The former is known as error sum of
squares or in short SSE.

X X
SSE = \i )2 =
(Raini − Rain (Raini − (β0 + β1 ∗ Ti + β2 ∗ Hi ))2

Cost Functions

You can see here , if i gave different values of βs , i will get different values of SSE . In machine learning lingo
SSE here is a cost function which we are trying to minimize and βs are parameters w.r.t. to which we are
trying to minimize the cost function.
In many coming modules we’ll discuss different algorithm where we’ll be defining different cost function as
per goal of the algorithm.
Now; mathematically optimizing the cost function in order to obtain best parameters is something which is
out of scope of this course. We’ll instead be focusing on understanding algorithms which can be used to solve
different kind of problems. underlying optimization algorithms for parameter estimations is implemented well
in the soft-wares and we’ll leave it at that.
However leaving that totally as a black box will not be good for our curious minds . I will discuss a very
simple yet powerful optimization idea known as gradient descent. Many underlying optimization algorithms
are nothing but some variation of the same.

Gradient Descent

Before we begin , here is a quick look back at one idea from our basic calculus:
Consider a function y = f (x) . In the picture below we are looking at very small segment of this function.
For that triangle formed in the picture , we can write

∆y
tan(θ) = (1)
∆x

For very small ∆x and ∆y we can write tan(θ) in terms of gradient as follows:

δy
tan(θ) = (2)
δx

Equating this to (1) and after doing a little adjustment we get:

δy
∆y = ∆x (3)
δx

Idea in (3) can be generalized to higher dimensions also. Now consider cost function C dependent on
parameters v1 and v2 .

C = f (v1 , v2 )

Lets say we start with some default values of these parameters say 1 and 2. Now i would like to change these
parameters to new value in such a way that change in my cost function is negative. Lets write change in cost
function using the relation between change and gradient seen in (3) for higher dimension.:

3
Figure 1: Gradient and Change in Function

4
δC δC
∆C = ∆v1 + ∆v2 (4)
δv1 δv2

If we consider gradient and change in parameters in vector formats as given below :

δC δC
∇C = ( , ) (5)
δv1 δv2

and

∆v = (∆v1 , ∆v2 ) (6)

using (5) and (6) , we can rewrite (4) as follows:

∆C = ∇C • ∆v (7)

Now; we need to figure out how to change our parameters; that is, what should be the value of ∆v so that
change in cost function ∆C is always negative.
Consider :

∆v = −η∇C

if we put this back in the (7), we get :

2
∆C = −ηk∇Ck (8)

which is always negative as long as η is positive. Result in (8) simply means that we can change our parameters
in the negative direction of the gradient of cost function and can always reduce our cost function. At the
optimal point , gradient of the function will become zero and we wont be able to update parameters. Those
values of the parameters will be the optimal values.
This idea is simply known as gradient descent. Where , given the cost function, we can start with some
default value of parameters and change them in the manner shown above in order to obtain their optimal
values which minimise cost function.
In coming discussions on multiple algorithms we will encounter many kind of cost functions . Starting with
SSE in linear models in the next module.
Note: Keep in mind that , underlying parameter estimation implemented in various softwares
does not use gradient descent in the form we just discussed but some variations of the
same idea.

Class Notes
No ratings yet
Class Notes
449 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
No ratings yet
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
6 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
8 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
ML Notes
No ratings yet
ML Notes
14 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
A Layman's Guide to the Project
No ratings yet
A Layman's Guide to the Project
34 pages
Unit - III
No ratings yet
Unit - III
44 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
Module2-Optimizations
No ratings yet
Module2-Optimizations
65 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
ML Linear Model
No ratings yet
ML Linear Model
10 pages
4. Gradient Descent
No ratings yet
4. Gradient Descent
15 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
43 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
nn
No ratings yet
nn
24 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
Computing For Data Sciences: Introduction To Regression Analysis
No ratings yet
Computing For Data Sciences: Introduction To Regression Analysis
9 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
[MLP] MidtermNote
No ratings yet
[MLP] MidtermNote
31 pages
Module 3
No ratings yet
Module 3
27 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Lecture3_Linear Regression and Logistic Regression
No ratings yet
Lecture3_Linear Regression and Logistic Regression
60 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
2. Linear_ Regression_SGD
No ratings yet
2. Linear_ Regression_SGD
71 pages
DAML - Lecture Notes
No ratings yet
DAML - Lecture Notes
35 pages
Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals
No ratings yet
Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals
56 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Cost Function in Machine Learning - Javatpoint
No ratings yet
Cost Function in Machine Learning - Javatpoint
9 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Chapter 0: Introduction: 0.2.1 Examples in Machine Learning
No ratings yet
Chapter 0: Introduction: 0.2.1 Examples in Machine Learning
4 pages
ML Primer PDF
No ratings yet
ML Primer PDF
122 pages
Gradient Descent Final
No ratings yet
Gradient Descent Final
27 pages
Linear Regression
No ratings yet
Linear Regression
37 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Regression
No ratings yet
Regression
30 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Gdesc LMS
No ratings yet
Gdesc LMS
7 pages
Notes-1
No ratings yet
Notes-1
3 pages
Week11_regularization and optimization
No ratings yet
Week11_regularization and optimization
75 pages
Foundations of Machine Learning - 3
No ratings yet
Foundations of Machine Learning - 3
38 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
HESI A2 Math Practice Tests: HESI A2 Nursing Entrance Exam Math Study Guide
From Everand
HESI A2 Math Practice Tests: HESI A2 Nursing Entrance Exam Math Study Guide
Exam SAM
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Student Solutions Manual to Accompany Modern Macroeconomics
From Everand
Student Solutions Manual to Accompany Modern Macroeconomics
Sanjay K. Chugh
No ratings yet
ML UNIT-III
No ratings yet
ML UNIT-III
18 pages
MSC Chennamsetty LH 2020
No ratings yet
MSC Chennamsetty LH 2020
56 pages
Cup To Disc Ratio Glaucoma
No ratings yet
Cup To Disc Ratio Glaucoma
10 pages
Supervised Vs Unsupervised Learning
100% (1)
Supervised Vs Unsupervised Learning
9 pages
Uwe Lorenz - Reinforcement Learning From Scratch. Understanding Current Approaches - With Examples in Java and Greenfoot-Springer (2022)
No ratings yet
Uwe Lorenz - Reinforcement Learning From Scratch. Understanding Current Approaches - With Examples in Java and Greenfoot-Springer (2022)
195 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
89 pages
Article On Pharmaceutical Software
No ratings yet
Article On Pharmaceutical Software
16 pages
PdM_FYP_Final_Report
No ratings yet
PdM_FYP_Final_Report
72 pages
A Spatiotemporal Deep Learning Approach for Unsupervised Anomaly Detection in Cl
No ratings yet
A Spatiotemporal Deep Learning Approach for Unsupervised Anomaly Detection in Cl
15 pages
RAGTruth- A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
No ratings yet
RAGTruth- A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
16 pages
Chapter-2-Fundamentals of Machine Learning
No ratings yet
Chapter-2-Fundamentals of Machine Learning
23 pages
Artificial Intelligence, Machine Learning and Smart Technologies For Nondestructive Evaluation
No ratings yet
Artificial Intelligence, Machine Learning and Smart Technologies For Nondestructive Evaluation
17 pages
Machine Learning
No ratings yet
Machine Learning
106 pages
Nara Cognitive Technologies Whitepaper
No ratings yet
Nara Cognitive Technologies Whitepaper
29 pages
HW 1
No ratings yet
HW 1
6 pages
Ai Class 9 Project Cycle Notes
No ratings yet
Ai Class 9 Project Cycle Notes
8 pages
Aws ML PDF
No ratings yet
Aws ML PDF
74 pages
mrs.shilan report CS1
No ratings yet
mrs.shilan report CS1
7 pages
Machine Learning Models
100% (1)
Machine Learning Models
2 pages
PDS Imp
No ratings yet
PDS Imp
43 pages
BDAunit5
No ratings yet
BDAunit5
26 pages
Unsw-Nb15 Dataset and Machine Learning Based Intrusion Detection Systems
No ratings yet
Unsw-Nb15 Dataset and Machine Learning Based Intrusion Detection Systems
11 pages
Training Report On Machine Learning PDF
No ratings yet
Training Report On Machine Learning PDF
28 pages
AI Unit 1
No ratings yet
AI Unit 1
41 pages
A Comprehensive Survey On Artificial Intelligence and Machine Learning Techniques
No ratings yet
A Comprehensive Survey On Artificial Intelligence and Machine Learning Techniques
7 pages
CH 10 Current Issues in Financial Markets HDRJDGFKMW
No ratings yet
CH 10 Current Issues in Financial Markets HDRJDGFKMW
104 pages
Martin Audus 2023 Emerging Trends in Machine Learning A Polymer Perspective
No ratings yet
Martin Audus 2023 Emerging Trends in Machine Learning A Polymer Perspective
20 pages
Instant Download Artificial Neural Networks for Engineers and Scientists: Solving Ordinary Differential Equations 1st Edition S. Chakraverty And Susmita Mall PDF All Chapters
100% (1)
Instant Download Artificial Neural Networks for Engineers and Scientists: Solving Ordinary Differential Equations 1st Edition S. Chakraverty And Susmita Mall PDF All Chapters
55 pages
Chronic Kidney Documents
No ratings yet
Chronic Kidney Documents
69 pages
CS607 - FinalTerm Subjectives Solved With References by Moaaz
No ratings yet
CS607 - FinalTerm Subjectives Solved With References by Moaaz
16 pages

ML Intro

Uploaded by

ML Intro

Uploaded by

Introduction to Machine Learning

Converting Business Problems to Data Problems

* What kind of toys should lego launch in India

Three kind of Problems

What drives a machine learning algorithm

Rainreal − Rainpredicted = Rainreal − (β0 + β1 ∗ T + β2 ∗ H)

Lets take a step back and consider these two questions:

Equating this to (1) and after doing a little adjustment we get:

If we consider gradient and change in parameters in vector formats as given below :

∆v = (∆v1 , ∆v2 ) (6)

using (5) and (6) , we can rewrite (4) as follows:

if we put this back in the (7), we get :

You might also like