0% found this document useful (0 votes)

5 views

Machine Learning and Data Mining

The document discusses machine learning concepts including the learning problem, prediction, inference, parametric and non-parametric methods. It provides examples to illustrate key machine learning concepts like predicting house prices or donations based on characteristics and determining impact of variables. The trade-off between predictive accuracy and interpretability of models is also covered.

Uploaded by

julius padi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Machine Learning and Data Mining

Uploaded by

julius padi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 88

DCIT 313

Introduction to Artificial
Intelligence

Session 6 – Machine Learning and Data Mining

Course Writer: Dr Kofi Sarpong Adu-Manu

Contact Information: [email protected]

CBAS
School of Physical and Mathematical Sciences
2020/2021 – 2022/2023
Subsets of Artificial Intelligence
Lesson Goals:
• Understand the basic concepts of the learning problem and why/how machine
learning methods are used to learn from data to find underlying patterns for
prediction and decision-making.

• Understand the learning algorithm trade-offs, balancing performance within

training data and robustness on unobserved test data.

• Differentiate between supervised and unsupervised learning methods as well as

regression versus classification methods.

• Understand the basic concepts of assessing model accuracy and the bias-variance
trade-off.
Introduction
• Machine learning is making great strides
– Large, good data sets
– Compute power
– Progress in algorithms
• Many interesting applications
– commericial
– scientific
• Links with artificial intelligence
– However, AI  machine learning

4
Big Data is Everywhere
• We are in the era of big data!
– 40 billion indexed web pages
– 100 hours of video are uploaded
to YouTube every minute
• The deluge of data calls for
automated methods of data
analysis, which is what
machine learning provides!
What is Machine Learning?
• Machine learning is a set of methods that can
automatically detect patterns in data.

• These uncovered patterns are then used to predict

future data, or to perform other kinds of decision-
making under uncertainty.

• The key premise is learning from data!!

What is Machine Learning?
• Addresses the problem of analyzing huge bodies of data so
that they can be understood.

• Providing techniques to automate the analysis and

exploration of large, complex data sets.

• Tools, methodologies, and theories for revealing patterns

in data – critical step in knowledge discovery.
What is Machine Learning?
• Driving Forces:
– Explosive growth of data in a great variety of fields
• Cheaper storage devices with higher capacity
• Faster communication
• Better database management systems
– Rapidly increasing computing power

• We want to make the data work for us!!

Examples of Learning Problems
• Machine learning plays a key role in many areas of science, finance
and industry:
– Predict whether a patient, hospitalized due to a heart attack, will have a
second heart attack. The prediction is to be based on demographic, diet and
clinical measurements for that patient.
– Predict the price of a stock in 6 months from now, on the basis of company
performance measures and economic data.
– Identify the numbers in a handwritten ZIP code, from a digitized image.
– Estimate the amount of glucose in the blood of a diabetic person, from the
infrared absorption spectrum of that person’s blood.
– Identify the risk factors for prostate cancer, based on clinical and
demographic variables.
Research Fields
• Statistics / Statistical Learning
• Data Mining
• Pattern Recognition
• Artificial Intelligence
• Databases
• Signal Processing
Applications
• Business
– Walmart data warehouse mined for advertising and logistics
– Credit card companies mined for fraudulent use of your card
based on purchase patterns
– Netflix developed movie recommender system
• Genomics
– Human genome project: collection of DNA sequences,
microarray data
Applications (cont.)
• Information Retrieval
– Terrabytes of data on internet, multimedia information
(video/audio files)

• Communication Systems
– Speech recognition, image analysis
The Learning Problem
• Learning from data is used in situations where we
don’t have any analytic solution, but we do have data
that we can use to construct an empirical solution

• The basic premise of learning from data is the use of a

set of observations to uncover an underlying process.
The Learning Problem (cont.)
• Suppose we observe the output space and the input space
• We believe that there is a relationship between Y and at
least one of the X’s.
• We can model the relationship as:
where f is an unknown function and ε is a random error
(noise) term, independent of X with mean zero.
The Learning Problem (cont.)
The Learning Problem: Example

0.10
0.05
0.00
y

-0.05
-0.10

0.0 0.2 0.4 0.6 0.8 1.0

x
The Learning Problem: Example (cont.)
The Learning Problem: Example (cont.)

• Different estimates for the target function f that depend on

the standard deviation of the ε’s
Why do we estimate f?
• We use modern machine learning methods to estimate f by
learning from the data.
• The target function f is unknown.
• We estimate f for two key purposes:
– Prediction
– Inference
Prediction
• By producing a good estimate for f where the variance of ε
is not too large, then we can make accurate predictions for
the response variable, Y, based on a new value of X.
• We can predict Y using (X)
where represents our estimate for f, and represents the
resulting prediction for Y.
Prediction (cont.)
• The accuracy of as a prediction for Y depends on:
– Reducible error
– Irreducible error

• Note that will not be a perfect estimate for f; this

inaccuracy introduces error.
Prediction (cont.)
• This error is reducible because we can potentially
improve the accuracy of the estimated (i.e. hypothesis)
function by using the most appropriate learning
technique to estimate the target function f.
• Even if we could perfectly estimate f, there is still
variability associated with ε that affects the accuracy of
predictions = irreducible error.
Prediction (cont.)
• Average of the squared difference between the
predicted and actual value of Y.
• Var(ε) represents the variance associated with ε.

• Our aim is to minimize the reducible error!!

Example: Direct Mailing Prediction
 We are interested in predicting how much money an individual will
donate based on observations from 90,000 people on which we have
recorded over 400 different characteristics.
 We don’t care too much about each individual characteristic.
 Learning Problem:
 For a given individual, should I send out a mailing?
Inference
• Instead of prediction, we may also be interested in the
type of relationship between Y and the X’s.
• Key questions:
– Which predictors actually affect the response?
– Is the relationship positive or negative?
– Is the relationship a simple linear one or is it more complicated?
Example: Housing Inference
• We wish to predict median house price based on
numerous variables.
• We want to learn which variables have the largest effect
on the response and how big the effect is.
• For example, how much impact does the number of
bedrooms have on the house value?
How do we estimate f?
• First, we assume that we have observed a set of training
data.

{( X1 , Y1 ), ( X 2 , Y2 ), , ( X n , Yn )}
• Second, we use the training data and a machine learning
method to estimate f.
– Parametric or non-parametric methods
Parametric Methods
• This reduces the learning problem of estimating the target
function f down to a problem of estimating a set of
parameters.

• This involves a two-step approach…

Parametric Methods (cont.)
• Step 1:
– Make some assumptions about the functional form of f. The
most common example is a linear model:

f (X i )   0  1 X i1   2 X i 2     p X ip
– In this course, we will examine far more complicated and
flexible models for f.
Parametric Methods (cont.)
• Step 2:
– We use the training data to fit the model (i.e. estimate f….the
unknown parameters).
– The most common approach for estimating the parameters in a
linear model is via ordinary least squares (OLS) linear
regression.
– However, there are superior approaches, as we will see in this
course.
Example: Income vs. Education Seniority
Example: OLS Regression Estimate
• Even if the standard deviation is low, we will still get a
bad answer if we use the incorrect model.
Non-Parametric Methods
• As opposed to parametric methods, these do not make
explicit assumptions about the functional form of f.
• Advantages:
– Accurately fit a wider range of possible shapes of f.
• Disadvantages:
– Requires a very large number of observations to acquire an
accurate estimate of f.
Example: Thin-Plate Spline Estimate
• Non-linear regression
methods are more flexible
and can potentially provide
more accurate estimates.

• However, these methods can

run the risk of over-fitting the
data (i.e. follow the errors, or
noise, too closely), so too
much flexibility can produce
poor estimates for f.
Predictive Accuracy vs. Interpretability

• Conceptual Question:
– Why not just use a more flexible method if it is more realistic?

• Reason 1:
– A simple method (such as OLS regression) produces a model
that is easier to interpret (especially for inference purposes).
Predictive Accuracy vs. Interpretability (cont.)

• Reason 2:
– Even if the primary
purpose of learning from
the data is for prediction, it
is often possible to get
more accurate predictions
with a simple rather than a
complicated model.
Learning Algorithm Trade-off
• There are always two aspects to consider when designing
a learning algorithm:
– Try to fit the data well
– Be as robust as possible

• The predictor that you have generated using your training

data must also work well on new data.
Learning Algorithm Trade-off (cont.)
• When we create predictors, usually the simpler the
predictor is, the more robust it tends to be in the sense of
begin able to be estimated reliably.

• On the other hand, the simple models do not fit the

training data aggressively.
Learning Algorithm Trade-off (cont.)
• Training Error vs. Testing Error:
– Training error  reflects whether the data fits well
– Testing error  reflects whether the predictor actually works on
new data

• Bias vs. Variance:

– Bias  how good the predictor is, on average; tends to be smaller
with more complicated models
– Variance  tends to be higher for more complex models
Learning Algorithm Trade-off (cont.)
• Fitting vs. Over-fitting:
– If you try to fit the data too aggressively, then you may over-fit
the training data. This means that the predictors works very well
on the training data, but is substantially worse on the unseen test
data.
• Empirical Risk vs. Model Complexity:
– Empirical risk  error rate based on the training data
– Increase model complexity = decrease empirical risk but less
robust (higher variance)
Learning Spectrum
Supervised vs. Unsupervised Learning

• Supervised Learning:
– All the predictors, Xi, and the response, Yi, are observed.
• Many regression and classification methods

• Unsupervised Learning:
– Here, only the Xi’s are observed (not Yi’s).
– We need to use the Xi’s to guess what Y would have been, and
then build a model form there.
• Clustering and principal components analysis
Terminology
• Notation
– Input X: feature, predictor, or independent variable
– Output Y: response, dependent variable
• Categorization
– Supervised learning vs. unsupervised learning
• Key question: Is Y available in the training data?
– Regression vs. Classification
• Key question: Is Y quantitative or qualitative?
Terminology (cont.)
• Quantitative:
– Measurements or counts, recorded as numerical values (e.g.
height, temperature, etc.)

• Qualitative: group or categories

– Ordinal: possesses a natural ordering (e.g. shirt sizes)
– Nominal: just name the categories (e.g. marital status, gender,
etc.)
Terminology (cont.)
Supervised Learning
Supervised Learning:
Regression vs. Classification
• Regression
– Covers situations where Y is continuous (quantitative)
– E.g. predicting the value of the Dow in 6 months, predicting the
value of a given house based on various inputs, etc.

• Classification
– Covers situations where Y is categorical (qualitative)
– E.g. Will the Dow be up or down in 6 months? Is this email spam
or not?
Supervised Learning: Examples
• Email Spam:
– predict whether an email is a junk email (i.e. spam)
Supervised Learning: Examples
• Handwritten Digit Recognition:
– Identify single digits 0~9 based on images
Supervised Learning: Examples
• Face Detection/Recognition:
– Identify human faces
Supervised Learning: Examples
• Speech Recognition:
– Identify words spoken according to speech signals
• Automatic voice recognition systems used by airline companies,
automatic stock price reporting, etc.
Supervised Learning:
Linear Regression
Supervised Learning:
Linear/Quadratic Discriminant Analysis
Supervised Learning:
Logistic Regression
Supervised Learning:
K Nearest Neighbors
Supervised Learning:
Decision Trees / CART
Supervised Learning:
Support Vector Machines
Unsupervised Learning
Unsupervised Learning (cont.)
• The training data does not contain any output information
at all (i.e. unlabeled data).

• Viewed as the task of spontaneously finding patterns and

structure in input data.

• Viewed as a way to create a higher-level representation of

the data and dimension reduction.
Unsupervised Learning:
K-Means Clustering
Unsupervised Learning:
Hierarchical Clustering
Assessing Model Accuracy
• For a given set of data, we need to decide which machine
learning method produces the best results.

• We need some way to measure the quality of fit (i.e. how

well its predictions actually match the observed data).

• In regression, we typically use mean squared error (MSE).

Assessing Model Accuracy (cont.)
Assessing Model Accuracy (cont.)
• Thus, we really care about how well the method
works on new, unseen test data.

• There is no guarantee that the method with the

smallest training MSE will have the smallest test
MSE.
Training vs. Test MSEs
• In general, the more flexible a method is the lower its
training MSE will be.

• However, the test MSE may in fact be higher for a more

flexible method than for a simple approach like linear
regression.
Training vs. Test MSEs (cont.)
• More flexible methods (such as splines) can generate a
wider range of possible shapes to estimate f as
compared to less flexible and more restrictive methods
(such as linear regression).

• The less flexible the method, the easier to interpret the

model. there is a trade-off between flexibility and
model interpretability.
Different Levels of Flexibility

Overfitting
Different Levels of Flexibility (cont.)
Different Levels of Flexibility (cont.)
Bias-Variance Trade-off
• The previous graphs of test versus training MSEs
illustrates a very important trade-off that governs the
choice of machine learning methods.

• There are always two competing forces that govern the

choice of learning method:
– bias and variance
Bias of Learning Methods
• Bias refers to the error that is introduced by modeling a
real life problem (that is usually extremely complicated)
by a much simpler problem.

• Generally, the more flexible/complex a machine learning

method is, the less bias it will generally have.
Variance of Learning Methods
• Variance refers to how much your estimate for f would
change by if you had a different training data set.

• Generally, the more flexible/complex a machine learning

method is the more variance it has.
The Trade-Off: Expected Test MSE
Test MSE, Bias and Variance
• Thus, in order to minimize the expected test MSE, we
must select a machine learning method that
simultaneously achieves low variance and low bias.

• Note that the expected test MSE can never lie below the
irreducible error - Var(ε).
Test MSE, Bias and Variance (cont.)
The Classification Setting
• For a classification problem, we can use the
misclassification error rate to assess the accuracy of the
machine learning method.
n
Error Rate   I ( yi  yˆ i ) / n
which represents the fraction i 1 of misclassifications.

• is an indicator function, which will give 1 if

Ithe  yˆ i )
( yi condition is correct, otherwise it gives a 0.
( yi  yˆ i )
Bayes Error Rate
• The Bayes error rate refers to the lowest possible error rate
that could be achieved if somehow we knew exactly what
the “true” probability distribution of the data looked like.

• On test data, no classifier can get lower error rates than the
Bayes error rate.

• In real-life problems, the Bayes error rate can’t be

calculated exactly.
Bayes Decision Boundary
• The purple dashed line
represents the points
where the probability is
exactly 50%.

• The Bayes classifier’s

prediction is determined
by the Bayes decision
boundary
K-Nearest Neighbors (KNN)
• KNN is a flexible approach to estimate the Bayes classifier.

• For any given X, we find the k closest neighbors to X in the

training data and average their corresponding responses Y.

• If the majority of the Y’s are orange, then we predict orange

otherwise guess blue.

• The smaller that k is, the more flexible the method will be.
KNN: K=10

KNN decision
boundary

Bayes decision
boundary
KNN: K=1 and K=100

Low Bias, High Variance High Bias, Low Variance

Overly Flexible Less Flexible
KNN Training vs. Test Error Rates

• Notice that the KNN

training error rates (blue)
keep going down as k
decreases (i.e. as the
flexibility increases).

• However, note that the

KNN test error rate at
first decreases but then
starts to increase again.
Key Note: Bias-Variance Trade-Off
Underfit Overfit • In general, training errors
will always decline.

• However, test errors will

decline at first (as
reductions in bias
dominate) but will then
When selecting a machine learning start to increase again (as
method, remember that more increases in variance
flexible/complex is not necessarily better!! dominate).
Summary
Activity
Reference
Ertel, W., & Black, N. T. (2011). Introduction to
Artificial Intelligence. Berlin: Springer.
Acknowledgement

Inter Banking Screen Tracer & Black Screen Debit Confirmation
100% (5)
Inter Banking Screen Tracer & Black Screen Debit Confirmation
3 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
SCSA3015 Deep Learning Unit 1 Notes PDF
No ratings yet
SCSA3015 Deep Learning Unit 1 Notes PDF
30 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
Unit 2
No ratings yet
Unit 2
76 pages
ML Unit 1
No ratings yet
ML Unit 1
17 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Notes
No ratings yet
Notes
125 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Lecture 1
No ratings yet
Lecture 1
30 pages
LAB MANUAL_ANATOMY-1
No ratings yet
LAB MANUAL_ANATOMY-1
10 pages
UNIT-5 ML Part1-1
No ratings yet
UNIT-5 ML Part1-1
59 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Classification
No ratings yet
Classification
53 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
ML_Unit_1 (1)
No ratings yet
ML_Unit_1 (1)
124 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
Unit 4 Machine Learning Tools, Techniques and Applications
No ratings yet
Unit 4 Machine Learning Tools, Techniques and Applications
78 pages
2.0 Machine Learning Introduction
No ratings yet
2.0 Machine Learning Introduction
24 pages
Machine Learning: Foundations: Prof. Nathan Intrator
No ratings yet
Machine Learning: Foundations: Prof. Nathan Intrator
60 pages
Module III - Simulation Scenarios - C1
No ratings yet
Module III - Simulation Scenarios - C1
179 pages
Artificial Intelligence: Slide 6
100% (1)
Artificial Intelligence: Slide 6
42 pages
Unit-5Cognitive System Design Principles
No ratings yet
Unit-5Cognitive System Design Principles
72 pages
What Is Supervise
No ratings yet
What Is Supervise
3 pages
unit-1.2-Perceptron-2024
No ratings yet
unit-1.2-Perceptron-2024
107 pages
1 - Module5 - Machine Learning
100% (1)
1 - Module5 - Machine Learning
78 pages
Answer for These Questions
No ratings yet
Answer for These Questions
2 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
Lecture AI_Handling Uncertainities
No ratings yet
Lecture AI_Handling Uncertainities
14 pages
DUnit I
No ratings yet
DUnit I
25 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Unit 3
No ratings yet
Unit 3
17 pages
Mfin6201 Week1
No ratings yet
Mfin6201 Week1
56 pages
Chapter
100% (1)
Chapter
101 pages
843 Class 12 Competency Based Artificial Intelligence Chap-2 (2024-25)
No ratings yet
843 Class 12 Competency Based Artificial Intelligence Chap-2 (2024-25)
28 pages
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
No ratings yet
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
28 pages
machineLearning-unit1
No ratings yet
machineLearning-unit1
9 pages
MachineLearning Jan2nd
100% (2)
MachineLearning Jan2nd
171 pages
Chap-6 Machine Learning Introduction
No ratings yet
Chap-6 Machine Learning Introduction
49 pages
bny-sec-ahw-2412201747-1562352815-1
No ratings yet
bny-sec-ahw-2412201747-1562352815-1
3 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
17 pages
Mathematical Foundations of Data Science Class Notes
No ratings yet
Mathematical Foundations of Data Science Class Notes
45 pages
Unit 4
No ratings yet
Unit 4
5 pages
Chapter One - Introduction
No ratings yet
Chapter One - Introduction
156 pages
DL_Unit1 (1)
No ratings yet
DL_Unit1 (1)
79 pages
Notes Artificial Intelligence Unit 5
No ratings yet
Notes Artificial Intelligence Unit 5
11 pages
E-Notes_34758_Content_Document_20250415115803AM
No ratings yet
E-Notes_34758_Content_Document_20250415115803AM
23 pages
ML Unit 1 Notes
No ratings yet
ML Unit 1 Notes
134 pages
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
complete ml (1)
No ratings yet
complete ml (1)
325 pages
MACHINE LEARNING Updated
No ratings yet
MACHINE LEARNING Updated
12 pages
Machine Learning
100% (1)
Machine Learning
12 pages
Machine Learning Unit-1.1
No ratings yet
Machine Learning Unit-1.1
29 pages
Prediction of Autism Spectrum Disorder
No ratings yet
Prediction of Autism Spectrum Disorder
25 pages
Unit1 ML NGP
No ratings yet
Unit1 ML NGP
106 pages
Module -1 Lecture-1
No ratings yet
Module -1 Lecture-1
40 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
GEC05 Module Statistics
No ratings yet
GEC05 Module Statistics
34 pages
Introduction
No ratings yet
Introduction
26 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
2 pages
L01 MBLC D04 TLM DWG 40la100 Ag Lay100 R01
No ratings yet
L01 MBLC D04 TLM DWG 40la100 Ag Lay100 R01
1 page
Vertical High Thrust Motors: Installation, Operation and Maintenance Manual
No ratings yet
Vertical High Thrust Motors: Installation, Operation and Maintenance Manual
74 pages
Ilovepdf Merged Removed
No ratings yet
Ilovepdf Merged Removed
20 pages
Lagging Behind: Modular Distant Learning Challenges of Junior High School Students of San Jose Pili National High School
No ratings yet
Lagging Behind: Modular Distant Learning Challenges of Junior High School Students of San Jose Pili National High School
12 pages
FL4000H - Catálogo EN
No ratings yet
FL4000H - Catálogo EN
4 pages
Ow 40435194220240401
No ratings yet
Ow 40435194220240401
1 page
SN54221, SN54LS221, SN74221, SN74LS221 Dual Monostable Multivibrators With Schmitt-Trigger Inputs
No ratings yet
SN54221, SN54LS221, SN74221, SN74LS221 Dual Monostable Multivibrators With Schmitt-Trigger Inputs
14 pages
Analysis Phases
No ratings yet
Analysis Phases
26 pages
CS501 Assignment Solution
No ratings yet
CS501 Assignment Solution
3 pages
Marketing 7s On Airtel
No ratings yet
Marketing 7s On Airtel
9 pages
Pre Test Unit 1
No ratings yet
Pre Test Unit 1
3 pages
User Manual 9888
No ratings yet
User Manual 9888
54 pages
JBDL 3
No ratings yet
JBDL 3
15 pages
UCB-DPS Admission Information Session 2022-23
No ratings yet
UCB-DPS Admission Information Session 2022-23
23 pages
Practical Research Module 1
No ratings yet
Practical Research Module 1
12 pages
Installation Guide PSS®E 35.3.0: July 2021
100% (1)
Installation Guide PSS®E 35.3.0: July 2021
16 pages
Deltabeam - Instalacion
No ratings yet
Deltabeam - Instalacion
10 pages
Training Programmes: Our Motto: Performance Improvement
No ratings yet
Training Programmes: Our Motto: Performance Improvement
16 pages
5 Yellow (Diffused) 5mm
No ratings yet
5 Yellow (Diffused) 5mm
3 pages
Cbmec 2 Chap. 8
No ratings yet
Cbmec 2 Chap. 8
8 pages
Digital Fluency Last Min Reference
No ratings yet
Digital Fluency Last Min Reference
21 pages
Cost Management Techiniques Used in Sugar Industry: Abstract
No ratings yet
Cost Management Techiniques Used in Sugar Industry: Abstract
8 pages
TLC4501 Opamp
No ratings yet
TLC4501 Opamp
38 pages
Ortho ID Installation & User Manual 2003 ENG
No ratings yet
Ortho ID Installation & User Manual 2003 ENG
32 pages
TECHNICAL REPORT Proses Pengeraman Telur
No ratings yet
TECHNICAL REPORT Proses Pengeraman Telur
10 pages
Sohan - Report
No ratings yet
Sohan - Report
22 pages

Machine Learning and Data Mining

Uploaded by

Machine Learning and Data Mining

Uploaded by

DCIT 313

Session 6 – Machine Learning and Data Mining

Course Writer: Dr Kofi Sarpong Adu-Manu

• Understand the learning algorithm trade-offs, balancing performance within

• Differentiate between supervised and unsupervised learning methods as well as

• These uncovered patterns are then used to predict

• The key premise is learning from data!!

• Providing techniques to automate the analysis and

• Tools, methodologies, and theories for revealing patterns

• We want to make the data work for us!!

• The basic premise of learning from data is the use of a

0.0 0.2 0.4 0.6 0.8 1.0

• Different estimates for the target function f that depend on

• Note that will not be a perfect estimate for f; this

• Our aim is to minimize the reducible error!!

• This involves a two-step approach…

• However, these methods can

• The predictor that you have generated using your training

• On the other hand, the simple models do not fit the

• Bias vs. Variance:

• Qualitative: group or categories

• Viewed as the task of spontaneously finding patterns and

• Viewed as a way to create a higher-level representation of

• We need some way to measure the quality of fit (i.e. how

• In regression, we typically use mean squared error (MSE).

• There is no guarantee that the method with the

• However, the test MSE may in fact be higher for a more

• The less flexible the method, the easier to interpret the

• There are always two competing forces that govern the

• Generally, the more flexible/complex a machine learning

• Generally, the more flexible/complex a machine learning

• is an indicator function, which will give 1 if

• In real-life problems, the Bayes error rate can’t be

• The Bayes classifier’s

• For any given X, we find the k closest neighbors to X in the

• If the majority of the Y’s are orange, then we predict orange

Low Bias, High Variance High Bias, Low Variance

• Notice that the KNN

• However, note that the

• However, test errors will

You might also like