0% found this document useful (0 votes)

50 views18 pages

Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views18 pages

Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Week 2 Characterization of
Learning Problems

Video 2.4 Scenarios for

Concept Learning
Basic Distinctions: Supervised vs
Unsupervised Learning

In Supervised learning, input data is always pre-classified with unique

concept labels.The goal of supervised learning is to, based on example input-
output pairs, learn a concept definition which best approximates the
relationship between input data and the concept labels. An optimal scenario
will allow for the algorithm to correctly determine the concept labels for
unseen data-items.

In Unsupervised learning, input data are NOT classified, i.e. contains only
input data and lack concept labels. Unsupervised learning algorithms
therefore have to identify commonalities and structures in the data-set and to
group the input based on similarity. Unsupervised learning algorithms have
to decide on a optimal portfolio of concepts that best matches the data-set and
arrange groupings of subsets of the data-set so that it matches the portfolio of
concepts.
Basic Distinctions:
Off-line (Batch) vs On-line (Incremental) learning
This distinction is relevant for supervised as well as for un-supervised
learning.

Offline learning refers to situations where the system is not operating in a real time
environment but handles pre-harvested data in static and complete batch form.
Most traditional machine learning algorithms are well adapted to off-line learning
and the parallell access to the whole data-set gives full flexibility of using data-
items in all kinds of variations during the learning process.

Online learning is a learning scenario where data is processed in-real time in an

incremental fashion. Input data items are incrementally, gradually and continuously
used to inductively extend the existing model. Results of earlier learning are
typically maintained as still being valid. Incremental algorithms are frequently
applied to data streams or big data. Stock trend prediction and user profiling are
some examples of data streams where new data becomes continuously available.
Many traditional machine learning algorithms inherently support incremental
learning, but may have to be adapted to facilitate this.

A middle way is to handle data in so called mini-batches.

Scenario 1
Learning a single concept off-line from
pre-classified positive examples
Label 1

Concept
Conc 1
Scenario 2 The presence of Noise
Learning a single concept off-line from
pre-classified positive examples
Noise
Label 1
Noise is a fundamental underlying
phenomenon that is present in all datasets.
Noise is a distortion in data, that is unwanted
by the perceiver.

Noise is anything that is spurious and

extraneous to the true data and typically due
to faulty capturing process.
Concept
Conc 1
Noise can occur in all subsequent scenarios
as well as in this first simplest one.
Scenario 3 Outliers
Learning a single concept off-line from
pre-classified positive examples
Outlier
Label 1
An outlier is a data-item that is distant from
other observations.

An outlier may be due to natural but extreme

variation or it may indicate an experimental
error or other noise. An outlier can cause
serious problems for analysis. It is crucial to
distinguish between the measurement error
cases and the cases where the population has
Concept
Conc 1 a heavy-tailed or skewed distribution.

Outliers can occur in all subsequent scenarios

as well as in this first simplest one.
Scenario 4 Negative examples
Learning a single concept off-line from
pre-classified positive examples

- Label 1 Negative examples

For many situations and many machine

+ - learning algorithms, faster convergence

towards a concept definition can be
- achieved by using a mix of positive and
Negative examples.
-
- - Negative examples can be made
available in a variety of ways, either by
Concept
Conc 1
using examples from other labelled
categories or by artificial generation
guided by available domain knowledge.
Scenario 5 Near Misses
Learning a single concept off-line from
pre-classified positive examples
Near Misses
Label 1
- In a scenario where negative examples are
used, it is not obvious, what type of objects we
- - should use as negative examples. Arbitrary
+ negative examples will differ considerably
from positive examples, which might allow the
- -
learner unwanted flexibility in determining the
classification boundary.
-
Near misses are negative examples that differ
from the learned concept in only a small
Concept
Conc 1
number of significant points.

Such examples do not necessarily belong to

known concepts.
Scenario 6 Internal Structure and Topology of Data-set
Learning a single concept off-line from
pre-classified positive examples
Internal Structure and Topology of Data-set
Label 1
In earlier scenarios we have assumed no internal
structure of the Data-set. All data-items have been
P P regarded as having the same status and importance
P and no structure or metric has been assumed among
P the data-items in the set.
P
P In contrast, this scenario introduce the concepts of
P typicality of objects and similarity metrics within the
data set:
- naively more typical objects are more
advantageous to use early in a learning process
Concept
Conc 1
- naively it makes sense to use the similarity metrics
to guide the order of considering training examples.
Scenario 7 Instance-based Learning
Learning a single concept off-line from
pre-classified positive examples
Instance-based learning
Label 1 The 6th scenario, which implies a well defined
Structure and Similarity Metric for the Data-set, opens
up for a new Scenario where no longer an explicit
P P generalization (concept definition) is needed. By
P instance-based learning (memory-based learning) we
P mean learning algorithms that, instead of creating
P
P explicit generalizations, compare new problem
P instances with instances which have already been
stored in memory.
One advantage that instance-based learning has over
other methods of machine learning is its ability to more
easily adapt its model to previously unseen data.
Instance-based learners may simply store a new
instance or may also throw old instances away.
Scenario 8 On-line learning
Learning a single concept on-line from
pre-classified positive examples
On-line learning
Label 1
Online learning is a learning
scenario where data is processed in-
real time in an incremental fashion.
Input data items are incrementally,
gradually and continuously used to
inductively extend the existing
model.

Results of earlier learning are

Concept
Conc 1 typically maintained as still being
valid.
Scenario 9 Throwing away data-items that has become irrelevant
Learning a single concept on-line from
pre-classified positive examples
In the case of On-line learning normally
all data-items that has been encountered
Label 1 and analyzed will be kept resulting in
a monotonically growing Data-set.

However there many reasons why ´older

´ data items may become irrelevant. This
means of course that also the Concept
definitions may have to be revised due
to the´retractions´.

Machine Learning Algorithms that

Concept
Conc 1
handle the On-line case must also be
able to handle this kind of situation.
Scenario 10 This scenario does not introduce many any
big differences and surprises.

All aspects introduced in scenario 1-9 are still

Learning multiple concepts off-line from relevant to consider but in parallell:
- Noise
Pre-classified positive examples - Outliers
- Negative examples
Label 1 Label 2 Label 3 - Near misses
- Internal structure and metrics of data sets
- Instance-based learning
- On-line learning
- Throwing away data-items that has
become irrelevant.

Concept
Conc 1 Concept 2 Concept 3
Scenario 11 Unsupervised concept learning
Learning multiple concepts from
unsorted examples
All aspects introduced in scenario 1-10
are still relevant to consider.

Input data are NOT classified, i.e.

contains only input data observables and
lack output data observables (concept
labels).

Unsupervised learning algorithms

therefore have to identify commonalities
and structures in the data-set and to
group the input based on similarity.

The main category of techniques that

tackles the unsupervised case is called
Cluster Analysis.
Cluster Analysis

Cluster analysis is the assignment of a set of

observations into subsets (called clusters) so that
observations within the same cluster are similar
according to one or more pre-designated criteria, while
observations drawn from different clusters are dissimilar.

Important aspects of clustering techniques are:

- similarity metrics
- internal compactness or density of clusters
- degree of separation - the difference between
clusters.
Clustering analysis can potentially learn also Concept Hierarchies
Category 1
Generalization
Generalization
Category 2 Category 3 Category 4

Generalization Generalization

Category 5 Category 6 Category 7 Category 8 Category 9 Category 10

The end-to-end process for Concept Learning

In a typical Concept Learning task the following steps need to be considered:

• Data harvesting from potentially heterogeneous sources

• Pre-processing of data (e.g. from analogue to digital form)

• Establishment of model or theory support

• Feature engineering

• Algorithm selection

• Tailoring conditions for algorithms

(hyper-parameter settings, language biases, complexity)

• Core Data analysis phase

• Post-processing of acquired Concept definitions including validations

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Thanks for your attention!

The next lecture 2.5 will be on the

topic:

Tutorial for Week 2

ANN Quiz
67% (3)
ANN Quiz
6 pages
ML Data
No ratings yet
ML Data
6 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Learning Scenarios Supervised Learning Unsupervised Learning Unit 1 Part B
No ratings yet
Learning Scenarios Supervised Learning Unsupervised Learning Unit 1 Part B
25 pages
AI Notes Module - 4
No ratings yet
AI Notes Module - 4
13 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
5 - AIML - Module3 - PPT
No ratings yet
5 - AIML - Module3 - PPT
37 pages
AI Unit 3 Lecture 3
No ratings yet
AI Unit 3 Lecture 3
17 pages
Lec 01 - Intro MachineLearning
No ratings yet
Lec 01 - Intro MachineLearning
59 pages
Notes
No ratings yet
Notes
125 pages
Introducti0n (MLT)
No ratings yet
Introducti0n (MLT)
39 pages
ML Study
No ratings yet
ML Study
9 pages
Intro MLT 08jan25
No ratings yet
Intro MLT 08jan25
21 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
Learning Scenarios
No ratings yet
Learning Scenarios
25 pages
Machine Learning 101
No ratings yet
Machine Learning 101
19 pages
Machine Learning-2
No ratings yet
Machine Learning-2
16 pages
Machine Learning - v1
No ratings yet
Machine Learning - v1
30 pages
Classification
No ratings yet
Classification
53 pages
Module 1
No ratings yet
Module 1
50 pages
DDD: A New Ensemble Approach For Dealing With Concept Drift: Leandro L. Minku, Member, IEEE, and Xin Yao, Fellow, IEEE
No ratings yet
DDD: A New Ensemble Approach For Dealing With Concept Drift: Leandro L. Minku, Member, IEEE, and Xin Yao, Fellow, IEEE
15 pages
Unit 4 Machine Learning
No ratings yet
Unit 4 Machine Learning
22 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Unit I
No ratings yet
Unit I
17 pages
Module 1 Notes
No ratings yet
Module 1 Notes
56 pages
FML Lecture Notes
No ratings yet
FML Lecture Notes
34 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
30 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Face Recognition Based Attendance System Using Opencv
No ratings yet
Face Recognition Based Attendance System Using Opencv
64 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Larning Introduction
No ratings yet
Larning Introduction
6 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Introduction 1175
No ratings yet
Introduction 1175
58 pages
Lecture#12 DM MS (DEIM) Spring 2025
No ratings yet
Lecture#12 DM MS (DEIM) Spring 2025
21 pages
ML Reference-Material-I
No ratings yet
ML Reference-Material-I
41 pages
Overview of Machine Learning PDF
100% (1)
Overview of Machine Learning PDF
57 pages
My Hands-On ML Notebook
No ratings yet
My Hands-On ML Notebook
5 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Unit5 ML Introduction
No ratings yet
Unit5 ML Introduction
32 pages
Decision Tree, Clustering
No ratings yet
Decision Tree, Clustering
73 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
Report
No ratings yet
Report
27 pages
ML Unit 1 Notes
No ratings yet
ML Unit 1 Notes
135 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
Notes Artificial Intelligence Unit 5
No ratings yet
Notes Artificial Intelligence Unit 5
11 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
AI&ML BM4251 Unit 1-5 Notes
No ratings yet
AI&ML BM4251 Unit 1-5 Notes
116 pages
Unit-5Cognitive System Design Principles
No ratings yet
Unit-5Cognitive System Design Principles
72 pages
Ann Artifical Neural Network
No ratings yet
Ann Artifical Neural Network
34 pages
Unit 1
No ratings yet
Unit 1
20 pages
ML Algos
No ratings yet
ML Algos
31 pages
InTech-Types of Machine Learning Algorithms PDF
No ratings yet
InTech-Types of Machine Learning Algorithms PDF
32 pages
Machine Learning: Algorithms Types
No ratings yet
Machine Learning: Algorithms Types
32 pages
Different Adv Algorithms For Machine Learning
No ratings yet
Different Adv Algorithms For Machine Learning
13 pages
Module1 And2
No ratings yet
Module1 And2
122 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
32 pages
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
From Everand
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
William Sullivan
1/5 (1)
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Case Based Reasoning: Fundamentals and Applications
From Everand
Case Based Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
3 4-ArtificialNeuralNetworks
No ratings yet
3 4-ArtificialNeuralNetworks
18 pages
4.3-DecisionTreesLearningAlgorithms Part 2
No ratings yet
4.3-DecisionTreesLearningAlgorithms Part 2
15 pages
3 3-BayesianNetworks
No ratings yet
3 3-BayesianNetworks
13 pages
3 5-GeneticAlgorithms
No ratings yet
3 5-GeneticAlgorithms
16 pages
3 6-LogicProgramming
No ratings yet
3 6-LogicProgramming
8 pages
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
No ratings yet
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
17 pages
2 3-FeatureRelatedIssues
No ratings yet
2 3-FeatureRelatedIssues
10 pages
Week1 Annotated
No ratings yet
Week1 Annotated
4 pages
Week 2 Watermark
No ratings yet
Week 2 Watermark
84 pages
Week 1
No ratings yet
Week 1
12 pages
Machine Learning in Python For Process Systems Engineering: Ankur Kumar, Jesus Flores-Cerrillo
No ratings yet
Machine Learning in Python For Process Systems Engineering: Ankur Kumar, Jesus Flores-Cerrillo
352 pages
Unit 2
No ratings yet
Unit 2
40 pages
Papers in Quantitative Finance March 2024 1712238549
No ratings yet
Papers in Quantitative Finance March 2024 1712238549
27 pages
Generative AI - Smartbridge
No ratings yet
Generative AI - Smartbridge
3 pages
3vc16cs034-k R Tejaswini
No ratings yet
3vc16cs034-k R Tejaswini
29 pages
Lecture 10 - AI Vs ML Vs DL - Classification
No ratings yet
Lecture 10 - AI Vs ML Vs DL - Classification
34 pages
Informatics 2022 TestExam
No ratings yet
Informatics 2022 TestExam
3 pages
ASSIGNMENT 2 (Business Analytics For Managers)
No ratings yet
ASSIGNMENT 2 (Business Analytics For Managers)
5 pages
Advanced Spectral Classifiers For Hyperspectral Images A Review
No ratings yet
Advanced Spectral Classifiers For Hyperspectral Images A Review
25 pages
R22 ML Syllabus
No ratings yet
R22 ML Syllabus
2 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
6 pages
Petrel 2011 Workshop Seismic Facies Analysis 6180277 01
No ratings yet
Petrel 2011 Workshop Seismic Facies Analysis 6180277 01
34 pages
Ida Unit-4
No ratings yet
Ida Unit-4
19 pages
Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning
No ratings yet
Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning
8 pages
1 DL Introduction
No ratings yet
1 DL Introduction
51 pages
Applied Computational Intelligence and Soft Computing - 2022 - Chung - Mental Health Prediction Using Machine Learning
No ratings yet
Applied Computational Intelligence and Soft Computing - 2022 - Chung - Mental Health Prediction Using Machine Learning
19 pages
REPORT - STOCK PRICE PREDICTION - New
No ratings yet
REPORT - STOCK PRICE PREDICTION - New
40 pages
بنك اسئلة للمراجعة - تنقيب بيانات
No ratings yet
بنك اسئلة للمراجعة - تنقيب بيانات
24 pages
2023 - 2024 - Full Ieee Title List
No ratings yet
2023 - 2024 - Full Ieee Title List
87 pages
Student's Behavior Clustering Based On Ubiquitous Learning Log Data Using Unsupervised Machine Learning
No ratings yet
Student's Behavior Clustering Based On Ubiquitous Learning Log Data Using Unsupervised Machine Learning
7 pages
Jeeva Final
No ratings yet
Jeeva Final
34 pages
Bits f464 Machine Learning - Handout
No ratings yet
Bits f464 Machine Learning - Handout
2 pages
Artificial Intelligent: Supervised Learning and Unsupervised Learning
No ratings yet
Artificial Intelligent: Supervised Learning and Unsupervised Learning
17 pages
Ai 3
No ratings yet
Ai 3
8 pages
Ai Mcqs Unit1
No ratings yet
Ai Mcqs Unit1
15 pages
Anns
No ratings yet
Anns
19 pages
Mini Project Document
No ratings yet
Mini Project Document
45 pages
Paper Reddy 21
No ratings yet
Paper Reddy 21
22 pages
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
No ratings yet
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
13 pages

Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning

Uploaded by

Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning

Uploaded by

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Video 2.4 Scenarios for

In Supervised learning, input data is always pre-classified with unique

Online learning is a learning scenario where data is processed in-real time in an

A middle way is to handle data in so called mini-batches.

Noise is anything that is spurious and

An outlier may be due to natural but extreme

Outliers can occur in all subsequent scenarios

- Label 1 Negative examples

For many situations and many machine

+ - learning algorithms, faster convergence

Such examples do not necessarily belong to

Results of earlier learning are

However there many reasons why ´older

Machine Learning Algorithms that

All aspects introduced in scenario 1-9 are still

Input data are NOT classified, i.e.

Unsupervised learning algorithms

The main category of techniques that

Cluster analysis is the assignment of a set of

Important aspects of clustering techniques are:

Category 5 Category 6 Category 7 Category 8 Category 9 Category 10

In a typical Concept Learning task the following steps need to be considered:

• Data harvesting from potentially heterogeneous sources

• Pre-processing of data (e.g. from analogue to digital form)

• Establishment of model or theory support

• Tailoring conditions for algorithms

• Core Data analysis phase

• Post-processing of acquired Concept definitions including validations

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Thanks for your attention!

The next lecture 2.5 will be on the

Tutorial for Week 2

You might also like