Learning

This document covers key concepts in machine learning, including inductive learning, decision trees, and classification learning. It discusses the importance of machine learning in improving human learning efficiency, discovering new information, and adapting software agents. The document also outlines various paradigms of machine learning and details the decision tree learning algorithm, including its strengths and weaknesses.

Uploaded by

anthonio.alex.d.costa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views51 pages

Learning

Uploaded by

anthonio.alex.d.costa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 51

Machine Learning

Chapter 18.1-18.3

Some material adopted from notes

by Chuck Dyer
1
Today’s Class
• Machine learning
– What is ML?
– Inductive learning
• Supervised
• Unsupervised
– Decision trees
• Later we’ll cover Bayesian learning, naïve Bayes, and
BN learning

2
Why Learn?
• Understand and improve efficiency of human learning
– Use to improve methods for teaching and tutoring people (e.g.,
better computer-aided instruction)
• Discover new things or structure that were previously
unknown to humans
– Examples: data mining, scientific discovery
• Fill in skeletal or incomplete specifications about a domain
– Large, complex AI systems cannot be completely derived by hand
and require dynamic updating to incorporate new information.
– Learning new characteristics expands the domain or expertise and
lessens the “brittleness” of the system
• Build software agents that can adapt to their users or to
other software agents
3
Major Paradigms of Machine Learning
• Rote learning – One-to-one mapping from inputs to stored
representation. “Learning by memorization.” Association-based
storage and retrieval.
• Induction – Use specific examples to reach general conclusions
• Clustering – Unsupervised identification of natural groups in data
• Analogy – Determine correspondence between two different
representations
• Discovery – Unsupervised, specific goal not given
• Genetic algorithms – “Evolutionary” search techniques, based on
an analogy to “survival of the fittest”
• Reinforcement – Feedback (positive or negative reward) given at
the end of a sequence of steps

4
5
6
Classification Learning: Definition
 Given a collection of records (training set)
– Each record contains a set of attributes, one of the
attributes is the class
 Find a model for the class attribute as a function of the
values of the other attributes
 Goal: previously unseen records should be assigned a class
as accurately as possible
– Use test set to estimate the accuracy of the model
– Often, the given data set is divided into training and test
sets, with training set used to build the model and test set
used to validate it
Illustrating Classification
Learning
Tid Attrib1 Attrib2 Attrib3 Class Learning
No
1 Yes Large 125K
algorithm
2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn

8 No Small 85K Yes Model
9 No Medium 75K No
10 No Small 90K Yes
Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ? Deduction

14 No Small 95K ?

15 No Large 67K ?
10

Test Set
Examples of Classification Task
Predicting tumor cells as benign or malignant

Classifying credit card transactions

as legitimate or fraudulent

Classifying secondary structures of protein

as alpha-helix, beta-sheet, or random
coil

Categorizing news stories as finance,

weather, entertainment, sports, etc.
Inductive Learning and Bias

• Suppose that we want to learn a function f(x) = y and we

are given some sample (x,y) pairs, as in figure (a)
• There are several hypotheses we could make about this
function, e.g.: (b), (c) and (d)
• A preference for one over the others reveals the bias of our
learning technique, e.g.:
– prefer piece-wise functions (b)
– prefer a smooth function (c)
– prefer a simple function and treat outliers as noise (d)
Inductive Learning as Search
• Instance space I defines the language for the training and
test instances
– Typically, but not always, each instance i  I is a feature vector
– Features are also sometimes called attributes or variables
– I: V1 x V2 x … x Vk, i = (v1, v2, …, vk)
• Class variable C gives an instance’s class (to be predicted)
• Model space M defines the possible classifiers
– M: I → C, M = {m1, … mn} (possibly infinite)
– Model space is sometimes, but not always, defined in terms of the
same features as the instance space
• Training data can be used to direct the search for a good
(consistent, complete, simple) hypothesis in the model
space

11
Model Spaces
• Decision trees
– Partition the instance space into axis-parallel regions, labeled with class
value
• Nearest-neighbor classifiers
– Partition the instance space into regions defined by the centroid instances
(or cluster of k instances)
• Bayesian networks (probabilistic dependencies of class on attributes)
– Naïve Bayes: special case of BNs where class  each attribute
• Neural networks
– Nonlinear feed-forward functions of attribute values
• Support vector machines
– Find a separating plane in a high-dimensional feature space
• Associative rules (feature values → class)
• First-order logical rules

12
Learning Decision Trees
• Goal: Build a decision tree to classify
examples as positive or negative
instances of a concept using supervised
learning from a training set
• A decision tree is a tree where
– each non-leaf node has associated with it
an attribute (feature)
–each leaf node has associated with it a
classification (+ or -)
–each arc has associated with it one of the
possible values of the attribute at the node
from which the arc is directed
• Generalization: allow for >2 classes
–e.g., {sell, hold, buy}

13
Example of a Decision Tree
Tid Refund Marital Taxable
Status Income Cheat

No
Refund
1 Yes Single 125K
Yes No
2 No Married 100K No
3 No Single 70K No NO MarSt
4 Yes Married 120K No Single, Divorced Married
5 No Divorced 95K Yes
TaxInc NO
6 No Married 60K No
7 Yes Divorced 220K No
< 80K > 80K
8 No Single 85K Yes NO YES
9 No Married 75K No
10 No Single 90K Yes
10

Training Data Model: Decision Tree

Apply Model to Test Data
Test Data
Start at the root of tree Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K