0% found this document useful (0 votes)
46 views

Overview and A Machine Learning Algorithm

This document discusses machine learning and decision trees. It provides an overview of machine learning algorithms that improve performance with experience. It also discusses decision tree learning, how decision trees represent data, and concepts like entropy, sample entropy, and overfitting. The trend of machine learning accelerating and being applied to areas like speech recognition, natural language processing, computer vision, and medical analysis is also mentioned.

Uploaded by

mrt5000
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Overview and A Machine Learning Algorithm

This document discusses machine learning and decision trees. It provides an overview of machine learning algorithms that improve performance with experience. It also discusses decision tree learning, how decision trees represent data, and concepts like entropy, sample entropy, and overfitting. The trend of machine learning accelerating and being applied to areas like speech recognition, natural language processing, computer vision, and medical analysis is also mentioned.

Uploaded by

mrt5000
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Machine Learning, Decision Trees, Overfitting

Recommended reading: Mitchell, Chapter 3

Machine Learning 10-701 Tom M. Mitchell Center for Automated Learning and Discovery Carnegie Mellon University September 13, 2005

Machine Learning:
Study of algorithms that improve their performance at some task with experience

Learning to Predict Emergency C-Sections


[Sims et al., 2000]

9714 patient records, each with 215 features

Object Detection
(Prof. H. Schneiderman)

Example training images for each orientation

Text Classification

Company home page vs Personal home page vs Univeristy home page vs

Reading a noun (vs verb)


[Rustandi et al., 2005]

Growth of Machine Learning


Machine learning is preferred approach to
Speech recognition, Natural language processing Computer vision Medical outcomes analysis Robot control Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environment

This trend is accelerating

Decision tree learning

How would you represent AB CD(E)?

Each internal node: test one attribute Xi Each branch from a node: selects one value for Xi Each leaf node: predict Y (or P(Y|X leaf))

[ID3, C4.5, ] node = Root

Entropy
Entropy H(X) of a random variable X

H(X) is the expected number of bits needed to encode a randomly drawn value of X (under most efficient code) Why? Information theory: Most efficient code assigns -log2P(X=i) bits to encode the message X=i So, expected number of bits is:

Sample Entropy

Assume X values known, labels Y encoded

What you should know:


Well posed function approximation problems:
Instance space, X Sample of labeled training data, D = { <xi, yi>} Hypothesis space, H = { f: X Y }

Learning is a search/optimization problem over H


Various objective functions Today: minimize training error (0-1 loss)

Decision tree learning Greedy top-down learning of decision trees (ID3, C4.5, ...) Overfitting and tree/rule post-pruning Extensions

You might also like