0% found this document useful (0 votes)
5 views23 pages

2 Main 2nd Lecture

Uploaded by

akg.uk14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views23 pages

2 Main 2nd Lecture

Uploaded by

akg.uk14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

History of Machine Learning

• 1960’s and 70’s: Models of human learning


– High-level symbolic descriptions of knowledge, e.g., logical
expressions or graphs/networks, e.g., (Karpinski & Michalski,
1966) (Simon & Lea, 1974).
– Winston’s (1975) structural learning system learned logic-based
structural descriptions from examples.

• Minsky Papert, 1969


• 1970’s: Genetic algorithms
– Developed by Holland (1975)
• 1970’s - present: Knowledge-intensive learning
– A tabula rasa approach typically fares poorly. “To acquire new
knowledge a system must already possess a great deal of initial
knowledge.” Lenat’s CYC project is a good example.
History of Machine Learning
(cont’d)
• 1970’s - present: Alternative modes of learning
(besides examples)
– Learning from instruction, e.g., (Mostow, 1983) (Gordon &
Subramanian, 1993)
– Learning by analogy, e.g., (Veloso, 1990)
– Learning from cases, e.g., (Aha, 1991)
– Discovery (Lenat, 1977)
– 1991: The first of a series of workshops on Multistrategy
Learning (Michalski)
• 1970’s – present: Meta-learning
– Heuristics for focusing attention, e.g., (Gordon &
Subramanian, 1996)
– Active selection of examples for learning, e.g., (Angluin,
1987), (Gasarch & Smith, 1988), (Gordon, 1991)
– Learning how to learn, e.g., (Schmidhuber, 1996)
History of Machine Learning
(cont’d)
• 1980 – The First Machine Learning Workshop was held at Carnegie-
Mellon University in Pittsburgh.
• 1980 – Three consecutive issues of the International Journal of Policy
Analysis and Information Systems were specially devoted to
machine learning.
• 1981 - Hinton, Jordan, Sejnowski, Rumelhart, McLeland at
UCSD
– Back Propagation alg. PDP Book
• 1986 – The establishment of the Machine Learning journal.
• 1987 – The beginning of annual international conferences on
machine learning (ICML). Snowbird ML conference
• 1988 – The beginning of regular workshops on computational
learning theory (COLT).
• 1990’s – Explosive growth in the field of data mining, which involves
Bottom line from History

• 1960 – The Perceptron (Minsky Papert)

• 1960 – “Bellman Curse of Dimensionality”

• 1980 – Bounds on statistical estimators (C.


Stone)
• 1990 – Beginning of high dimensional data
(Hundreds variables)
• 2000 – High dimensional data (Thousands
A Glimpse in to the future

• Today status:
– First-generation algorithms:
– Neural nets, decision trees, etc.

• Future:
– Smart remote controls, phones, cars
– Data and communication networks,
software
Type of models
• Supervised learning
– Given access to classified data
• Unsupervised learning
– Given access to data, but no
classification
– Important for data reduction
• Control learning
– Selects actions and observes
consequences.
– Maximizes long-term cumulative
return.
Some Issues in Machine
Learning

• What algorithms can approximate


functions well, and when?
• How does number of training examples
influence accuracy?
• How does complexity of hypothesis
representation impact it?
• How does noisy data influence accuracy?
More Issues in Machine
Learning

What are the theoretical limits of learnability?


• How can prior knowledge of learner help?
• What clues can we get from biological
learning
systems?

• How can systems alter their own


representations?
Complexity vs. Generalization

• Hypothesis complexity versus observed


error.
• More complex hypothesis have lower
observed
error on the training set,
• Might have higher true error (on test set).
Nearest Neighbor
Methods

Classify using near examples.

Assume a “structured space” and a “metric”

- + +
? -
+
- +
-
Separating Hyperplane

sign
Perceptron: sign( 
xiwi )
Find w1 .... 
wn
w1 wn
Limited representation
x1 xn
Neural Networks

Sigmoidal gates:
a=  xiwi and
output = 1/(1+ e-a)

x1 xn

Learning by “Back Propagation” of errors


Decision Trees

x1 > 5

+1
x6 > 2

+1 -1
Decision Trees

Top Down construction:


Construct the tree greedy,
using a local index function.
Ginni Index : G(x) = x(1-x), Entropy H(x) ...

Bottom up model selection:


Prune the decision Tree
while maintaining low observed error.
Decision Trees

• Limited Representation

• Highly interpretable

• Efficient training and retrieval

algorithm
• Smart cost/complexity pruning

• Aim: Find a small decision tree

with
Support Vector Machine

n dimensions
m
dimensions
Support Vector Machine

Project data to a high dimensional space.

Use a hyperplane in the LARGE space.

Choose a hyperplane with a large MARGIN.

+ -

+ + -
+ -
Reinforcement Learning

• Main idea: Learning with a Delayed Reward

• Uses dynamic programming and supervised


learning

• Addresses problems that can not be addressed


by
regular supervised methods
• E.g., Useful for Control Problems.
Genetic Programming

A search Method. Example: decision trees

Local mutation operations Change a node in a tree

Cross-over operations Replace a subtree by another tree

Keep trees with low observed erro


Keeps the “best” candidates
Unsupervised learning:
Clustering
Unsupervised learning:
Clustering
Data Science Vs. Machine Learning
and AI

Artificial Intelligence Machine Learning Data Science

Includes Machine Learning. Subset of Artificial Intelligence. Includes various Data Operations.

Artificial Intelligence combines large amounts of Machine Learning uses efficient programs that Data Science works by sourcing, cleaning, and
data through iterative processing and intelligent can use data without being explicitly told to do processing data to extract meaning out of it for
algorithms to help computers learn automatically. so. analytical purposes.

Some of the popular tools that AI uses are- The popular tools that Machine Learning makes
Some of the popular tools used by Data Science
1. TensorFlow2. Scikit Learn use of are-1. Amazon Lex2. IBM Watson Studio3.
are-1. SAS2. Tableau3. Apache Spark4. MATLAB
3. Keras Microsoft Azure ML Studio

Artificial Intelligence uses logic and decision Data Science deals with structured and
Machine Learning uses statistical models.
trees. unstructured data.

Chatbots, and Voice assistants are popular Recommendation Systems such as Spotify, and Fraud Detection and Healthcare analysis are
applications of AI. Facial Recognition are popular examples. popular examples of Data Science.

You might also like