0% found this document useful (0 votes)
40 views30 pages

Aprendizaje de Máquina: Joaquín F Sánchez

1) The document discusses machine learning processes including model learning, induction from data, approaches to learning such as probabilistic and geometrical methods, and optimization techniques. 2) It also covers topics like supervised vs unsupervised learning, evaluation of training vs generalization error, overfitting and underfitting, and the use of regularization to control model complexity. 3) The intended audience appears to be students or others new to machine learning, as it provides a broad overview of foundational concepts in clear, non-technical language.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views30 pages

Aprendizaje de Máquina: Joaquín F Sánchez

1) The document discusses machine learning processes including model learning, induction from data, approaches to learning such as probabilistic and geometrical methods, and optimization techniques. 2) It also covers topics like supervised vs unsupervised learning, evaluation of training vs generalization error, overfitting and underfitting, and the use of regularization to control model complexity. 3) The intended audience appears to be students or others new to machine learning, as it provides a broad overview of foundational concepts in clear, non-technical language.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Aprendizaje de Máquina

Joaquín F Sánchez
[email protected]

Programa de Ingeniería de Software


Facultad de Ingeniería
Universidad Manuela Beltrán
Pagina para revisar
shorturl.at/dkETX
T he machine learning process

Data and models


Data and Model

Int roduction
Machine learning
T he machine learning process

Machine Learning

ML
Fabio González, PhD An Introduct ion to Machine Learning
T he machine learning process

Machine Learning with Text Data


Machine Learning with Text Data

Fabio González, PhD An Introduct ion t o Machine Learning


Int roduction
Machine learning

The fourth paradigm


T he machine learning process

The fourth paradigm

Fabio González, PhD An Introduct ion t o Machine Learning


Machine learning
Supervised learning
T he machine learning process
Non-supervised learning

ML(Machine
Machine Learning Learning)

Construction and study of systems that can learn from data


Main problem: to find patterns, relationships, regularities
among data, which allow to build descriptive and predictive
models.
Related fields:
Statist ics
Pattern recognition and computer vision
Data mining and knowledge discovery
Data analyt ics

Fabio González, PhD An Int roduct ion t o Machine Learning


Brief history
Brief History
Fisher’s linear discriminant (Fisher, 1936)
Artificial neuron model (MCCulloch and Pitts, 1943)
Perceptron (Rosenblatt, 1957) (Minsky&Papert, 1969)
Probably approximately correct learning (Valiant, 1984)
Multilayer perceptron and back propagation (Rumelhart et al.,
1986)
Decision trees (Quinlan, 1987)
Bayesian networks (Pearl, 1988)
Support vector machines (Cortes&Vapnik, 1995)
Efficient MLP learning, deep learning (Hinton et al., 2007)

Fabio González, PhD An Introduct ion t o Machine Learning


What ’s machine learning
Int roduction
Hist ory
Machine learning

Supervised learning
Supervised learning
T he machine learning process
Non-supervised learning

Supervised learning

Fundament al problem:
to find a function that
relates a set of inputs
with a set of outputs
Typical problems:
Classification
Regression

Fabio González, PhD An Introduct ion t o Machine Learning


Int roduction
Hist ory
Machine learning
Supervised learning
T he machine learning process

Supervised learning
Non-supervised learning

Supervised learning

Fundament al problem:
to find a function that
relates a set of inputs
with a set of outputs
Typical problems:
Classification
Regression

Fabio González, PhD An Introduct ion t o Machine Learning


Non-supervised learning
No supervised learning
There are not labels for the
training samples
Fundament al problem: to find
the subjacent structure of a
training data set
Typical problems: clustering,
segmentation, dimensionality
reduction, latent topic analysis
Some samples may have labels,
in that case it is called
semi-supervised learning

Fabio González, PhD An Introduct ion t o Machine Learning


Non-supervised learning
No supervised learning
There are not labels for the
training samples
Fundament al problem: to find
the subjacent structure of a
training data set
Typical problems: clustering,
segmentation, dimensionality
reduction, latent topic analysis
Some samples may have labels,
in that case it is called
semi-supervised learning

Fabio González, PhD An Introduct ion t o Machine Learning


The machine Learning process
The Machine Learing process
Model learning
Model learning

Fabio González, PhD An Introduct ion t o Machine Learning


Model induction from data
Model induction from data

Learning is an ill-posed problem (more than one possible


solution for the same particular problem, solutions are
sensitive to small changes on the problem)
It is necessary to make additional assumptions about the kind
of pattern that we want to learn
Hypot hesis space: set of valid patterns that can be learnt by
the learning algorithm
Occam’s razor: ”All things being equal, the simplest solution
tends to be the best one.”

Fabio González, PhD An Introduct ion t o Machine Learning


Model application

Approaches to learning
Approaches to learning

Probabilistic:
Generative models: model P(Y , X )
Discriminative models: model P(Y |X )
Geometrical:
Manifold learning: model the geometry of the space where the
data lives
Max margin learning: model the separation between the classes
Optimization:
Energy/ loss/ risk minimization

Fabio González, PhD An Introduct ion t o Machine Learning


Feat ure ext ract ion
T he machine learning process
Model applicat ion

Learning
Learning as optimization and Optimization
General optimization problem:

min L(f , D),


f 2H

with H:hypothesis space, D:training data, L:loss/ error


Example, logistic regression:
Hypothesis space:

y(x) = P(C+ |x) = σ(w T x)

Cross-entropy error:

X`
E(w) = − ln p(t |w) = − [t n ln yn + (1 − t n ) ln(1 − yn )]
n= 1

Fabio González, PhD An Int roduct ion t o Machine Learning


Methods
T he machine learning process
Model application

Methods Methods

Fabio González, PhD An Introduct ion t o Machine Learning


ategies
Srategies
Optimization (non-linear, convex, etc)
Stochastic gradient descent
Kernel methods
Maximum likelihood estimation
Maximum a posteriori estimation
Bayesian estimation (variational learning, Gaussian processes)
Expectation maximization
Maximum entropy models
Sampling (Markov Chain Monte Carlo, particle filtering)

Fabio González, PhD An Introduct ion t o Machine Learning


Evaluation

Fabio González, PhD An Int roduct ion t o Machine Learning


Model application

Training error vs
Training error vs generalization error
generalization error

Training error:
X`
L(f w , Si )
Model ilearning
=1
Int roduction
Model evaluation
Machine learning
Generalization
T he machine error: Feat
learning process
ure ext ract ion
Model application
E[(L(fw , S)]
validation

Cross Validation
Fabio González, PhD An Introduct ion t o Machine Learning
Model learning
Int roduct ion
Model evaluat ion
Machine learning
Feat ure ext ract ion

Overfitting and underfitting


T he machine learning process
Model applicat ion

Overfitting and underfitting

Fabio González, PhD An Int roduct ion t o Machine Learning


Model applicat ion

Regularization
Regularization

Controls the complexity of a learned model


Usually, the regularization term corresponds to a norm of the
parameter vector (L1 or L2 the most common)
In some cases, it is equivalent to the inclusion of a prior and
finding a MAP solution.

Fabio González, PhD An Int roduct ion t o Machine Learning


ature extraction
Feature extraction
Features
Features
Features represent our prior knowledge of the problem
Depend on the type of data
Specialized features for practically any kind of data (images,
video, sound, speech, text, web pages, etc)
Medical imaging:
Standard computer vision features (color, shape, texture,
edges, local-global, etc)
Specialized features tailored to the problem at hand
New trend: learning features from data

Fabio González, PhD An Int roduct ion t o Machine Learning


Model learning
Int roduct ion
Model evaluat ion
Machine learning
Feat ure ext ract ion

Features learning
T he machine learning process
Model applicat ion

Feature learning

Fabio González, PhD An Int roduct ion t o Machine Learning


Feat ure ext ract ion
T he machine learning process
Model application

Unsupervised
Unsupervised features
feature learning learning

Fabio González, PhD An Introduct ion t o Machine Learning


Model application
Model application
High-throughput data analytics
High throughput data analytics

Large scale machine learning (big-data):


Large number of samples
Large samples (whole-slide images, 4D high-resolution
volumes)
Scalable learning algorithms (on-line learning)
Distributed computing architectures (Hadoop, Spark)
GPGPU computing and multicore architectures
¿Preguntas?

You might also like