0% found this document useful (0 votes)
43 views40 pages

Introduccion A ML

This document introduces machine learning concepts. It defines machine learning as a field that gives computers the ability to improve automatically through experience and use of data. Machine learning uses data to discover patterns and make predictions without being explicitly programmed. There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning, which differ based on the type of feedback available to the learning system. The document discusses various applications of machine learning such as classification, prediction, and pattern recognition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views40 pages

Introduccion A ML

This document introduces machine learning concepts. It defines machine learning as a field that gives computers the ability to improve automatically through experience and use of data. Machine learning uses data to discover patterns and make predictions without being explicitly programmed. There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning, which differ based on the type of feedback available to the learning system. The document discusses various applications of machine learning such as classification, prediction, and pattern recognition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Introducción al

aprendizaje automático
TC3002B
Basic Concepts
Introduction

2023FJ - [email protected] 2
– Ability to
– use percepts from the outside world
– not only for reacting,
What is – but for improving actions in future events.

learning?
– Implies that we know when and how to use this new knowledge.
– When: pattern detected
– How: algorithm created.

2023FJ - [email protected] 3
– Example:
– Imagine a supermarket chain with a hundred of stores selling
groceries to millions of customers.
– Each sale has a lot of data that can be analised and converted
into information.
What is – These information can be used to give people suggestions
when buying.
machine
learning? – If we knew who would buy an item, we would just write
code for the computer to remind them.

– Because we do not know, we collect data and hope to


extract enough information to recommend articles to
people.

2023FJ - [email protected] 4
– Example:
– In RoboCup agents play soccer.
– There are 11 players against 11 players.
– Each team has its own strategy for playing soccer.

What is – If we knew which strategy a team is using, we would play a


counter-attack strategy to stop them.
machine
learning? – Because we do not know their strategy, we collect data and try to
extract enough information to detect their strategies.

– Once strategies are detected and classified, we could select the


best strategy to exploit this knowledge.

2023FJ - [email protected] 5
– The computer algorithm should be able to:
– Identify patterns in the data (When)

– Construct a good and useful approximation of the solution to the


What is ML? … problem (How)

2023FJ - [email protected] 6
– “Machine learning uses data and answers to discover rules behind
a problem” Chollet (2017)

– “Machine learning is programming computers to optimize a


performance criterion using example data or past experience.”
Alpaydin, E. (2004)

What is ML? … – Has a model defined for some parameters.


– Learning is the execution of a computer program to optimize the
parameters of the model using training data or past experience.

– Two types of models:


– Predictive model: predictions in the future.
– Descriptive model: gain knowledge from data.

2023FJ - [email protected] 7
– “A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with
experience E.” Mitchell, T. (1997)

– Example: handwriting recognition


What is ML? … – Task T: recognizing and classifying handwritten words within
images.
– Performance measure P: percent of words correctly classified
– Training experience E: a database of handwritten words with given
classifications

2023FJ - [email protected] 8
Learning Agent

2023FJ - [email protected] 9
1. Which components of the performance element should be
learned?
Design of a
learning 2. What feedback is available to learn these components?
element
3. What representation is used for the components?

2023FJ - [email protected] 10
– What can be learned?
– Direct mapping from conditions on the current state to actions.

– Means to infer relevant properties of the world from the percept


sequence.
Components
of the – Information about the way world evolves and the results of possible
actions agent can take.
performance
– Utility information indicating the desirability of world states.
element
– Action-value information indicating desirability of actions.

– Goals that describe classes of states whose achievement maximizes


the agent’s utility.

2023FJ - [email protected] 11
– Components can be learned from appropriate feedback.
– Example: training Tae Kwon Do, Driving a Taxi.

– Type of feedback:
– The most important factor in determining the nature of the learning
Feedback problem.

– Three cases:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning

2023FJ - [email protected] 12
– Learning a function from examples of its inputs and outputs.
– There is an input X, an output Y, and the task is to learn mapping
from input to output.

– Outputs values can be provided


– By a supervisor – someone feed the output.
Supervised – By the environment – detected by sensors.

Learning
– Examples:
– Learn a condition-action rule for punching.
– Learn to differentiate between a dog and a cat.
– Regression
– Classification.

2023FJ - [email protected] 13
– Learning patterns in the input when no specific output values are
supplied.

– Aim: to find regularities in the input.


Unsupervised
learning – Example:
– Learn to separate colors.
– Learn when it might rain.
– Learn how to detect people that will not pay their credit cards.

2023FJ - [email protected] 14
– Mix between supervised and unsupervised learning

– Some data is labelled – usually a very small part


Semi-
supervised – Labelled data is used to create more data
learning
– Learner learns to:
– Generate labelled data and to
– Detect regularities in the input

2023FJ - [email protected] 15
– The output of the system is a sequence of actions.

– Uses rewards to guide the sequence of actions

– These actions are part of a policy.


– A single action is not important.
Reinforcement – The policy is what must be learned.

learning – Agent must learn from reinforcement which actions are best, i.e. the
policy.

– Examples:
– Playing chess.
– Driving politely.
– Robot navigation.

2023FJ - [email protected] 16
– Polynomials

– Propositional logic

Representation – Predicate calculus


of the learned
information – Bayesian networks

– Neural networks

– Etc.

2023FJ - [email protected] 17
– Learning associations
– Learn how people associate
elements (ex. buying – Knowledge extraction
groceries) – Learning a rule from data – it
explains the data
– Rules are a form of data
– Classification compression
– Learn to classify elements in
Applications of different categories
– Outlier detection
machine – Prediction
– Data that does not belong to
a class
learning – Learn to predict if some
action will happen
– Regression problems
– Learn the curve that best fits
– Pattern recognition a function to a set of points
– Learn to find familiar
patterns (characters, faces,
objects, etc.)

2023FJ - [email protected] 18
– Artificial Intelligence

– Bayesian methods

– Computational complexity theory

– Control theory
ML is
multidisciplinary – Information theory

– Philosophy

– Psychology and neurobiology

– Statistics

2023FJ - [email protected] 19
Designing a learning
system
Introduction

2023FJ - [email protected] 20
Hyper-
Data Model parameter
Collection Fitting tuning

ML Process
Data Model
Preparation Evaluation

2023FJ - [email protected] 21
1. Choosing the training experience
1. Feedback
2. Control of sequence of examples
3. Distribution of examples

Designing a 2.
1.
Choosing the target function
Function that is operational
learning
system 3. Choosing a representation for the target function
1. Expressive representation

4. Choosing a function approximation algorithm


1. Estimating training values
2. Adjusting the weights

2023FJ - [email protected] 22
How to build a
dataset

2023FJ - [email protected] 23
Structured Data
ML models learn from examples
Each example is called an instance or
pattern
Dataset is formed with multiple
examples
Structured Data is organized in rows
and columns
A column is called a feature

Images, videos and text are called


Unstructured Data
https://machinelearningmastery.com/wp-content/uploads/2013/12/Table-of-Data-Showing-an-Instance-Feature-and-Train-Test-Datasets.png

2023FJ - [email protected] 24
Dataset
organization
and division
Training set is usually 80% of
original set

Test set is usually 20%

Validation set is usually 20% of


training set

https://miro.medium.com/max/585/0*lbveKaL-MGRgppD8.png

2023FJ - [email protected] 25
– When data is too big and we
can’t pass all data to computer
at once.

– One Epoch is when an entire


dataset is passed forward and
backward through the learning
model only once.

Epoch, batch – Batch size: divide dataset into


number of batches or sets or
& iteration parts.

– Iterations is the number of


batches needed to complete
one epoch.

– The number of batches is equal


to number of iterations for one
epoch.

https://towardsdatascience.com/epoch-vs-iterations-vs-batch-size-
4dfb9c7ce9c9
2023FJ - [email protected] 26
– Data Wrangling
– Data might be in different
files
– Cleaning, structuring,
enriching raw data – Data Preparation
– Assure quality and useful – Analysis and optimization of
data features
– Select/remove features
– Consider prediction needs
– Data Cleansing and computation time
Data – Missing values (delete?)
– Unwanted characters
Preprocessing – Unwanted elements

https://miro.medium.com/max/666/0*ScsuON73dMJDC9XO.png

2023FJ - [email protected] 27
Complete data
science
pipeline

https://developer.ibm.com/articles/ba-intro-data-science-1/

2023FJ - [email protected] 28
Model
Evaluation

2023FJ - [email protected] 29
– N x N matrix Actual values
– N = number of classes

Predicted values
Confusion – Evaluates performance of a
Matrix classification model

– Compares actual target https://www.python-course.eu/images/confusion.matrix_image.png


values with predicted values

2023FJ - [email protected] 30
– True Positive (TP)
– Predicted value matches
actual
– Both were positive
– True Negative (TN)
– Predicted value matches
actual
– Both are negative
Binary
confusion – False Positive (FP)
– Type I error
matrix – Predicted value falsely
predicted
– Actual value Negative
– False Negative (FN)
– Type II error
https://cdn.analyticsvidhya.com/wp-content/uploads/2020/04/Basic-Confusion-matrix.png – Predicted value falsely
predicted
– Actual value Positive

2023FJ - [email protected] 31
– Accuracy
– Fraction of predictions model correctly classified

!"##$%& '#$()%&)"*+
– 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ,"&-. */01$# '#$()%&)"*+
Confusion
matrix metrics – For binary classification
!"#!$
– 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = !"#!$#%"#%$

– Very simple, but does not take into consideration class imbalances
and data unevenly distributed

2023FJ - [email protected] 32
– Precision
– Proportion predicted positives identified correctly
!"
– 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
!"#$"

– Recall (Sensitivity)
– Proportion actual positives identified correctly
!"
– 𝑅𝑒𝑐𝑎𝑙𝑙 = !"#$%

Confusion – Specificity
– Proportion actual negatives identified correctly
matrix – 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = !%#$"
!%

metrics…
– Precision used when FP is a higher concern than FN
– From the predicted positives, how many are really positive?

– Recall used when there is a high cost associated with FN


– How many positive were correctly classified?
– A higher recall ensures more actual positive values are being identified

2023FJ - [email protected] 33
– F1 Score
– Helps understand balance between Precision and Recall

2 '#$%)+)"* × #$%-.. ,6
– 𝐹1 = =2𝑥 =
Confusion !
3
!
"#$%&& '"#$()(*+
'#$%)+)"*5#$%-.. !
,6 5 ,(86589)

matrix – Values range from 0 to 1


metrics… – A value close to 1 means it is a better model

– Used when
– there is a need to balance this two metrics
– Not easy to decide if Type I or Type II errors is preferred

2023FJ - [email protected] 34
– When a classifier is not reporting the values we desire, we can
move the threshold for classification

– Moving threshold can increase/decrease recall, precision and


ROC & AUC specificity values

– ROC and AUC can help us determine the best threshold


– Receiver Operator Characteristic (ROC)
– Area Under the Curve (AUC)

2023FJ - [email protected] 35
ROC
– Summarizes all confusion matrices produced with
different thresholds

– Diagonal line is where TP rate is equal to FP rate

– Points above the diagonal represent a good classifier

– The best classifier would be (1,0)

– The value of the threshold is the value used in the


classifier that produced the ROC graph

2023FJ - [email protected] 36
AUC

– Helps to compare different ROC


graphs

– The greater the value of the AUC, the


better the model is for classifing that
data

2023FJ - [email protected] 37
Issues in
machine
learning

2023FJ - [email protected] 38
– What algorithms exist for learning general target functions from
specific training examples?

– How much training data is sufficient?

Issues in – When and how can prior knowledge guide the process of generalizing
from examples?
machine
learning – What is the best strategy for choosing a useful training experience?

– What is the best way to reduce the learning task to one or more
function approximation problems?

– Can the learner learn to represent the target function?

2023FJ - [email protected] 39
– Alpaydin, Ethem (2004). Introduction to Machine Learning. The MIT Press.

– Mitchell, Tom (1997). Machine Learning. WCB McGraw-Hill.

– Edwards, Gavin (2018). Machine Learning, an introduction. Towards Data


References Science (https://towardsdatascience.com/machine-learning-an-
introduction-23b84d51e6d0)

– Josh Starmer (2019). ROC and AUC clearly explained! StatQuest with Josh
Starmer, YouTube Channel

2023FJ - [email protected] 40

You might also like