0% found this document useful (0 votes)

15 views28 pages

Session 01 - Introduction

1) This session provides an introduction to machine learning, including how it involves studying algorithms that can extract information automatically from data. 2) Machine learning uses techniques from machine learning, statistics, and data mining to build models from sample data that can be used to make predictions. 3) A machine learning model is built by observing input and output signals in data, finding relationships between them (the hypothesis), and using those relationships to make predictions on new data.

Uploaded by

HGE05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views28 pages

Session 01 - Introduction

Uploaded by

HGE05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

25/06/2019

Session 1 – Introduction to Machine

Learning
Dr Ivan Olier
[email protected]

ECI – International Summer School /

Machine Learning
2019

In this session
• We will learn several introductory aspects about Machine Learning.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 2

1
25/06/2019

• Involves the study of • Descriptive statistics – Summarise

algorithms that can extract data from sample and hypothesis
generation
information automatically
• Inferential statistics: to draw
• Some of them include ideas conclusions from data
derived from, or inspired by, Statistics
classical statistics

Machine Data
Learning Mining

• Uses techniques developed in machine learning and statistics, but is put to different ends
• It is carried out by a person, on a particular data set, with a goal in mind.
• Various techniques can be tested and validated
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 3

Signals and Systems

• A system is a group of
interdepending units such that they
SURROUNDINGS form a whole.
Input Output
• A signal is any kind of measurable
variable that carries information.
SYSTEM
• From STEM view, a system is an
Boundary entity that makes operations over
signals.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 4

2
25/06/2019

Data and models

Input data Output data

Model

• A model is a representation of a system using general rules and concepts.

• A mathematical model – Uses mathematics to represent the system (e.g. 𝑦 = 3𝑥).
• A computer model – Uses an algorithm to represent the system
(e.g. y=function(x) 3x)
• Signals are collected in the form of data (e.g. an image, audio recording, text fragment, etc)

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 5

Modelling
• Modelling is about building representations of systems.

Assumptions
Hypothesis Model

𝓨 = 𝓕(𝓧ሻ 𝒀 = 𝒇(𝑿ሻ
Data

• A model of a system is built by observing its input and output signals, which are collected in
the form of data.
• Then, the data is used to find a set of operations or rules that relates inputs and outputs.
• Model assumptions are always needed (e.g. linearity, data correlation, etc). There is no free
lunch! (No free lunch theorem, Wolpert and Macready, 1997)
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 6

3
25/06/2019

Explanatory vs predictive modelling

• Explaining • Predicting
• Causation: 𝑓 is a causal function. • Association: 𝑓 is an association
function.
• Theory: 𝑓 is based on ℱ. • Data: 𝑓 is constructed from data.

• Retrospective: 𝑓 is used to test a • Prospective: 𝑓 is used to predict

hypothesis. new observations.

• Bias minimisation: 𝑓~ℱ • Bias – variance tradeoff.

Machine learning models are usually predictive

models, but this approach is rapidly changing
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 7

Predictive modelling example

• Task:
• to develop a system such
that is able to classify
pears and bananas
automatically.

• Data:
• Variables (2) : “yellowness”
and “asymmetry
• Classes (2) : “Banana” and
“Pear”
• Observations : ~ 100

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 8

4
25/06/2019

Predictive modelling using Machine Learning

Learns from data

Fruit properties
MACHINE
LEARNING
Makes predictions on ALGORITHMS
data

Given a new observation: is

it a pear or a banana?

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 9

Model A Model B Model C

? ? ?
? ? ?

Model complexity

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 10

5
25/06/2019

Model A Model B Model C

? ? ?
? ? ?

Model complexity

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 11

Model A Model B Model C

? ? ?
? ? ?

Model complexity

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 12

6
25/06/2019

Bias – variance tradeoff

xxxxx
xxx High bias
Low variance

x x “Just right”
x xxxx
x

x x
x Low bias
x
x
High variance
x x
x

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 13

Stages of data mining

Exploration
• Preparation and collection of data
• Data cleaning
1 • Data transformation

Model creation and validation

• Selection of appropriate techniques/algorithms
• Evaluate the models based on their predictive performance and
2 interpretability

Application/deployment
• Application to new instances/observations to generate predictions
or estimates of the expected outcome
3

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 14

7
25/06/2019

Preparation and collection of data

• Retrieve data

Online repositories

Different formats

Databases

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 15

Data cleaning
• Check data consistency
• Handle missing values

Automobile Data Set *

fuel aspiratio num widt num engine horsepowe
…
make type n doors body-style length h height cylinders size r price
audi gas std 4 wagon 192.7 71.4 55.7 5 136 110 18,920
audi gas turbo 4 sedan 192.7 71.4 55.9 5 131 140 23,875
audi gas turbo 2 hatchback 178.2 67.9 52 5 131 160 ?
bmw gas std 4 sedan 176.8 64.8 54.3 4 108 101 16,925
bmw gas std 4 sedan 189 66.9 55.7 6 209 182 30,760
bmw gas std 2 sedan 193.8 67.9 53.7 6 ? 182 41,315
chevrolet gas std 2 hatchback 141.1 60.3 53.2 3 61 48 5,151
chevrolet gas std 4 sedan 158.8 63.6 52 4 90 70 6,575
dodge gas std 2 hatchback 157.3 63.8 50.8 4 90 68 5,572
honda gas std 2 hatchback 144.6 63.9 50.8 4 92 58 6,479
jaguar gas std 2 sedan 191.7 70.6 47.8 12 326 262 36,000
…

https://archive.ics.uci.edu/ml/datasets/Automobile

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 16

8
25/06/2019

Data transformation / pre-processing

• Adjust values measured on different scales (normalisation)

There are various types of normalisation in statistics

Standardisation: Feature scaling: …
𝑥 −𝜇 To normalise 𝑥𝑛𝑒𝑤 in [0,1]: To normalise 𝑥𝑛𝑒𝑤 in [−1,1]: To normalise 𝑥𝑛𝑒𝑤 in [𝑎, 𝑏]:
𝑥𝑛𝑒𝑤 =
𝜎 𝑥 − 𝑥𝑚𝑖𝑛 …
𝑥 − 𝑥𝑚𝑖𝑛 𝑥 − 𝑥𝑚𝑖𝑛 𝑥𝑛𝑒𝑤 = 𝑏 − 𝑎 +a
𝜇: mean 𝑥𝑛𝑒𝑤 = 𝑥𝑛𝑒𝑤 = 2 −1 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
𝜎: standard deviation

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 17

Data transformation / pre-processing

• Dimensionality reduction

fe at u re s n e w fe at u re s
d
a
t
a
p
o
i
n
t
s
Original data Reduced data

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 18

9
25/06/2019

Learning tasks
Supervised learning

• Input and responses (outputs) are known, which are used to

build the model. We want to use the model to predict new
responses given new inputs.
• It is related to predictive modelling

Unsupervised learning

• There are no responses. A model is built to discover the data

structure
• It is related to explanatory modelling

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 19

Supervised learning
• Each training case consists of an input vector 𝑥 and a target output 𝑡.
Target output
• There are two types of supervised learning: variable (response)

̵ Regression: the target output is a real number

or a whole vector of real numbers. 𝑥1 𝑥2 … 𝑥𝑫 𝒕
Observations,

o The price of a stock in 6 months time.

cases, rows
instances,

o The temperature at noon tomorrow.

̵ Classification: the target output is a class label.

o The simplest case is a choice between 1 and 0. Input variables
(predictors, features,
o We can also have multiple alternative labels. attributes)

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 20

10
25/06/2019

Regression
e.g. Housing price prediction

400
Price (£) in 1000’s

300 750 feet2?

220
200
150
100

0 Learning task:
0 500 750 1000 1500 2000 2500
Supervised learning
Size in feet2

Regression analysis:
To predict continuous valued output (the price in the current
example)
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 21

Classification
e.g. with 2 variables

Tumour type 1
Tumour type 2
Age

Other potential variables:

- Location of the tumour
masses
Tumour size - Tumour grade
- Tumour stage

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 22

11
25/06/2019

Unsupervised learning – example

Geographic nations, regions, states, countries, cities

age, gender, family size, occupation, income,

Demographic
education, religion, race

Psychographic social class, lifestyle, personality traits

knowledge, attitudes (sexism, racism, poor-

Behavioural
rich)

Market segmentation

Aim: Create subsets of consumers with common needs, interests, spending habits, etc.,
and then designing and implementing strategies to target them.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 23

Model selection
Each point comprises a sample of
[*] the input variable 𝑥 along with the
corresponding target variable 𝑡

Data observations N=10

Function sin(2𝜋𝑥ሻ
The function sin 2𝜋𝑥 was used to
generate the data

Goal: To predict the value of 𝑡 for some new value of 𝑥,

without knowledge of the green curve

[*] Bishop, Pattern Recognition and Machine Learning. Springer, 2006. [page 6]
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 24

12
25/06/2019

Model selection
• Example: Polynomial Curve Fitting
Plots of polynomials having various orders 𝑀 (red curves) fitted to the previous dataset [*]:

𝑀 = 0 and 𝑀 = 1 give rather 𝑀 = 3 seems to give the 𝑀 = 9 fits data perfectly, however
poor fits best fit the fitted curve gives a very poor
representation of the function
sin 2𝜋𝑥

Underfitting Overfitting
[*] Bishop, Pattern Recognition and Machine Learning. Springer, 2006. [page 7]
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 25

Model selection
Plots using the 𝑀 = 9 polynomial for different numbers of data points (𝑁) [*]:

𝑀 = 9, 𝑁 = 10: Increasing 𝑁 reduces the

OVERFITTING overfitting problem

[*] Bishop, Pattern Recognition and Machine Learning. Springer, 2006. [page 9]
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 26

13
25/06/2019

Model performance – predictive capabilities

• Different methods[*]:

̶ Confusion matrix / contingency table

̶ Sensitivity, specificity, precision
̶ Accuracy, balanced accuracy
̶ Error, balanced error rate
̶ Receiver operating characteristic (ROC) curve, area under the curve (AUC)

[*]Mainly for classification tasks. Regression model performance will be studied in more detail later on.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 27

Model performance
• Confusion matrix
• useful when we know the true response values.
• It is used for classification tasks.

• E.g.: 60 patients with HIV and 40 healthy controls:

Predicted class Predicted class

Actual class

TP FN
50 10 True False
HIV Positive Negative

FP TN
Healthy
5 35 False True
Positive Negative

Total: 100 cases

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 28

14
25/06/2019

Model performance
̶ Sensitivity, true positive rate or recall
𝑇𝑃
measures the ability of a test to correctly identify 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
those with the disease (positive cases) 𝑇𝑃 + 𝐹𝑁

̶ Specificity, true negative rate 𝑇𝑁

measures the ability of a test to correctly identify 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁 + 𝐹𝑃
those without the disease (negative cases)

̶ Precision
𝑇𝑃
proportion of the predicted cases with the 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
disease (positive cases) that were correct 𝑇𝑃 + 𝐹𝑃

What is the difference between Sensitivity and Precision?

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 29

Model performance

[*]

Precision is the fraction of retrieved

instances that are relevant,
while
Sensitivity is the fraction of relevant
instances that are retrieved.
[*] https://en.wikipedia.org/wiki/Precision_and_recall
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 30

15
25/06/2019

Model performance
̶ Accuracy 𝑇𝑃 + 𝑇𝑁
indicates how correct a classifier is 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃 + 𝑇𝑁
̶ Balanced accuracy
same as before but takes into account 1 𝑇𝑃 𝑇𝑁
𝐵𝑎𝑙𝑎𝑛𝑐𝑒𝑑 𝑎𝑐𝑐 = +
imbalanced classes 2 𝑇𝑃 + 𝐹𝑁 𝐹𝑃 + 𝑇𝑁

Balanced classes: Imbalanced classes:

C1=10, C2=10 predicted labels C1=10, C2=6 predicted labels
C1 C2 C1 C2
C1 9 1 C1 9 1
true labels true labels
C2 2 8 C2 1 5

regular accuracy: balanced accuracy: regular accuracy: balanced accuracy:

9 +8 9 8 9 +5 9 5
= 0.85 + ൗ2 = 0.85 = 0.875 + ൗ2 = 0.867
9 +1+2+8 9 +1 2+8 9 +1+1+5 9 +1 1+5

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 31

Model performance
̶ Receiver Operating Characteristic (ROC) curve
illustrates the performance of a binary classifier as its
discrimination threshold is varied.

For every possible cut-off point

or criterion value you select to
discriminate between the two
groups

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 32

16
25/06/2019

Model performance
̶ The ROC analysis provides tools to select possibly optimal models and to discard
suboptimal ones independently from (and prior to specifying) the cost context or
the class distribution.
Random guess

Better than
guessing Worse than
guessing

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 33

Model performance
• The Area Under the ROC curve (AUROC)
or simply AUC measures the probability
that a classifier will rank a randomly
chosen positive instance higher than a
randomly chosen negative one
(assuming 'positive' ranks higher than
'negative')

• AUC varies between 0 and 1 — with an

uninformative classifier yielding 0.5 AUC

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 34

17
25/06/2019

Model performance
̶ ROC curve demonstrations [*]

[*] http://www.anaesthetist.com/mnm/stats/roc/Findex.htm

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 35

Model performance
̶ ROC curve demonstrations: bad and good models

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 36

18
25/06/2019

Model validation – predictive capabilities

Training dataset – dataset used to train the model.
Test
Test dataset – dataset used to test generalisation
dataset 1
capabilities of the model. It is not used for training.

Test
Training MODEL dataset 2
dataset

• Ideally, a model should be validated using independent Test

test sets (i.e. collected from other sources). dataset 3
• In practice, independent test sets are rarely available.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 37

Model validation – predictive capabilities

̶ Performance with out-of-sample test sets
is about evaluating the model using ‘unseen’ data, also called ‘hold-out’ data or
‘independent test set’.

Large dataset?
Randomly select a great number of instances for each
of the following groups:
1. Training set – Used to fit the models.
2. Validation set – Used to estimate prediction error
for model selection. Ideally, the test set should be
3. Test set – Used for assessment of the kept in a “vault,” and be
generalization error of the final chosen model brought out only at the end
of the data analysis [*]
[*] Trevor et. al. The Elements of Statistical Learning. Springer. [page 222, 5th print edition]
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 38

19
25/06/2019

Model validation – predictive capabilities

̶ Performance with out-of-sample test sets

Not a very large dataset?

Considerations when partitioning datasets to training and test sets:
• More training data gives better generalization.
• More test data gives better estimate for the
classification error probability.
Find an appropriate balance
partitioned
dataset dataset
1Τ training
Possible 3 Training
solution: 1Τ validation the model
3 (remember to do
1Τ hold-out random selection)
3 Testing

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 39

Model validation – predictive capabilities

Drawbacks of out-of-sample test sets

One repetition
approach

• The validation estimate of the test error can

be highly variable (depends on which
observations are included in the training set
and which in the validation set.)
• In the validation approach, only a subset of
Several repetitions

the observations — those that are included

in the training set rather than in the
validation set — are used to fit the model.
• This suggests that the validation set error
may tend to overestimate the test error for
the model fit on the entire data set.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 40

20
25/06/2019

Model validation – predictive capabilities

̶ 𝒌-fold cross-validation
one round involves partitioning a dataset into complementary subsets,
performing the training on one subset, and validating the model on the other

dataset iter1 iter2 iter3 iter4

iter1 iter2 iter3 iter4

4 5 3 4
1 0 2 1
Acc. 80% 100 60% 80%
%

𝑂𝑣𝑒𝑟𝑎𝑙𝑙 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 80%

with 𝑠𝑡𝑑 = 14%
training
test

Diagram of 𝑘-fold cross-validation with 𝑘=4.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 41

Model validation – predictive capabilities

̶ Leave-one-out (LOOCV)
this is a special and extreme case of a 𝑘-fold cross-validation. It uses a single case from
the original data set for testing, and the remaining cases for training the model

dataset iter1 iter2 iter3 … iter20

iter1 iter2 … iter20
0 1 … 1
1 0 … 0
Acc. 0% 100% … 100%
…

training
test

Diagram of leave-one-out
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 42

21
25/06/2019

Model validation – predictive capabilities

Potential issues with LOOCV:

• LOOCV is sometimes useful, but typically doesn’t shake up the data enough.
• The estimates from each fold are highly correlated and hence their average can have high
variance.

Potential issues with k-CV:

• Since each training set is only (𝐾 − 1ሻ/𝐾 as big as the original training set, the estimates
of prediction error will typically be biased upward.
• This bias is minimized when 𝐾 = 𝑁 (LOOCV), but this estimate has high variance, as noted
before.
• 𝐾 = 5 or 10 provides a good compromise for this bias-variance tradeoff.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 43

Model validation – predictive capabilities

̶ Bootstrapping
involves taking the original dataset and sample from it to form a new same-size sample,
called bootstrap samples.
bootstrap samples

(using sampling with dataset iter1 iter2 … iter1k

replacement)
this is repeated a large
number of times (1k, 10k)

…
iter1 Iter2 … Iter1k
19 18 … 20
1 2 … 0 training
Acc. 95% 90% … 100% test

Diagram of bootstrapping
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 44

22
25/06/2019

Model validation – predictive capabilities

Notes about bootstrapping:
• It is a flexible and powerful statistical tool that can be used to quantify the uncertainty
associated with a given estimator.
• For example, it can provide an estimate of the standard error of a coefficient, or a
confidence interval for that coefficient:

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 45

Model validation – predictive capabilities

Drawbacks of bootstrapping approach:
• In real life, we cannot generate new samples from the original population.
• Each of “bootstrap data sets” is created by sampling with replacement, and is the same size
as our original dataset. As a result some observations may appear more than once in a
given bootstrap data set and some not at all.
• In more complex data situations, figuring out the appropriate way to generate bootstrap
samples can require some thought. For example, if the data is a time series, we can’t simply
sample the observations with replacement.
• In cross-validation, each of the 𝐾 validation folds is distinct from the other 𝐾 − 1 folds
used for training: there is no overlap. This is crucial for its success.
• In bootstrapping, in order to avoid data overlapping, we need to guarantee the use of
predictions for those observations that did not (by chance) occur in the current bootstrap
sample.
• Its implementation gets complicated, and in the end, cross-validation provides a simpler,
more attractive approach for estimating prediction error.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 46

23
25/06/2019

Model selection – predictive capabilities

RESAMPLING METHOD
(Hold-Out, CV, Bootstrapping)

MODEL 1 Training Validation

Training dataset 2 dataset 3
dataset 1 Validation
MODEL 2 MODEL 3 MODEL n dataset 2 Training
Validation Training dataset 3
dataset 1 dataset 2
MODEL
N
Best performance (Highest
overall accuracy, AUC, etc)

BEST MODEL

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 47

Model validation – interpretability

̶ Interpretability – is about making sense of the structure of latent variables and clusters,
features selected, results, etc.

• Example: A model predicts that a certain patient has the flu.

[*]

• The prediction is then explained by the symptoms that are most important to the model.
• With this information about the rationale behind the model, the doctor is now
empowered to trust the model – or not.
[*] https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 48

24
25/06/2019

Model validation – interpretability

Understanding why machine learning models behave the way they do empowers both
system designers and end-users in many ways: in model selection, feature engineering, in
order to trust and act upon the predictions, and in more intuitive user interfaces. [*]

Interpretability is an important topic in machine learning

In some applications, when interpretability is

paramount, they may still be preferred.
[*] Ribeiro et al. Model-Agnostic Interpretability of Machine Learning. ICML 2016. [https://arxiv.org/pdf/1606.05386.pdf]
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 49

Model validation – interpretability

• Example:
We want to predict the effectiveness of a certain
drug in its therapeutic use, based on each
patient's history and condition.

Even though a good predictor would certainly be useful in practice, making a

model that reveals the reasons why the drug would or would not work in
specific cases would be much more meaningful and would enable the experts to
design better therapeutic drugs in the future.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 50

25
25/06/2019

Model validation – interpretability

However
restricting machine learning to interpretable
models poses an important limitation

If interpretability is not paramount:

1. What we really need before using a model is some (statistical) reassurance
about the general ability of the trained model.
2. That being said, we should do everything we can to figure out what is going
on inside machine learning models, because it can help us debug them and
figure out their limitations, thus build better models.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 51

Stages of data analysis – summary

Preparation
Data and collection
of data

Data cleaning

Data transformation
(Pre-processing, DR)

Model creation
Model Application
and model
evaluation (new instances)
selection
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 52

26
25/06/2019

Examples of ML problems [*]

• Spam detection
̶ Given emails in an inbox, identify those emails that are spam and
those that are not.
̶ Having a model of this problem would allow a program to leave non-spam emails in the inbox and move spam
emails to a spam folder.

• Credit card fraud detection

̶ Given credit card transactions for a customer in a month,
identify those that were made by the customer and those that were not.
̶ A program with a model of this decision could refund those transactions that were fraudulent, and trigger an
action to investigate this further

• Digit/letter recognition
̶ Given post codes hand written on envelops,
identify the digit/letter for each hand written character.
̶ A model of this problem would allow a computer program to read and understand handwritten post codes and
sort envelops by geographic region.

[*] http://machinelearningmastery.com/practical-machine-learning-problems/

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 53

Examples of ML problems [*]

• Speech understanding
̶ Given an utterance from a user, identify the specific
request made by the user.
̶ A model of this problem would allow a program to understand and make an attempt to fulfil that request.

• Face detection
̶ Given a digital photo album of many hundreds of digital
photographs, identify those photos that include a given person.
iOS Photos
̶ A model of this decision process would allow a program to organize photos by person.

• Product recommendation
̶ Given a purchase history for a customer and a large
inventory of products, identify those products in which
that customer will be interested and likely to purchase.
̶ A model of this would allow a program to make recommendations to a customer and motivate product
purchases.

[*] http://machinelearningmastery.com/practical-machine-learning-problems/

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 54

27
25/06/2019

Examples of ML problems [*]

• Medical diagnosis
̶ Given the symptoms exhibited in a patient and a database of
anonymized patient records, predict whether the patient is likely to have an illness.
̶ A model of this decision problem could be used by a program to provide decision support to medical
professionals.

• Stock trading
̶ Given the current and past price movements for a stock,
determine whether the stock should be bought, held or sold.
̶ A model of this decision problem could provide decision support to financial analysts.

• Customer segmentation
̶ Given the pattern of behaviour by a user during a trial
period and the past behaviours of all users, identify those
users that will convert to the paid version of the product and those that will not.
̶ A model of this decision problem would allow a program to trigger customer interventions to persuade the
customer to covert early or better engage in the trial.

[*] http://machinelearningmastery.com/practical-machine-learning-problems/

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 55

Summary
1. Explained the term Machine Learning, its relation with Statistic and Data Mining,
and other associated terms

2. Described in detail the different stages of the data analysis

3. Presented different types of methods according to their learning style to solve

particular ML problems

4. Explained how to apply different methods for model evaluation for assessing both
predictive performance and interpretability

5. Presented different scenarios where ML methods are/can be applied

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 56

Practical Machine Learning Illustrated With KNIME - Yu Geng
No ratings yet
Practical Machine Learning Illustrated With KNIME - Yu Geng
312 pages
DS Cheat Sheets
No ratings yet
DS Cheat Sheets
18 pages
A Course in Machine Learning
No ratings yet
A Course in Machine Learning
50 pages
PDF Machine Learning
100% (1)
PDF Machine Learning
222 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
Machine Learning
No ratings yet
Machine Learning
216 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
112 pages
Machine Learning Basic Principles
No ratings yet
Machine Learning Basic Principles
124 pages
Supervised Learning
No ratings yet
Supervised Learning
145 pages
1 All Notes G
No ratings yet
1 All Notes G
217 pages
ML Module No 01
No ratings yet
ML Module No 01
138 pages
Linear Regression For ML Ass
No ratings yet
Linear Regression For ML Ass
99 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
Introduction R
No ratings yet
Introduction R
9 pages
Introduction To Machine Learning: Gilles Gasso
No ratings yet
Introduction To Machine Learning: Gilles Gasso
32 pages
Diabetics Prediction Using Machine Learning
100% (1)
Diabetics Prediction Using Machine Learning
18 pages
MLE
No ratings yet
MLE
15 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
1 - Intro To Machine Learning
No ratings yet
1 - Intro To Machine Learning
34 pages
Unit III
No ratings yet
Unit III
19 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Machine Learning
No ratings yet
Machine Learning
137 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
Aiml Notes
No ratings yet
Aiml Notes
12 pages
Machine Learning (R20a0518)
No ratings yet
Machine Learning (R20a0518)
87 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
20 pages
ML Chap1
No ratings yet
ML Chap1
26 pages
SML Book Draft Latest (001 046)
No ratings yet
SML Book Draft Latest (001 046)
46 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Machine Learning
No ratings yet
Machine Learning
95 pages
Data - Analytics - Chapter 2
No ratings yet
Data - Analytics - Chapter 2
58 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
Machine Learning - A First Course For Engineers and Scientists
No ratings yet
Machine Learning - A First Course For Engineers and Scientists
348 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
81 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
A Comprehensive Guide To Machine Learning
No ratings yet
A Comprehensive Guide To Machine Learning
152 pages
LN ML Rug
No ratings yet
LN ML Rug
267 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
8 pages
Unit 4 DL
No ratings yet
Unit 4 DL
31 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
ML 01
No ratings yet
ML 01
24 pages
Module - 1
No ratings yet
Module - 1
9 pages
Review-1 PPT (1) .PPTX (Autosaved) - 1
No ratings yet
Review-1 PPT (1) .PPTX (Autosaved) - 1
12 pages
Lesson 4 - Introduction Machine Learning
No ratings yet
Lesson 4 - Introduction Machine Learning
44 pages
Chapter 6 Solutions
100% (1)
Chapter 6 Solutions
30 pages
Pinnacle - Quantitative Reliability Optimization (QRO) Executive Brief
100% (1)
Pinnacle - Quantitative Reliability Optimization (QRO) Executive Brief
9 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Graph Theory and Its Applications
100% (1)
Graph Theory and Its Applications
5 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
DS Chapter 1
No ratings yet
DS Chapter 1
41 pages
2 Birth Death
0% (1)
2 Birth Death
49 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
32 pages
VFC 4
No ratings yet
VFC 4
3 pages
Credit Card Fraud Detection Using Machine Learning Final Research Paper
100% (2)
Credit Card Fraud Detection Using Machine Learning Final Research Paper
11 pages
Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON
No ratings yet
Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON
3 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
9 pages
COMSATS University Islamabad, Wah Campus Terminal Examinations Spring 2020
No ratings yet
COMSATS University Islamabad, Wah Campus Terminal Examinations Spring 2020
6 pages
Lesson 1 Prepared By: Asst. Prof. Sherylyn T. Trinidad: The Management Science
No ratings yet
Lesson 1 Prepared By: Asst. Prof. Sherylyn T. Trinidad: The Management Science
17 pages
Examenes Corte 1 y 3
No ratings yet
Examenes Corte 1 y 3
40 pages
Project Scheduling by Pert CPM
No ratings yet
Project Scheduling by Pert CPM
32 pages
Advanced Patented Methodologies in Ground Vibration Testing For Aerospace Applications
No ratings yet
Advanced Patented Methodologies in Ground Vibration Testing For Aerospace Applications
19 pages
SADCAS TR 12 - Estimation of The UoM by Calibration Laboratories and Specification of Calibration and Measurement Capability On Schedule of Accreditation. (Issue 1)
No ratings yet
SADCAS TR 12 - Estimation of The UoM by Calibration Laboratories and Specification of Calibration and Measurement Capability On Schedule of Accreditation. (Issue 1)
8 pages
Coulomb, Potencial, Campo, Densidad (V, S, L) 2
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L) 2
13 pages
Shankar Chavan: Data Scientist
No ratings yet
Shankar Chavan: Data Scientist
2 pages
AI As Instrument of Knowledge Extractivism
No ratings yet
AI As Instrument of Knowledge Extractivism
18 pages
Annex 1 To OMCL GL Evaluation and Reporting of Results Rounding - PA - PH - OMCL (21) 01R1
No ratings yet
Annex 1 To OMCL GL Evaluation and Reporting of Results Rounding - PA - PH - OMCL (21) 01R1
10 pages
ch6 - Simplification - of - CFGs - and - Normal - Forms
No ratings yet
ch6 - Simplification - of - CFGs - and - Normal - Forms
32 pages
Session 02 - Regression - and - Classification
No ratings yet
Session 02 - Regression - and - Classification
22 pages
Big Data Computing
No ratings yet
Big Data Computing
57 pages
Coulomb, Potencial, Campo, Densidad (V, S, L)
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L)
7 pages
Coulomb, Potencial, Campo, Densidad (V, S, L)
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L)
7 pages
Computer Science Question Paper
No ratings yet
Computer Science Question Paper
4 pages
3.6 Iterative Methods For Solving Linear Systems
No ratings yet
3.6 Iterative Methods For Solving Linear Systems
35 pages
Big Data Computing: Working With Data Models and Big Data Processing
No ratings yet
Big Data Computing: Working With Data Models and Big Data Processing
46 pages
Teorema de Divergencia
No ratings yet
Teorema de Divergencia
5 pages
HandsOn 3. Sensor Data
No ratings yet
HandsOn 3. Sensor Data
3 pages
Hands-On Activity: 3. Exploring The Array Data Model of An Image
No ratings yet
Hands-On Activity: 3. Exploring The Array Data Model of An Image
3 pages
Session 04 - Tree-Based Methods
No ratings yet
Session 04 - Tree-Based Methods
25 pages
Termpaper
No ratings yet
Termpaper
6 pages
Session 03 - Neural Networks
No ratings yet
Session 03 - Neural Networks
21 pages
Ee3512 Ci Lab 2021R
No ratings yet
Ee3512 Ci Lab 2021R
3 pages
Data Visualization-5
No ratings yet
Data Visualization-5
14 pages
QUIZ-1 - Attempt Review
No ratings yet
QUIZ-1 - Attempt Review
4 pages
M Ibnu Darajat Salam - 6705164016 - D3TT40-04
No ratings yet
M Ibnu Darajat Salam - 6705164016 - D3TT40-04
9 pages
Advanced Methods For Complex Network Analysis
No ratings yet
Advanced Methods For Complex Network Analysis
2 pages
Euler-Heun Method PDF
No ratings yet
Euler-Heun Method PDF
3 pages
Assignment - 6 DSP
No ratings yet
Assignment - 6 DSP
2 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet