0% found this document useful (0 votes)
36 views

AI-Lecture 8 (Machine Learning Overview)

This document provides an overview of machine learning. It discusses machine learning approaches, including supervised and unsupervised learning. It also covers machine learning components like representation, evaluation, and optimization. Example machine learning algorithms for regression, classification, and clustering are listed. Issues like project failure reasons and popular frameworks like scikit-learn and Keras are also summarized.

Uploaded by

Braga Gladys Mae
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

AI-Lecture 8 (Machine Learning Overview)

This document provides an overview of machine learning. It discusses machine learning approaches, including supervised and unsupervised learning. It also covers machine learning components like representation, evaluation, and optimization. Example machine learning algorithms for regression, classification, and clustering are listed. Issues like project failure reasons and popular frameworks like scikit-learn and Keras are also summarized.

Uploaded by

Braga Gladys Mae
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Artificial Intelligence

Lecture 8

Bicol University College of Science


1st Semester 2021-2022
Machine Learning Overview
Paradigm
Traditional approach

5
Machine learning approach

6
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output
Machine Learning (ML)
• ML is a branch of artificial intelligence:
• Uses computing based systems to make sense
out of data
• Extracting patterns, fitting data to functions,
classifying data, etc
• ML systems can learn and improve
• With historical data, time and experience
• Bridges theoretical computer science and real
noise data.

8
ML in real-life

9
ML in a Nutshell
• Tens of thousands of machine learning algorithms

• Hundreds new every year

• Every machine learning algorithm has three


components:
– Representation

– Evaluation
– Optimization
ML Components
• Representation
– Numerical functions

Linear regression

Neural networks

Support vector machines
– Symbolic functions

Decision trees

Sets of rules / Logic programs
– Instance-based functions

Nearest-neighbor

Case-based
– Probabilistic Graphical Models

Naïve Bayes

Bayesian networks

Hidden-Markov Models (HMMs)

Probabilistic Context Free Grammars (PCFGs)

Markov networks
ML Components
• Various Search/Optimization Algorithms
– Gradient descent

Perceptron

Backpropagation
– Dynamic Programming

HMM Learning

PCFG Learning
– Divide and Conquer

Decision tree induction

Rule learning
– Evolutionary Computation

Genetic Algorithms (GAs)

Genetic Programming (GP)

Neuro-evolution
ML Components
• Evaluation
– Accuracy
– Precision and recall
– Squared error
– Likelihood
– Posterior probability
– Cost / Utility
– Margin
– Entropy
– K-L divergence
– Etc.
Types of Learning
• Supervised (inductive) learning
– Training data includes desired outputs

– regression: predict numerical values


– classification: predict categorical values, i.e., labels
• Unsupervised learning
– Training data does not include desired outputs

– clustering: group data according to "distance"


– association: find frequent co-occurrences
– link prediction: discover relationships in data
– data reduction: project features to fewer features
• Reinforcement learning
– Rewards from sequence of actions
Classification
Object recognition
https://ai.googleblog.com/
2014/09/building-deeper-u
nderstanding-of-images.ht
ml

15
Reinforcement
learning
Learning to play Break Out
https://www.youtube.com/
watch?v=V1eYniJ0Rnk

16
Clustering
Crime prediction using k-
means clustering
http://www.grdjournals.co
m/uploads/article/GRDJE/V
02/I05/0176/GRDJEV02I05
0176.pdf

17
Machine learning algorithms
• Regression:
Ridge regression, Support Vector Machines, Random Forest,
Multilayer Neural Networks, Deep Neural Networks, ...

• Classification:
Naive Base, , Support Vector Machines,
Random Forest, Multilayer Neural Networks,
Deep Neural Networks, ...

• Clustering:
k-Means, Hierarchical Clustering, ...

18
Issues
• Many machine learning/AI projects fail
(Gartner claims 85 %)

• Ethics, e.g., Amazon has/had


sub-par employees fired by an AI
automatically

19
Reasons for failure
• Asking the wrong question
• Trying to solve the wrong problem
• Not having enough data
• Not having the right data
• Having too much data
• Hiring the wrong people
• Using the wrong tools
• Not having the right model
• Not having the right yardstick

20
Frameworks
• Programming languages
– Python
– R Fast-evolving ecosystem!
– C++
– ...
• Many libraries classic machine
– scikit-learn learning
– PyTorch
deep learning
– TensorFlow
frameworks
– Keras
– …

21
scikit-learn
• Nice end-to-end framework
– data exploration (+ pandas + holoviews)
– data preprocessing (+ pandas)

cleaning/missing values

normalization
– training
– testing
– application
• "Classic" machine learning only
• https://scikit-learn.org/stable/
22
Keras
• High-level framework for deep learning
• TensorFlow backend
• Layer types
– dense
– convolutional
– pooling
– embedding
– recurrent
– activation
– …
• https://keras.io/

23
Supervised and Unsupervised Learning
• Unsupervised Learning
• There are not predefined and known set of outcomes
• Look for hidden patterns and relations in the data
• A typical example: Clustering
2.5

2.0

1.5
irisCluster$cluster

Petal.Width
1

1.0

0.5

0.0
2 4 6
Petal.Length

24
Supervised and Unsupervised Learning
• Supervised Learning
• For every example in the data there is always a predefined
outcome
• Models the relations between a set of descriptive features and
a target (Fits data to a function)
• 2 groups of problems:
• Classification
• Regression

25
Supervised Learning
• Classification
• Predicts which class a given sample of data (sample of
descriptive features) is part of (discrete value).

virginica
0.0 4.0 96.0

Percent
100

75

Predicted
versicolor
0.0 96.0 4.0 50

• Regression 25

• Predicts continuous values.


setosa
100.0 0.0 0.0

setosa versicolor virginica


Actual

26
Machine Learning as a Process
- Define measurable and quantifiable goals
Define
- Use this stage to learn about the problem
Objectives

- Normalization
- Transformation
Model - Missing Values
Deployment Data - Outliers
Preparation

- Study models accuracy


- Work better than the naïve - Data Splitting
approach or previous system - Features Engineering
- Do the results make sense in - Estimating Performance
the context of the problem - Evaluation and Model
Model Model Selection
Evaluation Building

27
ML as a Process: Data Preparation
• Needed for several reasons
• Some Models have strict data requirements
• Scale of the data, data point intervals, etc
• Some characteristics of the data may impact dramatically on the
model performance
• Time on data preparation should not be underestimated

• Missing Values • Scaling


• Error Values • Centering
Raw • Different Scales Data
Transform
• Skewness Data Modeling
Data • Dimensionality
• Types Problems ation
• Outliers
• Missing Values
Ready phase
• Many others • Errors

28
ML as a Process: Feature engineering
• Determine the predictors (features) to be used is one of the most critical
questions
• Some times we need to add predictors
• Reduce Number:
• Fewer predictors more interpretable model and less costly
• Most of the models are affected by high dimensionality, specially for non-informative
predictors
Algorithms
Multiple
that use
models
Wrappers adding and
removing
models as
input and
Genetics
Algorithms
performance
parameter
as output

• Binning predictors
Evaluate the Based
Filters relevance of
the predictor
normally on
correlations

29
View of Std ML Datasets
- a Single Table (2D array)

Output
Feature 1 Feature 2 Feature N
... Category

Example 1 0.0 small red true

Example 2 9.3 medium red false

Example 3 8.2 small blue false

...

Example M 5.7 medium green true


ML as a Process: Model Building

• Data Splitting
• Allocate data to different tasks
• model training
• performance evaluation
• Define Training, Validation and Test sets
• Feature Selection (Review the decision made previously)
• Estimating Performance
• Visualization of results – discovery interesting areas of the problem
space
• Statistics and performance measures
• Evaluation and Model selection
• The ‘no free lunch’ theorem no a priory assumptions can be made
• Avoid use of favorite models if NEEDED

31
Nearest Neighbors: Basic Algorithm
for Classification
• Find the K nearest neighbors to
test-set example
• Or find all ex’s within radius R
• Combine their ‘votes’
– Most common category
– Average value (real-valued prediction) +
- -
-
– Can also weight votes by distance ?
+ -
– Lots of variations on basic theme
Simple Example: 1-NN
(1-NN ≡ one nearest neighbor)

Training Set
1. a=0, b=0, c=1 +
2. a=0, b=0, c=0 -
3. a=1, b=1, c=1 -
Test Example
a=0, b=1, c=0 ?
“Hamming Distance” (# of different bits)
Ex 1 = 2
Ex 2 = 1 So output -
Ex 3 = 2
From neurons to ANNs

𝑥1 inspiration
𝑤1
𝑥2 𝑤2
𝑦 𝑁 𝜎 (𝑥 )

𝑥3
𝑤3 𝑦 =𝜎 ( ∑ 𝑤 𝑖 𝑥𝑖 + 𝑏
𝑖=1
) activation function
𝑏
+1
𝑥
...

𝑤𝑁
𝑥𝑁

34
Multilayer network

How to determine
weights?

35
Training: backpropagation
• Initialize weights "randomly"
• For all training epochs
• for all input-output in training set
• using input, compute output
(forward)
• compare computed output with
training output
• adapt weights (backward) to
improve output
• if accuracy is good enough, stop
36
Task: handwritten digit
recognition
• Input data
• grayscale image
• Output data
• digit 0, 1, ..., 9
• Training examples
• Test examples

37
Deep neural networks
• Many layers
• Features are learned, not given
• Low-level features combined into
high-level features

• Special types of layers


• convolutional
• drop-out
• recurrent
• ...
39
Convolutional neural
networks

1 ⋯ 0

[ ]
⋮ ⋱ ⋮
0 ⋯ 1

40
Convolution examples
1 ⋯ 0 1 ⋯ 0

[ ]
⋮ ⋱ ⋮
0 ⋯ 1 [ ]
⋮ ⋱ ⋮
0 ⋯ 1

0 ⋯ 1 0 ⋯ 1

[ ]
⋮ ⋱ ⋮
1 ⋯ 0 [ ]
⋮ ⋱ ⋮
1 ⋯ 0

41
Task: sentiment <start> this film was just
brilliant casting location

classification scenery story direction


everyone's really suited the
part they played and you
could just imagine being
• Input data there Robert redford's is an
amazing actor and now the
• movie review (English) same being director
norman's father came from
• Output data the same scottish island as

/
myself so i loved the fact
there was a real connection
with this
• Training examples film the witty remarks
throughout the film were
• Test examples great it was
just brilliant so much that
i bought the film as soon as
it

43
Word
embedding
• Represent words as
one-hot vectors
length = vocabulary
size
Issues:
• unwieldy
• no semantics

• Word embeddings
• dense vector
• vector distance 
semantic distance

• Training
• use context
• discover relations with
surrounding words
44
End

You might also like