0% found this document useful (0 votes)

18 views

03 Machine Learning Overview

The document provides an overview of machine learning as part of an artificial intelligence course taught by Marco Bonzanini. It covers the differences between machine learning and programming, various applications of machine learning, tasks such as supervised and unsupervised learning, and the machine learning process including modeling, feature engineering, and challenges like overfitting and underfitting. Key concepts like item representation, feature selection, and scaling are also discussed.

Uploaded by

hrhee1atl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

03 Machine Learning Overview

Uploaded by

hrhee1atl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Machine Learning

Overview
Course: Artificial Intelligence
Fundamentals

Instructor: Marco Bonzanini

Machine Learning vs Programming

Rules
Programming Answers
Data

Answers
Machine Learning Rules
Data

Ref: Deep Learning with Python, F. Chollet, 2017.

Examples of ML
Applications
• Filtering Emails (Spam Detection)

• Automatic Trading

• Fraud Detection

• Self-driving cars

• Playing chess/poker/go

• Recommending products / items / services

Machine Learning Tasks

Supervised Unsupervised
Discrete Data

Classification Clustering
(predict a label) (group similar items)

Continuous Data
Dimensionality
Regression Reduction
(predict a quantity) (reduce n. of variables)
Machine Learning Process

• Exercise:
— Search “machine learning stages” (or steps, or
process) on Google
— Find dozens of “The X stages of Machine
Learning” articles

• No standard process?!
Recap: CRISP-DM
Recap: CRISP-DM
Machine Learning Process

• What’s the problem you’re trying to solve?

(identify ML task)

• What ML algorithms are available for such task?

• How does the data set look like?

(enough data? need labelled data? need pre-
processing?)
ML Modelling
• Step 1: Learning (a.k.a. Training)
— Batch process (could take hours/days)
— “Learn” from the data
— Output: your “model”

• Step 2: Prediction (a.k.a. Testing)

— Given a trained model, make a prediction on
new, unseen data
— Output: depends on the task
Example: classification task

Ref: Mastering Social Media Mining with Python, M. Bonzanini, 2016.

ML Terminology

• Item or Sample: the “objects” we’re dealing with

• Item representation (e.g. a vector)

• Features: the attributes of an item (e.g. elements of

a vector)
Item Representation

• We can use any type of attributes

• Numerical features

• Categorical features → one-hot encoding

• Text → bag-of-words
Item Representation
Item Representation
One-hot Encoding

Rome = [1, 0, 0, 0, 0, 0, …, 0]
Paris = [0, 1, 0, 0, 0, 0, …, 0]
Italy = [0, 0, 1, 0, 0, 0, …, 0]
France = [0, 0, 0, 1, 0, 0, …, 0]
Feature Engineering

• Using domain knowledge of the data to create

features that make ML algorithms work

• Fundamental, difficult, expensive, time-consuming

• Quality and quantity of features can have a big

impact on the final result
Feature Selection
• Dimensionality!
How many words in the English vocabulary?
How many unique tokens on the Web?

• Using millions of features is not feasible for some

classifiers

• Reducing training time

• Can improve generalisation, e.g. eliminate noise,

avoid overfitting
Feature Selection
• Define a utility function A(f, c)
For a given class c, for all features f, compute the
value of A(f, c) and only use the k features with the
highest utility

• Example: Term Frequency

- Discard words that appear in many documents
- Discard words that appear in a very small number
of documents
Feature Scaling
• a.k.a. data normalisation

• Different features may have different range of values

• Many algorithms use a concept of “distance”,

therefore features with a broad range will dominate

• After scaling, features will contribute equally to the

distance
Feature Scaling (2)

• Many options for scaling

• “Standardisation”: zero-mean and unit-variance

Overfitting and Underfitting

• Symptom: your ML model doesn’t perform well

outside of your test environment

• Possible cause: generalisation is hard!

• More precisely:
— Overfitting
— Underfitting
Overfitting
• Your model learns the details of the training data
set “too well”

• Good performance on the given data set,

but not on new data sets

• Noise and random fluctuations in your training data

treated as important information

• Possible solution: cross-validation

Underfitting

• Less discussed (it’s clear since the beginning)

• Your model performs badly with the given data set,

and doesn’t generalise to new data

• Possible solution: move on (change feature

engineering, feature selection, or ML algorithm
altogether)
Questions?

Hawaii Early Learning Profile (HELP) : Let's Keep in Touch!
No ratings yet
Hawaii Early Learning Profile (HELP) : Let's Keep in Touch!
28 pages
Delivery Methods - Media Selection Matrix
No ratings yet
Delivery Methods - Media Selection Matrix
5 pages
2018 Children Can Change Through Right Brain Training 140pg DR Makoto Shichida
0% (1)
2018 Children Can Change Through Right Brain Training 140pg DR Makoto Shichida
2 pages
Problem Solving Through: 7 QC Tools
No ratings yet
Problem Solving Through: 7 QC Tools
16 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
ML Notes
No ratings yet
ML Notes
79 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
[Fall 2024] Intro to ML
No ratings yet
[Fall 2024] Intro to ML
51 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
ML Revision
No ratings yet
ML Revision
207 pages
Machine Learning Usefull Things
No ratings yet
Machine Learning Usefull Things
18 pages
Module 4
No ratings yet
Module 4
28 pages
Lecture 1 Course Introduction
No ratings yet
Lecture 1 Course Introduction
18 pages
ML_DA
No ratings yet
ML_DA
55 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lecture5
No ratings yet
Lecture5
26 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
10 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Lesson 4 -Introduction Machine Learning
No ratings yet
Lesson 4 -Introduction Machine Learning
44 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Chapter 02 Overview - 4
No ratings yet
Chapter 02 Overview - 4
43 pages
An Introduction To Machine Learning and How To Teach Machines To See
No ratings yet
An Introduction To Machine Learning and How To Teach Machines To See
50 pages
Coding Neural Networks-Classification & Regression
No ratings yet
Coding Neural Networks-Classification & Regression
39 pages
CSC413 Lecture Note
No ratings yet
CSC413 Lecture Note
32 pages
Module 1
No ratings yet
Module 1
22 pages
Machine Learning - course
No ratings yet
Machine Learning - course
6 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
ML_UNIT-1
No ratings yet
ML_UNIT-1
64 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Slides on DataI
No ratings yet
Slides on DataI
33 pages
Week 15
No ratings yet
Week 15
41 pages
Previous Lecture
No ratings yet
Previous Lecture
43 pages
ML_notion_1
No ratings yet
ML_notion_1
18 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
25 pages
305 BA PYTHON - APR 2022 ANSWER Key
No ratings yet
305 BA PYTHON - APR 2022 ANSWER Key
14 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Machine Learning - Unit - 1
100% (1)
Machine Learning - Unit - 1
58 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
ML 01
No ratings yet
ML 01
24 pages
Comp Vis Week 2
No ratings yet
Comp Vis Week 2
16 pages
Module 2 - ML
No ratings yet
Module 2 - ML
53 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Lecture 4-5
No ratings yet
Lecture 4-5
48 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Task 2 What Is Bibliotherapy?: Dictionary Defined Bibliotheraphy As "The Use of Selected Reading Materials As
No ratings yet
Task 2 What Is Bibliotherapy?: Dictionary Defined Bibliotheraphy As "The Use of Selected Reading Materials As
19 pages
Inclusiveness: Yousif G. (MA) Addis Ababa University
No ratings yet
Inclusiveness: Yousif G. (MA) Addis Ababa University
20 pages
2020 Ps III Final Report 1 Ta 1 Admin
No ratings yet
2020 Ps III Final Report 1 Ta 1 Admin
6 pages
Mbti Report
No ratings yet
Mbti Report
9 pages
What Are The Three Different Components of Self-Concept According To Carl Rogers? Elaborate Each
No ratings yet
What Are The Three Different Components of Self-Concept According To Carl Rogers? Elaborate Each
2 pages
Presesntation On Job Enrichment
No ratings yet
Presesntation On Job Enrichment
13 pages
Critical Thinking in The AI Era An Exploration of
No ratings yet
Critical Thinking in The AI Era An Exploration of
18 pages
Technical Terms in Research
No ratings yet
Technical Terms in Research
4 pages
Conceptual Framework
No ratings yet
Conceptual Framework
39 pages
Chapter 1 - The Science of Psychology
No ratings yet
Chapter 1 - The Science of Psychology
76 pages
Thesis Educational Administration
100% (1)
Thesis Educational Administration
8 pages
Writing Skills Assignment
No ratings yet
Writing Skills Assignment
3 pages
Intercultural Communication Paper
No ratings yet
Intercultural Communication Paper
10 pages
Psychology Paper 3 Mark Scheme
No ratings yet
Psychology Paper 3 Mark Scheme
20 pages
Male Underachievement Literature Review
No ratings yet
Male Underachievement Literature Review
3 pages
BPT1501 Assignment No 2
No ratings yet
BPT1501 Assignment No 2
5 pages
Module 3 Tech Research 3rd Year
No ratings yet
Module 3 Tech Research 3rd Year
10 pages
Teaching Writing Using Pictures in Recount Text
100% (8)
Teaching Writing Using Pictures in Recount Text
24 pages
Appex D Group4
No ratings yet
Appex D Group4
7 pages
Possible Question Final Defense
No ratings yet
Possible Question Final Defense
4 pages
Relationship of Teacher and School Performance of Public Elementary Schools in District I-B of The Division of Antipolo City
No ratings yet
Relationship of Teacher and School Performance of Public Elementary Schools in District I-B of The Division of Antipolo City
11 pages
Assessment Standards Ceramics & Sculpture
No ratings yet
Assessment Standards Ceramics & Sculpture
5 pages
Lesson 9 Module
No ratings yet
Lesson 9 Module
6 pages
DOC-20250212-WA0025 (1)
No ratings yet
DOC-20250212-WA0025 (1)
35 pages
Chapter 3
No ratings yet
Chapter 3
97 pages
Extreme Family
No ratings yet
Extreme Family
2 pages