0% found this document useful (0 votes)

5 views61 pages

1-Introduction To Machine Learning

This document provides an introduction to machine learning, covering key concepts such as its definition, types, modeling flow, and performance measures. It discusses the importance of data, training and testing phases, and various algorithms used in machine learning. Additionally, it highlights the advantages, limitations, and applications of machine learning in different fields.

Uploaded by

yashw609

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views61 pages

1-Introduction To Machine Learning

Uploaded by

yashw609

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

1

Supervised Learning
Introduction to Machine Learning
2

LEARNING OBJECTIVES

At the end of this session, you

will be able to understand:
• Introduction to Machine
Learning
• Machine Learning Modelling
Flow
• Parametric and Non-
parametric Algorithms
• Types of Machine Learning
• Performance Measures
• Bias-Variance Tradeoff
• Data Inconsistencies
3

Introduction to Machine
Learning
Machine Learning is… 4

Machine learning is a programming computers to optimize a performance criterion using example data or past
experience.

-- Ethem Alpaydin

The goal of machine learning is to develop methods that can automatically detect patterns in data, and then to use the
uncovered patterns to predict future data or other outcomes of interest.

-- Kevin P. Murphy

The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of
computer algorithms and with the use of these regularities to take actions.

-- Christopher M. Bishop
Machine Learning is… 5

Machine learning is about predicting the future based on the past.

-- Hal Daume III
What is This Image? 6

6
What is This Image? 7

Humayun's Tomb,

located in Delhi,

India

7
What is this image? 8

8
What is this image? 9

9
Machine Learning is… 1
0

Machine learning is about predicting the future based on the past.

-- Hal Daume III

past future

Training model/ Testing model/

Data predictor Data predictor
1
1

Relationship of AI, ML and DL

● Artificial Intelligence (AI) is Artificial Intelligence
anything about man-made
intelligence exhibited by
machines. Machine Learning
● Machine Learning (ML) is
an approach to achieve AI.
● Deep Learning (DL) is one
Deep Learning
technique to implement
ML.
1
2
WHAT IS MACHINE LEARNING?

Machine learning is a subset of artificial intelligence that often uses statistical

techniques to give the ability to "learn" with data

Consideration to define a problem:

• Problem definition
• Define data requirements and its source
• Define if whole dataset is considered or subset will do
1
3

Data and its data division strategies

13
Data everywhere! Big Data Statistics 2023: How Much Data is in The World? 1
4

• Global big data analytics market annual revenue is estimated to reach $68.09 billion
by 2025.
• There were 79 zettabytes of data generated worldwide in 2021.
• 90% of the data in the global datasphere is replicated data.
• In 2020, every person generated 1.7 megabytes in just a second.
• Global IoT connections already generated 13.6 zettabytes of data in 2019 alone.
• By 2025, more than 150 zettabytes of big data will need analysis.
• In 2021, there was 24% of big data revenue in software, 16% in hardware, and
another 24% in services.
• The COVID-19 pandemic increased the rate of data breaches by more than 400%.
• By 2027, the use of big data application database solutions and analytics is predicted
to grow to $12 billion.
• 97.2% of organizations are investing in big data and AI.
• Using big data, Netflix saves $1 billion per year on customer retention.
1
5

TABLE
SLIDE
1
6
Data types
Data comes in different sizes and different flavors (types):

H Texts

H Numbers

H Clickstreams

H Graphs

H Tables

H Images

H Transactions

H Videos

H Some or all the above!

1
7
we are ’DATAFIED’!

• Wherever we go, we are “datafied”.

• Smartphones are tracking our locations.
• We leave a data trail in our web browsing.
• Interaction in social networks.
• Privacy is an important issue in Data Science.
1
8

Training and testing

Data Practical
acquisition usage

Universal set
(unobserved)

Training set Testing set

(observed) (unobserved)
1
9

Training and testing

• Training is the process of making the system able to learn.

• No free lunch rule:

• Training set and testing set come from the same distribution
• Need to make some assumptions or bias
2

Cross-validation: A better way to choose meta parameters 0

Divide the total dataset into three subsets:

Training data is used for learning the parameters of the model.
Validation data is not used for learning but is used for deciding what
settings of the meta parameters work best.
Test data is used to get a final, unbiased estimate of how well the
ML model works. We expect this estimate to be worse than on the
validation data.
We could divide the total dataset into one final test set and N other
subsets and train on all but one of those subsets to get N different
estimates of the validation error rate.
This is called N-fold cross-validation.
The N estimates are not independent.
2

k-Fold Cross-Validation
1

k-Fold Cross-Validation

• Using k-fold cross-validation for hyper-parameter tuning is common when

the size of the training data is small
– It also leads to a better and less noisy estimate of the model performance by
averaging the results across several folds
• E.g., 5-fold cross-validation (see the figure on the next slide)
1. Split the train data into 5 equal folds
2. First use folds 2-5 for training and fold 1 for validation
3. Repeat by using fold 2 for validation, then fold 3, fold 4, and fold 5
4. Average the results over the 5 runs (for reporting purposes)
5. Once the best hyper-parameters are determined, evaluate the model on the test data

21
2

k-Fold Cross-Validation
2

k-Fold Cross-Validation

• Illustration of a 5-fold cross-validation

22
Picture from: https://scikit-learn.org/stable/modules/cross_validation.html
are validation strategies matters? 2
3
are validation strategies matters? 2
4
SDE vs SIE 2
5

Difference between Scene Dependent Evaluation (SDE) and Scene Independent Evaluation (SIE) data
division schemes. In SDE setup, training and testing video frames share the same background, leading to
high similarity between them. However, in SIE, completely unseen videos are tested for evaluation.
2
6
PHASES OF MACHINE LEARNING

The figure shows how learning can be applied to predict the behavior
Sample Questions 2
7

1) machine learning is_________________

A) Machine Learning is a type of artificial intelligence that allows a system to learn from data.

B) The science of getting computers to operate without being explicitly programmed is known as machine learning.

C) A&B

D) Non of the above

2) What are the phases of machine learning?

A) Training Phase

B) Validation Phase

C) Testing Phase

D) All of the above

Sample Questions 2
8

1) machine learning is_________________

A) Machine Learning is a type of artificial intelligence that allows a system to learn from data.

B) The science of getting computers to operate without being explicitly programmed is known as machine learning.

C) A&B

D) Non of the above

2) What are the phases of machine learning?

A) Training Phase

B) Validation Phase

C) Testing Phase

D) All of the above

Sample Questions 2
9

What is the role of training data in machine learning?

A) To evaluate the final performance of the model.

B) To fine-tune the model's hyperparameters.

C) To learn the model parameters.

D) To estimate the validation error rate

Why is validation data important in the machine learning process?

A) It is used to train the model initially.

B) It helps in deciding the best settings for the model's hyperparameters.

C) It provides the final estimate of the model's performance.

Sample Questions 3
0

What is the role of training data in machine learning?

A) To evaluate the final performance of the model.

B) To fine-tune the model's hyperparameters.

C) To learn the model parameters.

D) To estimate the validation error rate

Why is validation data important in the machine learning process?

A) It is used to train the model initially.

B) It helps in deciding the best settings for the model's hyperparameters.

C) It provides the final estimate of the model's performance.

Sample Questions 3
1

What is the purpose of test data in machine learning?

A. To learn the parameters of the model.
B. To optimize the hyperparameters.
C. To provide an unbiased performance estimate of the model after
training.
D. To continuously improve the model post-deployment.
Sample Questions 3
2

What is the purpose of test data in machine learning?

• Easily identifies trends and patterns

• No human intervention needed

• Cheap and flexible — can apply to any learning task

• Automatic method to search for hypotheses explaining data

• Continuous Improvement
3
4
LIMITATIONS OF MACHINE LEARNING

• Need a massive data to train

• Error prone - usually impossible to get perfect accuracy

Performance of ML 5

• There are several factors affecting the performance:

• Types of training provided
• The form and extent of any initial background knowledge
• The type of feedback provided
• The learning algorithms used
• Type of data provided

• Two important factors:

• Modeling
• Optimization
3

Algorithms 6

• The success of machine learning system also depends on the algorithms.

• The algorithms control the search to find and build the knowledge
structures.

• The learning algorithms should extract useful information from training

examples.
3
7
COMPLEMENTING FIELDS OF MACHINE LEARNING
3
8
APPLICATIONS OF MACHINE LEARNING

• Spam Detection

• Speech Recognition

• Language translation

• Fraud detection

• Product
Recommendation
Applications of Machine learning 3
9

Classification
Applications of Machine learning 4
0

Machine Translation

Point your camera at the

menu and the
restaurant’s selections
will magically appear in
English via the Google
Translate app
Applications of Machine learning 4
1

Image Captioning
4
2

Every day people are finding more and more applications of machine learning.
Some more applications of machine learning:
▪ Driverless vehicles
▪ Email Spam and Malware Filtering
▪ Online Customer Support
▪ Product Recommendations
▪ Search Engine Result Refining
▪ Online Fraud Detection
▪ Sentiment Analysis
▪ Action Recognition
▪ Anomaly Detection
▪ Intelligent Video Surveillance
▪ Depression Analysis
▪ Traffic Prediction (Verities of prediction tasks)
4
3

Did You
Know?
ML is a subset of artificial
intelligence that automates
data mining:
Machine learning can be stated as more
automated and continuous version of
data mining. Data mining can often
detect patterns in data sets that no
human would be able to find. Machine
learning is capable of generalizing
information from large and dynamically
changing data sets, and then detecting
and extrapolating patterns in order to
apply that information to new solutions
and actions
4
4

Machine Learning
Modeling Flow
4
5
MODELING FLOW
4
6
MODELING FLOW

• Get data: Gather data from different sources

• Clean, prepare and Manipulate data: Check for the

null values and outliers and clear

• Train Model: Build a model using train data

• Evaluation: Tweak the model using the test data

• Improve: Optimize the model to increase its accuracy

4
7
DATA IN MACHINE LEARNING

• Data forms the main source of learning in Machine Learning

• The data that is being referenced here can be in any format,

can be received at any frequency and can be of any size

• Data in the ML context, can either be labeled or unlabeled

4
8
LABELED DATA

Labeled Data takes a set of unlabeled data and augments each piece of that
unlabeled data with some sort of meaningful "label"

For example, labels for the unlabeled data might be whether this photo contains
a cat or a dog
4
9
UNLABELED DATA

Unlabeled Data consists of samples of natural or human-created artifacts that you can
obtain relatively easily from the world

Some examples of unlabeled data are photos, audio recordings, videos,

news articles, tweets etc
5
0
DATA TERMINOLOGY

• Column: Describes data of a single type. For

example, column of weights or heights or prices. All
the data in one column will have the same scale

• Row: A row describes a single entity or observation

whose properties are described by columns.

• Cell: A cell is a single value in a row and column. It

is represented as
Cell = dataset[ row ][ columns ]
Some Notation/Nomenclature/Convention 5
1

Supervised Learning requires training data given as a set of input-output pairs {(x n, yn )}N
n=1
Unsupervised Learning requires training data given as a set of inputs {x n }N
n=1
Each input x n is (usually) a vector containing the values of the features or attributes or
covariates that encode properties of the data it represents, e.g.,
Representing a 7 × 7 image: Xn can be a 49 × 1 vector of pixel intensities

Note: Good features can also be learned from data (feature learning) or extracted
using hand-crafted rules defined by a domain expert. Having a good set of
features is half the battle won!
Each yn is the output or response or label associated with input x n
Some Notation/Nomenclature/Convention 5
2

Will assume each input x n to be a D × 1 column vector (its transpose x Tn will be row vector)
xnd will denote the d-th feature of the n-th input
We will use X (N × D feature matrix) to collectively denote all the N inputs
We will use y (N × 1 output/response/label vector) to collectively denote all the N outputs
A feature
D

Input n xn1 xn2 xnD Output for

xT y
n
n
input n
N X y

Feature Matrix Outputs

A B C D 5
3
VISUALIZATION DATA TERMINOLOGY Column Column Column
1 2 3
Row 1 Cell11 Cell12 Cell13
Row 2 Cell21 Cell22 Cell23
Row 3 Cell31 Cell32 Cell33
5
4

54
5
5
STATISTICAL LEARNING PERSPECTIVE

The statistical perspective frames data in the context of a hypothetical function (f) that
the machine learning algorithm is trying to learn

The columns that are the inputs are referred to as input variables

The column of data we would like to predict is called the output

variable or response variable
5
6
STATISTICAL LEARNING PERSPECTIVE

• The most common type of machine learning is to

learn the mapping Y = f(X) to make predictions of Y
for new X

• This is called predictive modeling or predictive

analytics and the goal is to make the most accurate
predictions possible

• If, there are more than one input variable, then they
are referred as the Input Vector
5
7
STATISTICAL LEARNING PERSPECTIVE

• For example, a statistics text may talk about the input variables as independent
variables and the output variable as the dependent variable.
5
8

Model gives the best results when tested on the same data on which it was trained. If
you don’t have much data, you should stick to the simple models.
5
9
Sample Question
Why is it not advisable to test a model on the same data used for training?
A. It can lead to underfitting.
B. It does not provide a true measure of the model's performance.
C. .It increases the computational cost unnecessarily.
D. It reduces the model's ability to generalize to new data

What is a potential risk when a model is trained and tested on the same dataset?
A. The model may not perform well on unseen data due to lack of exposure.
B. The model will require more data to be validated accurately.
C. The computational time for training the model will increase.
D. The model will be too complex to understand.
6
0
Sample Question
Why is it not advisable to test a model on the same data used for training?
A. It can lead to underfitting.
B. It does not provide a true measure of the model's performance.
C. .It increases the computational cost unnecessarily.
D. It reduces the model's ability to generalize to new data

Next Class Parametric and Non Parametric Machine Learning

Machine Learning?
100% (2)
Machine Learning?
114 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
The AI Stock Investor A Beginner's Guide To Profiting From The AI Revolution (Freeman Publications) - 2023 - English - B0C2JL89SC - (Z-Library)
No ratings yet
The AI Stock Investor A Beginner's Guide To Profiting From The AI Revolution (Freeman Publications) - 2023 - English - B0C2JL89SC - (Z-Library)
175 pages
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
No ratings yet
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
132 pages
Theory of Computation Notes 1 - TutorialsDuniya
No ratings yet
Theory of Computation Notes 1 - TutorialsDuniya
106 pages
Artificial Intelligence Is Friend or Foe
No ratings yet
Artificial Intelligence Is Friend or Foe
6 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Thales AcadX India Kick-Off GLBITM
No ratings yet
Thales AcadX India Kick-Off GLBITM
28 pages
Supervised and Deep Learning
No ratings yet
Supervised and Deep Learning
83 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
How To Pilot AI Content at Your Company Ebook - Superside - FV - V1
100% (1)
How To Pilot AI Content at Your Company Ebook - Superside - FV - V1
14 pages
Chapter Five
No ratings yet
Chapter Five
178 pages
Mila Semeshkina Learn or Leave The Market
No ratings yet
Mila Semeshkina Learn or Leave The Market
150 pages
Agri Sense
No ratings yet
Agri Sense
10 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Radial Basis Function Neural Network RBFNN
No ratings yet
Radial Basis Function Neural Network RBFNN
14 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Unit1 ML NGP
No ratings yet
Unit1 ML NGP
106 pages
ML Module 1 Final
No ratings yet
ML Module 1 Final
134 pages
U1 ML Intro and Applications
No ratings yet
U1 ML Intro and Applications
123 pages
Machine Learning - Module 1
No ratings yet
Machine Learning - Module 1
105 pages
ML Unit 1 Pallav
No ratings yet
ML Unit 1 Pallav
22 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Unit 1
No ratings yet
Unit 1
62 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Machine Learning Ans
No ratings yet
Machine Learning Ans
60 pages
Tutorial Sheet1 (M.L.)
No ratings yet
Tutorial Sheet1 (M.L.)
49 pages
Lec 7 - 8 - Machine Learning Introduction
No ratings yet
Lec 7 - 8 - Machine Learning Introduction
55 pages
ML-QB-Unit 1
No ratings yet
ML-QB-Unit 1
41 pages
Unit No. 1
No ratings yet
Unit No. 1
73 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Insurance 4.0: Benefits and Challenges of Digital Transformation 1st Ed. Edition Bernardo Nicoletti PDF Download
100% (1)
Insurance 4.0: Benefits and Challenges of Digital Transformation 1st Ed. Edition Bernardo Nicoletti PDF Download
49 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Smart Classroom A Deep Learning Approach Towards Attention Assessment Through Class Behavior Detection
No ratings yet
Smart Classroom A Deep Learning Approach Towards Attention Assessment Through Class Behavior Detection
6 pages
Module1 Introduction
No ratings yet
Module1 Introduction
35 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Intro - Types of Machine Learning
No ratings yet
Intro - Types of Machine Learning
24 pages
3 - InnovatiCS - Introduction To CRISP-DM
No ratings yet
3 - InnovatiCS - Introduction To CRISP-DM
35 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Kanade Lucas Tomasi Tracker
No ratings yet
Kanade Lucas Tomasi Tracker
36 pages
Introductiontomachinelearning 230723174746 1a0e5edc
No ratings yet
Introductiontomachinelearning 230723174746 1a0e5edc
27 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Al - Lec 3
No ratings yet
Al - Lec 3
30 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
Intelligent Disassembly of Electric Vehicel Batteries A Forward Looking Review
No ratings yet
Intelligent Disassembly of Electric Vehicel Batteries A Forward Looking Review
26 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
UCS551 Chapter 5 - Machine Learning (Intro)
No ratings yet
UCS551 Chapter 5 - Machine Learning (Intro)
25 pages
ML Unit 1
No ratings yet
ML Unit 1
20 pages
17BIT051
No ratings yet
17BIT051
26 pages
E-Notes 33718 Content Document 20250325122736PM
No ratings yet
E-Notes 33718 Content Document 20250325122736PM
18 pages
Firoz Topic 0
No ratings yet
Firoz Topic 0
24 pages
Deep Representation Learning in Speech Processing Challenges, Recent Advances, and Future Trends
No ratings yet
Deep Representation Learning in Speech Processing Challenges, Recent Advances, and Future Trends
25 pages
Set-Level Guidance Attack - Boosting Adversarial Transferability of Vision-Language Pre-Training Models
No ratings yet
Set-Level Guidance Attack - Boosting Adversarial Transferability of Vision-Language Pre-Training Models
22 pages
Lec 001
No ratings yet
Lec 001
17 pages
Minor 1
No ratings yet
Minor 1
20 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Change Management: U C o I N
No ratings yet
Change Management: U C o I N
13 pages
Artificial Intelligence-IT314
No ratings yet
Artificial Intelligence-IT314
2 pages
Online Adaptation of Language Models With A Memory of Amortized Contexts
No ratings yet
Online Adaptation of Language Models With A Memory of Amortized Contexts
14 pages
The Use of Artificial Intelligence in Endodontics - 2024
No ratings yet
The Use of Artificial Intelligence in Endodontics - 2024
10 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Machinelearning Unit1
No ratings yet
Machinelearning Unit1
9 pages
ML Que
No ratings yet
ML Que
14 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
AI Workers Future of Work
No ratings yet
AI Workers Future of Work
7 pages
Speech Recognition System Using Python Report
No ratings yet
Speech Recognition System Using Python Report
7 pages
Avi Carmi Resume 110207
No ratings yet
Avi Carmi Resume 110207
4 pages
Network Automation Engineer (New Graduate Program) - Nokia Global Career Site Careers
No ratings yet
Network Automation Engineer (New Graduate Program) - Nokia Global Career Site Careers
4 pages
ML 2
No ratings yet
ML 2
4 pages
Calanoga, Novelyn Kaye R. BSN-2B (Ict) Final Exam
No ratings yet
Calanoga, Novelyn Kaye R. BSN-2B (Ict) Final Exam
3 pages
AI Viva QnA 2
No ratings yet
AI Viva QnA 2
2 pages
R20 B.Tech - CSM Siddarth Institute of Engineering & Technology: Puttur (Autonomous) Machine Learning Lab 3 Course Objectives
No ratings yet
R20 B.Tech - CSM Siddarth Institute of Engineering & Technology: Puttur (Autonomous) Machine Learning Lab 3 Course Objectives
2 pages
ATAL Schedule-Participants
No ratings yet
ATAL Schedule-Participants
1 page
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet