1-Introduction To Machine Learning
1-Introduction To Machine Learning
Supervised Learning
Introduction to Machine Learning
2
LEARNING OBJECTIVES
Introduction to Machine
Learning
Machine Learning is… 4
Machine learning is a programming computers to optimize a performance criterion using example data or past
experience.
-- Ethem Alpaydin
The goal of machine learning is to develop methods that can automatically detect patterns in data, and then to use the
uncovered patterns to predict future data or other outcomes of interest.
-- Kevin P. Murphy
The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of
computer algorithms and with the use of these regularities to take actions.
-- Christopher M. Bishop
Machine Learning is… 5
6
What is This Image? 7
Humayun's Tomb,
located in Delhi,
India
7
What is this image? 8
8
What is this image? 9
9
Machine Learning is… 1
0
past future
• Problem definition
• Define data requirements and its source
• Define if whole dataset is considered or subset will do
1
3
13
Data everywhere! Big Data Statistics 2023: How Much Data is in The World? 1
4
• Global big data analytics market annual revenue is estimated to reach $68.09 billion
by 2025.
• There were 79 zettabytes of data generated worldwide in 2021.
• 90% of the data in the global datasphere is replicated data.
• In 2020, every person generated 1.7 megabytes in just a second.
• Global IoT connections already generated 13.6 zettabytes of data in 2019 alone.
• By 2025, more than 150 zettabytes of big data will need analysis.
• In 2021, there was 24% of big data revenue in software, 16% in hardware, and
another 24% in services.
• The COVID-19 pandemic increased the rate of data breaches by more than 400%.
• By 2027, the use of big data application database solutions and analytics is predicted
to grow to $12 billion.
• 97.2% of organizations are investing in big data and AI.
• Using big data, Netflix saves $1 billion per year on customer retention.
1
5
TABLE
SLIDE
1
6
Data types
Data comes in different sizes and different flavors (types):
H Texts
H Numbers
H Clickstreams
H Graphs
H Tables
H Images
H Transactions
H Videos
Data Practical
acquisition usage
Universal set
(unobserved)
k-Fold Cross-Validation
1
k-Fold Cross-Validation
21
2
k-Fold Cross-Validation
2
k-Fold Cross-Validation
22
Picture from: https://scikit-learn.org/stable/modules/cross_validation.html
are validation strategies matters? 2
3
are validation strategies matters? 2
4
SDE vs SIE 2
5
Difference between Scene Dependent Evaluation (SDE) and Scene Independent Evaluation (SIE) data
division schemes. In SDE setup, training and testing video frames share the same background, leading to
high similarity between them. However, in SIE, completely unseen videos are tested for evaluation.
2
6
PHASES OF MACHINE LEARNING
The figure shows how learning can be applied to predict the behavior
Sample Questions 2
7
A) Machine Learning is a type of artificial intelligence that allows a system to learn from data.
B) The science of getting computers to operate without being explicitly programmed is known as machine learning.
C) A&B
A) Training Phase
B) Validation Phase
C) Testing Phase
A) Machine Learning is a type of artificial intelligence that allows a system to learn from data.
B) The science of getting computers to operate without being explicitly programmed is known as machine learning.
C) A&B
A) Training Phase
B) Validation Phase
C) Testing Phase
• Continuous Improvement
3
4
LIMITATIONS OF MACHINE LEARNING
Performance of ML 5
Algorithms 6
• The algorithms control the search to find and build the knowledge
structures.
• Spam Detection
• Speech Recognition
• Language translation
• Fraud detection
• Product
Recommendation
Applications of Machine learning 3
9
Classification
Applications of Machine learning 4
0
Machine Translation
Image Captioning
4
2
Every day people are finding more and more applications of machine learning.
Some more applications of machine learning:
▪ Driverless vehicles
▪ Email Spam and Malware Filtering
▪ Online Customer Support
▪ Product Recommendations
▪ Search Engine Result Refining
▪ Online Fraud Detection
▪ Sentiment Analysis
▪ Action Recognition
▪ Anomaly Detection
▪ Intelligent Video Surveillance
▪ Depression Analysis
▪ Traffic Prediction (Verities of prediction tasks)
4
3
Did You
Know?
ML is a subset of artificial
intelligence that automates
data mining:
Machine learning can be stated as more
automated and continuous version of
data mining. Data mining can often
detect patterns in data sets that no
human would be able to find. Machine
learning is capable of generalizing
information from large and dynamically
changing data sets, and then detecting
and extrapolating patterns in order to
apply that information to new solutions
and actions
4
4
Machine Learning
Modeling Flow
4
5
MODELING FLOW
4
6
MODELING FLOW
Labeled Data takes a set of unlabeled data and augments each piece of that
unlabeled data with some sort of meaningful "label"
For example, labels for the unlabeled data might be whether this photo contains
a cat or a dog
4
9
UNLABELED DATA
Unlabeled Data consists of samples of natural or human-created artifacts that you can
obtain relatively easily from the world
Supervised Learning requires training data given as a set of input-output pairs {(x n, yn )}N
n=1
Unsupervised Learning requires training data given as a set of inputs {x n }N
n=1
Each input x n is (usually) a vector containing the values of the features or attributes or
covariates that encode properties of the data it represents, e.g.,
Representing a 7 × 7 image: Xn can be a 49 × 1 vector of pixel intensities
Note: Good features can also be learned from data (feature learning) or extracted
using hand-crafted rules defined by a domain expert. Having a good set of
features is half the battle won!
Each yn is the output or response or label associated with input x n
Some Notation/Nomenclature/Convention 5
2
Will assume each input x n to be a D × 1 column vector (its transpose x Tn will be row vector)
xnd will denote the d-th feature of the n-th input
We will use X (N × D feature matrix) to collectively denote all the N inputs
We will use y (N × 1 output/response/label vector) to collectively denote all the N outputs
A feature
D
54
5
5
STATISTICAL LEARNING PERSPECTIVE
The statistical perspective frames data in the context of a hypothetical function (f) that
the machine learning algorithm is trying to learn
The columns that are the inputs are referred to as input variables
• If, there are more than one input variable, then they
are referred as the Input Vector
5
7
STATISTICAL LEARNING PERSPECTIVE
• For example, a statistics text may talk about the input variables as independent
variables and the output variable as the dependent variable.
5
8
Model gives the best results when tested on the same data on which it was trained. If
you don’t have much data, you should stick to the simple models.
5
9
Sample Question
Why is it not advisable to test a model on the same data used for training?
A. It can lead to underfitting.
B. It does not provide a true measure of the model's performance.
C. .It increases the computational cost unnecessarily.
D. It reduces the model's ability to generalize to new data
What is a potential risk when a model is trained and tested on the same dataset?
A. The model may not perform well on unseen data due to lack of exposure.
B. The model will require more data to be validated accurately.
C. The computational time for training the model will increase.
D. The model will be too complex to understand.
6
0
Sample Question
Why is it not advisable to test a model on the same data used for training?
A. It can lead to underfitting.
B. It does not provide a true measure of the model's performance.
C. .It increases the computational cost unnecessarily.
D. It reduces the model's ability to generalize to new data
What is a potential risk when a model is trained and tested on the same dataset?
A. The model may not perform well on unseen data due to lack of exposure.
B. The model will require more data to be validated accurately.
C. The computational time for training the model will increase.
D. The model will be too complex to understand.
6
1