Lecture01 Introduction To Machine Learning (Chapter1)
Lecture01 Introduction To Machine Learning (Chapter1)
Fall 2024
Outline
Fundamental concepts:
What Is Machine Learning, Why do we use Machine Learning
Types of Learning
Supervised Learning
Unsupervised Learning
Recommendation
Reinforcement Learning
Design cycle
Data collection
Feature choice
Model choice
Training
Evaluation
Demo: SK-learn example, Linear Regression
2
The main goal for this course
Get a deeper view into the process.
the theory of machine learning
why, when, which, what…
3
What is Machine Learning?
Definition 1: Machine Learning is the science (and art) of
programming computers so they can learn from data.
4
Why Use Machine Learning?
Consider how you would write a program of spam filter
6
Why Use Machine Learning?
Consider how you would write a program of spam filter
⁻ Image Recognition
y=f( ) => “dog
” 8
Machine Learning approach
Why Machine Learning Is Possible?
Mass Storage
More data available
online learning
9
Types of Learning
Supervised learning (Classification)
Training data includes desired outputs
Unsupervised learning (Clustering)
Training data does not include desired outputs
Recommendation (Collaborative Learning)
Semi-supervised learning
Training data includes a few desired outputs
Reinforcement learning
Rewards from sequence of actions
Learning from delayed feedback by interact with
10
environment
Supervised Learning
In supervised learning, the training data you feed to the
algorithm includes the desired solutions, called labels.
11
Supervised Learning Algorithms
k-Nearest Neighbors
Linear Regression/Generalized Regression
Logistic Regression
Support Vector Machines (SVMs)
Decision Trees
Random Forests
Neural Networks
Naïve Bayes
12
Unsupervised Learning
In unsupervised learning, the training data you feed to the
algorithm is unlabeled.
Visualization: https://projector.tensorflow.org/
13
Unsupervised Machine Learning Algorithms
Unsupervised Machine Learning Algorithms
Clustering
k-Means
Hierarchical Cluster Analysis (HCA)
Expectation Maximization
15
Reinforcement Learning
The learning system, called an agent, can observe the
environment, select and perform actions, and get rewards
in return (or penalties in the form of negative rewards)
learn by itself what is the best strategy, called a policy
16
Reinforcement Learning
Learning a policy: A sequence of outputs
No supervised output but delayed reward
Credit assignment problem
Game playing
Robot in a maze
Multiple agents, partial observability, ...
17
Another criterion used to
classify ML systems: Batch and
Online Learning
- Whether or not they can learn incrementally
(1) Batch learning(Offline) (2) On the fly (online)
Example of Online Learning (such as stock price)
A model is trained and launched into production, and then it keeps learning as
new data comes in
18
Instance-Based Versus Model-Based
Learning
Third way to categorize Machine Learning systems
Whether they work by simply comparing new data points to known
data points, or instead detect patterns in the training data and build a
predictive model, much like scientists do
19
Applications
Application: Character recognition
Automated mail sorting, processing bank checks
Scanner captures an image of the text
Image is converted into constituent characters
21
21
Different Algorithms
22
Application: Finger prints recognition
23
Application:
Image Segmentation
24
Application: Brain Tissue Segmentation
25
More Applications Book P.5
Analyzing images of products on a production line to automatically
classify them
Detecting tumors in brain scans
Automatically classifying news articles
Automatically flagging offensive comments on discussion forums
Summarizing long documents automatically
Creating a chatbot or a personal assistant
Forecasting your company’s revenue next year, based on many
performance metrics
Making your app react to voice commands
Detecting credit card fraud
Segmenting clients based on their purchases so that you can design
a different marketing
strategy for each segment
Representing a complex, high-dimensional dataset in a clear and
insightful diagram 26
Recommending a product that a client may be interested in, based
ChatGPT (Nov. 2022)
ChatGPT is a chatbot launched by OpenAI
o reinforcement learning
27
A typical machine learning system
A machine learning system contains
A sensor
A preprocessing mechanism
A feature extraction mechanism (manual or automated)
A classification algorithm
A set of examples (training set) already classified or described
28
Common Machine Learning Algorithms
1. Supervised:
Regression: Linear Regression and more
Classification
Decision Trees
Nearest Neighbors
Logistic Regression
PCA .. 29
The design cycle
Data collection
Probably the most time-intensive component of a PR project
How many examples are enough?
Feature Selection/Engineering
Critical to the success of the PR problem
“Garbage in, garbage out”
Requires basic prior knowledge
Model choice
Statistical, neural and structural approaches
Parameter settings
Training
Given a feature set and a “blank” model, adapt the model to explain
the data
Supervised, unsupervised and reinforcement learning
Evaluation
How well does the trained model do?
Overfitting vs. generalization
30
Features
31
Features
The combination of d features is represented as a d-
dimensional column vector called a feature vector
The d-dimensional space defined by the feature vector is
called the feature space
Objects are represented as points in feature space. This
representation is called a scatter plot
32
Feature extraction
34
The design cycle
Data collection
Probably the most time-intensive component of a PR project
How many examples are enough?
Feature choice
Critical to the success of the PR problem
“Garbage in, garbage out”
Requires basic prior knowledge
Model choice
Statistical, neural and structural approaches
Parameter settings
Training
Given a feature set and a “blank” model, adapt the model to explain
the data
Supervised, unsupervised and reinforcement learning
Evaluation
How well does the trained model do?
Overfitting vs. generalization
35
Consider the following scenario:
Classification
A fish processing plan wants to automate the process of sorting
incoming fish according to species (salmon or sea bass)
The automation system consists of
a conveyor belt for incoming products
two conveyor belts for sorted products
a pick-and-place robotic arm
a vision system with an overhead CCD camera
a computer to analyze images and control the robot arm
36
36
37
Improving the performance of our ML system
We combine “length” and “average intensity of the
scales” to improve class separability
We compute a linear discriminant function to separate
the two classes, and obtain a classification rate of 95.7%
Task: maximization of
classification accuracy.
Task: minimization of
classification error.
38
Cost versus Classification rate
Our linear classifier was designed to minimize the overall
misclassification rate
Is this the best objective function for our fish processing plant?
The cost of misclassifying salmon as sea bass is that the end customer will
occasionally find a tasty piece of salmon when he purchases sea bass
The cost of misclassifying sea bass as salmon is an end customer upset
when he finds a piece of sea bass purchased at the price of salmon
Intuitively, we could adjust the decision boundary to minimize this
cost function
39
The issue of generalization
The recognition rate of our linear classifier (95.7%) met the design
specs, but we still think we can improve the performance of the
system
We then design an artificial neural network with five hidden layers, a
combination of logistic and hyperbolic tangent activation functions, train
it with the Levenberg-Marquardt algorithm and obtain an impressive
classification rate of 99.9975% with the following decision boundary
40
The design cycle
Data collection
Probably the most time-intensive component of a PR project
How many examples are enough?
Feature choice
Critical to the success of the PR problem
“Garbage in, garbage out”
Requires basic prior knowledge
Model choice
Statistical, neural and structural approaches
Parameter settings
Training
Given a feature set and a “blank” model, adapt the model to explain
the data
Supervised, unsupervised and reinforcement learning
Evaluation
How well does the trained model do?
Overfitting vs. generalization
41
The issue of generalization
Satisfied with our classifier, we integrate the system and
deploy it to the fish processing plant
After a few days, the plant manager calls to complain
that the system is misclassifying an average of 25% of
the fish
42
Overfitting and underfitting
43
Avoid overfitting/underfitting
Dataset Splitting: Split your data into two sets
45
DA 515 vs. DA 516 will cover:
DA 515:
Basic Knowledge
Traditional algorithms
SK-learn
Transformer, GPT
Reinforcement Learning
Keras/Tensorflow
46
DEMO: Linear Regression
47
Example: Linear Regression
Linear Regression
Formula
Gradient descent
Sk-Learn
48
Regression-1: Evaluation
How to measure your model
49
Regression-2:
find the optimum fitness function
Optimization problem
50
Hidden behind:
Gradient Descent (from Internet)
51
Demo: 1. A Linear Regression.ipynb
1. B Decision Tree.ipynb
Data preparation
X_train, y_train, X_test, y_test
SK-Learn Library
1. Model representation
lin_reg = LinearRegression()
2. Training (Optimization)
lin_reg.fit(X_train, y_train)
3. Testing
predictions = lin_reg.predict(X_test)
52
Summary of Chapter 1
Data collection
Probably the most time-intensive component of a PR project
How many examples are enough?
Feature choice
Critical to the success of the PR problem
“Garbage in, garbage out”
Requires basic prior knowledge
Model choice
Statistical, neural and structural approaches
Parameter settings
Training
Given a feature set and a “blank” model, adapt the model to explain
the data
Supervised, unsupervised and reinforcement learning
Evaluation
How well does the trained model do?
Overfitting vs. generalization
53
Setup Python Environment
54
Software
Learning Python
Google Developer Python Tutorial
https://developers.google.com/edu/python/
NumPy Tutorial
https://www.tutorialspoint.com/numpy/
Python tutorial
http://docs.python.org/tutorial/
55
Software
Python Library
Scikit-learn -- machine learning in Python
https://scikit-learn.org/stable/
56
Resources
Kaggle Competition
https://www.kaggle.com/
Web pages:
https://machinelearningmastery.com/
https://www.geeksforgeeks.org/machine-learning/
Students:
Read Chapter 1
Next Lecture: Chapter 2 (ML pipeline)
58
Last Several Slides
Pattern Recognition
Machine Learning
data and is still little more towards math than programming, but uses
both.
62
What is the difference?
Artificial Intelligence uses models built by Machine
Learning and other ways to reason about the world and
give rise to intelligent behavior whether this is playing a
game or driving a robot/car.
63
Summary
Statistics quantifies numbers
Pattern Recognition finding patterns
Data Mining explains patterns
Machine Learning predicts with models
Artificial Intelligence behaves and reasons
64