0% found this document useful (0 votes)
8 views

Lecture04 - Machine Learning Landscape

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture04 - Machine Learning Landscape

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Machine Learning

LECTURE 04
JENNIFER JOYCE M. MONTEMAYOR - MAULANA

Department of Computer Science


College of Computer Studies
MSU - Iligan Institute of Technology
Machine Learning
■ science and art of programming computers so that they can learn from data (Geron, 2019)

■ field of study that gives computers the ability to learn without being explicitly programmed
(Samuel, 1959)

■ a machine learning system is trained rather than explicitly programmed.

ARTIFICIAL
INTELLIGENCE
MACHINE
LEARNING
Any technique
DEEP
that enables LEARNING
computers to Ability to learn
mimic human without being
explicitly being Extract patterns from
behavior
programmed data using neural
networks

2 Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


An Example: Spam Filter using classic approach
1. Observe what spam typically looks like.
○ You might notice that some words or
phrases (such as “4U,” “credit card,”
“free,” and “amazing”) tend to come up
a lot in the subject. Perhaps you would
also notice a few other patterns in the
sender’s name, the email’s body, and
so on.
2. Write a detection algorithm for each of the
patterns that you noticed, and your program
would flag emails as spam if a number of
these patterns are detected.
3. Test your program, and repeat steps 1 and 2
until it is good enough.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


An Example: Spam Filter using Machine Learning approach

Automatically learns which words and phrases are good predictors of spam by detecting
unusually frequent patterns of words in the spam examples compared to the ham examples.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Why Machine Learning?
■ great for problems for which existing solutions require a lot of hand-tuning or long lists of rules
■ complex problems for which there is no known solution at all using classic approach
■ fluctuating environments
■ getting insights about complex problems and large amounts of data - “Machine Learning can help
humans learn”

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Types of Machine Learning Systems
There are so many types of Machine Learning systems that it is useful to classify them in broad
categories:
■ whether or not they are trained with human supervision
○ supervised, unsupervised, semisupervised, and Reinforcement Learning
■ whether or not they can learn incrementally on the fly
○ online versus batch learning
■ whether they work by simply comparing new data points to known data points or instead detect
patterns in the training data and build a predictive model
○ instance-based versus model-based learning

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Types of Machine Learning Systems
Machine learning systems can be classified according to the amount and type of supervision they get
during training.
■ Supervised Learning
■ Unsupervised Learning
■ Semisupervised Learning
■ Reinforcement Learning

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels

The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels

The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.

■ Classification - system is
trained with many examples
along with their class
e.g spam classification

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels

The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.

■ Regression - predict a
target numeric value given a
set of features called
predictors

e.g Train system to predict price


of car given a set of features:
mileage, age, brand by giving it
many examples of cars

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
Some supervised learning algorithms:
■ K-Nearest Neighbors
○ Regression, Classification
○ Predicting class or value of an instance based on the majority class or average of its
k-nearest neighbors
• Example: handwriting recognition
■ Linear Regression
○ Regression
○ Predicting continuous output variable based on one or more input features
• Example: predicting house prices based on features like size, number of bedrooms,
location
■ Logistic Regression
○ Classification
○ Predicting the probability of an instance belong to a class
• Example: binary classification type of problem like spam detection

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
Some supervised learning algorithms:
■ Support Vector Machines (SVMs)
○ Classification, Regression
○ Separating instances into different classes or predicting continuous variable
■ Decision Trees and Random Forests
○ Classification / Regression type of task
○ Making decisions by recursively splitting the dataset based on the most significant features
○ Example: credit scoring, image recognition
■ Neural networks
○ Classification, Regression
○ Learning complex patterns using multiple layers of interconnected nodes / neurons

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Supervised Learning
Key concepts:
■ Features and Labels
• Feature - an input variable or attribute used to describe the data
• Label - output or target variable that the model is trying to predict
■ Training and Test data
• Training data - labeled dataset used to train the model
• Test data - unseen data used to evaluate the performance of the model
■ Loss Function
• A function that measures the difference between the predicted output and the actual label.
The goal is to minimize the difference.
■ Model Parameters
• Internal variables that are adjusted during training to minimize the loss function
■ Overfitting and Underfitting
• Overfitting - occurs when model learns the training data too well but fails to generalize to
new or unseen data
• Underfitting - occurs when model is too simple and cannot capture the underlying patterns
in the data.
Jennifer Joyce M. Montemayor / CSC172 / Lecture 04
TYPES OF MACHINE LEARNING SYSTEMS

Unsupervised Learning
The training data you feed to the algorithm is not labeled. The algorithm must find patterns,
relationships or structures within the data without explicit guidance or labeled outcomes.

The system tries to learn without the teacher.

The algorithm tries to learn inherent


structure of the data without being
provided with explicit targets or labels
for each data point.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.

■ Clustering - goal is to group similar data points together based on certain


characteristics/features

Use cases
● Customer segmentation
● Document clustering
● Image segmentation

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.

■ Clustering - goal is to group similar data points together based on certain


characteristics/features

Use cases
● Customer segmentation
● Document clustering
● Image segmentation

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.

■ Dimensionality Reduction - goal is to reduce the number of features in the data while retaining
as much information as possible

Use cases
● Feature extraction
● Visualization of
high-dimensional data
● Noise reduction
● Anomaly Detection

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Unsupervised Learning
Unsupervised learning is particularly useful in scenarios where labeled data is scarce or expensive to
obtain. It can help discover hidden patterns, associations, or underlying structures in the data that may
not be immediately apparent.

Applications
■ Anomaly Detection
○ detect outliers in a dataset by identifying patterns that deviate from the norm
■ Recommendation Systems
○ Analyze user behavior and provide personalized recommendations based on patterns
discovered on data
■ Generative Models
○ Generating new data samples that resemble training data
■ Market Basket Analysis
○ Analyze purchasing patterns to identify products frequently bought together, aiding product
placement or targeted marketing
■ Image Segmentation
○ Segment images into different regions based on pixel similarities
■ Natual Language Processing (NLP)
○ Tasks like topic modeling where documents are grouped based on the topics they discuss
Jennifer Joyce M. Montemayor / CSC172 / Lecture 04
TYPES OF MACHINE LEARNING SYSTEMS

Semisupervised Learning
In semisupervised learning, the algorithm is trained on a dataset that contains both labeled and
unlabeled examples. The presence of labeled data helps guide the learning process, while the
unlabeled data allows the algorithm to explore and learn patterns from a broader set of examples.

The typical scenario in semi-supervised


learning is that labeled examples are
scarce or expensive to obtain, while there
is a larger pool of unlabeled data
available.

The goal is to leverage the combination


of both labeled and unlabeled data to
improve the model's performance.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Semisupervised Learning
Approaches

■ Self-Training
○ Idea: Train a model on the labeled data and use it to predict labels for unlabeled data. The
confident predictions on unlabeled data are then added to the labeled dataset for further
training.
○ Example: Training a model on a small set of labeled images and using it to predict labels
for a larger set of unlabeled images.

■ Co-Training
○ Idea: Train multiple models on different sets of features or representations. Each model is
then used to predict labels for the unlabeled data, and instances with consistent
predictions across models are added to the labeled dataset.
○ Example: Training a model on textual features and another model on visual features for
image classification.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Semisupervised Learning
Approaches

■ Semi-Supervised Generative Models:


○ Idea: Use generative models (machine learning models designed to generate new data samples that
resemble a given training dataset) to create realistic synthetic data that can be combined with
the labeled data for training.
○ Example: Using Generative Adversarial Networks (GANs) to generate additional labeled
examples for training.

■ Bootstrapping:
○ Idea: Initialize the model with a small set of labeled examples and iteratively expand the
labeled dataset by selecting instances with high-confidence predictions on unlabeled
data.
○ Example: Classifying web pages as relevant or non-relevant using an initial set of labeled
examples and expanding the labeled dataset over time.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Semisupervised Learning
Semisupervised learning is particularly useful in scenarios where obtaining a large amount of labeled
data is challenging or costly, but there is a significant pool of unlabeled data that can still contribute
to model improvement.

Applications of Semi-Supervised Learning:

■ Text Classification
○ Leveraging a small labeled dataset to improve the classification of large amounts of unlabeled text data.
■ Speech Recognition
○ Using a combination of labeled and unlabeled audio data to improve the accuracy of speech
recognition models.
■ Image Classification
○ Enhancing image classification models by incorporating labeled and unlabeled images.
■ Drug Discovery
○ Combining labeled data on the biological activity of certain compounds with large-scale unlabeled
chemical data to identify potential drug candidates.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Reinforcement Learning
The learning system, called an agent, can
observe the environment, select and perform
actions, and get rewards or penalties in return.

The agent must learn by itself the best strategy,


called a policy, to get the most reward over time.

A policy defines what action the agent should


choose when it is in a given situation.

“ An agent learns to make decisions by interacting with an


environment. The agent takes actions in the environment, and
based on those actions, it receives feedback in the form of
rewards or punishments. The goal of the agent is to learn a
policy, which is a mapping from states to actions, that
maximizes the cumulative reward over time. ”

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Reinforcement Learning
Key components:

1. Agent: The learner or decision-maker that interacts


with the environment.
2. Environment: The external system or process with
which the agent interacts.
3. State (s): A representation of the current situation
or configuration of the environment.
4. Action (a): The decision or move made by the
agent in a particular state.
5. Reward (r): The feedback from the environment
indicating the immediate benefit or cost of the
agent's action.
6. Policy (π): A strategy or mapping from states to
actions that the agent follows to make decisions.
7. Value Function (V or Q): An estimate of the
expected cumulative reward that an agent can
obtain from a given state (V) or a state-action pair
(Q). Jennifer Joyce M. Montemayor / CSC172 / Lecture 04
TYPES OF MACHINE LEARNING SYSTEMS

Reinforcement Learning
Learning process:

1. Observation: The agent observes the current


state of the environment.
2. Action: The agent selects an action based on
its current policy.
3. Transition: The environment transitions to a
new state based on the chosen action.
4. Reward: The agent receives a reward or
punishment based on the transition.
5. Update: The agent updates its policy or value
function based on the observed reward and
the new state.
6. Repeat: Steps 1-5 are repeated over multiple
iterations, allowing the agent to learn and
improve its decision-making strategy.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Reinforcement Learning

Exploration and Exploitation:

Reinforcement learning often involves a trade-off


between exploration and exploitation.

The agent needs to explore different actions to


discover their effects and, at the same time,
exploit its current knowledge to maximize
short-term rewards.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


TYPES OF MACHINE LEARNING SYSTEMS

Reinforcement Learning
Reinforcement learning is a powerful approach for solving problems where an agent needs to make
sequential decisions in an environment with feedback. It has been particularly successful in domains
with complex and dynamic decision-making challenges.

Some Applications:

■ Game Playing: RL has been successful in training agents to play complex games, such as chess, Go, and video games.
■ Robotics: Teaching robots to perform tasks in the real world, such as grasping objects or navigating environments.
■ Autonomous Vehicles: Training autonomous vehicles to make decisions in complex traffic scenarios.
■ Natural Language Processing (NLP): Reinforcement learning is used for dialogue systems and language generation.
■ Finance: RL is applied in algorithmic trading to optimize trading strategies.
■ Healthcare: Personalized treatment recommendation systems and optimizing patient care.
■ Resource Management: Optimal resource allocation in scenarios like energy management and supply chain optimization.
■ Education: Adaptive learning systems that adjust the difficulty of educational content based on student performance.

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
■ Insufficient quantity of training data
○ it takes a lot of data for most ML algorithms to work properly
○ simple problems require thousands of examples
○ complex problems (e.g image or speech recognition) need millions of examples

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
■ Insufficient quantity of training data
○ it takes a lot of data for most ML algorithms to work properly
○ simple problems require thousands of examples
○ complex problems (e.g image or speech recognition) need millions of examples
■ Non representative training data
○ in order to generalize well it is crucial that your training data must representative of
the new cases you want to generalize to

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
■ Insufficient quantity of training data
○ it takes a lot of data for most ML algorithms to work properly
○ simple problems require thousands of examples
○ complex problems (e.g image or speech recognition) need millions of examples
■ Non representative training data
○ in order to generalize well it is crucial that your training data must representative of
the new cases you want to generalize to
■ Poor-Quality Data
○ if training data is full of errors, outliers, and noise it will make it harder for the system to
detect the underlying patterns
○ often worth the effort to spend time cleaning up training data

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
■ Insufficient quantity of training data
○ it takes a lot of data for most ML algorithms to work properly
○ simple problems require thousands of examples
○ complex problems (e.g image or speech recognition) need millions of examples
■ Non representative training data
○ in order to generalize well it is crucial that your training data must representative of
the new cases you want to generalize to
■ Poor-Quality Data
○ if training data is full of errors, outliers, and noise it will make it harder for the system to
detect the underlying patterns
○ often worth the effort to spend time cleaning up training data
■ Irrelevant Features
○ system will only learn if training data contains enough relevant features and not too many
irrelevant ones
○ critical part of the success of Machine Learning project is coming up with good set of
features to train on

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
● Overfitting the training data
○ model performs well on training data but it does not generalize well on new data
○ happens when the model is too complex relative to the amount and noisiness of the
training data

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04


Main Challenges of Machine Learning
● Overfitting the training data
○ model performs well on training data but it does not generalize well on new data
○ happens when the model is too complex relative to the amount and noisiness of the
training data
● Underfitting the training data
○ model is too simple to learn the underlying structure of the data

Jennifer Joyce M. Montemayor / CSC172 / Lecture 04

You might also like