Lecture04 - Machine Learning Landscape
Lecture04 - Machine Learning Landscape
LECTURE 04
JENNIFER JOYCE M. MONTEMAYOR - MAULANA
■ field of study that gives computers the ability to learn without being explicitly programmed
(Samuel, 1959)
ARTIFICIAL
INTELLIGENCE
MACHINE
LEARNING
Any technique
DEEP
that enables LEARNING
computers to Ability to learn
mimic human without being
explicitly being Extract patterns from
behavior
programmed data using neural
networks
Automatically learns which words and phrases are good predictors of spam by detecting
unusually frequent patterns of words in the spam examples compared to the ham examples.
Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels
The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.
Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels
The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.
■ Classification - system is
trained with many examples
along with their class
e.g spam classification
Supervised Learning
The training data you feed to the algorithm includes the desired solutions, called labels
The goal is to learn the relationship/mapping between the input variables and their corresponding
labels, allowing the algorithm to make predictions or decisions when given new or unseen data.
■ Regression - predict a
target numeric value given a
set of features called
predictors
Supervised Learning
Some supervised learning algorithms:
■ K-Nearest Neighbors
○ Regression, Classification
○ Predicting class or value of an instance based on the majority class or average of its
k-nearest neighbors
• Example: handwriting recognition
■ Linear Regression
○ Regression
○ Predicting continuous output variable based on one or more input features
• Example: predicting house prices based on features like size, number of bedrooms,
location
■ Logistic Regression
○ Classification
○ Predicting the probability of an instance belong to a class
• Example: binary classification type of problem like spam detection
Supervised Learning
Some supervised learning algorithms:
■ Support Vector Machines (SVMs)
○ Classification, Regression
○ Separating instances into different classes or predicting continuous variable
■ Decision Trees and Random Forests
○ Classification / Regression type of task
○ Making decisions by recursively splitting the dataset based on the most significant features
○ Example: credit scoring, image recognition
■ Neural networks
○ Classification, Regression
○ Learning complex patterns using multiple layers of interconnected nodes / neurons
Supervised Learning
Key concepts:
■ Features and Labels
• Feature - an input variable or attribute used to describe the data
• Label - output or target variable that the model is trying to predict
■ Training and Test data
• Training data - labeled dataset used to train the model
• Test data - unseen data used to evaluate the performance of the model
■ Loss Function
• A function that measures the difference between the predicted output and the actual label.
The goal is to minimize the difference.
■ Model Parameters
• Internal variables that are adjusted during training to minimize the loss function
■ Overfitting and Underfitting
• Overfitting - occurs when model learns the training data too well but fails to generalize to
new or unseen data
• Underfitting - occurs when model is too simple and cannot capture the underlying patterns
in the data.
Jennifer Joyce M. Montemayor / CSC172 / Lecture 04
TYPES OF MACHINE LEARNING SYSTEMS
Unsupervised Learning
The training data you feed to the algorithm is not labeled. The algorithm must find patterns,
relationships or structures within the data without explicit guidance or labeled outcomes.
Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.
Use cases
● Customer segmentation
● Document clustering
● Image segmentation
Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.
Use cases
● Customer segmentation
● Document clustering
● Image segmentation
Unsupervised Learning
The training data you feed to the algorithm is not labeled. The system tries to learn without the
teacher.
■ Dimensionality Reduction - goal is to reduce the number of features in the data while retaining
as much information as possible
Use cases
● Feature extraction
● Visualization of
high-dimensional data
● Noise reduction
● Anomaly Detection
Unsupervised Learning
Unsupervised learning is particularly useful in scenarios where labeled data is scarce or expensive to
obtain. It can help discover hidden patterns, associations, or underlying structures in the data that may
not be immediately apparent.
Applications
■ Anomaly Detection
○ detect outliers in a dataset by identifying patterns that deviate from the norm
■ Recommendation Systems
○ Analyze user behavior and provide personalized recommendations based on patterns
discovered on data
■ Generative Models
○ Generating new data samples that resemble training data
■ Market Basket Analysis
○ Analyze purchasing patterns to identify products frequently bought together, aiding product
placement or targeted marketing
■ Image Segmentation
○ Segment images into different regions based on pixel similarities
■ Natual Language Processing (NLP)
○ Tasks like topic modeling where documents are grouped based on the topics they discuss
Jennifer Joyce M. Montemayor / CSC172 / Lecture 04
TYPES OF MACHINE LEARNING SYSTEMS
Semisupervised Learning
In semisupervised learning, the algorithm is trained on a dataset that contains both labeled and
unlabeled examples. The presence of labeled data helps guide the learning process, while the
unlabeled data allows the algorithm to explore and learn patterns from a broader set of examples.
Semisupervised Learning
Approaches
■ Self-Training
○ Idea: Train a model on the labeled data and use it to predict labels for unlabeled data. The
confident predictions on unlabeled data are then added to the labeled dataset for further
training.
○ Example: Training a model on a small set of labeled images and using it to predict labels
for a larger set of unlabeled images.
■ Co-Training
○ Idea: Train multiple models on different sets of features or representations. Each model is
then used to predict labels for the unlabeled data, and instances with consistent
predictions across models are added to the labeled dataset.
○ Example: Training a model on textual features and another model on visual features for
image classification.
Semisupervised Learning
Approaches
■ Bootstrapping:
○ Idea: Initialize the model with a small set of labeled examples and iteratively expand the
labeled dataset by selecting instances with high-confidence predictions on unlabeled
data.
○ Example: Classifying web pages as relevant or non-relevant using an initial set of labeled
examples and expanding the labeled dataset over time.
Semisupervised Learning
Semisupervised learning is particularly useful in scenarios where obtaining a large amount of labeled
data is challenging or costly, but there is a significant pool of unlabeled data that can still contribute
to model improvement.
■ Text Classification
○ Leveraging a small labeled dataset to improve the classification of large amounts of unlabeled text data.
■ Speech Recognition
○ Using a combination of labeled and unlabeled audio data to improve the accuracy of speech
recognition models.
■ Image Classification
○ Enhancing image classification models by incorporating labeled and unlabeled images.
■ Drug Discovery
○ Combining labeled data on the biological activity of certain compounds with large-scale unlabeled
chemical data to identify potential drug candidates.
Reinforcement Learning
The learning system, called an agent, can
observe the environment, select and perform
actions, and get rewards or penalties in return.
Reinforcement Learning
Key components:
Reinforcement Learning
Learning process:
Reinforcement Learning
Reinforcement Learning
Reinforcement learning is a powerful approach for solving problems where an agent needs to make
sequential decisions in an environment with feedback. It has been particularly successful in domains
with complex and dynamic decision-making challenges.
Some Applications:
■ Game Playing: RL has been successful in training agents to play complex games, such as chess, Go, and video games.
■ Robotics: Teaching robots to perform tasks in the real world, such as grasping objects or navigating environments.
■ Autonomous Vehicles: Training autonomous vehicles to make decisions in complex traffic scenarios.
■ Natural Language Processing (NLP): Reinforcement learning is used for dialogue systems and language generation.
■ Finance: RL is applied in algorithmic trading to optimize trading strategies.
■ Healthcare: Personalized treatment recommendation systems and optimizing patient care.
■ Resource Management: Optimal resource allocation in scenarios like energy management and supply chain optimization.
■ Education: Adaptive learning systems that adjust the difficulty of educational content based on student performance.