0% found this document useful (0 votes)
59 views37 pages

Basics of Learning Theory

The document discusses computational learning theory and statistical learning theory. Computational learning theory focuses on the computational complexity and efficiency of learning algorithms, while statistical learning theory emphasizes the ability of algorithms to generalize and their statistical properties. Both fields aim to analyze machine learning algorithms and understand the conditions under which they will succeed or fail.

Uploaded by

Rajath Surya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views37 pages

Basics of Learning Theory

The document discusses computational learning theory and statistical learning theory. Computational learning theory focuses on the computational complexity and efficiency of learning algorithms, while statistical learning theory emphasizes the ability of algorithms to generalize and their statistical properties. Both fields aim to analyze machine learning algorithms and understand the conditions under which they will succeed or fail.

Uploaded by

Rajath Surya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Module 2 Part 1

Basics of Learning Theory


Prepared by:
Mrs. Sonaxi Bhagawan Raikar
Assistant Professor
Dept. of EEE
NIE Mysuru
Introduction
Learning is the process of acquiring new knowledge, skills, behaviors, or attitudes through experience,

instruction, or study. It involves a change in behavior or mental processes that occurs as a result of new

information or experiences.

To make machines learn, simulate


human learning strategies in machine

3/27/2023 Department of EEE, NIE Mysuru 2


Learning Process
1 Acquisition of new declarative knowledge

2 Development of motor and cognitive skills through instruction or practice

3 Organization of new knowledge into general, effective representations

4 Discovery of new facts and theories through observation and experimentation

3/27/2023 Department of EEE, NIE Mysuru 3


What sort of Tasks Computer Learn?
A problem is considered well-posed if it satisfies three conditions: existence, uniqueness, and stability.

• Existence means that the problem has at least one solution.


• Uniqueness means that there is only one solution to the problem.
• Stability means that small changes to the input of the problem result in small changes to the output.

3/27/2023 Department of EEE, NIE Mysuru 4


Well Posed Problems

3/27/2023 Department of EEE, NIE Mysuru 5


Learning Theory
Learning theory is a subfield of machine learning that studies
how machine learning algorithms can improve their
performance through experience.

It is concerned with understanding the


theoretical foundations of learning algorithms Dependent
and their capabilities and limitations.

Let x be the input and  be the input


space, which is the set of all inputs,
and Y is the output space, which is the
set of all possible outputs.

Let D be the input dataset with Independent


examples (x1,y1), (x2,y2),………..,
(xn,yn) for n inputs.
Learning Model = Hypothesis Set + Learning Algorithms

3/27/2023 Department of EEE, NIE Mysuru 6


Learning Types

3/27/2023 Department of EEE, NIE Mysuru 7


Introduction to Computational Learning
Theory
Questions which arise when we think about making Machine Learning:

• What makes a learning problem solvable?

• How can we measure the complexity of a learning problem?

• What is the minimum amount of data required to learn a particular concept?

• What is the best way to choose a learning algorithm given a particular learning problem?

• How can we ensure that a learning algorithm generalizes well to new data?

3/27/2023 Department of EEE, NIE Mysuru 8


Introduction to Computational Learning
Theory
Computational Learning Theory is a field of computer science and theoretical statistics that focuses on
studying the principles and algorithms behind machine learning. This field aims to understand the computational
and statistical properties of various learning algorithms and to identify the conditions under which they can be
expected to succeed or fail.

Another concept that is similar to CoLT is Statistical Learning Theory.

Statistical learning theory is a framework for machine learning that draws from statistics and functional
analysis. It deals with finding a predictive function based on the data presented. The main idea in statistical
learning theory is to build a model that can draw conclusions from data and make predictions.

3/27/2023 Department of EEE, NIE Mysuru 9


Introduction to Computational Learning Theory
ML Problem: Predict whether a person is likely to develop a particular disease based on their
medical history and lifestyle factors

Computational Learning Theory in action might


be a study of the accuracy and efficiency of
various machine learning algorithms. Algorithm 1 Algorithm 2 Algorithm 3

Evaluate accuracy and efficiency of each


algorithm and compare it.

CoLT evaluates Learnability of any algorithm

3/27/2023 Department of EEE, NIE Mysuru 10


Introduction to Computational Learning Theory
Statistical Learning Theory

Ability to learn from finite datasets,


Emphasis

Ability to generalize to new data,


Statistical Learning Theory
focuses on mathematical
foundation of the algorithm Their performance guarantees

New Unseen data

For example, statistical learning theory can be used to analyze the performance of linear regression or support
vector machines on a given dataset and to prove theorems about their convergence rates and error bounds.

3/27/2023 Department of EEE, NIE Mysuru 11


Introduction to Computational Learning Theory

CoLT focuses on computational complexity

For example, computational learning theory can be used to


design efficient algorithms for clustering, classification, or
dimensionality reduction, and to analyze their worst-case
running time or sample complexity.

Requirement of computational resources


to learn from data

3/27/2023 Department of EEE, NIE Mysuru 12


Computational Learning Theory Statistical Learning Theory

Approach Complexity-based Probability-based

Goal Efficiency of learning algorithms Generalization error estimation

Emphasis on Vapnik-Chervonenkis
Learning Models Emphasis on PAC learning
(VC) learning

Focuses on the statistical properties of


Algorithm Analysis Focuses on algorithmic performance
data

Employs computational models for


Learning Theory Applies statistical models to learning
learning
Difference Concerned with learnability of classes Concerned with fitting the data to
Focus
of functions models

Often involves determining the


Focuses on the rate at which the error
Sample Complexity minimum number of examples needed
decreases with sample size
to learn a function

Finding efficient algorithms that can Avoiding overfitting while achieving


Key Challenge
learn from a limited amount of data high accuracy

Widely used in various fields such as


Often applied in the field of artificial
Application statistics, data science, and machine
intelligence and machine learning
learning
3/27/2023 Department of EEE, NIE Mysuru 13
Design of Learning System
According to Arthur Samuel “Machine Learning enables a Machine to Automatically learn from Data, Improve
performance from an Experience and predict things without explicitly programmed.”

Mathematical
Model Predict
Training
Data

ML Algorithm Decide

3/27/2023 Department of EEE, NIE Mysuru 14


Design of Learning System

As data increases
Logical and Efficiency increases
Self-Driving Car Mathematical
Model

3/27/2023 Department of EEE, NIE Mysuru 15


Step 1 Choosing the Training Experience
Attributes of Training data

Feedback Degree Diversity

Data or experience that


we fed to the algorithm
have a significant impact
on the Success or
Failure of the Model

3/27/2023 Department of EEE, NIE Mysuru 16


Step 1 Choosing the Training Experience
Attributes of Training data

Feedback Degree Diversity

Direct Indirect

Data or experience that


we fed to the algorithm
have a significant impact
on the Success or
Failure of the Model

3/27/2023 Department of EEE, NIE Mysuru 17


Step 1 Choosing the Training Experience
Attributes of Training data

Feedback Degree Diversity

Consider a machine learning algorithm that is being trained to recognize handwritten digits

Data or experience that 0 0


we fed to the algorithm 0 1
have a significant impact Learned to recognize digits The algorithm is more likely to
on the Success or 0 based on their position in 0 learn the underlying patterns in
Failure of the Model 1 the sequence rather than 1 the data and generalize well to
their inherent features. new data.
1 0

3/27/2023 Department of EEE, NIE Mysuru 18


Step 1 Choosing the Training Experience
Attributes of Training data

Feedback Degree Diversity

Data or experience that


we fed to the algorithm
have a significant impact
on the Success or
Failure of the Model
training dataset does not accurately represent
the distribution of images that the algorithm is
expected to encounter in the real world.
3/27/2023 Department of EEE, NIE Mysuru 19
Step 2 Choosing the Target Function
The target function is the function that the algorithm is trying to approximate or learn from the input-output examples
in the training dataset. The target function should accurately capture the relationship between the input features and
the output, and it should be flexible enough to handle variations in the data.

Example: Predict the price of a


house based on its features such
House Price Prediction
as the number of bedrooms,
Type of
bathrooms, and square footage. algorithm

Choosing the right target function


Number of bedrooms
is important because it determines Target
Inputs () Bathrooms House Price
the nature of the machine learning Function
(Output (Y))
f()
problem and the type of algorithm Square footage
that should be used to solve it.

3/27/2023 Department of EEE, NIE Mysuru 20


Step 3: Choosing Representation for Target function
Selecting the type of function that will be used to model the target function.

Representation for Nature of the


Target function problem
Linear
Regression
Neural Model
Network
Computational
Model architecture or resources Type of data
Learning algorithm available Decision Tree
Choice of the
representation of the
target function

Complexity
Size of the
of the target
dataset function House Price

3/27/2023 Department of EEE, NIE Mysuru 21


Step 4: Choosing the function approximation
Selecting a mathematical function that

approximates the relationship between Number of bedrooms


the input variables and the output Target
Inputs () Bathrooms House Price
Function
(Output (Y))
variables of a learning system. Y = f()
Square footage
The goal of function approximation is

to +hat can generalize well to new data


Y = w1 × x1+w2 × x2+w3 × x3
that the system has not seen before.

Choosing the right function for approximation involves selecting a model that can capture the underlying patterns in the data
and generalize well to new data. This process often involves experimentation and evaluation of different models, until the
best-performing model is identified for use in the machine learning system.

3/27/2023 Department of EEE, NIE Mysuru 22


Step 4: Choosing the function approximation
The main focus of this step is to choose weights and fit the given training samples effectively.
The aim is to reduce the error
2

𝐸 = σ𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑆𝑎𝑚𝑝𝑙𝑒𝑠 𝑉𝑡𝑟𝑎𝑖𝑛 𝑏 − 𝑉(𝑏)

෠ 𝑏 is the expected hypothesis.


b is the sample, 𝑉
Based on the error weights are updated.

3/27/2023 Department of EEE, NIE Mysuru 23


Step 5: Final Design
The final design step in the design of learning systems is to evaluate the performance of the system and fine-tune
it to improve its accuracy and generalization.

This step involves testing the system on a separate dataset, known as the test set, to measure its performance on
unseen data.

Performance
Data Splitting Deployment
Analysis

Evaluation matrices Fine Tuning

3/27/2023 Department of EEE, NIE Mysuru 24


Introduction to Concept Learning
A concept is a set of objects, events, or situations that share common features or characteristics.

Training Data

Growing on a plant Growing on a plant Growing on a plant

Being edible Being edible Being edible

Containing seeds Containing seeds Containing seeds


Fruit
Being sweet or tart Being sweet or tart Being sweet or tart

So this is a Fruit So this is Not a Fruit

3/27/2023 Department of EEE, NIE Mysuru 25


Introduction to Concept Learning
Concept learning is a subfield of machine learning that focuses on learning concepts from data.
In other words, it involves developing algorithms that can recognize and understand different concepts based on
examples. The Learner tries to simplify by observing the common features from the training samples and then
apply this simplified model to the future samples. This task is also known as learning from experience.

The problem of inducing general functions from specific training examples is central to learning. Concept learning
can be formulated as a problem of searching through a predefined space of potential hypotheses for the hypothesis
that best fits the training examples.
Each concept or category obtained by learning is a Boolean valued function that takes true or false value.

3/27/2023 Department of EEE, NIE Mysuru 26


Introduction to Concept Learning
Assume that we have collected data for some attributes/features of the day like, Sky, Air Temperature, Humidity,
Wind, Water, Forecast.
Let these set of instances be denoted by X and many concepts can defined over the X.
For example, the concepts can be
- Days on which my friend Sachin enjoys his favorite water sport
- Days on which my friend Sachin will not go outside of his house.
- Days on which my friend Sachin will have night drink.

Target concept: The concept or function to be learned is called


the target concept and denoted by c. It can be seen as a boolean
valued function defined over X and can be represented as
c: X → {0, 1}

3/27/2023 Department of EEE, NIE Mysuru 27


Introduction to Concept Learning
Assume that we have collected data for some attributes/features of the day like, Sky, Air Temperature, Humidity,
Wind, Water, Forecast.
Let these set of instances be denoted by X and many concepts can defined over the X.
For example, the concepts can be
- Days on which my friend Sachin enjoys his favorite water sport
- Days on which my friend Sachin will not go outside of his house.
- Days on which my friend Sachin will have night drink.

Target concept: The concept or function to be learned is called


the target concept and denoted by c. It can be seen as a boolean
valued function defined over X and can be represented as
c: X → {0, 1}

3/27/2023 Department of EEE, NIE Mysuru 28


Introduction to Concept Learning
For the target concept c , “Days on which my friend Sachin enjoys his favorite water sport”, an attribute
EnjoySport is included in the below dataset X and it indicates whether or not my friend Sachin enjoys his favorite
water sport on that day.

Now the target concept is


EnjoySport : X →{0,1}.
With this, a learner task is to learn to
predict the value of EnjoySport for
arbitrary day, based on the values of its
attribute values.

When a new sample with the values for attributes <Sky, Air Temperature, Humidity, Wind, Water, Forecast> is given, the
value for EnjoySport (ie. 0 or 1) is predicted based on the previous learning.

3/27/2023 Department of EEE, NIE Mysuru 29


Hypothesis
Let H denote the set of all possible hypotheses that the learner may consider regarding the identity of the target
concept. So H = {h1, h2, …. }.
Let each hypothesis be a vector of six constraints, specifying the values of the six attributes <Sky, Air
Temperature, Humidity, Wind, Water, Forecast>.
In hypothesis representation, value of each attribute could be either
• “?’ — that any value is acceptable for this attribute,
• specify a single required value (e.g., Warm) for the attribute, or
• “0” that no value is acceptable.
For example :
• The hypothesis that my friend Sachin enjoys his favorite sport only on cold days with high humidity (independent of
the values of the other attributes) is represented by the expression < ?,cold, High,?, ? ,?>
• The most general hypothesis — < ?, ? , ? , ?, ? , ?> that every day is a positive example
• The most specific hypothesis — < 0, 0, 0, 0, 0, 0> that no day is a positive example.

3/27/2023 Department of EEE, NIE Mysuru 30


EnjoySport Concept Learning Task – (Ex)
EnjoySport concept learning task requires
• learning the sets of days for which EnjoySport=yes and then
• describing this set by a conjunction of constraints over the instance attributes.

We know that, the Inductive learning algorithm


tries to induce a “general rule” from a set of
observed instances. So the above case is same as
inductive learning where a learning algorithm is
trying to find a hypothesis h (general rule) in H
such that h(x) = c(x) for all x in D.

3/27/2023 Department of EEE, NIE Mysuru 31


Inductive Learning
For a given collection of examples, in reality, learning algorithm return a function h (hypothesis) that
approximates c (target concept). But the expectation is, the learning algorithm to return a function h (hypothesis)
that equals c (target concept) ie. h(x) = c(x) for all x in D.

So we can define, Inductive learning hypothesis is any hypothesis found to approximate the target function well
over a sufficiently large set of training examples will also approximate the target function well over any other
unobserved examples.

3/27/2023 Department of EEE, NIE Mysuru 32


Hypothesis Space
The number of combinations: 5×4×4×4×4×4 = 5120 syntactically distinct hypotheses.
They are syntactically distinct but not semantically.
Any hypothesis containing one 0r more “0” symbols represents the empty set of
instances: that is , it classifies every instance as negative instance.

All such hypothesis having same semantic is counted as 1. So we can have total number of combinations as below.
1 (hypothesis with one or more 0) + 4×3×3×3×3×3 (add ? to each attribute) = 973 semantically distinct hypotheses

3/27/2023 Department of EEE, NIE Mysuru 33


Hypothesis Space
The number of combinations: 5×4×4×4×4×4 = 5120 syntactically distinct
hypotheses. They are syntactically distinct but not semantically.
All such hypothesis having same semantic is counted as 1. So we can
have total number of combinations as below.
1 (hypothesis with one or more 0) + 4×3×3×3×3×3 (add ? to each
attribute) = 973 semantically distinct hypotheses

The concept learning can be viewed as the task of searching through a large space of hypotheses. The goal of this search
is to find the hypothesis that best fits the training examples. By selecting a particular hypothesis representation, the
designer of the learning algorithm implicitly defines the space of all hypotheses that the program can ever represent and
therefore can ever learn.

3/27/2023 Department of EEE, NIE Mysuru 34


Heuristic Space Search
Finding the correct hypothesis can be computationally expensive and time-
consuming.
A possible hypothesis hn
Heuristic search algorithms are used to explore this search space
more efficiently by guiding the search towards promising regions of
the space where the correct hypothesis is more likely to be found.
These algorithms use various heuristics to evaluate the fitness or
quality of a hypothesis and to determine which hypotheses are most
likely to be correct.
By using heuristic search, the concept learning process can be made
more efficient and effective, allowing the learner to discover the
underlying concepts or rules governing the data with fewer iterations
and less computational resources.
Hypothesis Space

3/27/2023 Department of EEE, NIE Mysuru 35


Heuristic Space Search
The heuristic function provides an estimate of the
quality or fitness of a potential solution or
hypothesis based on some measure of similarity or
distance to the true solution. By using this estimate
to guide the search, the algorithm can focus on
promising regions of the space, rather than blindly
exploring every possible solution.

Examples of heuristic search algorithms used in machine learning and AI include hill-climbing, simulated annealing,
genetic algorithms, and particle swarm optimization. Each of these algorithms uses a different heuristic function and
search strategy to efficiently explore the space of possible solutions and find the best fit to the available data.

3/27/2023 Department of EEE, NIE Mysuru 36


End

You might also like