DEEP LEARNING
LEC-01
TEAM NO.1
Team members:
1. Hirdesh Viikram (PES1UG21EE001)
2. Akshatha K (PES1UG21EE009)
3. Ananya Venkatesh (PES1UG21EE012)
4. Aryan Aggarwal (PES1UG21EE018)
5. Bharathi T (PES1UG21EE022)
TOPICS
1.What is Learning
2. What is Intelligence
3. Plato
4. Machine Learning
5. Processing
Aryan Aggarwal
What is Learning? PES1UG21EE018
Learning is the process of acquiring knowledge, skills, behaviors, or values
through experiences, study, or teaching.
Importance:
1) Facilitates growth and adaptability.
2) Improves problem-solving and critical
thinking.
3) Encourages lifelong development.
Aryan Aggarwal
What is Machine Learning? PES1UG21EE018
Machine Learning (ML) is a subset of artificial intelligence (AI) that enables
machines to learn and improve from data without being explicitly
programmed.
How It Works:
ML models use algorithms to identify patterns and make predictions or
decisions based on data.
Aryan Aggarwal
Types of Machine Learning and Its PES1UG21EE018
Applications
Types of Machine Learning:
1) Supervised Learning: Learning from labeled data.
2) Unsupervised Learning: Identifying patterns in unlabeled
data.
3) Reinforcement Learning: Learning through trial and error to
achieve a goal.
Applications:
1) Voice recognition (e.g., Siri, Alexa).
2) Autonomous vehicles.
3) Fraud detection in banking.
Aryan Aggarwal
MCQs PES1UG21EE018
1. What is the primary purpose of learning?
A. Entertainment
B. Acquiring skills and knowledge
C. Making money
D. Socializing
2. Which type of learning occurs through everyday experiences and interactions?
A. Formal Learning
B. Informal Learning
C. Self-directed Learning
D. Experimental Learning
3. Which of the following is NOT an application of machine learning?
A. Fraud detection
B. Playing a musical instrument
C. Autonomous vehicles
D. Voice recognition (answers are highlighed in orange)
Aryan Aggarwal
Assignment questions PES1UG21EE018
Comparison Task
• Compare and contrast human learning with machine learning in
250-300 words.
• Include:
⚬ Definitions of both.
⚬ How humans and machines learn differently.
⚬ Examples to illustrate the differences.
Akshatha K
What is Intelligence? PES1UG21EE009
Definition: Intelligence refers to the capacity to learn, understand,
and apply knowledge to adapt to new situations and solve problems.
Key Components:
-Learning: Acquiring knowledge or skills through experience or
education.
- Reasoning: Processing information to make decisions or solve
problems.
- Adaptation: Adjusting to new environments or challenges
effectively.
Akshatha K
Natural vs. Artificial Intelligence PES1UG21EE009
Natural Intelligence:
- Exhibited by humans and animals.
- Involves cognitive functions like perception, memory, and
consciousness.
Artificial Intelligence (AI):
- Simulation of human intelligence in machines.
- Enables machines to perform tasks like learning, reasoning, and
self-correction.
Akshatha K
Role of Deep Learning in AI PES1UG21EE009
Deep Learning: A subset of AI that uses neural networks with
multiple layers to model complex patterns in data.
Capabilities:
- Image and speech recognition.
- Natural language processing.
- Autonomous decision-making.
Akshatha K
MCQs PES1UG21EE009
1. Which component is NOT considered a key aspect of intelligence?
a) Learning
b) Reasoning
c) Adaptation
d) Imitation
2. What distinguishes deep learning from other forms of machine learning?
a) Use of decision trees
b) Implementation of neural networks with multiple layers
c) Reliance on manual feature extraction
d) Application of linear regression models
3. In the context of AI, what is the primary function of neural networks?
a) Storing large datasets
b) Simulating human brain processes to analyze and interpret data
c) Performing arithmetic calculations
d) Managing database transactions
(answers are highlighed in orange)
Akshatha K
Assignment questions PES1UG21EE009
1. How does artificial intelligence differ from natural intelligence, and
what role does deep learning play in AI?
2. What are neural networks, and how do they contribute to the
development of intelligent systems?
Bharathi T
Plato PES1UG21EE022
Plato(427-347 BC)
Plato is one of the world’s best known and most widely read and
studied philosophers.
He was the student of Socrates and the teacher of Aristotle, and he
wrote in the middle of the fourth century B.C.E. in ancient Greece.
Bharathi T
Plato PES1UG21EE022
• The concept of abstract ideas are known to us a priori,through a
mystic connection with world.
• He concluded that ability to think is found in a priori knowledge
of the concepts.
Bharathi T
Plato’s Pupil.... PES1UG21EE022
• Aristotle(384-322 BC )
• Criticized his Teacher’s Theory
as it is not taking into account the important aspect
---An ability to learn or adapt to changing world.
MCQs Bharathi T
PES1UG21EE022
1) The state is a national institution,was maintained by
a) T.H Green
b) Herbert spencer
c) Aristotle
d) Plato
2) Which one of the following is Plato’s work?
(a) The Lyceum
(b) The Prince
(c) The Republic
(d) None of the above
(answers are highlighed in orange)
Bharathi T
Assignment questions PES1UG21EE022
1) Examine plato’s theory of forms.
2) Explain aristotle’s critique of palto’s Idealism.
Hirdesh Viikram
Machine Learning PES1UG21EE001
What is Machine Learning?
Machine Learning is a subfield of AI that focuses on developing
algorithms that enable computers to learn from and make
predictions based on data.
Instead of being explicitly programmed to perform tasks, ML
algorithms use statistical methods to identify patterns in data,
allowing them to improve over time.
Hirdesh Viikram
Machine Learning PES1UG21EE001
How a computer looks at an image before
feeding
it’s pixel data into the Machine Learning
Hirdesh Viikram
Machine Learning PES1UG21EE001
Types of Machine Learning
• Supervised Learning:
⚬ Algorithms are trained on labeled datasets, meaning each
training example is paired with an output label.
• Unsupervised Learning:
⚬ Algorithms work with unlabeled data and seek to identify
patterns or structures within the data.
• Reinforcement Learning:
⚬ Algorithms learn to make decisions by performing certain
actions and receiving feedback in the form of rewards or
penalties.
Hirdesh Viikram
MCQs PES1UG21EE001
1. Example of supervised
learning:
A. Clustering customers
B. Predicting house prices
C. Identifying patterns
D. Training to play games
2. Reinforcement learning learns
by:
A. Analyzing labeled data
B. Identifying patterns
C. Rewards and penalties
D. Predefined rules
3. What is the main goal of ML?
A. Store data
B. Make predictions
C. Execute rules
D. Replace hardware
(answers are highlighed in orange)
Hirdesh Viikram
Assignment questions PES1UG21EE001
Compare supervised and unsupervised learning using examples.
a) Explain their differences.
b) Provide a real-world use case for each.
Processing Ananya
Venkatesh
PES1UG21EE012
To process any image, there are 2 main properties/information:
1.Shape - (boundary of the image)
2. Region - (colour and texture within the boundary)
Ananya
Processing Venkatesh
PES1UG21EE012
Image can be converted to a feature vector for simplicity/pre-processing.
N M
M IMAGE
M
MxN
Machines learning v/s Deep learning Ananya
Venkatesh
PES1UG21EE012
Machine Learning Deep Learning
Features are manually selected No manual feature extraction;
Feature Extraction and extracted from the input the model automatically learns
before training. features from raw input data.
Significant preprocessing is Minimal preprocessing is
Preprocessing required to prepare data for the needed; raw data is directly
model. provided to the model.
Model relies on pre-defined Model learns both features and
Learning Approach features for learning and classification simultaneously
classification. through additional layers.
Challenges in deep learning algorithms Ananya
Venkatesh
PES1UG21EE012
1.Viewing angles - same object can have different viewing angles
2.Pose - objects can have different poses
3.Illumination - lighting can be different in different images for the same object
4.Intraclass variation - objects belonging to the same class can have variations
5.Distortion and Occlusion
MCQs
Ananya
1. Why is an image converted to a feature vector during preprocessing? Venkatesh
a) To represent the image in a simplified numerical form for analysis PES1UG21EE012
b) To improve the resolution of the image
c) To directly reduce noise in the image
d) To make the image visually more appealing
2. Pick the correct statement
a) ML automatically learns features, while DL requires manual feature extraction.
b) ML requires manual feature extraction, while DL automatically learns features.
c) Both ML and DL rely on extensive manual preprocessing for feature extraction.
d) Neither ML nor DL involve feature extraction during the training process.
3. Which of the following is not a common challenge faced by deep
learning models in object recognition tasks?
a) Variation in viewing angles of the same object
b) Differences in object poses
c) Consistent lighting across all images
d) Intraclass variations among objects
(answers are highlighed in orange)
Ananya
Assignment questions Venkatesh
PES1UG21EE012
Q1) Mention the differences between Machine learning and Deep learning
Q2) List the challenges of Deep learning
DEEP LEARNING
LECTURE-02
TEAM NO. 2
Department of Electrical and Electronics Engineering
TEAM MEMBERS:
1. C M SAMARTHA - PES1UG21EE023
2. GOURI GUDDAKAYU - PES1UG21EE031
3. JYOTHI - PES1UG21EE037
4. NAMAN PANJETA - PES1UG21EE049
JYOTHI - PES1UG21EE037
INTRODUCTION
To differentiate between the 2 images horse and zebra i.e., to identify which
is horse and which is zebra, we need Descriptors.
JYOTHI - PES1UG21EE037
TWO TYPES OF DESCRIPTORS
A descriptor is a characteristic or feature that describes an object or
signal. In the context of visual signals, descriptors can be used to
identify and differentiate between objects.
We can extract two types of descriptors:
1. Shape Descriptor
2. Region Descriptor
Shape Descriptor: Differentiates based on the shape of the horse
and shape of the zebra.
Region Descriptor: Differentiates based on the content, color
intensity and texture of the body of horse and body of zebra.
JYOTHI - PES1UG21EE037
SHAPE DESCRIPTOR
• If we look at the shape of these two animals, we can find that the shape boundary
is more or less the same.
• This means that, shape information or the boundary information isn’t sufficient.
This does not give us enough information/description by which we can
differentiate between these two animals.
JYOTHI - PES1UG21EE037
REGION DESCRIPTOR
• But when we look at the entire image along with considering the color,
intensity and texture, only then we can differentiate between these two
animals.
• This is known as the region descriptor.
JYOTHI - PES1UG21EE037
DESCRIPTOR
For images or objects in the real world, we can have two types of information:
• Shape information or the boundary information
• region information which includes color, intensity, texture etc.
We can obtain descriptors or features from the signals that we obtain from the real
world.
These signals can be visual signals in the form of images which we can see through
our eyes (Applications: image recognition, object detection etc)
(or) these signals can also be audio signals that we can hear. (Applications: speaker
indentification, speech to text conversion, etc)
JYOTHI - PES1UG21EE037
BOUNDARY DESCRIPTOR/FEATURE
SHAPE FEATURE/ POLYGONAL
REPRESENTATION
• Consider this shape, which is a closed
boundary in Fig.1
• Although this is a continuous curve with a Fig.1
closed boundary, we are talking about the
discrete signals or digital signals.
• So this curve is not really a closed curve
rather, it consists of a set of discrete points
as shown in Fig.2
Fig.2
JYOTHI - PES1UG21EE037
BOUNDARY DESCRIPTOR/FEATURE
To extract boundary features, an arbitrary shape can be
represented as a polygon. This is done by recursively
subdividing the shape into segments. The process involves:
1.Identifying two points on the boundary that are at
maximum distance (P & Q)
2.Drawing a straight line passing through these points,
which subdivides the boundary into two sub-boundaries
A and B
3.Computing the perpendicular distance of different points
on these boundary segments from the straight line.
JYOTHI - PES1UG21EE037
BOUNDARY DESCRIPTOR/FEATURE
4. Identifying the point with the maximum distance (S in
the upper sub boundary and R in the lower)
5. Subdivide the boundary/segment PS further into PX and
XS in the upper sub boundary and segment PR into PY and
YR in the lower sub boundary.
The resulting polygonal representation can be used to
extract different types of boundary features, such as the
properties of the polygon and the properties of the
boundary segments.
JYOTHI - PES1UG21EE037
MCQs
1. A characteristic or a feature that describes an object or a signal is called
a) boundary b) region c) descriptor d) shape
Solution: c) descriptor
2. refers to boundary or shape of an object
b) shape descriptor b) region descriptor
c) boundary feature d) polygonal represenation
solution: a) shape descriptor
3. refers to the content, color intensity and texture of an object
a) shape descriptor b) region descriptor
c) boundary feature d) polygonal represenation
solution: b) region descriptor
JYOTHI - PES1UG21EE037
Assignment Questions
1. Mention 2 applications each of extracting descriptors from visual siganls and audio signals.
2. Define: shape descriptor, region descriptor, boundary feature, polygonal representaion.
GOURI-PES1UG21EE031
SIGNATURE
A signature in shape analysis refers to a graphical representation that
characterizes a shape by measuring the distance between its
boundary points and the centroid. This distance is evaluated at
various orientations, covering a full circular range of angles.
This method provides a way to quantify and analyze the shape’s
boundary geometry by capturing essential features and variations.
GOURI-PES1UG21EE031
ILLUSTRATIVE EXAMPLE
To better understand the concept of a signature, consider a simple
geometric shape—a square.
reference line
Fig.1
GOURI-PES1UG21EE031
ILLUSTRATIVE EXAMPLE
The signature is obtained by plotting the distance from the square's
centroid to various boundary points.
This is done by analyzing the shape in relation to a reference line that
passes through the centroid.
Measurements are taken at different angles (θ) as the reference line
rotates around the centroid, completing a full 360-degree cycle.
The distance from the centroid to a boundary point, oriented at an
angle θ from the reference line, is denoted as d(θ).
GOURI-PES1UG21EE031
COMPUTING THE SIGNATURE
Method to compute the Signature.
1. Select a Reference Line:
• Start with a line passing through the centroid of the shape.
• This line serves as the baseline for measuring angular orientation.
2. Measure Distances d(θ):
• At each angle θ, calculate the distance from the centroid to a boundary point.
• This distance is referred to as d(θ), where θ represents the angle of rotation from
the reference line.
3. Plot d(θ) vs θ
GOURI-PES1UG21EE031
COMPUTING THE SIGNATURE
Record the values of d(θ) for angles θ ranging from 0° to 360°.
Create a plot with θ on the x-axis and d(θ) on the y-axis to visualize the variations
in distance.
GOURI-PES1UG21EE031
OBSERVATIONS FROM PLOT
When θ=0°
• The reference line aligns with one of the square’s sides.
• The distance d(θ)is minimal since the boundary point is closest to the centroid.
When θ=45°
• The reference line aligns with one of the square’s vertices.
• The distance d(θ)reaches its maximum as the vertex is farthest from the
centroid.
When θ=90°
• The reference line aligns with another side of the square.
• The distanced(θ)becomes minimal again, as the side is closer than the vertex.
GOURI-PES1UG21EE031
CONCLUSION
The plot alternates between minima and maxima, repeating every
90°, reflecting the symmetry and boundary features of the square.
By analyzing the signature, important boundary characteristics can
be extracted, making it a valuable boundary descriptor.
GOURI-PES1UG21EE031
MCQs
1. What is a signature in shape analysis?
A. The outline of a shape's boundary.
B. A plot of the distance between boundary points and the centroid in various
orientations.
C. The centroid of a shape in 2D space.
D. The measurement of angles between edges of a shape.
ANSWER: B
2. In the signature plot for a square, at which angle does the distance (θ) reach its
maximum?
A. θ=0° B. θ=45° C. θ=90° D. θ=180°
ANSWER: B
GOURI-PES1UG21EE031
MCQs
3. What is the significance of the signature plot in shape analysis?
A. It calculates the area of a shape.
B. It determines the color and texture of a shape.
C. It provides insights into the boundary's geometry and helps identify
unique features.
D. It measures the perimeter of a shape.
ANSWER: C
GOURI-PES1UG21EE031
Assignment Questions
1. What does a minima in the signature plot of a square represent?
2. What are the significance of the plot?
C M SAMARTHA - PES1UG21EE023
FOURIER DESCRIPTORS - Introduction
What are Fourier Descriptors?
• Mathematical representation of the shape of an object using
Fourier series.
• Used to encode boundary or contour information of images.
Why Are They Important in Image Processing?
• Compact and invariant to transformations (scaling, rotation,
translation).
• Useful for shape analysis and feature extraction.
C M SAMARTHA - PES1UG21EE023
FOURIER DESCRIPTORS
How Fourier Descriptors Work?
•Basic Idea:
◦ Represent a closed contour (boundary) of an image in a parametric
form.
◦ Apply the Fourier transform to describe the shape as coefficients in
the frequency domain.
• Steps:
• Extract boundary coordinates (x, y).
• Represent as a complex number
• Perform Fourier Transform
C M SAMARTHA - PES1UG21EE023
FOURIER TRANSFORM
Fourier Transform:
• “Zk” are the Fourier coefficients.
Fourier Descriptor:
C M SAMARTHA - PES1UG21EE023
BOUNDARY REPRESENTATION
Parametric Boundary Representation:
Properties:
•Translation invariance: Subtract the centroid.
•Scale invariance: Normalize magnitudes of descriptors.
•Rotation invariance: Use magnitude of Fourier coefficients.
C M SAMARTHA - PES1UG21EE023
SHAPE ANALYSIS
Fourier Descriptors for Shape Analysis:
Shape Reconstruction:
• Use low-frequency Fourier coefficients to approximate the shape.
• High-frequency coefficients capture finer details.
Visualization:
• Original boundary vs. reconstructed boundary using a subset of Fourier coefficients.
C M SAMARTHA - PES1UG21EE023
FOURIER DESCRIPTORS IN DEEP LEARNING
Role in Deep Learning:
• Fourier descriptors are used for pre-processing or feature extraction.
• Compress contour data into meaningful representations.
• Reduce dimensionality while preserving shape information.
How Deep Learning Integrates Fourier Descriptors:
• Use Fourier descriptors as features for training CNNs or RNNs.
• Feed Fourier-transformed data into neural networks for robust shape recognition.
C M SAMARTHA - PES1UG21EE023
APPLICATIONS IN IMAGE PROCESSING
Applications in Image Processing
Use Cases:
• Medical Imaging:
Analyze tumor shapes or organ boundaries.
• Autonomous Vehicles:
Detect and classify road signs based on contours.
• Handwriting Recognition:
Use Fourier descriptors to analyze handwritten text contours.
C M SAMARTHA - PES1UG21EE023
MCQs
1.) What is the primary purpose of using Fourier Descriptors in image
processing?
A. To enhance the brightness of an image.
B. To represent and analyze the shape of an object in an image.
C. To extract color information from an image.
D. To improve texture details in an image.
Correct Answer: B
C M SAMARTHA - PES1UG21EE023
MCQs
2.)In Fourier descriptors, how is the contour of a shape typically represented
mathematically?
A. Using the pixel intensity values of the image.
B. Using the histogram of the image's gradient magnitudes.
C. Using a complex function to evaluate boundary coordinates.
D. Using a convolution kernel applied to the image.
Correct Answer: C
C M SAMARTHA - PES1UG21EE023
MCQs
3.)Which of the following properties of Fourier Descriptors makes them robust for
shape-based image analysis?
A. They are invariant to transformations such as scaling, rotation, and translation.
B. They are sensitive to high-frequency noise in the image.
C. They directly analyze the texture and color of the image.
D. They increase the dimensionality of shape representation.
Correct Answer: A
C M SAMARTHA - PES1UG21EE023
Assignment Questions
1.) What is the role of normalization in improving classification accuracy?
2.) How could Fourier descriptors be used in real-world applications like handwriting
recognition?
GOURI-PES1UG21EE031
FOURIER CO-EFFICIENTS
When we truncate the Fourier coefficient , we lose information about
the shapes.
The effect of truncation can be seen by
1. Considering only a few Fourier coefficient.
2. Taking the inverse Fourier transformation to reconstruct the
boundary.
GOURI-PES1UG21EE031
FOURIER CO-EFFICIENTS
Impact of Coefficient Truncation
Full Reconstruction:
To fully reconstruct the original sequence, all N coefficients are
required.
Using all a(u) in the Inverse Fourier Transformation
a(u): Fourier coefficients for u=0,1,…,N−1.
s(k): Original sequence or signal values.
N: Total number of points in the sequence.
This ensures precise recovery of the original boundary or signal.
GOURI-PES1UG21EE031
FOURIER CO-EFFICIENTS
Truncated Reconstruction:
When only P coefficients are used (P<N), some information is lost.
Reconstruction formula,
Here, P determines the level of detail retained in the reconstructed
boundary.
GOURI-PES1UG21EE031
RECONSTRUCTION EXAMPLE
Case 1: Low P
Consider P=2, using only the first two coefficients (a(0) and a(1)):
• Reconstruction captures the general trend, resulting in a circular shape.
• Sharp details such as corners (e.g., of a square) are lost.
Case 2: Moderate P
With P=10, more coefficients are included:
• Reconstruction improves, capturing some boundary details.
• The shape is neither a perfect circle nor a fully detailed square, but an
intermediate approximation.
GOURI-PES1UG21EE031
RECONSTRUCTION EXAMPLE
Case 3: High P
As P approaches N (e.g., P=128 for N=128):
• Reconstruction becomes increasingly accurate.
• Sharp corners and detailed boundary features reappear, closely resembling
the original shape.
GOURI-PES1UG21EE031
OBSERVATIONS
Truncation Effects:
• Excluding high-order coefficients simplifies the signal, reducing detail in the
boundary.
• Low-order coefficients dominate overall shape, while high-order coefficients
add fine details.
Reconstruction Accuracy:
• Complete reconstruction is possible only when all
• N coefficients are used.
• Truncation may be useful in applications where only approximate boundary
shapes are required.
GOURI-PES1UG21EE031
OBSERVATIONS
Through Fourier transformation, the reconstructed shape may approximate
a circular form. This occurs because lower-order coefficients primarily
capture the general trend of the signal, while higher-order coefficients are
responsible for detailing its finer features. In the case of a square shape,
these details include the sharp corners and vertices. When the higher-order
coefficients are truncated, the detailed information is lost, resulting in a
simplified reconstruction lacking the original shape's precise characteristics.
GOURI-PES1UG21EE031
VISUALIZATION OF RECONSTRUCTION
• Original Shape: A square boundary with N points.
• With P=2: Circular shape with no sharp corners.
• With P=10: An intermediate shape showing partial details.
• With P=N: Full recovery of the square with sharp edges.
GOURI-PES1UG21EE031
CONCLUSION
Fourier coefficients are crucial for understanding and reconstructing boundary
shapes.
Low-order coefficients provide general trends, while high-order coefficients reveal
finer details.
Truncation simplifies reconstruction but sacrifices detail, which may or may not be
desirable depending on the application.
The Inverse Fourier Transformation allows controlled reconstruction of boundaries,
demonstrating the balance between accuracy and computational simplicity
GOURI-PES1UG21EE031
MCQs
1. What happens to the reconstructed shape when higher-order coefficients are
truncated?
A. It becomes more detailed.
B. It remains unchanged.
C. Detailed information, such as sharp corners, is lost.
D. The shape turns into a square.
ANSWER: C
2. In Fourier Transformation, what does P represent?
A) Number of original boundary points.
B) Number of truncated points.
C) Number of coefficients used for reconstruction.
D) Total number of vertices.
ANSWER: C
GOURI-PES1UG21EE031
MCQs
3. What happens when all higher-order coefficients are truncated in Fourier
reconstruction?
A. The reconstructed shape is identical to the original.
B. The shape becomes circular due to loss of details.
C. The shape turns into a straight line.
D. The shape remains unchanged.
ANSWER: B
GOURI-PES1UG21EE031
Assignment Questions
1. Why does a square shape become circular when P=2?
2.Describe the impact of low-order and high-order Fourier coefficients
on signal representation.
Deep Learning
Statistical Moments – PES1UG21EE049 (Naman)
Statistical moments are a powerful tool for analyzing shapes, going
beyond simple boundary descriptions. By normalizing a shape’s
boundary, we can calculate these moments to gain a deeper
understanding of its features and characteristics.
Definition: Statistical moments are quantitative measures that
describe the shape and characteristics of a data distribution.
Importance: They provide insights into various aspects like central
tendency, dispersion, and the shape of the data, which are crucial for
data analysis and modeling.
Deep Learning
Statistical Moments – PES1UG21EE049 (Naman)
Relevance of Moments in Deep Learning
• Data Preprocessing: Understanding moments helps in normalizing
and standardizing data, ensuring efficient training of models.
• Feature Engineering: Moments can be used to create features that
capture essential aspects of the data distribution, improving model
performance.
• Model Evaluation: Analyzing the moments of residuals can help in
assessing model accuracy and diagnosing issues like overfitting.
Deep Learning
Statistical Moments – PES1UG21EE049 (Naman)
The Four Primary Moments
1.First Moment – Mean
1.Represents the average value of the data set.
2.Indicates the central location of the data.
2.Second Moment – Variance
1.Measures the dispersion or spread of the data around the mean.
2.A higher variance signifies more spread out data points.
3.Third Moment – Skewness
1.Assesses the asymmetry of the data distribution.
2.Positive skewness indicates a longer tail on the right; negative skewness
indicates a longer tail on the left.
4.Fourth Moment – Kurtosis
1.Evaluates the "tailedness" of the distribution.
2.High kurtosis implies more data in the tails; low kurtosis indicates a flatter
distribution.
Deep Learning
Statistical Moments – PES1UG21EE049 (Naman)
1. What does the second moment of a data distribution represent?
a) Mean
b) Skewness
c) Variance
d) Kurtosis
Answer: c) Variance
2. In deep learning, why are moments important in batch normalization?
a) To initialize weights properly
b) To normalize inputs using mean and variance
c) To reduce computational complexity
d) To calculate gradients directly
Answer: b) To normalize inputs using mean and variance
3. Which of the following moments evaluates the asymmetry of a data distribution?
a) Mean
b) Variance
c) Skewness
d) Kurtosis
Answer: c) Skewness
Deep Learning
Statistical Moments – PES1UG21EE049 (Naman)
Assignment Questions
1. Explain how statistical moments (mean, variance, skewness, and
kurtosis) are utilized in data preprocessing and model evaluation in
deep learning.
2. Discuss the significance of skewness and kurtosis in understanding
data distribution for machine learning models. How can these
moments affect model training and performance?
THANK YOU
Department of Electrical and Electronics Engineering
Enhancing Audio Signal Processing in Deep
Learning: The Role of Region Deceptors
Department of Electrical and Electronics Engineering
Introduction to Audio Signal Processing
In recent years, audio signal processing has gained
significant attention in deep learning. This
presentation explores the role of region deceptors
in enhancing audio analysis and recognition tasks,
focusing on their impact on performance and
accuracy.
Understanding Region Deceptors
Region deceptors are specialized components
that enhance the extraction of features from
audio signals. They focus on specific frequency
ranges and temporal patterns, improving the
model's ability to discern important
characteristics in audio data.
Understanding Region Deceptors
Various deep learning models such as CNNs and
RNNs are used for audio signal processing. By
integrating region deceptors, these models can
achieve higher accuracy and better robustness in
tasks like speech recognition and music
classification.
Benefits of Region Deceptors
Incorporating region deceptors leads to
significant benefits, including enhanced feature
representation, reduced computational costs,
and improved generalization across different
audio datasets. This makes them a valuable tool
in audio signal processing.
Benefits of Region Deceptors
Despite their advantages, implementing region
deceptors poses challenges such as complexity
in model design and the need for extensive
training data. Addressing these issues is crucial
for maximizing their potential in audio
processing applications.
Conclusion and Future Directions
In conclusion, region deceptors play a pivotal
role in enhancing audio signal processing within
deep learning frameworks. Continued research
and innovation in this area will likely lead to
further advancements in audio technology and
applications.
THANK YOU
Department of Electrical and Electronics Engineering
INTRODUCTION TO
DEEP LEARNING
Niharika S Chauhan
PES1UG21EE054
Department of Electrical and
Electronics Engineering
Deep Learning
Intensity Histogram
An intensity histogram is a graphical representation that shows the
distribution of intensity values in an image.
Deep Learning
Intensity Histogram
Each pixel in a black and white image is assigned an intensity value, which
is typically quantized using an 8-bit binary number.
The histogram H(i) for an intensity value (i) indicates how many pixels in
This means intensity values can range from 0 (black) to 255 (white).
the image have that specific intensity.
To understand the probability of occurrence of each intensity level, the
histogram can be normalized. By normalizing the histogram, we can obtain
the intensity probability distribution, which highlights the occurrence of
intensity levels, often indicating characteristics of the image such as
brightness.
Deep Learning
Intensity Histogram
The normalized histogram H(i) is calculated
H(i) �
as:
= 𝑁𝑖
where 𝑁𝑖 is the number of pixels with intensity (i),
�
and N is the total number of pixels in the image.
value (i).
This gives the probability of a pixel having an intensity
Deep Learning
Intensity Histogram – Example 1
A histogram skewed towards lower intensity values indicates that
most pixels are dark, resulting in a darker image.
Deep Learning
Intensity Histogram – Example 2
Conversely, if the histogram shows higher intensity values, the
image appears brighter.
For example, if most intensity values cluster around 70
(between 50 and 100) , the image will be brighter compared to
one where most values are close to 0.
Deep Learning
Intensity Histogram
Applications:
• Contrast Enhancement: Improves the visibility of features in an image by
adjusting the intensity levels based on the histogram distribution.
• Image Segmentation: Aids in identifying and isolating specific regions of
interest within an image based on intensity values.
• Image Equalization: Enhances the overall contrast of an image by
redistributing intensity values to achieve a uniform histogram.
• Feature Extraction: Assists in identifying key features in an image for further
analysis or processing tasks.
Deep Learning
MCQs
Multiple Choice Questions:
Q1) What does an intensity histogram represent in an image?
A)The color distribution of the image
B)The distribution of intensity values of pixels
C)The shape of objects in the image
D)The size of the image
Answer: (B)
Q2) In an 8-bit image, what is the maximum intensity value a pixel can have?
A)128 C)512
B)255 D)1024
Answer: (B)
Deep Learning
MCQs
Multiple Choice Questions:
Q3) What is the purpose of normalizing an intensity histogram?
A)To change the color of the image
B)To reduce the size of the image
C)To determine the frequency of occurrence of each intensity level
D)To enhance the resolution of the image
Answer: (C)
Deep Learning
Assignment Questions
Assignment Questions:
Q1) What is the significance of intensity profiles in region descriptors?
Q2) What does a high frequency of low intensity values in a histogram indicate
about an image? What effect does a histogram with a peak around higher
intensity values have on an image’s appearance.
THANK YOU
Niharika S Chauhan
PES1UG21EE054
Department of Electrical and
Electronics Engineering
[email protected]
du
INTRODUCTION TO
DEEP LEARNING
Neha Shivaraj
PES1UG21EE053
Department of Electrical and
Electronics Engineering
Deep Learning
Colour Histograms - Introduction
A colour Histogram is a representation of
the distribution of colours in an image
Deep Learning
Colour Histograms of Basic Colours
Distribution of the strength of
Red colour in the figure
Distribution of the strength of Green colour in the
figure
Distribution of the strength of
Blue colour in the figure
Deep Learning
Statistics in Colour Histograms
Normalized Histogram,
h(𝑟𝑖): Frequency of
value, 𝑟𝑖
occurrence of an intensity
𝜇 = 𝑟𝑖ℎ(𝑟𝑖)
Mean of intensity:
∀𝑖
Normalized
Histogram
Statistical moment of order k:
K = 2, variance Spread of
𝜎𝑘 = (𝑟 𝑖 − 𝜇)𝑘ℎ(𝑟𝑖)
histogram K = 3, skewness of
∀𝑖
histogram
Deep Learning
Texture Descriptor
Pixel Domain / Co-occurrence Matrix
• An image is a two dimensional
array
of integers.
• Intensity is a represented by an
eight bit number varies from 0
to 255.
• A rectangular or square area
within
image gives a Matrix of int
numbers.
• Given the matrix, texture is
Deep Learning
MCQs
Multiple Choice Questions:
Q1) What does a colour histogram represent?
A)The colour distribution of the image
B) The size of the pixels in image
C) Shape of different coloured objects
D)The distribution of intensity values.
Answer: (A)
Q2) Which of these colours are not basic in colour histogram
A) Blue B) Yellow
C) Green D) Red
Answer: (B)
Deep Learning
MCQs
Multiple Choice Questions:
Q3) What do 𝜎𝑘 𝑎𝑛𝑑 𝜇
represent?
A) Mean and Median of colour intensity Median
B) and Mode of intensity
C) Mean of colour intensity and Statistical
D) moment of order k
Statistical
Answer: (C) moment of order k and mode of
intensity
Deep Learning
Assignment Questions
Assignment Questions:
Q1) What is the difference between intensity and colour histogram. What are the
basic colours used for colour histograms?
Q2) What is a textured image? Describe a co-occurrence, Matrix.
THANK YOU
Neha Shivaraj
PES1UG21EE053
Department of Electrical and
Electronics Engineering
[email protected]
du
INTRODUCTION TO
DEEP LEARNING
Nathan Samuel
PES1UG21EE051
Department of Electrical and
Electronics Engineering
Deep Learning
Understanding the Concept
A concurrence matrix represents relationships or co-occurrences
between entities.
Commonly used in:
• Natural Language Processing (NLP): Word co-occurrence matrix.
• Computer Vision: Texture feature extraction with GLCM.
• Model Evaluation: Confusion matrix.
• Graph Neural Networks: Adjacency matrix.
• Attention Mechanisms: Transformers' attention matrix.
Deep Learning
Real-World Applications
• NLP: Word co-occurrence
matrices in GloVe for word
embeddings.
• Semantic relationships in text.
• Computer Vision: GLCM for
extracting texture features in
medical imaging and pattern
recognition.
• Model Evaluation: Confusion
matrix for accuracy, precision,
and recall metrics.
• Transformers (NLP and Vision):
Attention matrix for analysing
token/token relationships.
Deep Learning
Concurrence Matrix
Takeaways:
• Versatility: Concurrence matrices play a
vital role in analyzing and improving
models across domains.
• Integration with Deep Learning: These
matrices enable advanced techniques like
embeddings, attention mechanisms, and
GNNs.
• Future Directions: Enhanced
interpretability and scalability of
concurrence matrices in larger datasets
and complex models.
Deep Learning
Challenges and Future Directions
Challenges Future Directions
Scalability: High computational cost for large Efficient Representations: Use of compressed
datasets (e.g., large vocabularies in NLP or or low-rank approximations for scalability.
dense graphs).
Sparsity: Co-occurrence matrices can become Integration with Neural Architectures:
sparse, leading to inefficiencies in Advanced use in graph neural networks and
representation and computation. hybrid models.
Interpretability: Complex matrices (e.g., Enhanced Interpretability: Developing better
attention scores) are hard to interpret in visualization tools for analyzing matrix-based
large-scale models. relationships (e.g., attention heatmaps).
Deep Learning
MCQs
Multiple Choice Questions:
Q1) What is the primary purpose of a co-occurrence matrix in Natural Language Processing (NLP)?
A) To evaluate the performance of a B) To capture the relationships or
classification model frequencies of word pairs in a given context
C) To extract texture features from an D) To calculate the adjacency relationships in
image a graph
Answer: (B)
Q2) Which of these techniques does not require manual feature extraction ?
A) Confusion Matrix B) Gray-Level Co-occurrence Matrix (GLCM)
C) Attention Matrix D) Adjacency Matrix
Answer: (B)
Deep Learning
MCQs
Multiple Choice Questions:
Q3) In transformers, the attention matrix represents:
A)The co-occurrence of words in a text corpus
B)The frequency of pixel intensity relationships in an image
C)The relationships between tokens based on their relevance to each
other
D)The classification results of a model Answer: (C)
Deep Learning
Assignment Questions
Assignment Questions:
Q1) What is the significance of a co-occurrence matrix in Natural Language
Processing (NLP), and how is it constructed?
Q2) Describe the role of the Gray-Level Co-occurrence Matrix (GLCM) in texture
analysis for computer vision applications.
THANK YOU
Nathan Samuel
PES1UG21EE051
Department of Electrical and
Electronics Engineering
[email protected]
du
INTRODUCTION TO
DEEP LEARNING
Aditya K.N.
PES1UG20EE003
Department of Electrical and
Electronics Engineering
Deep Learning
Spectral Domain- MFCC
Mel Frequency Cepstral Coefficients(MFCC) is one of the very popular
Vector representation techniques used in the field of Deep Learning.
• MFCCs are designed to capture the most relevant features of the
human auditory system. They approximate the way humans perceive
sound, particularly speech, by focusing on frequencies that are more
perceptible to the human ear.
• MFCCs are used to represent the short-term power spectrum of
sound, particularly for speech recognition, speaker identification,
and audio classification.
Deep Learning
Spectral Domain- MFCC
Steps for the computation of MFCC are :-
1.Take the Fourier Transform of the speech samples.
2.Convert the obtained Fourier frequency coefficients into Mel frequency coefficients.
–Use Triangular overlapping windows to map powers on Mel scale.
–This is required because our auditory system is not that much sensitive to higher frequency
components as compared to the lower frequency components.
3. Take the logarithm of these Mel frequency coefficients.
– This is necessary because our auditory system is not sensitive to the loud signals as compared to
the signals which are not too loud.
4.Take the discrete Fourier transform ( especially cos transform ) on the Mel frequency coefficients
on which logarithm has been applied.
5.The obtained discrete Fourier transformation coefficients are nothing but the MFCC coefficients.
– Discrimination on different speakers and identification of the different spoken words are possible
using MFCC coefficients.
Deep Learning
Traditional Machine Learning vs Deep Learning
Traditional Machine Learning Deep Learning
Often requires manual feature extraction. Does not need manual feature extraction.
The input raw signal during the training The input raw signal during the training
phase is fed to the feature extractor. phase is fed to the Machine Learning
algorithm directly.
The decision on the input raw signal during the The decision taken from the input raw signal
training phase is taken using the Machine during the training phase is the output (test
Learning Algorithm. data) of the Machine Learning Algorithm.
The input raw signal during the training The input raw signal during the training phase
phase is typically represented as vectors. can be represented in various forms, not
necessarily as vectors.
The feature vectors during the training phase The feature vectors during the training phase
are to be decided by the user or human being are to be decided by the machine and even
and accordingly have the feature extraction learnt by the machine itself.
algorithm.
Deep Learning
Representation of the input raw signal as set of
vectors
3x3 MATRIX OF IMAGE PIXELS FROM THE VISUAL DOMAIN VECTOR REPRESENTATION
7
7 4 3
5
9
5 1 6 4
1
2
3
9 2 8 6
8
Deep Learning
MCQs
Multiple Choice Questions:
Q1) is a technique used to capture the most relevant features of the human auditory system.
A)Wavelet Transform B)MFCC
C)SIFT D)Multimodal embeddings
Answer: (B)
Q2) Which of these techniques does not require manual feature extraction ?
A) Traditional Machine Learning B) Artificial Neural Network
C) Random Forest D) Deep Learning
Answer: (D)
Deep Learning
MCQs
Multiple Choice Questions:
Q3) Which of the following Transforms are taken during the computation of MFCC in audio samples ?
A)Mellin B)
C) Laplace
Fourier D) Z
Answer: (C)
Deep Learning
Assignment Questions
Assignment Questions:
Q1) Mention the steps for the computation of Mel Frequency Cepstral Coefficients.
Q2) Mention the differences between Traditional Machine Learning and Deep
Learning.
THANK YOU
Aditya K.N.
PES1UG20EE003
Department of Electrical and
Electronics Engineering
[email protected]
du
Bayesian Learning - I
Introduction
Signals as Feature Vectors :
•Signals (e.g., voice or visual) can be transformed into feature representations.
•Features can be derived from shape, region, intensity, color, or texture.
Feature Vector Concept :
•Extracted features are organized in a vector format.
•Example: A signal with 3 features becomes a 1D vector with 3 components.
•This vector represents a point in a 3D feature space.
Bayesian Learning - I
Introduction – Transition to High Dimensional Feature
Space
Generalization to d Dimensions:
•Extract d features or descriptors.
•Arrange these descriptors to form a vector with d elements.
Mapping Signals to Feature Space:
•Each signal is represented by a unique vector in a d-dimensional space.
•Multiple signals create corresponding points/vectors in this space.
Bayesian Learning - I
Signal Comparision using Feature Vectors
Similarity and Dissimilarity:
•Compute the difference between two vectors to measure similarity.
•Low difference: Signals are similar.
•High difference: Signals are dissimilar.
Applications:
•Pattern recognition.
•Signal classification.
•Feature-based comparison and analysis.
Bayesian Learning - I
MCQ’s
1. Which of the following best describes the concept of a "Feature Vector" as
presented in the image?
a) A mathematical representation of a signal's characteristics.
b) A graphical representation of a signal's waveform.
c) A numerical value representing the amplitude of a signal.
d) A physical measurement of a signal's frequency.
Answer: a
Bayesian Learning - I
MCQ’s
2. Which of the following best describes the concept of "Generalization to d Dimensions" as
presented in the image?
a) The process of reducing the dimensionality of a signal representation.
b) The ability to represent signals with varying complexity using a fixed number of features.
c) The process of extracting d features or descriptors from a signal and representing them as
a vector.
d) The ability of a machine learning model to generalize to unseen data.
Answer: c
Bayesian Learning - I
MCQ’s
3. Which of the following statements best describes the relationship between feature vector
differences and signal similarity, as depicted in the image?
a) Large differences between feature vectors indicate high similarity between signals.
b) Small differences between feature vectors indicate low similarity between signals.
c) The magnitude of the difference between feature vectors has no relation to signal
similarity.
d) Small differences between feature vectors indicate high similarity between signals.
Answer: d
Bayesian Learning - I
Assignment Questions
1.What is a feature vector, and how is it used in signal comparison?
A feature vector is a mathematical representation of a signal's characteristics. It is created by extracting relevant
features from the signal and organizing them into a vector format. In signal comparison, the difference between
the feature vectors of two signals is computed. Smaller differences indicate higher similarity between the signals.
2.What are some applications of feature vector-based signal comparison mentioned in the image?
The image lists several applications, including:
•Pattern recognition
•Signal classification
•Feature-based comparison and analysis
These applications utilize the concept of comparing feature vectors to differentiate between signals, group similar
signals, and extract meaningful information from the feature representations.
Bayesian Learning - I
Feature Space Representation
• What is Feature Space? : A multidimensional space where features of data
points are represented as vectors.
• Dimensionality:
• Each dimension corresponds to a feature of the data.
• Example: A 3D feature space can represent height, weight, and age of
individuals.
• Purpose: To analyze relationships and patterns among data points.
Bayesian Learning - I
Feature Space Representation
Understanding Feature Clustering
• Feature Vectors Representation: Signals are represented as vectors in a
multidimensional feature space, e.g., a 3D space or higher.
• Clusters Formation: Signals of similar types (e.g., images of birds, cars, or dogs) form
clusters in the feature space.
• Cluster Characteristics:
• Similar signals form tightly packed clusters.
• Dissimilar signals are represented by vectors that are farther apart in the feature
space.
Bayesian Learning - I
Probability Density Representation
Feature Distribution within Clusters
• Density at the Center: High density at the center of the cluster indicates the mean location
of feature vectors.
• Decreasing Density: Density reduces as the distance from the center increases.
• Probability Density Functions:
• Clusters follow specific probability density functions (PDFs), such as circular or elliptical in
2D and spherical or ellipsoidal in higher dimensions.
Bayesian Learning - I
Surface Plots
3D Surface Plots of Class Distributions
• Class Conditional Density Functions: Represents the distribution of feature vectors for a
given class.
• 3D Representation: Probability density shown as surface plots for visualizing cluster
distributions.
• Examples:
• Class “Birds” → One surface plot.
• Class “Cars” → Another surface plot.
Bayesian Learning - I
MCQ’s
1. What does a feature space represent in the context of data analysis?
a) A space where signals are converted into audio forms.
b) A multidimensional space where features of data points are represented as vectors.
c) A single-dimensional representation of signals.
d) A space limited to only 2D features.
Answer: b
Bayesian Learning - I
MCQ’s
2. What indicates that two signals are similar in a feature space?
a) The vectors representing the signals are far apart.
b) The signals have different clusters.
c) The distance between the feature vectors is small.
d) The signals are represented in a 3D space.
Answer: c
Bayesian Learning - I
MCQ’s
3. What does the density at the center of a cluster in feature space signify?
a) The number of classes in the dataset.
b) A high concentration of feature vectors around the mean location.
c) A low likelihood of feature vector occurrence.
d) The similarity between dissimilar signals.
Answer: b
Bayesian Learning - I
Assignment Questions
1. Explain the concept of clustering in feature space.
2. Define feature vectors and explain how they are used to represent data in a feature space.
3. What is the significance of density in a cluster?
Bayesian Learning - I
Bayes Rule
Bayesian Learning is a probabilistic approach to machine learning that is
based on Bayes' Theorem. It is used to update the probability estimate for
a hypothesis as new evidence or data becomes available. This method
provides a robust framework for dealing with uncertainty and
incorporating prior knowledge into the learning process.
Bayesian Learning - I
Bayes Rule
Bayes' Rule (or Bayes' Theorem) is a fundamental concept in probability theory and
statistics that describes how to update the probability of a hypothesis A and given new
evidence B.
The rule is mathematically expressed as:
Where ,
P(A|B) :The posterior probability of the hypothesis A after observing evidence B.
P(B|A) : The likelihood of observing evidence B given that A is true.
P(A) : The prior probability of the hypothesis A (before observing the evidence).
P(B) : The marginal probability of the evidence B (the probability of observing B
under all possible hypotheses).
Bayesian Learning - I
Bayes Rule – Steps Involved
1.Define the Prior: Begin with an initial belief about the
parameters of the model.
2.Collect Data: Gather new evidence or observations.
3.Compute the Likelihood: Evaluate how well the data fits the
hypothesis.
4.Update Beliefs: Use Bayes' Rule to compute the posterior
distribution.
5.Make Predictions: Use the posterior distribution to make
probabilistic predictions about new data.
Bayesian Learning - I
Bayes Rule – Applications
1. Spam Filtering: Classify emails as spam or not based on word
probabilities.
2. Medical Diagnosis: Update the probability of diseases given symptoms
or test results.
3. Anomaly Detection: Identify unusual patterns in data (e.g., fraud
detection).
4. Bayesian Neural Networks: Introduce uncertainty in deep learning
models.
5. Recommendation Systems: Adapt user preferences dynamically with
new data.
Bayesian Learning - I
Bayes Rule – Advantages
1.Incorporates Prior Knowledge: Leverages existing
information for informed decision-making.
2.Handles Uncertainty: Provides a full probability distribution
rather than point estimates.
3.Adaptive: Updates beliefs dynamically as new data arrives.
Bayesian Learning - I
Bayes Rule – Disadvantages
1. Computational Complexity: Calculating the posterior can be
expensive, especially with complex models.
2. Choosing a Prior: The choice of prior can significantly influence results,
especially with limited data.
3. Intractable Integrals: Exact computation of the evidence can be
challenging in high dimensions, requiring approximations like Markov
Chain Monte Carlo (MCMC) or Variational Inference.
Bayesian Learning - I
Bayes Rule – MCQ’s
1. What is the primary advantage of Bayesian learning compared
to other probabilistic methods?
A) It always guarantees the best hypothesis.
B) It updates the probability of hypotheses as new data is
observed.
C) It requires no prior assumptions about the data.
D) It works only for small datasets.
Answer: B
Bayesian Learning - I
Bayes Rule – MCQ’s
2. In Bayesian learning, the term posterior probability refers to:
A) The probability of the prior hypothesis.
B) The likelihood of the data given the hypothesis.
C) The probability of the hypothesis given the observed data.
D) The normalization factor in Bayes' theorem.
Answer: C
Bayesian Learning - I
Bayes Rule – MCQ’s
3. Which of the following is NOT a component of Bayes' theorem?
A) Prior probability
B) Posterior probability
C) Gradient descent
D) Likelihood
Answer: C
Bayesian Learning - I
Bayes Rule – Assignment
1.)You are tasked with diagnosing a rare disease. The probability of
having the disease in the general population is 0.01 (1%). A
diagnostic test has the following properties:
•If a person has the disease, the test correctly identifies it 95% of
the time (true positive rate).
•If a person does not have the disease, the test incorrectly indicates
the disease 5% of the time (false positive rate).
A patient tests positive. Using Bayes' theorem, calculate the
probability that the patient actually has the disease.
Bayesian Learning - I
Bayes Rule – Assignment
2. Suppose you have a biased coin where the probability of heads is
P(H)=0.7 and the probability of tails is P(T)=0.3.
You observe the following data: you flipped the coin 5 times and got 3
heads and 2 tails.
If you want to calculate the posterior probability of the coin being biased
towards heads given the observed data, use Bayes' Theorem. Assume a
uniform prior for the bias of the coin.
Bayesian Learning - I
Bayes Minimum Error Classifier
● The Bayes Minimum Error Classifier is based on Bayes' Theorem.
● It aims to minimize the probability of misclassification.
● It is optimal when class priors and likelihoods are known.
Bayesian Learning - I
Decision Rule
A decision rule is a guideline or criterion used to make a choice between
alternatives based on available data or conditions. It defines how decisions are
made in various contexts, such as statistics, machine learning, business, and
The classifier assigns 𝑥 to class ωi if:
everyday problem-solving.
This ensures minimal probability of misclassification.
Bayesian Learning - I
Decision Boundaries
The decision boundary is the surface that separates regions where different classes have
the highest posterior probability.
For a two-class problem (ω1and ω2), the decision boundary is where:
Bayesian Learning - I
Minimum Error rate (Bayes Risk)
The minimum classification error is given by:
Bayesian Learning - I
Summary
● The Bayes Minimum Error Classifier assigns samples to the class with the highest
posterior probability.
● It is optimal but requires knowledge of priors and likelihoods.
● Used in pattern recognition, spam filtering, and medical diagnosis.
Bayesian Learning - I
MCQ’s
Which of the following correctly describes the decision rule in the Bayes
Minimum Error Classifier?
A.A sample x is assigned to the class with the highest likelihood P(x∣ωi).
B.A sample x is assigned to the class with the highest posterior probability
P(ωi∣x).
C.A sample x is assigned to the class with the highest prior probability P(ωi).
D.A sample x is assigned to the class with the lowest evidence P(x).
Answer: B
Bayesian Learning - I
MCQ’s
The decision boundary is the surface where:
A.P(ω1∣x)<P(ω2∣x)
B.P(ω1∣x)>P(ω2∣x)
C.P(ω1∣x)=P(ω2∣x)
D.P(ω1)=P(x)
Answer:C
Bayesian Learning - I
MCQ’s
What is the primary objective of Bayes' Minimum Risk
Classifier?
A.To maximize the probability of misclassification
B.To minimize the probability of misclassification
C.To ignore classification errors
D.To always choose the majority class
Answer: B
Bayesian Learning - I
Assignment Questions
1. Write Bayes’ theorem formula
2. What is a Decision Rule?
Bayesian Learning - I
Bayes Minimum Risk Classifier
What is Classification?
•Classification is the process of assigning a label or category to an
input based on some features.
Types of Classifiers:
•Bayes Classifier
•Support Vector Machine (SVM)
•Decision Trees, etc.
Bayesian Learning - I
Risk in Classification
Risk in Classification:
• In classification, risk refers to the expected loss when predicting
a class label.
• Loss Function (L): Defines the penalty for misclassification
• Expected Risk (or Loss):
R(h) = ∑y P(Y =y ∣X) L(h(X), y)
• R(h) is the expected loss for classifier
• L(h(X),y) is the loss when the classifier predicts h(x)
Bayesian Learning - I
BMR Classifier - Objective
Objective of BMR Classifier:
The goal is to choose the class label Y^ to minimize the conditional risk
Bayesian Learning - I
Summary
• The Bayes Minimum Risk Classifier is an optimal decision rule that minimizes expected
loss.
• It is highly flexible, as it can work with various loss functions.
• Understanding the underlying Bayes theorem and risk concepts is key to using BMR
effectively.
Bayesian Learning - I
Summary
• The Bayes Minimum Risk Classifier is an optimal decision rule that minimizes expected
loss.
• It is highly flexible, as it can work with various loss functions.
• Understanding the underlying Bayes theorem and risk concepts is key to using BMR
effectively.
Bayesian Learning - I
MCQ’s
Which of the following represents the Bayes decision rule for a classifier?
A.Y^=arg maxy P(Y=y∣X)
B.Y^=arg miny ∑y′ P(Y=y′∣X)L(y,y′).
C.Y^=arg maxX P(Y=y∣X)
D.Y^=arg minX P(Y=y∣X)
Answer: B
Bayesian Learning - I
MCQ’s
In the context of the Bayes Minimum Risk (BMR) Classifier, what is the
primary objective of minimizing risk?
A.To maximize the posterior probability of the predicted class
B.To minimize the expected loss associated with misclassification
C.To maximize the likelihood of observing the features
D.To reduce the number of false positives in the classification
Answer: B
Bayesian Learning - I
MCQ’s
If you are using a 0-1 loss function, which decision rule should the Bayes
Minimum Risk Classifier follow?
A.Choose the class with the maximum prior probability
B.Choose the class with the maximum posterior probability
C.Choose the class with the minimum likelihood
D.Choose the class with the highest loss
Answer: B
Bayesian Learning - I
Assignment Questions
1) State the difference between Bayes Minimum Error Classifier and Bayes Minimum Risk
Classifier.
2)What is the role of minimizing the classification error in the context of this classifier.
Thank You
Sharannya
Rishabh
Sreekar
Jahirkhan
Priya
Bayesian Learning II
Department of Electrical and Electronics
Engineering
Team 5:
PES1UG21EE006 - Adeeba
Moghis
PES1UG21EE010 - Akshay
Kumar
PES1UG21EE029 - Gaurav
Kumar
PES1UG21EE036 - Jayanth S R
PES1UG21EE092 - Shravani G.
Deep Learning
Bayesian Learning II
Bayes' Theorem and
Feature Space Distribution
Jayanth S R
PES1UG21EE036
Deep Learning
Bayesian Learning II
Introduction to Feature Space
Feature Space is the distribution of features (variables) in the feature
space, which represents all the possible values of the input data.
Deep Learning
Bayesian Learning II
Understanding Bayes’
Theorem
Bayes’ Theorem Formula:
• Posterior (P(ωi∣X): Updated probability of hypothesis ωiafter
evidence X.
• Likelihood (P(X∣ωi)): How likely evidence X is if hypothesis ωiis
true.
• Prior (P(ωi)): Initial belief about ωi.
• Evidence (P(X)): Total probability of observing X across all
Deep Learning
Bayesian Learning II
Bayes’ Theorem -
Fundamentals
H: Hypothesis
E: Evidence
Key Terms:
• Prior Probability (P(H)): Initial belief before evidence.
• Likelihood (P(E∣H)): Probability of evidence given the hypothesis.
• Posterior Probability (P(H∣E)): Updated belief after evidence.
Deep Learning
Bayesian Learning II
Importance
•Feature of Feature
Space Distribution Space
helps visualize and
how Bayes'
features are spread,
Theorem
guiding model selection.
•Bayes' Theorem calculates class probabilities, enhancing prediction
accuracy.
•Feature Space Distribution aids in feature selection and
dimensionality reduction.
•Bayes' Theorem updates beliefs on class membership with new data,
Deep Learning
Bayesian Learning II
Decision Making in Feature Space
• Classification Boundaries: Feature space defines regions for
different classes based on probabilities.
• Threshold Selection: Decision thresholds in feature space impact
the classifier's performance and accuracy.
• Data Separation: The choice of features determines the ease of
separating different classes.
• Model Optimization: Decision-making in feature space guides the
refinement of model parameters for better accuracy.
Deep Learning
Bayesian Learning II
What does Feature Space Distribution help to represent?
•a) Data points and their probabilities for classification
•b) The entire decision-making process
•c) The clustering of data points in the training set
•d) Linear boundaries between different classes
Answer: a) Data points and their probabilities for classification
Deep Learning
Bayesian Learning II
Which of the following is crucial when analyzing Feature Space Distribution?
•a) The relationship between class labels and data features
•b) Only the class distribution in the dataset
•c) The decision boundary creation
•d) None of the above
Answer: a) The relationship between class labels and data features
Deep Learning
Bayesian Learning II
Feature Space Distribution is used to:
•a) Minimize the loss function of a model
•b) Create a feature matrix for classification
•c) Establish the likelihood of data points belonging to a class
•d) All of the above
Answer: c) Establish the likelihood of data points belonging to a class
Deep Learning
Bayesian Learning II
Assignment Questions
1) Explain how Feature Space Distribution can be used to
improve the accuracy of a classifier.
2) Describe the importance of understanding feature space in
the context of Bayes’ Theorem. Discuss how it helps in the
decision-making process.
Deep Learning
Bayesian Learning II
Bayes Minimum
Error
Classifier(BMEC)
Gaurav Kumar
PES1UG21EE029
Deep Learning
Bayesian Learning II
Bayes Minimum Error Classifier
(BMEC)
•The Bayes Minimum Error Classifier (BMEC) is a statistical model used in classification
problems to minimize classification errors. It uses Bayes' theorem to calculate the posterior
probabilities for different classes and select the class with the highest probability. BMEC is a
fundamental approach in decision theory and machine learning, providing an optimal
framework for classification under uncertainty.
•Objective:
To ensure decisions are as accurate as possible by leveraging probabilities derived from
Bayes' theorem.
Deep Learning
Bayesian Learning II
Key Concepts
•Key Features:Optimal Classification: The classifier is optimal when the true class
distributions are known.
•Decision Rule: It works by selecting the class with the highest posterior probability for a
given set of features.
•Probabilistic Nature: It uses probability distributions of the classes and data to make
decisions.
Deep Learning
Bayesian Learning II
Bayes' Theorem Overview
•Bayes' Theorem Formula:
•Bayes’ Theorem provides a way to update the probability estimate for a class Ckbased on
new evidence or data X.
•P(C_k | X): Posterior probability of class Ck given feature X.
•P(X | C_k): Likelihood of feature X given class Ck.
•P(C_k): Prior probability of class Ck, the probability of class before observing X.
•P(X): Total probability of feature X across all classes.
•Explanation:
•Bayes' Theorem is fundamental to the BMEC, as it allows the classifier to compute the
probability of each class given the input features.
Deep Learning
Bayesian Learning II
BMEC Decision Rule
It does this by selecting the class 𝐶𝑘with the highest posterior probability 𝑃( 𝐶𝑘∣ 𝑋).
•Error Minimization:The goal of the BMEC is to minimize the expected classification error.
•The decision rule is as follows:
•This rule means that for any new data point 𝑋, the classifier will choose the class with the
maximum posterior probability.
•Decision Boundary:The decision boundary is the set of points where the classifier is
equally likely to predict one class over another, essentially where
Deep Learning
Bayesian Learning II
Deriving BMEC: Assumptions
•Assumptions for BMEC:
•Known Class Distributions: The model assumes that we have access to the class
distributions or can estimate them from data.
•Minimizing Error: BMEC seeks to minimize the total classification error over all possible
inputs.
•Independence of Features: In some applications, features might be assumed to be
conditionally independent, but BMEC does not require this assumption.
•Error Minimization:To minimize error, BMEC looks at the overall probability of
misclassifying a data point. It does this by choosing the class that maximizes the posterior
probability for the given input data.
Deep Learning
Bayesian Learning II
Types of Errors in BMEC
•False Positives and False Negatives:
•False Positives: These occur when the classifier incorrectly assigns a class to a data point
that does not belong to it.
•False Negatives: These occur when the classifier fails to assign the correct class to a data
point.
•BMEC aims to minimize these errors by using the posterior probabilities for accurate
decision-making. The error types are minimized through optimal class selection based on the
Bayesian rule.
Deep Learning
Bayesian Learning II
Application of BMEC
•Real-world Applications:Image Recognition: BMEC can be used to classify images based
on features extracted from them, such as pixels, shapes, or textures.
•Speech Processing: In speech recognition, BMEC classifies audio features into different
phonetic or linguistic classes.
•Medical Diagnostics: BMEC helps classify patients’ medical conditions based on observed
symptoms and test results.
Deep Learning
Bayesian Learning II
Advantages
•Optimal Performance: BMEC provides the optimal classifier when the class distributions
are known, minimizing misclassification errors.
•Versatility: It can be applied across different domains by modeling the probabilistic
relationship between data and classes.
Deep Learning
Bayesian Learning II
Limitations of BMEC
•Data Availability:BMEC assumes that the class distributions are either known or can be
accurately estimated. However, this requires large datasets or expert knowledge, which may
not always be available.
•Computational Complexity:For complex datasets, calculating the posterior probabilities
for each class may be computationally expensive, particularly when dealing with large
amounts of data or many classes.
•Assumptions on Distributions:In practice, class distributions may not always be easy to
estimate or may not match the assumptions used by BMEC, leading to suboptimal
performance.
Deep Learning
Bayesian Learning II
Multiple-Choice Questions (MCQs)
1. What is the primary goal of the Bayes Minimum Error Classifier (BMEC)?
A) To minimize computational complexity
B) To maximize classification accuracy by minimizing errors
C) To equalize prior probabilities of all classes
D) To maximize the likelihood of the data
•Correct Answer: B
Deep Learning
Bayesian Learning II
Multiple-Choice Questions (MCQs)
2. Which of the following is a necessary condition for BMEC to function optimally?
•A) The class distributions must be unknown
•B) The class distributions must be known or estimable
•C) The features must be linearly separable
•D) The classifier must be deterministic
•Answer: B) The class distributions must be known or estimable
Deep Learning
Bayesian Learning II
Multiple-Choice Questions (MCQs)
3. In BMEC, the decision rule selects the class with the highest:
•A) Prior probability
•B) Posterior probability
•C) Likelihood probability
•D) Joint probability
•Answer: B) Posterior probability
Deep Learning
Bayesian Learning II
Assignment Questions
•Explain the concept of the Bayes Minimum Error Classifier (BMEC).
•Compare and contrast the Bayes Minimum Error Classifier with the Bayes
Minimum Risk Classifier. Highlight their objectives, decision rules, and use cases.
Deep Learning
Bayesian Learning II
Decision
Boundaries in
BMEC and BMRC
Shravani G
PES1UG21EE092
Deep Learning
Bayesian Learning II
Introduction to BMEC and
BMRC
• BMEC (Boundary Max Entropy Classification): A method for
supervised classification that uses entropy-based metrics to
define decision boundaries for different classes.
• BMRC (Boundary Max Regression Classification): Focuses
on regression techniques to optimize the boundaries for
classification tasks.
• Relevance of Decision Boundaries: Crucial for differentiating
between classes in both methods to enhance model performance
and interpretability.
Deep Learning
Bayesian Learning II
Introduction to Decision Boundaries
• Feature Space: Distribution of feature vectors from different
classes.
• Class Conditional Probability Density Function: p(X∣ωi),
where X is the feature vector and ωi is the class.
• Prior Probability: p(ωi), the probability of occurrence of class
ωi.
• Posterior Probability: Computed using Bayes' Theorem:
Deep Learning
Bayesian Learning II
Bayes Minimum Error Classifier (BMEC)
• Objective: Minimize classification error.
• Decision Rule: Assign X to ωi if p(ωi∣X)>p(ωj∣X),∀j≠i
• Probability of Error: Perror=∫min(p(ωi∣X),p(ωj∣X))dX
• Decision boundaries determined by comparing posterior
probabilities of classes.
Deep Learning
Bayesian Learning II
Bayes Minimum Risk Classifier (BMRC)
• Objective: Minimize classification risk considering the loss
function.
• Risk Function:
• Where:
λ(αi∣ωj): Loss for taking action αiwhen true class is ωj.
C: Number of classes.
• Decision Rule: Choose αi that minimizes R(αi∣X)
• Relation to BMEC: For a zero-one loss function, BMRC reduces
to BMEC.
Deep Learning
Bayesian Learning II
Comparing BMEC and BMRC
• BMEC:
1.Ignores variations in consequences of misclassification.
2.Optimal for minimizing classification error.
• BMRC:
1.Incorporates a loss function to weigh consequences of
decisions.
2. Generalizes BMEC.
• Key Formula:
Decision boundary condition for two classes:
Deep Learning
Bayesian Learning II
MCQ Questions
1.What does the Bayes Minimum Error Classifier aim to
minimize?
• (A) Classification time
• (B) Classification error
• (C) Risk of misclassification
• (D) Feature computation time
Answer: B
Deep Learning
Bayesian Learning II
MCQ Questions
2. In BMRC, the loss function λ(αi∣ωj) represents:
• (A) Prior probabilities of classes.
• (B) Probability of correct classification.
• (C) Consequences of taking an action when the true state is
different.
• (D) Posterior probabilities.
Answer: C
Deep Learning
Bayesian Learning II
MCQ Questions
3. For BMEC and BMRC to be equivalent, the loss function
must be:
• (A) Non-linear.
• (B) Uniform across all actions.
• (C) Zero-one loss.
• (D) Proportional to the posterior probabilities.
Answer: C
Deep Learning
Bayesian Learning II
Assignment Questions
1.Derivation Task: Derive the decision boundary equation for a two-
class BMRC system assuming a zero-one loss function.
2. Application Task: Given a dataset with three classes and their
prior probabilities, compute the decision boundary using BMEC
and explain how incorporating a loss function in BMRC would
affect the decision.
Deep Learning
Bayesian Learning II
Bayes Minimum Risk
Classifier (BMRC)
Akshay Kumar
PES1UG21EE010
Deep Learning
Bayesian Learning II
Introduction to
BMRC
• Definition: A classification approach that minimizes the
expected risk (or loss) associated with a decision.
• Key Components:
1.States of Nature (Classes):
2.Actions:
3.Loss Function:
4.Posterior Probability:
Deep Learning
Bayesian Learning II
Loss Function and
Risk
• Loss Function: Represents the cost of an incorrect decision.
• Risk for Action Focuses on regression techniques to optimize
the boundaries for classification tasks.
Deep Learning
Bayesian Learning II
Decision Rule in
• BMRC
Goal: Minimize the risk by selecting the action with the
smallest risk.
• Relation to Bayes Minimum Error Classifier:
• When a zero-one loss function is used, BMRC simplifies to
minimizing the classification error.
• Decision Rule:
Deep Learning
Bayesian Learning II
Key
Formulae
• Bayes Rule:
• Risk Computation:
• Total Risk:
Deep Learning
Bayesian Learning II
Application and
• Advantages
Applications:
1.Image classification
2.Medical diagnosis
3.Financial decision-making
• Advantages:
1.Accounts for varying costs of errors.
2.Provides a more general approach than Bayes Minimum Error
Classifier.
3.Can handle multi-class problems efficiently.
Deep Learning
Bayesian Learning II
What is the purpose of the loss function in the Bayes Minimum Risk
Classifier?
A) To calculate posterior probabilities
B) To penalize incorrect classifications
C) To normalize feature vectors
D) To estimate prior probabilities
Answer: B
Deep Learning
Bayesian Learning II
In BMRC, the action chosen minimizes which of the following?
A) Prior probability
B) Posterior probability
C) Expected risk
D) Classification error
Answer: C
Deep Learning
Bayesian Learning II
When does BMRC simplify to Bayes Minimum Error Classifier?
A) When the loss function is linear
B) When the loss function is one-zero
C) When the feature space is discrete
D) When the prior probabilities are equal
Answer: B
Deep Learning
Bayesian Learning II
Assignment Questions
Derive the expression for the total risk in BMRC and explain how it
differs from the total error in Bayes Minimum Error Classifier.
Implement a Bayes Minimum Risk Classifier in Python to classify two
classes of data, where the loss function for incorrect classification is
higher for one class than the other. Validate your implementation
using a synthetic dataset.
Deep Learning
Bayesian Learning II
Practical
Implementation
and Real-World
Applications
Adeeba Moghis
Deep Learning
Bayesian Learning II
Introduction to Practical Implementation of Deep
Learning
Deep learning uses neural networks with multiple layers to analyze
and learn patterns from vast datasets.
Key Components of Implementation:
• Data Collection: Large, labeled datasets are essential.
• Model Architecture Selection: Choose between CNNs, RNNs,
Transformers, etc., based on the application.
• Training and Validation: Train the model with optimization
techniques like gradient descent and validate on unseen data.
• Hardware: Use GPUs/TPUs for faster computations.
Software Frameworks: TensorFlow, PyTorch, Keras, and others
simplify model development and deployment.
Deep Learning
Bayesian Learning II
Real-World Application 1: Computer
Key Use Cases: Vision
• Image Classification: Identifying objects in images (e.g.,
autonomous vehicles).
• Object Detection: Detecting multiple objects in a scene (e.g.,
security systems).
• Medical Imaging: Assisting in diagnostics, such as identifying
tumors in X-rays.
• Face Recognition: Used in authentication systems and security.
Case Study: Autonomous vehicles use CNNs for lane detection,
obstacle avoidance, and pedestrian recognition.
Deep Learning
Bayesian Learning II
Real-World Application 2: Natural Language
Key Use Cases:
Processing (NLP)
• Chatbots and Virtual Assistants: Powering Alexa, Siri, and
customer service bots.
• Sentiment Analysis: Understanding customer feedback from text.
• Machine Translation: Translating languages in real time (e.g.,
Google Translate).
• Text Summarization: Automatically summarizing long documents.
Case Study: GPT models (like ChatGPT) revolutionizing
conversational AI in various industries.
Deep Learning
Bayesian Learning II
Real-World Application 3: Predictive
Analytics
Key Use Cases:
• Finance: Fraud detection, stock market predictions, and risk
assessment.
• Healthcare: Predicting patient outcomes and early diagnosis of
diseases.
• Retail: Demand forecasting, personalized recommendations, and
inventory management.
• Weather Forecasting: Predicting weather patterns with high
accuracy.
Case Study: Netflix uses deep learning to recommend content
based on user preferences.
Deep Learning
Bayesian Learning II
Challenges and Future
Challenges inDirections
Practical Implementation:
• Data Requirements: Models need extensive, high-quality data.
• Computational Costs: High demand for computing resources.
• Interpretability: Explaining decisions made by deep learning
models remains difficult.
Future Trends:
• Edge AI: Bringing deep learning to smaller devices.
• Explainable AI: Improving transparency in decision-making.
• Integration: Seamless merging with IoT, robotics, and other
advanced technologies.
• Sustainability: Reducing energy consumption of large models.
Deep Learning
Bayesian Learning II
MCQ Questions
1. Which of the following is a key step in implementing a deep learning model?
a) Collecting labeled data
b) Selecting a suitable model architecture
c) Training and validating the model
d) All of the above
Answer: d) All of the above
Deep Learning
Bayesian Learning II
MCQ Questions
2. Which of these is a common application of computer vision in real-world
scenarios?
a) Speech recognition
b) Image classification
c) Natural language processing
d) Predictive analytics
Answer: b) Image classification
Deep Learning
Bayesian Learning II
MCQ Questions
3. What is a major challenge in deep learning applications?
a) Availability of GPUs
b) Lack of labeled data
c) Limited real-world applications
d) None of the above
Answer: b) Lack of labeled data
Deep Learning
Bayesian Learning II
Assignment
Questions
1. Practical Assignment:
Implement a basic image classification model using a deep learning framework
like TensorFlow or PyTorch. Use a dataset like MNIST or CIFAR-10 and train a
Convolutional Neural Network (CNN) to classify images. Provide a detailed
explanation of the architecture, training process, and evaluation metrics used.
2. Theoretical Assignment:
Select a real-world application of deep learning (e.g., healthcare, finance, or
autonomous vehicles). Research and write a report (1000–1500 words) detailing:
- The problem being addressed
- The deep learning techniques used
- Challenges faced during implementation
- The impact of the solution on the field or industry