0% found this document useful (0 votes)
4 views4 pages

Module 5 Notes

Fisher Discriminant Analysis (FDA) is a dimensionality reduction and classification technique that maximizes class separability by finding linear combinations of features. It is applied in various fields such as face recognition, speech recognition, bioinformatics, remote sensing, and quality control. The Parzen Window method is a non-parametric technique for estimating probability density functions, while K-Nearest Neighbors (K-NN) is a supervised learning algorithm that classifies data based on similarity to existing data points.

Uploaded by

Papan Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views4 pages

Module 5 Notes

Fisher Discriminant Analysis (FDA) is a dimensionality reduction and classification technique that maximizes class separability by finding linear combinations of features. It is applied in various fields such as face recognition, speech recognition, bioinformatics, remote sensing, and quality control. The Parzen Window method is a non-parametric technique for estimating probability density functions, while K-Nearest Neighbors (K-NN) is a supervised learning algorithm that classifies data based on similarity to existing data points.

Uploaded by

Papan Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Fisher Discriminant Analysis (FDA)

Fisher Discriminant Analysis (FDA), also known as Linear Discriminant Analysis (LDA), is a
dimensionality reduction and classification technique used in pattern recognition and machine learning. It
was developed by Ronald A. Fisher in the 1930s and is based on the concept of finding linear
combinations of features that best separate different classes of data.
Here's how Fisher Discriminant Analysis works:
1. Data Preprocessing: FDA assumes that the input data is numerical and continuous. Categorical
variables may need to be converted into numerical form, and data normalization may be performed to
ensure that each feature contributes equally to the analysis.
2. Class Separability: FDA aims to find linear combinations of features that maximize the
separability between different classes in the data. It does this by maximizing the between-class scatter
while minimizing the within-class scatter.
3. Between-Class Scatter: The between-class scatter measures the spread between the means of
different classes. It quantifies how well the classes are separated from each other in the feature space.
4. Within-Class Scatter: The within-class scatter measures the spread within each class. It quantifies
the variability of data points within the same class.
5. Fisher Criterion: The Fisher criterion is defined as the ratio of between-class scatter to within-
class scatter. The goal of FDA is to find linear combinations of features that maximize this criterion.
6. Eigenvalue Decomposition: FDA typically involves eigenvalue decomposition of the scatter
matrices to find the directions (linear discriminants) in which the data is most discriminative. These linear
discriminants are the optimal projection axes for separating the classes.
7. Dimensionality Reduction: Once the linear discriminants are obtained, FDA can be used for
dimensionality reduction by projecting the data onto the subspace spanned by the discriminant vectors.
This reduces the dimensionality of the feature space while preserving the class-discriminatory information
as much as possible.
8. Classification: FDA can also be used for classification by applying a decision rule to the
projected data. For example, a simple decision rule might involve assigning a new data point to the class
with the nearest mean in the projected space.

Applications
1. Face Recognition: In facial recognition systems, FDA can be used to extract discriminative
features from facial images and classify them into different individuals. By analyzing the patterns of pixel
intensities in facial images, FDA can find linear combinations of features that maximize the differences
between individuals while minimizing the variations within the same individual.
2. Speech Recognition: FDA can also be applied in speech recognition to classify spoken words or
phonemes. By extracting relevant acoustic features from speech signals, such as Mel-frequency cepstral
coefficients (MFCCs) or spectral features, FDA can help differentiate between different phonetic classes
or spoken words.
3. Bioinformatics: In bioinformatics, FDA can be used for tasks such as classifying gene expression
data into different disease subtypes or identifying biomarkers for disease diagnosis. By analyzing the
expression levels of genes across different samples, FDA can help identify gene sets that are most
discriminative for differentiating between healthy and diseased individuals.
4. Remote Sensing: In remote sensing applications, FDA can be used to classify land cover types or
detect environmental changes based on satellite imagery. By extracting spectral features from satellite
images, FDA can help classify different land cover classes such as vegetation, water bodies, and urban
areas, enabling applications in agriculture, environmental monitoring, and urban planning.
5. Quality Control: In manufacturing and industrial applications, FDA can be used for quality
control and defect detection. By analyzing sensor data or measurements from production processes, FDA
can help classify products into different quality categories or detect anomalies that indicate manufacturing
defects or deviations from desired specifications.

Parzen Window method


The Parzen Window method, also known as the kernel density estimation (KDE) method, is a non-
parametric technique used for estimating the probability density function (PDF) of a random variable. It's
particularly useful when the underlying distribution of the data is unknown or difficult to model
parametrically.
Here's how the Parzen Window method works:
1. Data Representation: Assume we have a dataset {𝑥1,𝑥2,...,𝑥𝑛} consisting of 𝑛 observations of a
random variable 𝑋. Each 𝑥𝑖 represents a data point in the dataset.
2. Window Function: The Parzen Window method uses a window function, often a kernel function
𝐾(⋅), to estimate the PDF. The window function defines the shape and size of the "window" or "kernel"
around each data point.
3. Density Estimation: For a given data point 𝑥, the Parzen Window estimate of the PDF at 𝑥 is
computed by averaging the contributions of all data points within the window centered at x. This is done
by applying the window function to each data point and summing up the results:

Where:
• 𝑓^(𝑥) is the estimated PDF at 𝑥.
• 𝐾(⋅) is the kernel function.
• ℎ is the bandwidth or width of the window, which controls the smoothness of the
estimated PDF. It determines the size of the neighborhood around each data point that contributes to the
density estimate.
2. Choice of Kernel Function: Commonly used kernel functions include the Gaussian
(normal) kernel, the Epanechnikov kernel, and the uniform kernel. The choice of kernel function affects
the smoothness and shape of the estimated PDF.
3. Bandwidth Selection: The bandwidth ℎ is a crucial parameter in the Parzen Window
method. It controls the trade-off between bias and variance in the density estimate. Choosing an
appropriate bandwidth is important for obtaining an accurate estimate of the underlying PDF. Common
methods for bandwidth selection include cross-validation and Silverman's rule of thumb.
4. Normalization: To ensure that the estimated PDF integrates to 1 over the entire domain,
the estimated densities are often normalized by dividing each density estimate by the total number of data
points and the volume of the window.

K-Nearest Neighbors (K-NN)

K-Nearest Neighbors (K-NN) is a straightforward yet effective Machine Learning algorithm that falls
under the Supervised Learning category. It operates on the principle of similarity, assigning new data
points to categories based on their resemblance to existing data. By storing all available data and gauging
similarity, K-NN facilitates easy classification of new data into relevant categories. It is versatile,
applicable to both Regression and Classification tasks, although it's predominantly used for the latter.
Notably, K-NN is non-parametric, making no assumptions about the data's underlying distribution. Due to
its deferred learning approach, where it stores the dataset during training and classifies new data when
prompted, K-NN is often referred to as a "lazy learner" algorithm.
Example:
Consider a scenario where you have a dataset of various fruits such as apples, oranges, and bananas. Each
fruit is characterized by its weight and color. Now, if you were to use the K-Nearest Neighbors (K-NN)
algorithm to classify a new fruit, say a small, red fruit, the algorithm would examine the characteristics of
nearby fruits in the dataset. If the majority of nearby fruits are apples, the algorithm would classify the
new fruit as an apple. However, if the majority are oranges, it would classify the new fruit as an orange.
The algorithm's decision depends on the similarity between the features of the new fruit and those of
existing fruits in the dataset.
Steps:
1. Load the Data: Start by loading the dataset that contains the features (attributes) of the objects
you want to classify and their corresponding labels (class).
2. Choose the Value of K: Determine the value of K, which represents the number of nearest
neighbors to consider when classifying a new data point. This value is typically chosen empirically or
through techniques like cross-validation.
3. Calculate Distance: Compute the distance between the new data point and all other data points in
the dataset. The distance metric used (e.g., Euclidean distance, Manhattan distance) depends on the nature
of the data and the problem.
4. Find K Nearest Neighbors: Identify the K data points with the shortest distances to the new data
point. These data points are the "nearest neighbors."
5. Majority Vote: Determine the majority class among the K nearest neighbors. This is typically
done by counting the occurrences of each class label among the neighbors.
6. Assign Class: Assign the majority class as the predicted class for the new data point. If there is a
tie, additional rules (such as considering distances) may be used to break it.
7. Evaluate Model: After classifying all data points, evaluate the performance of the model using
metrics such as accuracy, precision, recall, or F1 score. This step helps assess how well the model
generalizes to unseen data.
8. Adjust K: If necessary, iterate over different values of K and evaluate the model's performance to
find the optimal value.
9. Predict New Data: Once the model is trained and evaluated, you can use it to predict the classes
of new, unseen data points by repeating steps 3 to 6 for each new data point.
10. Finalize Model: Finally, once satisfied with the model's performance, finalize it for deployment
and use it to classify new data in real-world applications.

You might also like