0% found this document useful (0 votes)

79 views7 pages

MAS 408 - Discriminant Analysis

Discriminant analysis is a classification technique used to predict group membership. It analyzes variables measured on known training data to develop rules for assigning new observations to one of the known groups. The method determines which variables best distinguish between the groups and uses these to build a model. New data can then be classified based on the model by predicting which group it is most likely to belong to.

Uploaded by

Dorin Katuu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views7 pages

MAS 408 - Discriminant Analysis

Uploaded by

Dorin Katuu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

DISCRIMINANT ANALYSIS

Adapted from PSU Online Notes

Discriminant analysis is

– is a classification problem.

– two or more groups or clusters or populations are known a priori and one or more new
observations are classified into one of the known populations based on some measured
characteristics.

Example 1

Data were collected on two species of insects in the genus Chaetocnema, (a) Ch. concinna and
(b) Ch. heikertlingeri. Three variables were measured on each insect:

– width of the 1st joint of the tarsus (legs)

– width of the 2nd joint of the tarsus

– width of the aedeagus (reproductive organ)

The objective is to obtain a classification rule for identifying the insect species based on these
three variables.
Let P(πi |x) denote the conditional probability that an observation came from population given
that the observed values of the multivariate vector of variables x.

– We classify an observation to the population for which the value of P(πi |x) is greatest.

– This is most probable group given the observed values of x.

Notation

– Suppose that we have g populations (groups) and that the ith population is denoted as
πi .

– Let pi = P(πi ), be the probability that a randomly selected observation is in population

πi .

– Let f(x|πi ) be the conditional probability density function of the multivariate set of
variables, given that the observation came from population πi .

1
The probability of interest is

P(member of πi and we observe x)

P(member of πi | we observed x) = (1)
P(we observe x)

– The numerator in Equation (1) is the likelihood that a randomly selected observation is
both from population πi and has the value x. This likelihood = pi f(x|πi ).

– The denominator in Equation (1) is the unconditional likelihood (over all populations)
P
that we could observe x. This likelihood = gj=1 pj f(x|πj ).

The posterior probability that an observation is a member of population πi is

pi f(x|πi )
p(πi |x) = Pg (2)
j=1 pj f(x|πj )

– The classification rule is to assign observation x to the population for which Equation (2)
is the greatest.

– The denominator in Equation (2) is the same for all posterior probabilities (for the various
populations) so it is equivalent to say that we will classify an observation to the population
for which pi f(x|πi ) is greatest.

Case of two populations

In the case of two populations we express a classification rule in terms of the ratio of the two
posterior probabilities.
We classify to population 1 when
p1 f(x|π1 )
>1 (3)
p2 f(x|π2 )
Equation (3) can be written as
f(x|π1 ) p2
> (4)
f(x|π2 ) p1
The classification rule is to assign observation x to the population for which Equation (2) is
the greatest is equivalent to assign it to the population that maximizes the product:

log f(x|πi )pi (5)

Steps in Discriminant Analysis

Discriminant analysis is a 7-step procedure. The steps are:

1. Collect training data.

2
– Training data are data with known group memberships.

– We know which population contains each subject.

2. Prior Probability.

– pi represents the expected portion of the community that belongs to population πi .

There are three common choices namely:

a) Equal priors: p̂i = g1 , useful if we believe that all of the population sizes are equal.

b) Arbitrary priors selected according to the investigators beliefs regarding the relative
population sizes, however p̂1 + p̂2 + · · · + p̂g = 1.

c) Estimated priors
ni
p̂i =
N
where ni is the number observations from population πi in the training data, and
N = n1 + n2 + . . . + ng .

3. Bartlett’s test.

– The means of the populations must be equal for there to be case for Discriminant
Analysis (DA).

– Bartlett’s test is used to determine if the variance-covariance matrices are homogeneous

for all populations involved.

– The result of this test will determine whether to use Linear or Quadratic DA.

– Linear DA is for homogeneous variance-covariance matrices:

Σ1 = Σ2 = · · · = Σg = Σ

– Quadratic DA is for heterogeneous variance-covariance matrices:

Σi 6= Σj for some i 6= j

4. Estimate the parameters of the conditional probability density functions f(X|πi ).

The following standard assumptions are made:

a) The data from group i has common mean vector µi .

b) The data from group i has common variance-covariance matrix Σ.

c) Independence: the subjects are independently sampled.

3
d) Normality: the data are multivariate normally distributed.

5. Compute discriminant functions - the rule to classify the new object into one of the known
populations.

6. Use cross validation to estimate misclassification probabilities.

– This is a diagnostic procedure to assess the efficacy of the discriminant analysis.

– You will have some prior rules about what constitutes an acceptable misclassification
rate. These rules could include questions like, ”What is the cost of misclassification?”
For example, in a medical study to help you diagnose cancer. There are two costs to
consider:

– the cost of incorrectly labeling someone as having cancer when they do not. This
could result in some emotional distress!

– the cost of misclassifying someone as not having cancer when they actually do.
The cost is obviously higher if early detection improves cure rates.

– Cross-validation is used to assess the classification probability.

7. Classify observations with unknown group memberships.

Linear Discriminant Analysis

Assume that in population πi the probability density function of x is multivariate normal with
mean vector µi and variance-covariance matrix Σ (same for all populations). Then

1

1 0 −1
f(x|πi ) = exp − (x − µi ) Σ (x − µi ) (6)
(2π)p/2 |Σ|1/2 2

Recall:

– We classify to the population with largest pi f(x|πi ).

– This is equivalent to the population with largest log pi f(x|πi ).

In this case, our decision rule is based on the Linear Score Function, a function of the population
means for each of our g populations, µi , as well as the pooled variance-covariance matrix.

4
Linear Score Function

The Linear Score Function is:

1 0 0
X p
sLi (x) = − µi Σ−1 µi + µi Σ−1 x + log pi = di0 + dij xj + log pi (7)
2 j=1

where
1 0
di0 = − µi Σ−1 µi
2
0
dij = j element of µi Σ−1
th

Linear Discriminant Function

1 0 0
X p
dLi (x) = − µi Σ−1 µi + µi Σ−1 x = di0 + dij xj (8)
2 j=1

Equation (7) is computed for each population, then we plug in our observation values and
assign the unit to the population with the largest score.
However, Equation (7) is in terms of population values. These parameters are estimated these
from training data, in which the population membership is known.
Discriminant analysis requires estimates of:

– Prior probabilities: pi = Pr(πi ); i = 1, 2, . . . , g

– The population means are estimated by the sample mean vectors: µi = E(x|πi ); i =
1, 2, . . . , g

– The variance-covariance matrix is estimated by using the pooled variance-covariance ma-

trix: Σ = var(x|πi ); i = 1, 2, . . . , g

Conditional Density Function Parameters

Population Means: µi is estimated by substituting in the sample means x̄i .

Variance-Covariance matrix: Let Si denote the sample variance-covariance matrix for popu-
lation i. Then the variance-covariance matrix Σ is estimated by substituting in the pooled
variance-covariance matrix into the Linear Score Function as:
Pg
(ni − 1)Si
Sp = Pi=1g (9)
i=1 (ni − 1)

to obtain the estimated linear score function:

1 X
p
ŝLi (x) = − x̄i0 S−1
p x̄ i + x̄ 0 −1
S
i p x + log p̂ i = d̂ i0 + d̂ij xj + log pi (10)
2 j=1

5
where
1
d̂i0 = − x̄i0 S−1
p x̄i
2
and
d̂ij = jth element of x̄i0 S−1
p

Classification with Two Multivariate Normal Populations

Misclassification costs and prior probabilities

The criterion for classifying is to minimize the expected cost of misclassification. It is minimized
by classifying a unit with measurement x0 into population 1 if

f1 (x0 ) c(1|2)p2
> (11)
f2 (x0 ) c(2|1)p1

then, the estimated minimum average or Expected Cost of Misclassification (ECM) rule for two
normal populations allocates x0 to population 1 if

T 1 c(1|2)p2
(x¯1 − x¯2 ) S−1
p x0 − (x¯1 − x¯2 )T S−1
p (x¯1 + x¯2 ) > log (12)
2 c(2|1)p1

otherwise, it allocates x0 to population 2.

Note

– c(1|2) is the cost when population 2 observation is incorrectly classified into population 1

– c(2|1) is the cost when population 1 observation is incorrectly classified into population
2.

– p1 and p2 are prior probabilities in population 1 and 2, respectively.

Exercise

1. Page 650 11.1

2. Suppose that n1 = 11 and n2 = 12 observations are sampled from two different bivariate
normal distributions that have a common covariance matrix Σ and possibly different mean
vectors µ1 and µ2 . The sample mean vectors and pooled covariance matrix are:

a) Report the estimate of the formula for Fisher”s linear discriminant function. Explain
how this function is used to classify.

6
b) Consider an observation x0 = [1, 2]T on a new experimental unit. Was this unit more
likely to have come from population 1 or population 2? (Assume equal misclassification
costs and equal prior probabilities).

c) Classify the unit in part (b) assuming prior probabilities .35 and .65 of observing a unit
from populations 1 and 2, respectively. Also, assume the cost of misclassifying a unit
from population 1 into population 2 is ten times greater than the cost of misclassifying
a unit from population 2 into population 1

Shankland - Theoretical Rook Endgames (2023)
83% (6)
Shankland - Theoretical Rook Endgames (2023)
450 pages
Week2 Part1 Summer Partial Notes
No ratings yet
Week2 Part1 Summer Partial Notes
75 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
58 pages
Report
No ratings yet
Report
50 pages
Chapter5 DA
No ratings yet
Chapter5 DA
39 pages
2018 Mult 9
No ratings yet
2018 Mult 9
46 pages
Notes Discriminant Analysis March 2021
No ratings yet
Notes Discriminant Analysis March 2021
59 pages
Discrimination and Classification
No ratings yet
Discrimination and Classification
7 pages
Analisis Klasifikasi
No ratings yet
Analisis Klasifikasi
41 pages
Analisis Diskriminan 2
No ratings yet
Analisis Diskriminan 2
30 pages
Subject: Statistics
No ratings yet
Subject: Statistics
21 pages
Linear Discriminant Analysis and Its Variations: Abu Minhajuddin CSE 8331
No ratings yet
Linear Discriminant Analysis and Its Variations: Abu Minhajuddin CSE 8331
20 pages
Linear Discriminant Analysis
No ratings yet
Linear Discriminant Analysis
20 pages
Lachenbruch EstimationErrorRates 1968
No ratings yet
Lachenbruch EstimationErrorRates 1968
12 pages
Sst414 Lesson 5
No ratings yet
Sst414 Lesson 5
11 pages
PR - Unit 2
No ratings yet
PR - Unit 2
17 pages
Disk Rim in An Z Analyse
No ratings yet
Disk Rim in An Z Analyse
30 pages
Parametric Classification PDF
No ratings yet
Parametric Classification PDF
46 pages
Bayesian
No ratings yet
Bayesian
21 pages
Lect 13 - Bayes Decistion Theory - Derivation
No ratings yet
Lect 13 - Bayes Decistion Theory - Derivation
25 pages
Asdfghjkl
No ratings yet
Asdfghjkl
22 pages
Chap12 DiscriminantAnalysis
No ratings yet
Chap12 DiscriminantAnalysis
30 pages
Supervised Learning: Linear Methods (1/2) : Applied Multivariate Statistics - Spring 2012
No ratings yet
Supervised Learning: Linear Methods (1/2) : Applied Multivariate Statistics - Spring 2012
15 pages
Classification Models
No ratings yet
Classification Models
95 pages
Linear and Quadratic Discriminant Analysis: Tutorial: Benyamin Ghojogh
No ratings yet
Linear and Quadratic Discriminant Analysis: Tutorial: Benyamin Ghojogh
16 pages
Chapter 11 KNN Naive Bayes and LDA
No ratings yet
Chapter 11 KNN Naive Bayes and LDA
15 pages
AE - Tema 5 - Two-Class Fisher Discriminant Analysis
No ratings yet
AE - Tema 5 - Two-Class Fisher Discriminant Analysis
6 pages
Discriminanat Analysis
No ratings yet
Discriminanat Analysis
13 pages
Document
No ratings yet
Document
6 pages
Data Mining Supervised Techniques II
No ratings yet
Data Mining Supervised Techniques II
13 pages
Lec 6
No ratings yet
Lec 6
14 pages
DFA Interpretation Help
No ratings yet
DFA Interpretation Help
36 pages
Discriminant Analysis:, Y,, y Iid F I 1, H. Here, Let F
No ratings yet
Discriminant Analysis:, Y,, y Iid F I 1, H. Here, Let F
13 pages
Multiple Discriminant Analysis: Dr. Hemal Pandya
No ratings yet
Multiple Discriminant Analysis: Dr. Hemal Pandya
29 pages
Discriminant Function Analysis: Basics Psy524 Andrew Ainsworth
No ratings yet
Discriminant Function Analysis: Basics Psy524 Andrew Ainsworth
39 pages
Discriminant Analysis: 5.1 The Maximum Likelihood (ML) Rule
No ratings yet
Discriminant Analysis: 5.1 The Maximum Likelihood (ML) Rule
6 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
19 pages
Materi 5 - 2
No ratings yet
Materi 5 - 2
25 pages
Multivariate Analysis (Slides 8)
No ratings yet
Multivariate Analysis (Slides 8)
19 pages
Ant Analysis (Smoker Edition) Final
No ratings yet
Ant Analysis (Smoker Edition) Final
13 pages
Discriminant Analysis Psy.
No ratings yet
Discriminant Analysis Psy.
5 pages
MATERIAL 5-Discriminant PDF
No ratings yet
MATERIAL 5-Discriminant PDF
26 pages
Easy Love Spell
50% (2)
Easy Love Spell
2 pages
n9 PDF
No ratings yet
n9 PDF
6 pages
Discriminant Analysis For Risk Classification and Prediction
No ratings yet
Discriminant Analysis For Risk Classification and Prediction
23 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
6 pages
FCIE V17a Sample Paper
No ratings yet
FCIE V17a Sample Paper
21 pages
Generative Algorithms
No ratings yet
Generative Algorithms
3 pages
Discriminant Analysis Example 2: Fisher's Iris Data
No ratings yet
Discriminant Analysis Example 2: Fisher's Iris Data
12 pages
Pattern Recognition
No ratings yet
Pattern Recognition
9 pages
Discriminant Analysis: Prepared By-Sumit Jain
No ratings yet
Discriminant Analysis: Prepared By-Sumit Jain
44 pages
Classification: 12.1 Discriminant Analysis
No ratings yet
Classification: 12.1 Discriminant Analysis
21 pages
Chapter11 Slides
No ratings yet
Chapter11 Slides
20 pages
DL - Assignment 2 Solution
No ratings yet
DL - Assignment 2 Solution
7 pages
SANS MGT414 10 Course Book
No ratings yet
SANS MGT414 10 Course Book
100 pages
Two Group Discriminant Function Analysis
No ratings yet
Two Group Discriminant Function Analysis
4 pages
Empirical Data Analysis in Accounting and Finance
No ratings yet
Empirical Data Analysis in Accounting and Finance
37 pages
TQM - TRG - F-09 - Discriminant Analysis - Rev01 - 20180602 PDF
No ratings yet
TQM - TRG - F-09 - Discriminant Analysis - Rev01 - 20180602 PDF
22 pages
991.20 Nitrogeno Total en Leche - Kjeldahl
No ratings yet
991.20 Nitrogeno Total en Leche - Kjeldahl
2 pages
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
No ratings yet
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
10 pages
Lesson Planning in Teaching
No ratings yet
Lesson Planning in Teaching
10 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
33 pages
Standard PDI G102
No ratings yet
Standard PDI G102
8 pages
Resources Tab
No ratings yet
Resources Tab
4 pages
Safe Work Procedure
No ratings yet
Safe Work Procedure
2 pages
Dasmesh Group of Schools: Faridkot/Kotkapura/Bargari Std. VII
No ratings yet
Dasmesh Group of Schools: Faridkot/Kotkapura/Bargari Std. VII
23 pages
Twophaseflow: An Openfoam Based Framework For Development of Two Phase Ow Solvers
No ratings yet
Twophaseflow: An Openfoam Based Framework For Development of Two Phase Ow Solvers
38 pages
Simulation and Performance Evaluation of Battery Based Stand-Alone Photovoltaic Systems of Malawi
No ratings yet
Simulation and Performance Evaluation of Battery Based Stand-Alone Photovoltaic Systems of Malawi
89 pages
Computer Ports and Cables
No ratings yet
Computer Ports and Cables
7 pages
Bayesian Answers
No ratings yet
Bayesian Answers
13 pages
Life Contingencies
No ratings yet
Life Contingencies
5 pages
Mit 403 Exam - 2021-2022
No ratings yet
Mit 403 Exam - 2021-2022
7 pages
Mit403 2016
No ratings yet
Mit403 2016
12 pages
FLEX-1500 Service Manual
No ratings yet
FLEX-1500 Service Manual
49 pages
M4 Merge PDF
No ratings yet
M4 Merge PDF
68 pages
ABCD Complete V7b HR 1
No ratings yet
ABCD Complete V7b HR 1
11 pages
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
No ratings yet
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
8 pages
A Study On The Performance of Insurance Companies in 1xynrowx1f
No ratings yet
A Study On The Performance of Insurance Companies in 1xynrowx1f
13 pages
SCM 100 Review
No ratings yet
SCM 100 Review
23 pages
Unit 9 Simple Linear Regression: Structure
No ratings yet
Unit 9 Simple Linear Regression: Structure
22 pages
UMTS Call Flow Scenarios Overview
No ratings yet
UMTS Call Flow Scenarios Overview
161 pages
Project Proposal
No ratings yet
Project Proposal
23 pages
Ict2611 Octnov24
No ratings yet
Ict2611 Octnov24
15 pages
Action Reesearch Webinar CPD Certificate April 2025
No ratings yet
Action Reesearch Webinar CPD Certificate April 2025
5 pages
MIT 402 CAT 2 S
No ratings yet
MIT 402 CAT 2 S
8 pages
MIT 404 Main 2022
No ratings yet
MIT 404 Main 2022
3 pages
The Role of Frontier Orbitals in Chemical Reactions
No ratings yet
The Role of Frontier Orbitals in Chemical Reactions
18 pages
Topic2 Bayesian
No ratings yet
Topic2 Bayesian
4 pages
A Milling Machine Is A Machine Tool Used To Machine Solid Materials
No ratings yet
A Milling Machine Is A Machine Tool Used To Machine Solid Materials
7 pages
Lesson-Plan 1
No ratings yet
Lesson-Plan 1
2 pages
A Guidelines For Interviewing For The High School Newspaper
No ratings yet
A Guidelines For Interviewing For The High School Newspaper
4 pages
Comparative Media Systems / Sociology of News: Rdb6@nyu - Edu
No ratings yet
Comparative Media Systems / Sociology of News: Rdb6@nyu - Edu
9 pages
En Entl Encl1106 (Подъемники)
No ratings yet
En Entl Encl1106 (Подъемники)
2 pages
Book 3 Unit 8. Communicating With Staff: Group Name: 4 Arya Nugroho Indri Novianti Rahayu Yiyin
No ratings yet
Book 3 Unit 8. Communicating With Staff: Group Name: 4 Arya Nugroho Indri Novianti Rahayu Yiyin
10 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

MAS 408 - Discriminant Analysis

Uploaded by

MAS 408 - Discriminant Analysis

Uploaded by

DISCRIMINANT ANALYSIS

Adapted from PSU Online Notes

– width of the 1st joint of the tarsus (legs)

– width of the 2nd joint of the tarsus

– width of the aedeagus (reproductive organ)

– This is most probable group given the observed values of x.

– Let pi = P(πi ), be the probability that a randomly selected observation is in population

P(member of πi and we observe x)

The posterior probability that an observation is a member of population πi is

Case of two populations

log f(x|πi )pi (5)

Steps in Discriminant Analysis

Discriminant analysis is a 7-step procedure. The steps are:

1. Collect training data.

– We know which population contains each subject.

– pi represents the expected portion of the community that belongs to population πi .

– Bartlett’s test is used to determine if the variance-covariance matrices are homogeneous

– Linear DA is for homogeneous variance-covariance matrices:

– Quadratic DA is for heterogeneous variance-covariance matrices:

4. Estimate the parameters of the conditional probability density functions f(X|πi ).

a) The data from group i has common mean vector µi .

b) The data from group i has common variance-covariance matrix Σ.

c) Independence: the subjects are independently sampled.

6. Use cross validation to estimate misclassification probabilities.

– This is a diagnostic procedure to assess the efficacy of the discriminant analysis.

– Cross-validation is used to assess the classification probability.

7. Classify observations with unknown group memberships.

Linear Discriminant Analysis

– We classify to the population with largest pi f(x|πi ).

– This is equivalent to the population with largest log pi f(x|πi ).

The Linear Score Function is:

Linear Discriminant Function

– Prior probabilities: pi = Pr(πi ); i = 1, 2, . . . , g

– The variance-covariance matrix is estimated by using the pooled variance-covariance ma-

Conditional Density Function Parameters

Population Means: µi is estimated by substituting in the sample means x̄i .

to obtain the estimated linear score function:

Classification with Two Multivariate Normal Populations

Misclassification costs and prior probabilities

otherwise, it allocates x0 to population 2.

– p1 and p2 are prior probabilities in population 1 and 2, respectively.

1. Page 650 11.1

You might also like