0% found this document useful (0 votes)

56 views7 pages

Linear Models For Classification: Logreg - PDF - May 4, 2010 - 1

Classification problems involve predicting a target variable that can take on discrete, unordered values. The goal is to predict the target class given input values. There are two main approaches: discriminative models directly model the conditional probability of the target given inputs, while generative models estimate class conditional densities and apply Bayes' rule. Linear probability models and logistic regression are examples of discriminative models that use a linear function of the inputs to model the conditional probability. Logistic regression specifically uses a logistic response function to map the linear combination to a probability between 0 and 1.

Uploaded by

Jiri otruba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views7 pages

Linear Models For Classification: Logreg - PDF - May 4, 2010 - 1

Uploaded by

Jiri otruba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Classification Problems

Data Mining In classification problem there is a target variable y that

Linear Models for Classification
assumes values in an unordered discrete set.

The goal of a classification procedure is to predict the

target value (class label) given a set of input values
x = {x1, . . . , xd} measured on the same object.

An important special case is when there are only two

classes, in which case we usually code them as
y ∈ {0, 1}.

1 JJ J I II J • X 1 JJ J I II J • X

Examples of Classification Problems Classification Problems

I Credit Scoring: applicant will default or not? At a particular point x the value of y is not uniquely
determined.
I SPAM filter: e-mail message is SPAM or not?
I Medical diagnosis: does patient have breast cancer? It can assume both its values with respective
I Handwritten digit recognition. probabilities that depend on the location of the point x
I Music Genre Classification. in the input space. We write
p(y = 1|x) = 1 − p(y = 0|x).
The goal of a classification procedure is to produce an
estimate of p(y|x) at every input point x.

2 JJ J I II J • X 3 JJ J I II J • X

Two types of approaches to classification Discriminative Models

I Discriminative Models (regression). Discriminative methods only model the conditional

distribution of y given x. The probability distribution of
I Generative Models (density estimation). x itself is not modeled. For the binary classification
problem:
p(y = 1|x) = f (x, w)
where f (x, w) is some deterministic function of x.

Note that the linear regression model follows the same

strategy.

4 JJ J I II J • X 5 JJ J I II J • X

logreg.pdf — May 4, 2010 — 1

Discriminative Models Generative Models

Examples of discriminative classification methods: An alternative paradigm for estimating p(y|x) is based
on density estimation. Here Bayes’ theorem
I Linear probability model (this lecture) p(y = 1)p(x|y = 1)
p(y = 1|x) =
I Logistic regression (this lecture) p(y = 1)p(x|y = 1) + p(y = 0)p(x|y = 0)

I Classification Trees (Book: section 4.3) is applied where p(x|y) are the class conditional
I Feed-forward neural networks (Book: section 5.4) probability density functions and p(y) are the
unconditional (prior) probabilities of each class.
I ...

6 JJ J I II J • X 7 JJ J I II J • X

Generative Models Discriminative Models: linear probability model

Examples of density estimation based classification Consider the linear regression model
y = wT x + ε y ∈ {0, 1}
methods:
Note:
I Linear/Quadratic Discriminant Analysis (not
 
1
discussed),  x1
 
wT = [w0 w1 . . . wd], x =  .. .

 . 
I Naive Bayes classifier (Book: section 5.3), xd
so wT x = w0 + di=1 wixi.
P
I...
By assumption E[ε|x] = 0, so we have
E[y|x] = wT x
But
E[y|x] = 1 · p(y = 1|x) + 0 · p(y = 0|x)
= p(y = 1|x)
8 JJ J I II J • X 9 JJ J I II J • X

Linear response function Logistic regression

Logistic response function

T
1
ew x
E[y|x] E[y|x] = T
1 + ew x
T
or (divide numerator and denominator by ew x)
1 T
E[y|x] = T
= (1 + e−w x)−1
1 + e−w x
1 wT x

10 JJ J I II J • X 11 JJ J I II J • X

logreg.pdf — May 4, 2010 — 2

Logistic Response Function Linearization: the logit transformation

Write z = wT x:
1.0
p(y = 1|x) (1 + e−z )−1
ln = ln
E[y|x] 1 − p(y = 1|x) 1 − (1 + e−z )−1
1 1
= ln = ln −z
(1 + e−z ) − 1 e
= ln ez = z = wT x
0.5

In the second step, we divided the numerator and the denominator by (1 + e−z )−1.
The ratio p(y = 1|x)/(1 − p(y = 1|x)) is called the odds.
0.0

12 JJ J I II J • X 13 JJ J I II J • X

Linear Separation Linear Decision Boundary

x2
Assign to class 1 if p(y = 1|x) > p(y = 0|x), i.e. if
p(y = 1|x)
>1 w T x = w0 + w1 x1 + w2 x2 = 0
1 − p(y = 1|x)

This is true if Class B

p(y = 1|x) Class A

ln >0
1 − p(y = 1|x)
x1

So assign to class 1 if wT x > 0, and to class 0

otherwise.
14 JJ J I II J • X 15 JJ J I II J • X

Maximum Likelihood Estimation Maximum Likelihood Estimation

y = 1 if heads, y = 0 if tails. Let µ = p(y = 1). In a sequence of 10 coin flips we observe

One coin flip y = (1, 0, 1, 1, 0, 1, 1, 1, 1, 0).
p(y) = µy (1 − µ)1−y
Note that p(1) = µ, p(0) = 1 − µ as required. The corresponding likelihood function is
Sequence of N independent coin flips p(y|µ) = µ · (1 − µ) · µ · µ · (1 − µ) · µ · µ · µ · µ
N ·(1 − µ) = µ7(1 − µ)3
µyi (1 − µ)1−yi
Y
p(y) = p(y1, y2, ..., yN ) =
The corresponding loglikelihood function is
i=1
which defines the likelihood function when viewed as a ln p(y|µ) = ln(µ7(1 − µ)3) = 7 ln µ + 3 ln(1 − µ)
function of µ. Note: log ab = log a + log b, log ab = b log a.
16 JJ J I II J • X 17 JJ J I II J • X

logreg.pdf — May 4, 2010 — 3

Computing the maximum Loglikelihood function for y = (1, 0, 1, 1, 0, 1, 1, 1, 1, 0)

To determine the maximum we take the derivative and

equate it to zero

−10
d ln p(y|µ) 7 3
= − =0
µ 1−µ

−15
dµ

loglikelihood
which yields maximum likelihood estimate µ = 0.7. ML

−20
This is just the relative frequency of heads in the sample.

−25
Note:

−30
d ln x 1
=
dx x 0.0 0.2 0.4 0.6 0.8 1.0

18 JJ J I II J • X 19mu JJ J I II J • X

ML estimation for logistic regression ML estimation for logistic regression

Now probability of success depends on xi: Example

T i xi yi p(yi)
µi = p(y = 1|xi) = (1 + e−w xi )−1
T 1 8 0 (1 + ew0+8w1 )−1
1 − µi = p(y = 0|xi) = (1 + ew xi )−1 2 12 0 (1 + ew0+12w1 )−1
we can represent its probability distribution as follows 3 15 1 (1 + e−w0−15w1 )−1
y 4 10 1 (1 + e−w0−10w1 )−1
p(yi) = µi i (1 − µi)1−yi yi ∈ {0, 1}; i = 1, . . . , N
p(y|w) = (1 + ew0+8w1 )−1 × (1 + ew0+12w1 )−1×
× (1 + e−w0−15w1 )−1 × (1 + e−w0−10w1 )−1

20 JJ J I II J • X 21 JJ J I II J • X

LR: likelihood function LR: error function

Since the yi observations are independent: Since for the logistic regression model:
T
N
Y N
Y µi = (1 + e−w xi )−1
p(y|w) = p(yi) = µyi i (1 − µi)1−yi T
1 − µi = (1 + ew xi )−1
i=1 i=1

Or, taking minus the natural log: we get

N n
X o
T T
N
Y E(w) = yi ln(1 + e−w xi ) + (1 − yi) ln(1 + ew xi )
− ln p(y|w) = − ln µyi i (1 − µi)1−yi i=1
i=1 I Non-linear function of the parameters.
N
I Likelihood function globally concave.
X
=− {yi ln µi + (1 − yi) ln(1 − µi)}
i=1 I Numerical Optimization required.
This is called the cross-entropy error function.

Comparable to sum of squared errors for regression problems.

22 JJ J I II J • X 23 JJ J I II J • X

logreg.pdf — May 4, 2010 — 4

Fitted Response Function Example: Programming Assignment

Substitute maximum likelihood estimates into the Model the probability of succesfully completing a
response function to obtain the fitted response function programming assignment.
T
ew x
ML
Explanatory variable: “programming experience”.
p̂(y = 1|x) = T
1 + ew x ML
We find w0 = −3.0597 and w1 = 0.1615, so
e−3.0597+0.1615xi
p̂(y = 1|xi) =
1 + e−3.0597+0.1615xi
14 months of programming experience:
e−3.0597+0.1615(14)
p̂(y = 1|x = 14) = ≈ 0.31
1 + e−3.0597+0.1615(14)
24 JJ J I II J • X 25 JJ J I II J • X

Example: Programming Assignment Allocation Rule

month.exp success fitted month.exp success fitted Probability of the classes is equal when
1 14 0 0.310262 16 13 0 0.276802
2 29 0 0.835263 17 9 0 0.167100 −3.0597 + 0.1615x = 0
3 6 0 0.109996 18 32 1 0.891664
4 25 1 0.726602 19 24 0 0.693379
5 18 1 0.461837 20 13 1 0.276802 Solving for x we get x ≈ 18.95.
6 4 0 0.082130 21 19 0 0.502134
7 18 0 0.461837 22 4 0 0.082130
8 12 0 0.245666 23 28 1 0.811825 Allocation Rule:
9 22 1 0.620812 24 22 1 0.620812
10 6 0 0.109996 25 8 1 0.145815 x ≥ 19: assign to class 1
11 30 1 0.856299 x < 19: assign to class 0
12 11 0 0.216980
13 30 1 0.856299
14 5 0 0.095154
15 20 1 0.542404

26 JJ J I II J • X 27 JJ J I II J • X

Programming Assignment: Confusion Matrix Example: Conn’s syndrome

Cross table of observed and predicted class label: Two possible causes:
0 1 a) Benign tumor (adenoma) of the adrenal cortex.
0 11 3 b) More diffuse affection of the adrenal glands (bilateral
1 3 8 hyperplasia).
Row: observed, Column: predicted
Pre-operative diagnosis on basis of
Error rate: 6/25=0.24 1. Sodium concentration (mmol/l)
2. CO2 concentration (mmol/l)
Default: 11/25=0.44

28 JJ J I II J • X 29 JJ J I II J • X

logreg.pdf — May 4, 2010 — 5

Conn’s syndrome: the data Conn’s Syndrome: Plot of Data

a=1, b=0

34
b b
b b b
sodium co2 cause sodium co2 cause b

32
1 140.6 30.3 0 16 139.0 31.4 0 b
2 143.0 27.1 0 17 144.8 33.5 0
b

30
3 140.0 27.0 0 18 145.7 27.4 0 b
b b
4 146.0 33.0 0 19 144.0 33.0 0 b
5 138.7 24.1 0 20 143.5 27.5 0 a

28
ab

co2
6 143.7 28.0 0 21 140.3 23.4 1 b b b
b a b
7 137.3 29.6 0 22 141.2 25.8 1
b a aa

26
8 141.0 30.0 0 23 142.0 22.0 1 a
9 143.8 32.2 0 24 143.5 27.8 1 a
10 144.6 29.5 0 25 139.7 28.0 1 b

24
11 139.5 26.0 0 26 141.1 25.0 1 a
12 144.0 33.7 0 27 141.0 26.0 1 a

22
13 145.0 33.0 0 28 140.5 27.0 1
138 140 142 144 146
14 140.2 29.1 0 29 140.0 26.0 1
15 144.7 27.4 0 30 140.0 25.6 1 sodium
30 JJ J I II J • X 31 JJ J I II J • X

Maximum Likelihood Estimation Conn’s Syndrome: Allocation Rule

The maximum likelihood estimates are:

34
b b
b b b
b
w0 = 36.6874320
32
b
w1 = −0.1164658 30
b
b
b b
w2 = −0.7626711 b
a
28

ab
co2

b b b
b a b
b a aa
26

Assign to group a if a
a
b
36.69 − 0.12 × sodium − 0.76 × CO2 > 0
24

and to group b otherwise. a

138 140 142 144 146

sodium
32 JJ J I II J • X 33 JJ J I II J • X

Conn’s Syndrome: Confusion Matrix Conn’s Syndrome: Line with lower empirical error

Cross table of observed and predicted class label:

b b
a b b
b b b
32

a 7 3 b
b b
b 2 18
30

b b
b
Row: observed, Column: predicted
28
co2

a a b
b b b
b a b
26

b a aa
Error rate: 5/30=1/6 a
a
24

b
a
Default: 1/3
22

a
138 140 142 144 146

sodium
34 JJ J I II J • X 35 JJ J I II J • X

logreg.pdf — May 4, 2010 — 6

Likelihood and Error Rate Quadratic Model

Likelihood maximization is not the same as error rate Coefficient Value

minimization! (Intercept) -13100.69
sodium 177.42
i yi p̂1(yi = 1) p̂2(yi = 1) CO2 41.36
1 0 0.9 0.6 sodium2 -0.60
2 0 0.4 0.1 CO2 2 -0.12
3 1 0.6 0.9 sodium × CO2 -0.25
4 1 0.55 0.4 Cross table of observed (row) and predicted class label:
Which model has the lower error-rate? a b
Which one the higher likelihood? a 8 2
b 2 18
36 JJ J I II J • X 37 JJ J I II J • X

Conn’s Syndrome: Quadratic Specification

b b
b b b
b
32

b
b b
30

b b
b
28
co2

a ab
b b b
b a b
26

b a aa
a
a
24

b
a
22

a
138 140 142 144 146

sodium
38 JJ J I II J • X

logreg.pdf — May 4, 2010 — 7

Ethics and Public Policy A Philosophical Inquiry PDF
No ratings yet
Ethics and Public Policy A Philosophical Inquiry PDF
2 pages
Exercise Solution 05 Linear Classification
No ratings yet
Exercise Solution 05 Linear Classification
9 pages
Barry King
No ratings yet
Barry King
2 pages
Hyd
No ratings yet
Hyd
502 pages
Mole Concepts and Molar Mass
No ratings yet
Mole Concepts and Molar Mass
11 pages
Pattern Recognition and Deep Learning Linear Models For Classification
No ratings yet
Pattern Recognition and Deep Learning Linear Models For Classification
59 pages
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
No ratings yet
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
10 pages
Multivariate Classification
No ratings yet
Multivariate Classification
7 pages
Log-Linear Models and Conditional Random Fieldsels
No ratings yet
Log-Linear Models and Conditional Random Fieldsels
27 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
Class23 26 LinearClassification NeuralNetworks - 05 15nov2019
No ratings yet
Class23 26 LinearClassification NeuralNetworks - 05 15nov2019
35 pages
ML-chap10 2024 110300
No ratings yet
ML-chap10 2024 110300
29 pages
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
No ratings yet
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
67 pages
Class
No ratings yet
Class
102 pages
04 Probability and Learning PDF
No ratings yet
04 Probability and Learning PDF
34 pages
LR, Decision Tree
No ratings yet
LR, Decision Tree
48 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Hw2 - Raymond Von Mizener - Chirag Mahapatra
No ratings yet
Hw2 - Raymond Von Mizener - Chirag Mahapatra
13 pages
Basic ML Algorithm
No ratings yet
Basic ML Algorithm
74 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
CS60010: Deep Learning: Spring 2021
No ratings yet
CS60010: Deep Learning: Spring 2021
32 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
ML Columbia PDF
No ratings yet
ML Columbia PDF
615 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
Lec 20
No ratings yet
Lec 20
16 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Mathematical Foundations of Computational Linguistics: Manfred Klenner and Jannis Vamvas
No ratings yet
Mathematical Foundations of Computational Linguistics: Manfred Klenner and Jannis Vamvas
32 pages
Weather Wax Hastie Solutions Manual
No ratings yet
Weather Wax Hastie Solutions Manual
18 pages
3logistic Regression
No ratings yet
3logistic Regression
61 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
Classification Problems: - Outcome Is Categorical - E.G. Customer Responds or Not
No ratings yet
Classification Problems: - Outcome Is Categorical - E.G. Customer Responds or Not
10 pages
Unit 3-Discriminative Models
No ratings yet
Unit 3-Discriminative Models
29 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Lecturenotes
No ratings yet
Lecturenotes
56 pages
Dis 1
No ratings yet
Dis 1
5 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Logistic Regression by Nirzona
No ratings yet
Logistic Regression by Nirzona
11 pages
Math Behind Machine Learning
No ratings yet
Math Behind Machine Learning
9 pages
CSCE 970 Lecture 2: Bayesian-Based Classifiers: Most Probable
No ratings yet
CSCE 970 Lecture 2: Bayesian-Based Classifiers: Most Probable
5 pages
2223hk1 Slide03 ML2022
No ratings yet
2223hk1 Slide03 ML2022
33 pages
Output 23
No ratings yet
Output 23
6 pages
Logistic Regression
No ratings yet
Logistic Regression
26 pages
Logistic Regression: Some Slides Adapted From Dan Jurfasky and Brendan O'Connor
No ratings yet
Logistic Regression: Some Slides Adapted From Dan Jurfasky and Brendan O'Connor
53 pages
n9 PDF
No ratings yet
n9 PDF
6 pages
Practice 1130
No ratings yet
Practice 1130
20 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Logistic Regression 1
No ratings yet
Logistic Regression 1
32 pages
ML Basics Lecture2 Linear Classification
No ratings yet
ML Basics Lecture2 Linear Classification
34 pages
Weatherwax Theodoridis Solutions
No ratings yet
Weatherwax Theodoridis Solutions
212 pages
2+logistic Regression
No ratings yet
2+logistic Regression
10 pages
Lecture 05
No ratings yet
Lecture 05
5 pages
Output 25
No ratings yet
Output 25
8 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
04 - Linear-Classification-2024
No ratings yet
04 - Linear-Classification-2024
65 pages
NB 13
No ratings yet
NB 13
27 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
Week04 Lecture BB
No ratings yet
Week04 Lecture BB
80 pages
Psych 2220 Syllabus
No ratings yet
Psych 2220 Syllabus
7 pages
Self-Concept Questionnaire (SCQ)
No ratings yet
Self-Concept Questionnaire (SCQ)
12 pages
Diversity in Underutilized Plant Species An Asia-Pacific Prespective 1938
No ratings yet
Diversity in Underutilized Plant Species An Asia-Pacific Prespective 1938
234 pages
9th Language Marathi 1
No ratings yet
9th Language Marathi 1
168 pages
Probability - Handout
No ratings yet
Probability - Handout
9 pages
3rdTrimesterMAT130 takeawayCAT
No ratings yet
3rdTrimesterMAT130 takeawayCAT
2 pages
Additional Maths SBA 2 PDF
No ratings yet
Additional Maths SBA 2 PDF
15 pages
Loading Data From APO SCM 5.0 To BW 7.0
No ratings yet
Loading Data From APO SCM 5.0 To BW 7.0
10 pages
Nutretion Eportfolio
No ratings yet
Nutretion Eportfolio
3 pages
Index: International Marketing International Advertising
No ratings yet
Index: International Marketing International Advertising
37 pages
Algorithm Design: Spring 2016 Problem Set 3
No ratings yet
Algorithm Design: Spring 2016 Problem Set 3
4 pages
ID Pengaruh Sepeda Motor Di Persimpangan Ja
No ratings yet
ID Pengaruh Sepeda Motor Di Persimpangan Ja
10 pages
The Second Law of Thermodynamics
100% (1)
The Second Law of Thermodynamics
29 pages
PEEL Your Paragraphs
No ratings yet
PEEL Your Paragraphs
4 pages
Hand Out Network Security
No ratings yet
Hand Out Network Security
6 pages
Math Reviewer
No ratings yet
Math Reviewer
4 pages
Education System in Singapore and Hong Kong
No ratings yet
Education System in Singapore and Hong Kong
9 pages
Laboratory Activity 2
No ratings yet
Laboratory Activity 2
3 pages
DCF - Several Forms - Batch 1 March 12, 2021
No ratings yet
DCF - Several Forms - Batch 1 March 12, 2021
2 pages
Rizal Sa Dapitan
0% (1)
Rizal Sa Dapitan
3 pages
A Classification and Subject Index For Cataloguing and Arranging The Books and Pamphlets of A Library by Dewey, Melvil, 1851-1931
100% (1)
A Classification and Subject Index For Cataloguing and Arranging The Books and Pamphlets of A Library by Dewey, Melvil, 1851-1931
61 pages
I C Bus Sniffer: Entry Documentation A3808
No ratings yet
I C Bus Sniffer: Entry Documentation A3808
3 pages
Year 5 Section A Set 2
No ratings yet
Year 5 Section A Set 2
13 pages
A Teaching Method Suggestopedia Fix
No ratings yet
A Teaching Method Suggestopedia Fix
12 pages
Breaching The Geisha Spirit On The High School Playground
No ratings yet
Breaching The Geisha Spirit On The High School Playground
6 pages
AmitaJoshi SamualDrugsLimited
No ratings yet
AmitaJoshi SamualDrugsLimited
13 pages

Linear Models For Classification: Logreg - PDF - May 4, 2010 - 1

Uploaded by

Linear Models For Classification: Logreg - PDF - May 4, 2010 - 1

Uploaded by

Classification Problems

Data Mining In classification problem there is a target variable y that

The goal of a classification procedure is to predict the

An important special case is when there are only two

Examples of Classification Problems Classification Problems

Two types of approaches to classification Discriminative Models

I Discriminative Models (regression). Discriminative methods only model the conditional

Note that the linear regression model follows the same

logreg.pdf — May 4, 2010 — 1

Generative Models Discriminative Models: linear probability model

Linear response function Logistic regression

Logistic response function

logreg.pdf — May 4, 2010 — 2

Linear Separation Linear Decision Boundary

This is true if Class B

p(y = 1|x) Class A

So assign to class 1 if wT x > 0, and to class 0

Maximum Likelihood Estimation Maximum Likelihood Estimation

y = 1 if heads, y = 0 if tails. Let µ = p(y = 1). In a sequence of 10 coin flips we observe

logreg.pdf — May 4, 2010 — 3

To determine the maximum we take the derivative and

ML estimation for logistic regression ML estimation for logistic regression

Now probability of success depends on xi: Example

LR: likelihood function LR: error function

Or, taking minus the natural log: we get

Comparable to sum of squared errors for regression problems.

logreg.pdf — May 4, 2010 — 4

Example: Programming Assignment Allocation Rule

Programming Assignment: Confusion Matrix Example: Conn’s syndrome

logreg.pdf — May 4, 2010 — 5

Maximum Likelihood Estimation Conn’s Syndrome: Allocation Rule

The maximum likelihood estimates are:

and to group b otherwise. a

138 140 142 144 146

Cross table of observed and predicted class label:

logreg.pdf — May 4, 2010 — 6

Likelihood maximization is not the same as error rate Coefficient Value

Conn’s Syndrome: Quadratic Specification

logreg.pdf — May 4, 2010 — 7

You might also like