0% found this document useful (0 votes)

162 views6 pages

CS 4870: Machine Learning - Homework #4: Professor Kilian Weinberger, 8:40 AM

This document provides instructions for homework 4 in the machine learning course CS 4870. It includes 4 problems related to applying Bayes' rule and naive Bayes classification. Problem 1 has multiple parts applying Bayes' rule to coin flipping scenarios. Problem 2 applies naive Bayes to classify emails as spam or ham based on features. It discusses improving the model with more data and features. Problem 3 provides the formula for calculating the posterior probability in naive Bayes classification.

Uploaded by

Kevin Gao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

162 views6 pages

CS 4870: Machine Learning - Homework #4: Professor Kilian Weinberger, 8:40 AM

Uploaded by

Kevin Gao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CS 4870: Machine Learning - Homework #4

Due on March 6, 2018 at 11:59 PM

Professor Kilian Weinberger, 8:40 AM

A,D,K,M,Z

1
A,D,K,M,Z CS 4870: Homework #4

Problem 1
a. From the question:
P (R) = 1/2
P (B) = 1/2
P (H|R) = 3/5

By Law of Total Probability, we have:

3 1 7 1 6 7 13
P (H) = P (H|R)P (R) + P (H|B)P (B) = · + · = + =
5 2 10 2 20 20 20
By Bayes Rule:

P (H|R)P (R) 3 1 20 6
P (R|H) = = · · =
P (H) 5 2 13 13

b. From the question (probabilities are P ([coin] is heads|hat color):

P (P |R) = 3/5 P (P |B) = 7/10

P (N |R) = 3/10 P (N |B) = 1/5
P (D|R) = 1/2 P (D|B) = 1/10
P (Q|R) = 4/5 P (Q|B) = 2/5

We make the naive Bayes assumption. By Bayes Rule (notice the 1/2 terms cancel):

3 3 1 4 9
P (HHT H|R) = · · · =
5 10 2 5 125
7 1 9 2 63
P (HHT H|B) = · · · =
10 5 10 5 1250

P (HHT H|R) 10
P (R|HHT H) = =
P (HHT H|R) + P (HHT H|B) 17

c. After examining the data table (probabilities are P ([coin] is heads|hat color):

P (P |R) = 3/4 P (P |B) = 1/10

P (N |R) = 7/8 P (N |B) = 3/10
P (D|R) = 1/2 P (D|B) = 9/10
P (Q|R) = 1/8 P (Q|B) = 4/10

By Bayes Rule (notice the 1/2 terms cancel):

3 7 1 1 21
P (HHT H|R) = · · · =
4 8 2 8 512
1 3 1 4 3
P (HHT H|B) = · · · =
10 10 10 10 2500

P (HHT H|R) 4375

P (R|HHT H) = =
P (HHT H|R) + P (HHT H|B) 4503

Problem 1 continued on next page. . . 2

A,D,K,M,Z CS 4870: Homework #4 Problem 1 (continued)

d. X = Data(heads/tails) and y = red/blue hat. The Naive Bayes assumption holds because different
coins are independent of each other (i.e. the event spaces are disjoint); hence, the features can be assumed
to be conditionally independent.

Problem 2
a.
P (0, 0, 1|Ham)P (Ham) 0
P (Ham|0, 0, 1) = =
P (0, 0, 1) 5

P (1, 1, 1|Ham)P (Ham) 1/5 ∗ 1/3

P (Ham|1, 1, 1) = = =1
P (1, 1, 1) 1/5 ∗ 1/3 + 0 ∗ 2/3

P (1, 0, 0|Ham)P (Ham)

P (Ham|1, 0, 0) = =0
P (1, 0, 0)

P (0, 0, 0|Ham)P (Ham) 0

P (Ham|0, 0, 0) = = = undef ined
P (0, 0, 0 0
Yes, last one is undefined, all seem unreasonable for them to guaranteed to to be 0 or 1 or undefined. This
is due to the fact that we don’t utilize Laplace smoothing.

• Collecting more emails would help with our predictions because a larger data sample would give us
more realistic probabilities

• Extracting more features for each email would allow us to classify each email more accurately

• Duplicating emails with uncommon features would not help, it changes the distribution of the emails

• Making stronger assumptions is helpful, assuming our features are independent of each other would be
more realistic for our data.

P (1, 0, 1|Ham) = P (bacon = 1|Ham)P (ip = 0|Ham)P (mispell = 1|Ham) = 1 ∗ 2/5 ∗ 3/5 = 6/25

Problem 2 continued on next page. . . 3

A,D,K,M,Z CS 4870: Homework #4 Problem 2 (continued)

P (0, 0, 1|Ham)P (Ham)

P (Ham|0, 0, 1) = =0
P (0, 0, 1)
P (1, 1, 1|Ham)P (Ham)
P (Ham|1, 1, 1) = = 60/67
P (1, 1, 1)
P (1, 0, 0|Ham)P (Ham)
P (Ham|1, 0, 0) = = 80/101
P (1, 0, 0)
P (0, 0, 0|Ham)P (Ham)
P (Ham|0, 0, 0) = =0
P (0, 0, 0)

P (1, 0, 1|Ham) = P (bacon = 1|Ham)P (ip = 0|Ham)P (mispell = 1|Ham) = 9/13 ∗ 6/13 ∗ 7/13 = 0.172

P (0, 0, 1|Ham)P (Ham)

P (Ham|0, 0, 1) = = 0.16996
P (0, 0, 1)
P (1, 1, 1|Ham)P (Ham)
P (Ham|1, 1, 1) = = 0.613
P (1, 1, 1)
P (1, 0, 0|Ham)P (Ham)
P (Ham|1, 0, 0) = = 0.617
P (1, 0, 0)
P (0, 0, 0|Ham)P (Ham)
P (Ham|0, 0, 0) = = 0.216
P (0, 0, 0)

4
A,D,K,M,Z CS 4870: Homework #4

2. Dividing both sides by the numerator.

1
= Qd
α=1 p([x]α |y=0)p(y=0)
1 + exp (log Qd )
α=1 p([x]α |y=1)p(y=1)

1
= Qd
p([x]α |y=1)p(y=1)
1 + exp (− log Qα=1
d )
α=1 p([x]α |y=0)p(y=0)

3. Define w
~ and b as follows:
µα1 − µα0
wα = [w]α =
σα2
d
p(y = 1) X µ2α1 − µ2α0
b = log −
p(y = 0) α=1
2σα2

Problem 3 continued on next page. . . 5

A,D,K,M,Z CS 4870: Homework #4 Problem 3 (continued)

(xα −µαy )2
Then, given that p([x]α |y) = √ 1 exp (− 2 ),
2πσα 2σα

P (y = 1|~x)
h(~x) = 1 ⇐⇒ >1
P (y = 0|~x)
Qd
α=1 p([x]α |y = 1)p(y = 1)
⇐⇒ Qd >1
α=1 p([x]α |y = 0)p(y = 0)
Qd
p([x]α |y = 1) p(y = 1)
⇐⇒ log Qdα=1 + log >0
α=1 p([x]α |y = 0)
p(y = 0)
Qd (xα −µα1 )2
√ 1
α=1 2πσα exp (− 2
2σα ) p(y = 1)
⇐⇒ log Qd (xα −µα0 )2
+ log >0
1 p(y = 0)
α=1 2πσα exp (− )
√ 2
2σα
2
exp (− α=1 (xα −µ
Pd α1 )
2σα2 ) p(y = 1)
⇐⇒ log Pd (xα −µα0 )2 + log >0
exp (− 2 ) p(y = 0)
α=1 2σα
d
X (xα − µα1 )2 − (xα − µα0 )2 p(y = 1)
⇐⇒ log(exp (− 2
)) + log >0
α=1
2σα p(y = 0)
d
X (xα − µα1 )2 − (xα − µα0 )2 p(y = 1)
⇐⇒ − 2
+ log >0
α=1
2σ α p(y = 0)
d
X −2xα µα1 + µ2α1 + 2xα µα0 − µ2α0 p(y = 1)
⇐⇒ − 2
+ log >0
α=1
2σ α p(y = 0)
d d
X µα1 − µα0 X µ2α1 − µ2α0 p(y = 1)
⇐⇒ 2
· x α − 2
+ log >0
α=1
σα α=1
2σα p(y = 0)
d d
X p(y = 1) X µ2α1 − µ2α0
⇐⇒ wα xα + log − >0
α=1
p(y = 0) α=1
2σα2
⇐⇒ w
~ · ~x + b > 0

Therefore, the Gaussian Naives Bayes with shared variance is linear.

Grade 7 Chapter 12 Chapter Review
No ratings yet
Grade 7 Chapter 12 Chapter Review
2 pages
MAT2377 Final Formula Sheet
No ratings yet
MAT2377 Final Formula Sheet
4 pages
STT206 Summary-1
100% (1)
STT206 Summary-1
36 pages
BG2209 - Prob
No ratings yet
BG2209 - Prob
435 pages
Prob ALecture Notes
No ratings yet
Prob ALecture Notes
202 pages
ADMS 2320 Test 1 Sheet
No ratings yet
ADMS 2320 Test 1 Sheet
1 page
Review For Final Exam - Sending
No ratings yet
Review For Final Exam - Sending
5 pages
Combinatorics - Brilliant - Answers
No ratings yet
Combinatorics - Brilliant - Answers
6 pages
Exercise 6C PG 161
No ratings yet
Exercise 6C PG 161
3 pages
Solution10 PDF
No ratings yet
Solution10 PDF
6 pages
HW 01
No ratings yet
HW 01
7 pages
2014 MI Prelim P2
No ratings yet
2014 MI Prelim P2
7 pages
Course Outline Statistics
No ratings yet
Course Outline Statistics
4 pages
Stat 151 - Final Review
No ratings yet
Stat 151 - Final Review
15 pages
Chapter 19 - Quiz Version B - Blank Copy and Answer Key Combined
No ratings yet
Chapter 19 - Quiz Version B - Blank Copy and Answer Key Combined
2 pages
Nora Roberts Stakleni Otok PDF
0% (1)
Nora Roberts Stakleni Otok PDF
85 pages
IDF Procedure
No ratings yet
IDF Procedure
5 pages
Pooled Time Series analysis-SAGE 1989
No ratings yet
Pooled Time Series analysis-SAGE 1989
149 pages
Result Analysis 2016
No ratings yet
Result Analysis 2016
2 pages
Assignment 2 MAS291
No ratings yet
Assignment 2 MAS291
6 pages
Caie As Maths 9709 Statistics 1 v2
No ratings yet
Caie As Maths 9709 Statistics 1 v2
12 pages
Probability
0% (1)
Probability
8 pages
Assignment 2 MAS291
No ratings yet
Assignment 2 MAS291
3 pages
BS UNIT 2 Normal Distribution New
No ratings yet
BS UNIT 2 Normal Distribution New
41 pages
Statistics Formulas 2025
No ratings yet
Statistics Formulas 2025
8 pages
Assignment 1 MAS291
No ratings yet
Assignment 1 MAS291
3 pages
Bai Tap Xac Suat Thong Ke
No ratings yet
Bai Tap Xac Suat Thong Ke
39 pages
Course Notes For Unit 4 of The Udacity Course ST101 Introduction To Statistics
No ratings yet
Course Notes For Unit 4 of The Udacity Course ST101 Introduction To Statistics
25 pages
Mit18 05 s22 Exam Final
No ratings yet
Mit18 05 s22 Exam Final
23 pages
Baye's Theorem
No ratings yet
Baye's Theorem
5 pages
Computing Means and Variances
No ratings yet
Computing Means and Variances
3 pages
IAL Statistics Revision Worksheet Month 6
100% (1)
IAL Statistics Revision Worksheet Month 6
5 pages
Hyper Geometric Distribution: Examples and Formula
No ratings yet
Hyper Geometric Distribution: Examples and Formula
9 pages
Continuous Distributions
No ratings yet
Continuous Distributions
4 pages
Blue Print For Class Xi See 2025
No ratings yet
Blue Print For Class Xi See 2025
1 page
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
2 pages
Pea305 Ete 2022
No ratings yet
Pea305 Ete 2022
7 pages
Second Year B.C.A. (Sem. I LL) Exam Ination 301: Statistical M Ethods
No ratings yet
Second Year B.C.A. (Sem. I LL) Exam Ination 301: Statistical M Ethods
4 pages
Probability and Statistics (IT302) 10 August 2020 (Monday) Class Class-4
No ratings yet
Probability and Statistics (IT302) 10 August 2020 (Monday) Class Class-4
31 pages
AP Computer Science A 2019 Ced Scoring Guidelines
No ratings yet
AP Computer Science A 2019 Ced Scoring Guidelines
10 pages
Stats Formula
No ratings yet
Stats Formula
2 pages
STAT Final Sample
No ratings yet
STAT Final Sample
4 pages
AP Stats Chapter 3 Homework Answers
100% (1)
AP Stats Chapter 3 Homework Answers
4 pages
RI H2 Maths 2013 Prelim P2 Solutions
No ratings yet
RI H2 Maths 2013 Prelim P2 Solutions
10 pages
Maths 1120 Dist
No ratings yet
Maths 1120 Dist
4 pages
ISC 2019 Results
No ratings yet
ISC 2019 Results
22 pages
Normal Distribution
No ratings yet
Normal Distribution
14 pages
Course File Probability and Statistics
No ratings yet
Course File Probability and Statistics
72 pages
B39AX Topic1-P PDF
No ratings yet
B39AX Topic1-P PDF
28 pages
06 The Continuous Uniform Distribution
No ratings yet
06 The Continuous Uniform Distribution
10 pages
Chapter 24 - Probability Venn Diagram
No ratings yet
Chapter 24 - Probability Venn Diagram
14 pages
Wa0016.
No ratings yet
Wa0016.
13 pages
Math525 2
No ratings yet
Math525 2
8 pages
2.2 Normal Distribution Worksheet AP Statistics: Between 18.6 MPG and 31 MPG
No ratings yet
2.2 Normal Distribution Worksheet AP Statistics: Between 18.6 MPG and 31 MPG
2 pages
OBS 615: Managerial Decision Making Processes and Techniques
No ratings yet
OBS 615: Managerial Decision Making Processes and Techniques
51 pages
Stat Cheatsheet (Ver.2)
No ratings yet
Stat Cheatsheet (Ver.2)
2 pages
Sta5 Prev
100% (1)
Sta5 Prev
6 pages
MST 004
No ratings yet
MST 004
22 pages
Assignment 2 (2-12-2016)
No ratings yet
Assignment 2 (2-12-2016)
48 pages
Probability and Statistics
100% (1)
Probability and Statistics
26 pages
Permutation Combination Probability
No ratings yet
Permutation Combination Probability
13 pages
AB1202 Lect 05
No ratings yet
AB1202 Lect 05
17 pages
Assignment Bays Theorerm
100% (1)
Assignment Bays Theorerm
1 page
MIT18 05S14 Class26-Sol
No ratings yet
MIT18 05S14 Class26-Sol
20 pages
Kevin Gao Kg349 Sec 402
No ratings yet
Kevin Gao Kg349 Sec 402
5 pages
Kevin Gao Kg349 Engrd HW 8
No ratings yet
Kevin Gao Kg349 Engrd HW 8
6 pages
Kevin Gao Kg349 HW6
No ratings yet
Kevin Gao Kg349 HW6
7 pages
Kevin Gao Kg349 Sec 402 HW7
No ratings yet
Kevin Gao Kg349 Sec 402 HW7
6 pages
CS 4820 HW 3.2
No ratings yet
CS 4820 HW 3.2
2 pages
l (e) l (e) : ϵ ∈ (P 1 ,P 2, .. Pi ϵ ∈ (Pj ,,.. Pi
No ratings yet
l (e) l (e) : ϵ ∈ (P 1 ,P 2, .. Pi ϵ ∈ (Pj ,,.. Pi
2 pages
Kevin Gao Kg349 HW6
No ratings yet
Kevin Gao Kg349 HW6
7 pages
CS 4320 HW 1
No ratings yet
CS 4320 HW 1
1 page
Kevin Gao kg349 Sec 402
No ratings yet
Kevin Gao kg349 Sec 402
6 pages
Kevin Gao kg349 Lab Sec 402
No ratings yet
Kevin Gao kg349 Lab Sec 402
6 pages
Kevin Gao Kg349 Sec 402
No ratings yet
Kevin Gao Kg349 Sec 402
5 pages
Homework 4 and Solutions For Homework 3
No ratings yet
Homework 4 and Solutions For Homework 3
5 pages
P's List. All Other Edges Are OK.: N 1 N 1 I I I S
No ratings yet
P's List. All Other Edges Are OK.: N 1 N 1 I I I S
3 pages
N R D), That Should Not Have Been On His/her List L
No ratings yet
N R D), That Should Not Have Been On His/her List L
2 pages
I I I I I I I I I: Optimal Solution Your Solution
No ratings yet
I I I I I I I I I: Optimal Solution Your Solution
3 pages
Problem 1: Train/Test Splits: So The Data Is I.I.D
No ratings yet
Problem 1: Train/Test Splits: So The Data Is I.I.D
3 pages
CS4780 HW3
No ratings yet
CS4780 HW3
4 pages
Statistics in Education Activity 3 Central Tendency
No ratings yet
Statistics in Education Activity 3 Central Tendency
2 pages
Stat Activity 2 Group 4
No ratings yet
Stat Activity 2 Group 4
12 pages
Slides #1 Chapter 10.1.2
No ratings yet
Slides #1 Chapter 10.1.2
33 pages
103 Sept 2002 Question
No ratings yet
103 Sept 2002 Question
8 pages
Introduction To VAR Model
No ratings yet
Introduction To VAR Model
8 pages
Variable and Types of Statistical Variables
100% (1)
Variable and Types of Statistical Variables
9 pages
Rose Sparkling Wine
100% (1)
Rose Sparkling Wine
32 pages
EC2203
No ratings yet
EC2203
20 pages
Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D
No ratings yet
Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D
59 pages
ITCT Lab Manual 2018-19
100% (3)
ITCT Lab Manual 2018-19
40 pages
Test of Significance
No ratings yet
Test of Significance
22 pages
Pattern Recognition - Unit - 1&2
100% (1)
Pattern Recognition - Unit - 1&2
41 pages
Basic Concepts: Time Value of Money
100% (1)
Basic Concepts: Time Value of Money
20 pages
Quantitative Techniques For Business - 1 2021
No ratings yet
Quantitative Techniques For Business - 1 2021
3 pages
Business Statistic Group Assignment.
No ratings yet
Business Statistic Group Assignment.
14 pages
Rank Correlation
No ratings yet
Rank Correlation
18 pages
Qcar 17 18 Compre QP
No ratings yet
Qcar 17 18 Compre QP
2 pages
07 Project PERT
No ratings yet
07 Project PERT
37 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
F06-Method Validation Assay (Chemical)
No ratings yet
F06-Method Validation Assay (Chemical)
2 pages
Derivatives (ECONM3017) Lecture Eleven: Options VI (Advanced Option Pricing)
No ratings yet
Derivatives (ECONM3017) Lecture Eleven: Options VI (Advanced Option Pricing)
9 pages
Math Formulas Ans Statistical Tables-ALevels
No ratings yet
Math Formulas Ans Statistical Tables-ALevels
16 pages
PS Iv
No ratings yet
PS Iv
8 pages
11 MAA Univariate Stats Check Knowledge
No ratings yet
11 MAA Univariate Stats Check Knowledge
11 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
MINITAB Tip Sheet 10: Test of One or Two Proportions
No ratings yet
MINITAB Tip Sheet 10: Test of One or Two Proportions
4 pages

CS 4870: Machine Learning - Homework #4: Professor Kilian Weinberger, 8:40 AM

Uploaded by

CS 4870: Machine Learning - Homework #4: Professor Kilian Weinberger, 8:40 AM

Uploaded by

CS 4870: Machine Learning - Homework #4

Due on March 6, 2018 at 11:59 PM

Professor Kilian Weinberger, 8:40 AM

By Law of Total Probability, we have:

b. From the question (probabilities are P ([coin] is heads|hat color):

P (P |R) = 3/5 P (P |B) = 7/10

P (P |R) = 3/4 P (P |B) = 1/10

By Bayes Rule (notice the 1/2 terms cancel):

P (HHT H|R) 4375

Problem 1 continued on next page. . . 2

P (1, 1, 1|Ham)P (Ham) 1/5 ∗ 1/3

P (1, 0, 0|Ham)P (Ham)

P (0, 0, 0|Ham)P (Ham) 0

Problem 2 continued on next page. . . 3

P (0, 0, 1|Ham)P (Ham)

P (0, 0, 1|Ham)P (Ham)

2. Dividing both sides by the numerator.

Problem 3 continued on next page. . . 5

Therefore, the Gaussian Naives Bayes with shared variance is linear.

You might also like