0% found this document useful (0 votes)

12 views6 pages

Lab 6

Uploaded by

aabbccddeeffgganimelover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views6 pages

Lab 6

Uploaded by

aabbccddeeffgganimelover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Learning Classification Theory Examination

Time allowed: 120 minutes Total points: 100

Part 1: Classification Fundamentals (30 points)

1. (10 points) Given the following confusion matrix for a facial expression recognition system:

Actual
Predicted Happy Sad
Happy 85 15
Sad 20 80

a) Calculate accuracy, precision, recall, and F1-score

b) Explain why each metric would or would not be appropriate for this application

c) If this system were to be used in mental health screening, which metric should be prioritized and why?

2. (10 points) The sigmoid function σ(z) = 1/(1 + e⁻ᶻ) is fundamental to logistic regression.

a) For z = ln(3), calculate the exact value of σ(z)

b) Prove that σ(z) + σ(-z) = 1 for any value of z

c) Explain why property (b) makes this function suitable for binary classification

4. (10 points) For a KNN classifier with k=3, consider the following five training points:

Point Feature x Feature y Class

P1 1 2 A
P2 2 1 A
P3 5 5 B
P4 4 4 B
P5 2 3 A

What class would be assigned to a test point at (3,3)? Show all distance calculations and explain your reasoning.

Question 1. Confusion Matrix Analysis (10 points)

Given matrix:

Actual
Predicted Happy Sad
Happy 85 15
Sad 20 80

a) Metric Calculations: TP = 85, TN = 80, FP= 15, FN = 20

1. Accuracy:

Formula: (TP + TN)/(TP + TN + FP + FN)

Numbers: (85 + 80)/(85 + 80 + 15 + 20)
Calculation: 165/200 = 0.825
Result: 82.5%

2. Precision:

Formula: TP/(TP + FP)

Numbers: 85/(85 + 15)
Calculation: 85/100 = 0.85
Result: 85%

3. Recall:

Formula: TP/(TP + FN)

Numbers: 85/(85 + 20)
Calculation: 85/105 ≈ 0.81
Result: 81%

4. F1-score:

Formula: 2 × (Precision × Recall)/(Precision + Recall)

Numbers: 2 × (0.85 × 0.81)/(0.85 + 0.81)
Calculation: 2 × 0.6885/1.66 ≈ 0.83
Result: 83%

b) Metric Appropriateness:

Accuracy: Appropriate because:

Classes are balanced (similar numbers of Happy and Sad)

Both types of errors have similar importance
Gives good overall performance measure

Precision: Important because:

Measures reliability of "Happy" predictions

Crucial if false positives are problematic
Helps avoid wrongly labeling sad expressions as happy

Recall: Critical because:

Shows ability to detect all happy expressions

Important for not missing positive emotions
Helps ensure comprehensive emotion detection

F1-score: Very appropriate because:

Balances precision and recall

Provides single metric for comparison
Accounts for both types of errors

c) For mental health screening:

Recall should be prioritized because:

Missing a negative emotion (false negative) could be dangerous

Better to flag potential concerns for further review
Early intervention is crucial in mental health
Cost of missing a sad expression > cost of false happy detection

Question 2. Sigmoid Function Analysis (10 points)

a) For z = ln(3): σ(ln(3)) = 1/(1 + e^-ln(3)) = 1/(1 + 1/3) = 1/(3/3 + 1/3) = 1/(4/3) = 3/4 = 0.75

b) Proof that σ(z) + σ(-z) = 1:

σ(z) + σ(-z) = 1/(1 + e^-z) + 1/(1 + e^z)

= (e^z)/(e^z + 1) + (1)/(1 + e^z)

= 1

c) This property makes sigmoid suitable for binary classification because:

Outputs are complementary probabilities

P(class 1) + P(class 0) = 1 always holds
Natural interpretation as probability
Smooth transition between classes

Question 3. KNN Classification (10 points)

For test point (3,3), calculate distances to all training points:

P1(1,2) class A: d₁ = √((3-1)² + (3-2)²) = √(4 + 1) = √5 ≈ 2.236

P2(2,1) class A: d₂ = √((3-2)² + (3-1)²) = √(1 + 4) = √5 ≈ 2.236

P3(5,5) class B: d₃ = √((3-5)² + (3-5)²) = √(4 + 4) = √8 ≈ 2.828

P4(4,4)class B: d₄ = √((3-4)² + (3-4)²) = √(1 + 1) = √2 ≈ 1.414

P5(2,3) class A: d₅ = √((3-2)² + (3-3)²) = √(1 + 0) = 1.000

Three nearest neighbors:

1. P5: d=1.000 (Class A)

2. P4: d=1.414 (Class B)
3. P1: d=2.236 (Class A)

Result: Class A (2 votes for A, 1 vote for B)

Part 2: Advanced Concepts (40 points)

4. (15 points) Softmax Regression: Given the following scores for an image classification task:

Cat: 2.0
Dog: 1.0
Bird: -1.0

a) Calculate the softmax probabilities for each class

b) Show all steps of your calculation

c) Prove that your probabilities sum to 1

d) Explain why we need the exponential function in softmax

5. (10 points) For logistic regression with two features x₁ and x₂, weights w₁=2, w₂=-1, and bias b=1:

a) Write the complete equation for P(y=1|x)

b) Find the equation of the decision boundary where P(y=1|x) = 0.5

c) Calculate P(y=1|x) for point (1,1)

Question 4. Softmax Regression (15 points)

Given scores: Cat: 2.0 Dog: 1.0 Bird: -1.0

a) & b) Detailed calculation steps:

1. Calculate exponentials:

Cat: e^2.0 = 7.389

Dog: e^1.0 = 2.718

Bird: e^(-1.0) = 0.368

3. Calculate sum (denominator):

Sum = 7.389 + 2.718 + 0.368 = 10.475

5. Calculate probabilities:

P(Cat) = 7.389/10.475 = 0.705 = 70.5%

P(Dog) = 2.718/10.475 = 0.259 = 25.9%

P(Bird) = 0.368/10.475 = 0.035 = 3.5%

c) Proof probabilities sum to 1:

0.705 + 0.259 + 0.035 = 0.999 ≈ 1.0

(Small difference due to rounding)

d) Exponential function necessity:

1. Non-negativity: Converts all scores to positive numbers

2. Relative scale preservation: Maintains ordering of original scores

3. Amplification: Enhances differences between scores

4. Mathematical properties: Derivative of exp(x) is exp(x), simplifying optimization

Question 5. Logistic Regression (10 points)

Given: w₁=2, w₂=-1, b=1

a) Complete probability equation: P(y=1|x) = 1/(1 + e^-(2x₁ - x₂ + 1))

b) Decision boundary where P(y=1|x) = 0.5:

1. Start with P(y=1|x) = 0.5: 0.5 = 1/(1 + e^-(2x₁ - x₂ + 1))

2. Multiply both sides by denominator: 0.5(1 + e^-(2x₁ - x₂ + 1)) = 1

3. Simplify:

1 + e^-(2x₁ - x₂ + 1) = 2

e^-(2x₁ - x₂ + 1) = 1

-(2x₁ - x₂ + 1) = 0

2x₁ - x₂ + 1 = 0

5. Final equation: x₂ = 2x₁ + 1

c) Calculate P(y=1|x) for (1,1):

1. Calculate z = 2x₁ - x₂ + 1: z = 2(1) - 1 + 1 = 2

2. Apply sigmoid: P(y=1|x) = 1/(1 + e^-2) = 1/(1 + 0.135) = 0.881 = 88.1%

Part 3: Theoretical Analysis (30 points)

7. (15 points) Compare and contrast:

a) The mathematical basis for why KNN can handle non-linear decision boundaries while logistic regression cannot

b) How the curse of dimensionality affects each algorithm

c) The computational complexity during training and prediction phases

8. (15 points) Consider a face recognition system that needs to classify expressions into: Happy, Sad, Angry, Surprised.

a) Write out the full One-vs-All formulation for this problem

b) Calculate the number of binary classifiers needed for both One-vs-All and One-vs-One approaches

c) Explain mathematically why softmax regression might be more appropriate than multiple logistic regressions

Question 7. Algorithm Comparison (15 points)

a) KNN vs. Logistic Regression Decision Boundaries:

KNN decision boundaries:

Forms boundaries based on local neighborhood majority

Mathematical formulation: D(x) = argmax_y Σᵢ I(y=yᵢ)w(||x-xᵢ||) where:
I(y=yᵢ) is indicator function
w(||x-xᵢ||) is weight based on distance
Can create complex, non-linear boundaries because:
1. Decision at each point depends only on nearby training points
2. No global functional form constraint
3. Boundary shape adapts to local data distribution

Logistic regression boundaries:

Creates single linear boundary
Mathematical form: z = w^T x + b P(y=1|x) = 1/(1 + e^-z)
Can only create linear boundaries because:
1. Decision function is linear combination of features
2. Sigmoid function is monotonic
3. Decision boundary occurs where w^T x + b = 0

b) Curse of Dimensionality Effects:

KNN affected severely:

1. Volume grows exponentially with dimensions

In d dimensions, unit hypersphere volume ∝ π^(d/2)/Γ(d/2 + 1)
2. Distance metrics become less meaningful
Ratio of distances (dmax/dmin) → 1 as d → ∞
3. Required sample size grows exponentially
To maintain same density, N ∝ 2^d

Logistic regression affected less:

1. Parameter count grows linearly

Number of parameters = d + 1
2. Model complexity independent of space volume
3. Still effective in high dimensions if:
Linear separation exists
Features are informative

c) Computational Complexity Analysis:

KNN: Training phase:

Time complexity: O(1)

Space complexity: O(nd)
Just stores training data
No actual training performed

Prediction phase:

Time complexity: O(nd + nlog(n))

nd for distance calculations
nlog(n) for sorting distances
Space complexity: O(n)
Must compute distances to all training points

Logistic Regression: Training phase:

Time complexity: O(ndm)

n = samples
d = dimensions
m = iterations
Space complexity: O(d)
Gradient descent iterations

Prediction phase:

Time complexity: O(d)

Space complexity: O(d)
Single matrix multiplication

Question 8. Multi-class Classification (15 points)

a) One-vs-All formulation for emotions:

For each emotion k ∈ {Happy, Sad, Angry, Surprised}:

Train binary classifier fₖ(x)

fₖ(x) = σ(wₖᵀx + bₖ)

Complete formulation:

1. Happy vs. rest: P(Happy|x) = σ(w₁ᵀx + b₁)

2. Sad vs. rest: P(Sad|x) = σ(w₂ᵀx + b₂)
3. Angry vs. rest: P(Angry|x) = σ(w₃ᵀx + b₃)
4. Surprised vs. rest: P(Surprised|x) = σ(w₄ᵀx + b₄)

Final prediction: ŷ = argmax_k fₖ(x)

b) Number of Required Classifiers:

One-vs-All:

Number = K (number of classes)

Here: 4 classifiers

One-vs-One:

Number = K(K-1)/2
Here: 4(4-1)/2 = 6 classifiers
Pairs: (Happy-Sad), (Happy-Angry), (Happy-Surprised), (Sad-Angry), (Sad-Surprised), (Angry-Surprised)

c) Softmax Regression Advantages:

Mathematical explanation:

1. Joint probability modeling: Softmax: P(y=k|x) = exp(wₖᵀx)/Σⱼ exp(wⱼᵀx)

Naturally ensures Σₖ P(y=k|x) = 1

Models class probabilities simultaneously
2. Parameter sharing:

Features used by all classes simultaneously

Better feature utilization
More efficient learning
3. Training stability:

Single optimization objective

Avoids conflicting binary decisions
More numerically stable
4. Direct comparison:

Scores directly comparable

No need for calibration
Natural ranking of probabilities

Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

Private Hotel Management Colleges in Delhi NCR
No ratings yet
Private Hotel Management Colleges in Delhi NCR
4 pages
English Project (2025-26) - Xii
No ratings yet
English Project (2025-26) - Xii
4 pages
Officers IAS Academy Course Details
No ratings yet
Officers IAS Academy Course Details
3 pages
Building Guardrails For Large Language Models
No ratings yet
Building Guardrails For Large Language Models
20 pages
MLT MCQ
100% (1)
MLT MCQ
21 pages
Lecture3 Logistic Regression Classifier V0
No ratings yet
Lecture3 Logistic Regression Classifier V0
41 pages
Week 7
No ratings yet
Week 7
25 pages
07 Logistics Regression
No ratings yet
07 Logistics Regression
23 pages
JARVIS-1: Open-World Multi-Task Agents With Memory-Augmented Multimodal Language Models
No ratings yet
JARVIS-1: Open-World Multi-Task Agents With Memory-Augmented Multimodal Language Models
28 pages
Intro To Neural Networks Explained For Beginners: Sajjad Mustafa
No ratings yet
Intro To Neural Networks Explained For Beginners: Sajjad Mustafa
110 pages
IML21 Term1
No ratings yet
IML21 Term1
7 pages
A Child Said What Is Grass
No ratings yet
A Child Said What Is Grass
2 pages
SWG 632 - Assignment F
No ratings yet
SWG 632 - Assignment F
2 pages
Applsci 12 09820 v2
No ratings yet
Applsci 12 09820 v2
15 pages
6th Maths Paper (1st Term)
No ratings yet
6th Maths Paper (1st Term)
2 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
Binary Logistic Regression From Scratch
No ratings yet
Binary Logistic Regression From Scratch
10 pages
Ifjo 320 Fy 98324 Fo 3 F 2 Ifr
No ratings yet
Ifjo 320 Fy 98324 Fo 3 F 2 Ifr
6 pages
Conceptual Paper
No ratings yet
Conceptual Paper
15 pages
DLL Science 6 q3 w10
No ratings yet
DLL Science 6 q3 w10
6 pages
Winter 21 Exam 1
No ratings yet
Winter 21 Exam 1
17 pages
RESD2 Research Proposal 2
No ratings yet
RESD2 Research Proposal 2
7 pages
Checklist For Enrollment of Providers
No ratings yet
Checklist For Enrollment of Providers
7 pages
CS-31002 (ML) - CS End April 2025
No ratings yet
CS-31002 (ML) - CS End April 2025
19 pages
Personality Development
No ratings yet
Personality Development
53 pages
2023-24 ML End-Semester Make-Up QP Anwer-Keys
No ratings yet
2023-24 ML End-Semester Make-Up QP Anwer-Keys
9 pages
Classification and K Nearest Neighbour Algorithm
No ratings yet
Classification and K Nearest Neighbour Algorithm
53 pages
Midterm Sol
No ratings yet
Midterm Sol
23 pages
A Data Quality-Driven View of Mlops
No ratings yet
A Data Quality-Driven View of Mlops
12 pages
TNA For Islami Banks
No ratings yet
TNA For Islami Banks
16 pages
Version 1
No ratings yet
Version 1
18 pages
Self-Quiz Unit 4 - Attempt Review
No ratings yet
Self-Quiz Unit 4 - Attempt Review
7 pages
ML PG Assignment 3
No ratings yet
ML PG Assignment 3
3 pages
2023 Machine Learning
No ratings yet
2023 Machine Learning
8 pages
Mathematical Advances Towards Sustainable Environmental Systems
No ratings yet
Mathematical Advances Towards Sustainable Environmental Systems
355 pages
Education-Related Research Topics & Ideas Education
No ratings yet
Education-Related Research Topics & Ideas Education
2 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
Aa100b Fall 2014 Tma
No ratings yet
Aa100b Fall 2014 Tma
6 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Answers 2024
No ratings yet
Answers 2024
11 pages
The Psychology of Risk Embracing Uncertainty To Stay Profitable
No ratings yet
The Psychology of Risk Embracing Uncertainty To Stay Profitable
4 pages
ML Question Bank
No ratings yet
ML Question Bank
7 pages
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
Midterm Solutions Machine
100% (1)
Midterm Solutions Machine
17 pages
Mahamed Ezzeldeen - CV (GUC)
No ratings yet
Mahamed Ezzeldeen - CV (GUC)
2 pages
S&UL Subjective Question Bank
No ratings yet
S&UL Subjective Question Bank
7 pages
2024 Machine Learning
No ratings yet
2024 Machine Learning
8 pages
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
6 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
12 Definitive Traits of A Middle Child
No ratings yet
12 Definitive Traits of A Middle Child
2 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Dda3020 22
No ratings yet
Dda3020 22
4 pages
Midterm Solutions PDF
No ratings yet
Midterm Solutions PDF
17 pages
Adobe Scan 30-May-2023
No ratings yet
Adobe Scan 30-May-2023
7 pages
DS - ESE - June 22
No ratings yet
DS - ESE - June 22
2 pages
HW1 Final
No ratings yet
HW1 Final
4 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
Machine Learning 20CSE09
No ratings yet
Machine Learning 20CSE09
3 pages
Midterm Sp16 Solutions
100% (1)
Midterm Sp16 Solutions
17 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
8611 - Assignment 2 (AG)
100% (1)
8611 - Assignment 2 (AG)
14 pages
12s 701 Final
No ratings yet
12s 701 Final
17 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
Self Esteem
0% (1)
Self Esteem
3 pages
Machine Learning Project: Sneha Sharma PGPDSBA Mar'21 Group 2
100% (4)
Machine Learning Project: Sneha Sharma PGPDSBA Mar'21 Group 2
36 pages
1585234894iuf Presentation Avans Uas
No ratings yet
1585234894iuf Presentation Avans Uas
48 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
Semi Detailed Lesson Plan in Measure of Central Tendency
No ratings yet
Semi Detailed Lesson Plan in Measure of Central Tendency
3 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
Styles of Citation
No ratings yet
Styles of Citation
4 pages
Blooms Taxonomy
No ratings yet
Blooms Taxonomy
16 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
Final 2019
No ratings yet
Final 2019
15 pages
Endsem ML Makeup AK - 1
No ratings yet
Endsem ML Makeup AK - 1
7 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Person-Organization Fit, Knowledge Sharing Behaviour, and Innovative Work Behaviour: A Self-Determination Perspective
No ratings yet
Person-Organization Fit, Knowledge Sharing Behaviour, and Innovative Work Behaviour: A Self-Determination Perspective
17 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
10-701 Midterm Exam Solutions, Spring 2007
No ratings yet
10-701 Midterm Exam Solutions, Spring 2007
20 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
Facilitating Learning
92% (61)
Facilitating Learning
23 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Sat Mathematics Review And Practice
From Everand
Sat Mathematics Review And Practice
Addison Shaw
1/5 (1)
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)