Exercise 4
Exercise 4
Exercise-4
Prof. Dr. Dr. Lars Schmidt-Thieme, Umer Khan
Information Systems and Machine Learning Lab (ISMLL),
University of Hildesheim
Logistic Regression:
Problem-1:
Suppose the following problem: A doctor wants to find out whether a particular antibiotic has an
effect on the incidence of infection in women who have a caesarean section. He starts from a
simple linear regression model:
y =1 means that an infection occurs in the next 2 weeks, y = 0, that no infection occurs. x
encoding the administered amount of the antibiotic. When the target variable is a binomial
random variable with the following discrete distribution:
a) Why should you not use linear regression on data with binary outcome variable? Consider the
error 'ɛ' and the variance σ2. What value ɛ takes for y = 1 and y = 0? What value does σ2 takes?]
b) We assume that the variance for data x can be varying in different cases. What method can be
applied to solve the problem?
c) Suppose we use logistic regression. Describe the behavior of the logistic function with respect
to β
Problem 2:
Build a logistic regression model to predict whether a student gets admitted into a university
based on ‘Exam 1 Score’ and ‘Exam 2 Score’. You have to estimate the probability of a student
to get admitted. Implement ‘plotData.m’ so that the training data in ‘ex2data1.txt’ can be
visualized as in the following figure:
Impelemt ‘costFunction.m’ to return cost and gradient of Logistic Regression, which takes the
following form:
And the gradient of the cost is a vector ‘Θ’ (theta) where the jth element (for j= 1,2,….,n) is
defined as:
Test your costFunction through ‘ex2.m’ using the initial parameters of ‘Θ’
After learning the parameters, you can use the model to predict whether a particular student will
be admitted. For a student with an Exam 1 score of 45 and an Exam 2 score of 85, you should
expect to see an admission probability of 0.774. Complete the code in ‘predict.m’ which
produces ‘1’ or ‘0’ and takes data set and thetas as parameters. Using ‘predict.m’, ‘ex2.m’ will
report training accuracy of classifier based on percentage of examples it classified correctly.