0% found this document useful (0 votes)
10 views3 pages

hw1 1

hw

Uploaded by

Arshilgeni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

hw1 1

hw

Uploaded by

Arshilgeni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

COSC435 Homework 1

Due date: September 30, 2024 at 11:00AM.

Assignment: Building and Evaluating a Logistic Regression Model

Part 1: Tutorial Walkthrough

The first part of the assignment is designed to ensure you follow the tutorial on logistic regression. The
tasks are to be clearly laid out with references to the sections in the tutorial.

Task 1: Tutorial Review and Model Building

1. Review the Tutorial:

o Go through the Real Python tutorial on Logistic Regression. Pay special attention to the
sections on data preparation, logistic regression implementation, and model evaluation.

2. Dataset Selection:

o Use the sklearn’s Iris dataset or the Titanic dataset (or any other dataset you prefer that
is relevant for classification tasks).

3. Logistic Regression Implementation:

o Write Python code that loads the dataset and applies logistic regression as explained in
the tutorial.

o Preprocess the data by splitting it into training and testing sets.

o Train the logistic regression model and evaluate it using common metrics like accuracy,
confusion matrix, precision, recall, and F1 score.

4. Code Submission:

o Ensure that your code is well-documented and clearly structured.

o Include a summary explaining each section of the code and what was learned from the
tutorial.

Part 2: Expanding the Assignment

Task 2: Model Evaluation and Interpretation

This part expands the assignment to ensure you understand the fundamental aspects of logistic
regression, such as interpretation, feature importance, and model evaluation.

1. Coefficient Interpretation:
o After training the logistic regression model, interpret the coefficients (weights). Explain
what each coefficient means in the context of the data (e.g., what impact does each
feature have on the target variable?).

2. ROC Curve and AUC Score:

o Plot the ROC curve and compute the AUC score for your logistic regression model.

o Briefly describe what the AUC score represents and how the model’s performance can
be interpreted from it.

3. Hyperparameter Tuning:

o Use GridSearchCV or RandomizedSearchCV to tune the hyperparameters of the logistic


regression model (e.g., regularization strength, solver).

o Discuss the effect of different hyperparameters on model performance.

4. Error Analysis:

o Generate a confusion matrix and discuss the types of errors your model makes (false
positives, false negatives).

o Suggest strategies for improving the model (e.g., adjusting the decision threshold,
feature engineering, or collecting more data).

Part 3: Additional Verification Tasks

To ensure you have grasped key concepts, include the following tasks:

Task 3: Conceptual Understanding and Verbal Explanation

1. Explain Logistic Regression in Your Own Words:

o Write a 500-word summary explaining what logistic regression is, how it works
mathematically, and where it is best applied.

2. Comparison with Other Models:

o Compare logistic regression to another classification model (e.g., decision trees or


support vector machines). Highlight their differences in terms of interpretability,
performance, and real-world applications.

Task 4: Model Expansion (Optional)

1. Multiclass Logistic Regression (Optional):

o Expand your binary classification model to handle multiclass classification (using the
multinomial option in scikit-learn’s logistic regression).

o Explain how logistic regression can be adapted for multiclass problems.


Submission Requirements

• Jupyter Notebook with well-commented Python code for each task.

• A PDF report that includes answers to the interpretive and conceptual questions (Task 3).

• Visualizations (e.g., ROC curve, confusion matrix) included in both the notebook and report.

Grading Rubric

1. Code Implementation (50%):

o Correct implementation of logistic regression and associated tasks.

o Clean and well-documented code.

2. Model Evaluation and Interpretation (30%):

o Correct interpretation of coefficients, ROC, AUC, and confusion matrix.

o Insightful hyperparameter tuning and error analysis.

3. Conceptual Understanding (20%):

o Well-explained summary of logistic regression and comparison to other models.

Extension Ideas (optional):

To further verify that you understand logistic regression deeply:

• Task on Regularization: Ask students to include both L1 and L2 regularization and compare the
results.

• Cross-validation Task: Students should perform k-fold cross-validation to evaluate model


robustness.

• Ethical Considerations Task: Provide students with a case study where logistic regression is
applied (e.g., in credit scoring or medical diagnoses) and have them discuss ethical implications
of the model's decisions.

Submission
Submit your Jupyter notebook and reports as above. This homework is to be done individually.

You might also like