CA One 2024
CA One 2024
CA One 2024
Assessment Number: 1
Individual/Group: Individual
1. Analyze a dataset from a problem domain in depth, and select appropriate statistical models,
tools, and techniques to derive insights regarding the dataset and domain.
3. Construct, refine, interpret, and critically evaluate predictive analytical and machine learning
models.
4. Critically evaluate and utilize hyperparameter search strategies for optimizing machine
learning models.
Supervised Machine Learning – Classification (100 Marks)
Dataset
1
Relevant information about this dataset is given below:
Task
The bank wants to use a classification model that can predict whether the client has subscribed
a term deposit? Construct a suitable classification model for the bank by implementing both
random forest and support vector classification algorithms in Python.
In addition to providing the python code file, you are required to provide critical analysis of
your approach and results in a pdf report.
2
Your code and analysis should cover the following points:
1. Data Preparation (What steps would you take to prepare your data? Discuss your approach)
[20]
2. Model Hyperparameter Tuning (Which hyperparameters would you tune and why?
How would you tune them?) [20]
3. Choice of Evaluation Metric (Which metric would be suitable for model evaluation and
why?) [20]
5. Results analysis
a). Which of the two models (random forest or support vector classifier) would you
recommend for deployment in the real-world?
b). Is any model underfitting? If yes, what could be the possible reasons?
[20]
Naming convention:
Report should be named as –
Report_Firstname_Surname.pdf
There is no prescribed word-count for the report. It will be assessed on quality, and not
quantity of content.
3
Assessment Criteria
Each part will be graded according to the following criteria:
1. Quality of code (correctness and completeness) [Weightage – 40%]