Assignment_2
Assignment_2
Astana 2025
SUBMISSION: Submit your reports to the Moodle platform as a Zip file (report, code part,
dataset)
COLLABORATION: Make certain that you understand the course collaboration policy,
described on the course website. You must complete this assignment individually; you are not
allowed to collaborate with anyone else. You may discuss the homework to understand the
problems and the mathematics behind the various learning algorithms, but you are not allowed to
share problem solutions or your code with any other students.
DESCRIPTION
This assignment focuses on enhancing the performance of classification models using
various machine learning techniques. It involves building, tuning, and evaluating models such as
Decision Trees, Random Forests, and K-Nearest Neighbors (KNN). It is expected to apply
preprocessing steps including handling missing values, feature scaling, and encoding, followed
by model development and optimization using tools from the Scikit-learn library.
Key concepts such as hyperparameter tuning (using GridSearchCV or
RandomizedSearchCV), cross-validation, and ensemble learning are explored. The stacking
ensemble model is constructed to compare the performance of combined models against
individual learners.
EVALUATION CRITERIA:
The following evaluation criteria provide a detailed description of how the assignment will
be evaluated. Each section of the assignment will be evaluated based on its technical correctness,
completeness and clarity of presentation. The grading system below assigns weights to each
component, providing a balanced assessment of both the technical implementation and the
quality of the report.
General Instructions:
For each task, provide the code, visualizations (if necessary), and detailed reports on the results.
Use the Scikit-learn library to implement models and evaluate their performance.
Use GridSearchCV or RandomizedSearchCV for hyperparameter tuning.
Use Cross-Validation methods to validate model results and avoid overfitting.
Apply different evaluation metrics like accuracy, F1-Score, Confusion Matrix, and ROC-AUC.