AI Lab Report Detailed
AI Lab Report Detailed
of Artificial Intelligence
1. Introduction
In recent years, the importance of machine learning (ML) techniques in solving real-world
problems has increased dramatically. One of the fundamental tasks in ML is classification,
particularly binary classification, which deals with categorizing data into two distinct
groups. This laboratory practice aims to introduce students to the basic workflow of solving
a classification problem using Python and the scikit-learn library. By performing a binary
classification task using two models — Multilayer Perceptron (MLP) and Decision Tree —
students gain practical experience in training, validating, and evaluating machine learning
models. The primary dataset used in this practice is the Breast Cancer Wisconsin dataset, a
widely used benchmark in biomedical ML research. Through this exercise, we analyze the
impact of different validation strategies (holdout vs. k-fold cross-validation), input feature
sets, and random seed values on model performance. In total, 16 experimental
configurations are tested, offering a comprehensive view of how each factor influences
accuracy, recall, precision, and F1-score.
2. Dataset Description
The dataset used for this practice is the Breast Cancer Wisconsin dataset, included in the
scikit-learn library. It contains 569 instances, each with 30 numerical features extracted
from digitized images of breast mass tissue. These features describe characteristics of the
cell nuclei present in the image. The target variable is binary, where '0' indicates a benign
tumor and '1' indicates a malignant tumor. The dataset is relatively clean, with no missing
values, making it ideal for introductory machine learning experiments. Among the 30
features, we focus on a subset considered informative for the classification task:
- mean perimeter
- mean smoothness
- mean concave points
These features are chosen based on domain knowledge and previous studies showing their
relevance in tumor classification. The dataset exhibits a slight class imbalance, which needs
to be considered during evaluation.
3. Data Preparation
Data preparation is a crucial step in the machine learning pipeline. The dataset is first
loaded and converted into a pandas DataFrame for easier manipulation. Labels are stored in
a separate Series. Two different feature sets are considered in this study:
1. A single feature: 'mean perimeter'
2. A combination of three features: 'mean perimeter', 'mean smoothness', and 'mean
concave points'
This allows us to study the effect of feature richness on model performance. The labels
remain unchanged, representing the binary classification goal. During preprocessing, the
data is not normalized or scaled, though in practice this could benefit algorithms such as
MLP. For holdout validation, the data is split using a 90/10 train-test ratio. For cross-
validation, 5-fold (K=5) splitting is used. The same seed values are applied consistently
across experiments to ensure reproducibility.
4. Experiment Design
To thoroughly assess the effect of model choice, validation method, input features, and seed
values on model performance, 16 combinations are tested:
Cross-validation metrics are averaged across all folds, while holdout metrics are based on a
single test split. The script generates confusion matrices for visual assessment and outputs
metrics to the console or log file.
5. Experimental Results
The results of the 16 experiments are summarized in the table below. Each row
corresponds to one unique combination of model, validation method, feature set, and seed
value. The performance metrics are recorded to analyze how each variable affects
classification quality.
Based on these results, several trends can be observed and are discussed in the next section.
6. Analysis and Discussion
From the experiments, it becomes clear that the choice of input features has a major impact
on model performance. Using only 'mean perimeter' often leads to poor recall values,
especially when the data is split unfavorably due to the random seed. Adding two more
features significantly improves the classifier's ability to distinguish malignant cases.
Model comparison shows that the MLP generally outperforms the Decision Tree in terms of
precision and F1-score, particularly when cross-validation is used. This suggests that MLP is
better at generalizing, though it is also more sensitive to data scaling and convergence
settings.
Regarding validation, 5-Fold Cross-Validation provides more stable and reliable results
compared to holdout validation. The variability introduced by different seed values is
mitigated in cross-validation, which helps in obtaining a more accurate performance
estimate.
Finally, different seed values affect both training/test splits and model initialization
(especially for MLP). While the effect is sometimes small, in borderline cases it can
significantly impact results.
7. Conclusions
This laboratory practice demonstrated the fundamental steps in approaching a binary
classification problem using machine learning. The impact of model selection, validation
strategy, feature richness, and random seed was studied through systematic
experimentation.
Future improvements may include feature scaling, hyperparameter tuning, and use of
ensemble methods.
8. Appendix
Additional outputs such as confusion matrices, learning curves, and logs can be found in the
output_images directory or captured via script output. Below is a list of files:
- confusion_matrix_fold_X_seed_Y.png
- confusion_matrix_retention_seed_Y.png
These figures complement the numerical evaluation and help assess which combinations of
parameters produce more confident and accurate models.