Project Synopsis of Student Droupout Prediction
Project Synopsis of Student Droupout Prediction
SIMULATION
Title: Predicting Student Dropout Rates Using Decision Tree Model
Project Synopsis
MUHAMMAD ALI
2022-KIU-BS2344
Contents
Project Synopsis ........................................................................................................................................... 2
Title: Predicting Student Dropout Rates Using Decision Tree Model......................................................... 2
Introduction .............................................................................................................................................. 2
Objectives ................................................................................................................................................. 2
Methodology ............................................................................................................................................ 2
1. Data Collection: ............................................................................................................................. 2
2. Data Preprocessing: ...................................................................................................................... 2
3. Exploratory Data Analysis (EDA): .................................................................................................. 2
4. Model Building: ............................................................................................................................. 2
5. Model Evaluation: ......................................................................................................................... 2
Data Analysis ............................................................................................................................................ 3
Scatter Plot:................................................................................................................................... 3
Bar Chart: ...................................................................................................................................... 3
Box Plot: ........................................................................................................................................ 4
Histogram:..................................................................................................................................... 4
Results....................................................................................................................................................... 5
Conclusion ................................................................................................................................................ 5
Future Work.............................................................................................................................................. 5
Practical Work .............................................................................................................................................. 5
Table of Figures
Figure 1: Scatter Plot ..................................................................................................................................... 3
Figure 2: Bar Chart ........................................................................................................................................ 3
Figure 3: Box Plot .......................................................................................................................................... 4
Figure 4: Histogram ....................................................................................................................................... 4
Project Synopsis
Introduction
Objectives
To perform exploratory data analysis (EDA) on the dataset to identify patterns and
insights.
To preprocess the data by handling missing values and encoding categorical variables.
To build a decision tree model to predict student dropout rates.
To evaluate the model's performance using accuracy and other relevant metrics.
Methodology
1. Data Collection: The dataset was downloaded from Kaggle, which contains various
features related to student performance and demographics.
2. Data Preprocessing:
o Handled missing values by dropping rows with missing data.
o Encoded categorical variables using label encoding.
o Split the data into training and testing sets with an 80-20 ratio.
3. Exploratory Data Analysis (EDA):
o Created scatter plots, bar charts, box plots, and histograms to understand the data
distribution and relationships between variables.
4. Model Building:
o Utilized a decision tree classifier to build the model.
o Trained the model on the training data.
5. Model Evaluation:
o Evaluated the model using accuracy, precision, recall, and F1-score.
o Compared the predicted values against the actual values in the test set.
Data Analysis
Scatter Plot: Visualized the relationship between "Curricular units 2nd sem (grade)"
and "Target" to observe trends. See Figure 1: Scatter Plot
Bar Chart: Analyzed the distribution of "Marital status" to understand its impact on
dropout rates. See Figure 2: Bar Chart
Figure 4: Histogram
Results
Conclusion
The decision tree model demonstrated reasonable accuracy in predicting student dropout rates.
The model's performance can be further improved by refining the data preprocessing steps, using
more sophisticated modeling techniques, and incorporating additional relevant features. This
project highlights the importance of data analysis and machine learning in addressing educational
challenges.
Future Work
Practical Work
Google Colab Notebook on Student Droupout Prediction
The End…