Capstone Project Guidelines
Capstone Project Guidelines
A capstone project involves applying your knowledge to analyze a given dataset. You will
conduct extensive research, use critical thinking, and apply practical skills to derive meaningful
insights and solutions. This project will demonstrate your expertise in data analysis and your
ability to tackle real-world problems.
2. Data Visualization
- Task 6: Create visualizations to understand the distribution of numerical features (e.g.,
histograms, box plots).
- Task 7: Create visualizations for categorical features (e.g., bar charts, pie charts).
- Task 8: Generate correlation heatmaps to identify relationships between numerical features.
- Task 9: Use pair plots to visualize relationships between features.
3. Feature Engineering
- Task 10: Create new features that might be useful for the analysis (e.g., date-related features
from timestamps, interaction terms).
- Task 11: Standardize or normalize numerical features if needed.
4. Model Building
- Task 12: Split the dataset into training and testing sets.
- Task 13: Train a simple linear regression model (if the task is regression) or a logistic
regression model (if the task is classification).
- Task 14: Evaluate the model performance using appropriate metrics (e.g., RMSE for
regression, accuracy/F1-score for classification).
- Task 15: Experiment with at least two other algorithms (e.g., decision tree, random forest,
k-nearest neighbors) and compare their performance.
5. Model Tuning
- Task 16: Perform hyperparameter tuning using GridSearchCV or RandomizedSearchCV.
- Task 17: Evaluate and compare the tuned models’ performance.
7. Reporting
- Task 21: Summarize the findings and results in a Jupyter Notebook (.ipynb file), including
visualizations and explanations.
- Task 22: Create a final report or presentation summarizing the entire process and key insights,
integrating Power BI visualizations.
Note: Submit the project in .ipynb format along with the presentation file(report).