Machine Learning For Data Science
Machine Learning For Data Science
(50 marks)
Assignment Description:
In this assignment, you will explore the House Prices dataset, perform data
preprocessing, and build predictive models using machine learning and artificial
intelligence techniques. Your goal is to predict house prices based on various features.
1. Explore the dataset's structure and statistics using appropriate Pandas functions.
2. Visualize the distribution of house prices using a histogram.
3. Investigate the relationships between features and house prices using scatter plots or
correlation matrices.
4. Identify the most important features that may affect house prices.
5. Provide insights and observations based on your EDA.
1. Select at least three different machine learning algorithms suitable for regression (e.g.,
Linear Regression, Decision Tree, Random Forest, etc.).
2. Train and evaluate each model using appropriate evaluation metrics (e.g., Mean
Absolute Error, R-squared, Root Mean Squared Error).
3. Implement hyperparameter tuning for one of the models to improve its performance.
4. Compare the performance of the models and select the best-performing one.
5. Visualize the predictions of the selected model against the actual house prices.
Task 4: Model Interpretability (5 marks)
1. Use model interpretability techniques (e.g., feature importance, SHAP values, or partial
dependence plots) to explain the predictions of the selected model.
2. Provide insights into which features have the most significant impact on house prices
according to your model.
1. Summarize your findings from data preprocessing, EDA, and model building.
2. Discuss the strengths and limitations of your predictive model.
3. Provide recommendations for potential homebuyers or real estate professionals based
on your model's insights.
Submission Guidelines: