0% found this document useful (0 votes)
25 views

Machine Learning For Data Science

Uploaded by

Rishu Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Machine Learning For Data Science

Uploaded by

Rishu Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Machine Learning for Data Science

(50 marks)

Assignment Description:

In this assignment, you will explore the House Prices dataset, perform data
preprocessing, and build predictive models using machine learning and artificial
intelligence techniques. Your goal is to predict house prices based on various features.

Task 1: Data Preprocessing (10 marks)

1. Load the "house-prices.csv" dataset into a Pandas DataFrame.


2. Perform data cleaning, including handling missing values, outliers, and duplicates.
3. Encode categorical variables using one-hot encoding or label encoding.
4. Handle numerical features, such as scaling or transformation.
5. Split the dataset into training and testing sets (e.g., 80% training, 20% testing).

Task 2: Exploratory Data Analysis (EDA) (10 marks)

1. Explore the dataset's structure and statistics using appropriate Pandas functions.
2. Visualize the distribution of house prices using a histogram.
3. Investigate the relationships between features and house prices using scatter plots or
correlation matrices.
4. Identify the most important features that may affect house prices.
5. Provide insights and observations based on your EDA.

Task 3: Model Building (20 marks)

1. Select at least three different machine learning algorithms suitable for regression (e.g.,
Linear Regression, Decision Tree, Random Forest, etc.).
2. Train and evaluate each model using appropriate evaluation metrics (e.g., Mean
Absolute Error, R-squared, Root Mean Squared Error).
3. Implement hyperparameter tuning for one of the models to improve its performance.
4. Compare the performance of the models and select the best-performing one.
5. Visualize the predictions of the selected model against the actual house prices.
Task 4: Model Interpretability (5 marks)

1. Use model interpretability techniques (e.g., feature importance, SHAP values, or partial
dependence plots) to explain the predictions of the selected model.
2. Provide insights into which features have the most significant impact on house prices
according to your model.

Task 5: Conclusion and Recommendations (5 marks)

1. Summarize your findings from data preprocessing, EDA, and model building.
2. Discuss the strengths and limitations of your predictive model.
3. Provide recommendations for potential homebuyers or real estate professionals based
on your model's insights.

Submission Guidelines:

• Submit your assignment report in PDF format.


• Include code snippets, visualizations, and explanations for each task.
• Clearly label and describe the sections in your report.
• Cite any external sources or libraries used.
• Submit your assignment by the specified deadline.

You might also like