0% found this document useful (0 votes)
21 views10 pages

NN - CCP

Uploaded by

Aleena Nadeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views10 pages

NN - CCP

Uploaded by

Aleena Nadeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

COMPLEX COMPUTING PROBLEM

CSC – 449
NEURAL NETWORKS & FUZZY LOGIC

Submitted By:

Submitted to: Sir Shahid Khan

BSCS – 8A
1. Introduction

 Objective: Briefly state the objective of the assignment (to develop an ANN
model for predicting house prices).
 Background: Provide some context about why predicting house prices is
important and how machine learning can help.
 Scope: Mention the scope of the project (data acquisition, preprocessing, EDA,
model design, training, evaluation).

2. Data Acquisition and Preprocessing

 Dataset: Describe the dataset you chose (source, number of records, features
included).
 Loading Data: Show the code snippet for loading the data.
 Handling Missing Values: Explain how you dealt with missing values (e.g.,
imputation, removal).
 Encoding Categorical Variables: Describe how you encoded categorical
variables (e.g., one-hot encoding).
 Standardizing Numerical Features: Discuss why and how you standardized
numerical features.

3. Exploratory Data Analysis (EDA)

 Purpose of EDA: Explain the importance of EDA in understanding the data.


 Visualizations: Include key plots and visualizations (e.g., histograms, scatter
plots) with descriptions of what they show.
 Insights: Summarize the main insights gained from the EDA.

4. Model Architecture Design

 Data Partitioning: Describe how you split the data into training and testing
sets.
 ANN Architecture: Provide details of the ANN architecture you designed
(number of layers, types of layers, activation functions).
 Model Compilation: Explain the choice of loss function and optimizer.
 Code Snippets: Include code snippets for building and compiling the model.

5. Model Training

 Training Process: Describe the process of training the model (number of


epochs, batch size).
 Performance Monitoring: Explain how you monitored the training performance
(e.g., loss curves).
 Code Snippets: Include code snippets for training the model.

6. Model Evaluation

 Evaluation Metrics: Define the metrics used for evaluation (MSE, RMSE, R²).
 Results: Present the evaluation results and compare them.
 Analysis: Discuss the performance of the model and any patterns observed in
the results.
 Code Snippets: Include code snippets for model evaluation.

7. Conclusion

 Summary: Summarize the key findings and the performance of the ANN model.
 Challenges: Mention any challenges faced during the project.
 Future Work: Suggest possible improvements or future directions for the
project.

8. References

 Datasets: Provide references for the datasets used.


 Libraries and Tools: List any major libraries or tools used (e.g., TensorFlow,
Keras, scikit-learn).
 Other Sources: Cite any other references or resources you used in your project.

Example Structure for Documentation

Here's an example of how your documentation might start:

Predicting House Prices Using Artificial Neural


Networks
Your Name

Course Name and Code

Bahria University, Karachi Campus

Date
Table of Contents
1. Introduction
2. Data Acquisition and Preprocessing
3. Exploratory Data Analysis (EDA)
4. Model Architecture Design
5. Model Training
6. Model Evaluation
7. Conclusion
8. References

1. Introduction
The objective of this assignment is to develop a robust Artificial Neural Network (ANN)
model for predicting house prices based on various features. Predicting house prices is
a significant task in real estate markets, helping buyers, sellers, and investors make
informed decisions. This project involves data acquisition, preprocessing, exploratory
data analysis, model design, training, and evaluation.

2. Data Acquisition and Preprocessing


Dataset

The dataset used for this project was sourced from Kaggle, containing features such as
square footage, number of bedrooms, location, and house prices.

Loading Data
python
Copy code
import pandas as pd df = pd.read_csv( 'house_prices.csv' )

Handling Missing Values

Missing values were handled by imputing with the mean for numerical features and
mode for categorical features.

python
Copy code
df.fillna(df.mean(), inplace= True )

Encoding Categorical Variables

Categorical variables were encoded using one-hot encoding.

python
Copy code
df = pd.get_dummies(df, columns=[ 'location' ])

Standardizing Numerical Features

Numerical features were standardized to have a mean of 0 and a standard deviation of


1.

python
Copy code
from sklearn.preprocessing import StandardScaler scaler = StandardScaler()
df[[ 'square_footage' , 'number_of_bedrooms' ]] = scaler.fit_transform(df[[ 'square_footage' ,
'number_of_bedrooms' ]])

Continue this structure for the remaining sections, detailing your approach and
including relevant code snippets. Make sure your explanations are clear and concise to
make your documentation easy to follow.
continur
ChatGPT
Sure, let's continue with the documentation for the remaining sections.
3. Exploratory Data Analysis (EDA)
Purpose of EDA

The purpose of EDA is to gain insights into the dataset, understand the distribution of
variables, and identify any patterns or anomalies. This helps in making informed
decisions during the data preprocessing and model building stages.

Visualizations

We used various visualizations to explore the dataset:

Distribution of House Prices

python
Copy code
import matplotlib.pyplot as plt import seaborn as sns sns.histplot(df[ 'price' ], kde= True )
plt.title( 'Distribution of House Prices' ) plt.xlabel( 'Price' ) plt.ylabel( 'Frequency' ) plt.show()

Description: The histogram above shows the distribution of house prices. The KDE
(Kernel Density Estimate) plot gives an idea of the density of prices.

Relationship between Square Footage and Price

python
Copy code
sns.scatterplot(x= 'square_footage' , y= 'price' , data=df) plt.title( 'Square Footage vs Price' )
plt.xlabel( 'Square Footage' ) plt.ylabel( 'Price' ) plt.show()

Description: This scatter plot illustrates the relationship between the square footage of
a house and its price. There is a positive correlation, indicating that larger houses tend
to have higher prices.

Insights

 Price Distribution: House prices are right-skewed, with most prices


concentrated on the lower end.
 Square Footage: There is a clear positive correlation between square footage
and house prices.
 Categorical Variables: Some locations have higher average prices than others.
4. Model Architecture Design
Data Partitioning

The dataset was split into training and testing sets to evaluate the model's performance
on unseen data.

python
Copy code
from sklearn.model_selection import train_test_split X = df.drop( 'price' , axis= 1 ) y =
df[ 'price' ] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= 0.2 ,
random_state= 42 )

ANN Architecture

The architecture of the ANN was designed with an input layer, two hidden layers, and an
output layer.

python
Copy code
import tensorflow as tf from tensorflow.keras.models import Sequential from
tensorflow.keras.layers import Dense model = Sequential([ Dense( 64 , activation= 'relu' ,
input_shape=(X_train.shape[ 1 ],)), Dense( 32 , activation= 'relu' ), Dense( 1 ) # Output layer
for regression ])

Model Compilation

The model was compiled using the Mean Squared Error (MSE) loss function and the
Adam optimizer.

python
Copy code
model. compile (optimizer= 'adam' , loss= 'mean_squared_error' )

5. Model Training
Training Process

The model was trained for 50 epochs with a batch size of 32. We monitored the training
and validation loss to track the model's performance.

python
Copy code
history = model.fit(X_train, y_train, epochs= 50 , batch_size= 32 , validation_split= 0.2 )

Performance Monitoring

The training and validation loss were plotted to visualize the learning process.

python
Copy code
plt.plot(history.history[ 'loss' ], label= 'Training Loss' ) plt.plot(history.history[ 'val_loss' ],
label= 'Validation Loss' ) plt.title( 'Model Loss' ) plt.xlabel( 'Epoch' ) plt.ylabel( 'Loss' )
plt.legend() plt.show()

6. Model Evaluation
Evaluation Metrics

We used Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared
(R²) to evaluate the model's performance.

Results

The model's performance on the testing set was evaluated as follows:

python
Copy code
from sklearn.metrics import mean_squared_error, r2_score import numpy as np y_pred =
model.predict(X_test) mse = mean_squared_error(y_test, y_pred) rmse = np.sqrt(mse) r2 =
r2_score(y_test, y_pred) print ( f"MSE: {mse}, RMSE: {rmse}, R²: {r2}" )

Example Output:

makefile
Copy code
MSE: 40000.25 RMSE: 200.00 R²: 0.85
Analysis

 MSE and RMSE: These metrics indicate the average squared difference and the
average difference between actual and predicted prices, respectively. Lower
values are better.
 R²: This metric shows how well the model's predictions match the actual values.
An R² score close to 1 indicates a good fit.

7. Conclusion
Summary

In this project, we developed an ANN model to predict house prices. The model was
trained on a dataset with various features, including square footage, number of
bedrooms, and location. The model demonstrated good predictive performance with an
R² score of 0.85.

Challenges

 Data Quality: Handling missing values and encoding categorical variables were
crucial steps.
 Model Tuning: Selecting the right architecture and hyperparameters required
experimentation.

Future Work

 Feature Engineering: Adding more relevant features could improve model


performance.
 Hyperparameter Optimization: Using techniques like grid search or random
search to find optimal hyperparameters.
 Model Comparison: Comparing the ANN model with other machine learning
models (e.g., linear regression, random forests) could provide additional insights.
8. References
 Datasets:
 Kaggle: House Prices Dataset
 UCI Machine Learning Repository: Housing Dataset
 Libraries and Tools:
 TensorFlow: TensorFlow Documentation
 scikit-learn: scikit-learn Documentation
 Other Sources:
 Books, articles, or online resources you referenced.

This structured approach ensures that your documentation is comprehensive and easy
to follow. Each section should include relevant details and code snippets, making it clear
how you approached each part of the assignment.

You might also like