Sakhil Assignment 02
Sakhil Assignment 02
Sakhil Assignment 02
Page | 1
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
Page | 2
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
Page | 3
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
Page | 4
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
#Property data
str(csvdata)
Page | 5
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
Page | 6
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
#Scatter Chart showing Actual & Predicted Price starting from origin
plot(testData$Price,pricePredicted, xlab = "Actual Price",ylab = "Predicted price",
xlim=c(0,max(testData$Price)),ylim=c(0,max(pricePredicted)))
Page | 7
Name: Sakhil Pant
Roll no: 21319 (MBA Spring 2021 - KUSOM)
#Calculating accuracy (RMSE) - the average distance between the actual and predicted values
rmse <- sqrt(mean(testData$Price - pricePredicted)^2)
rmse
#output = 1691254
Conclusion:
From the first plotted graph, we can see that majority of scatter plot are close to linear
line y= a + bx where b=1 and a = 0. This means actual and predicted prices are very close to each
other where plots are closed along the line. But many outliers are also far from the line, so we
need more datasets to train the model for better predictions. Similarly, second graph also shows
the similar result where bulk of data are accurately anticipated. Lastly, the RMSE value was
1691254 which used root mean square error to determine the model’s accuracy and comparing it
with RMSE value with prices in test data, it appears to be lower meaning the model accurately
predicts most of the test data. So in order for the model to increase its prediction accuracy, we
must train the data with additional massive data sets which help to learn more about the prices
changes of various type of variables.
Page | 8