0% found this document useful (0 votes)
58 views25 pages

Looking For A New Home?: Presented By: Vivek Behera 10/MBA/59

The document provides statistical analysis of home sale data from 5 townships. It includes descriptive statistics, frequency distributions, correlation and regression analysis, and hypothesis testing. Key findings are that most homes sold for $180,000-$250,000, and homes with garages or more bedrooms had higher prices, while homes with pools or further from city centers had lower prices. Hypothesis testing found the average sale price was significantly different between homes with and without pools.

Uploaded by

Vivek Behera
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views25 pages

Looking For A New Home?: Presented By: Vivek Behera 10/MBA/59

The document provides statistical analysis of home sale data from 5 townships. It includes descriptive statistics, frequency distributions, correlation and regression analysis, and hypothesis testing. Key findings are that most homes sold for $180,000-$250,000, and homes with garages or more bedrooms had higher prices, while homes with pools or further from city centers had lower prices. Hypothesis testing found the average sale price was significantly different between homes with and without pools.

Uploaded by

Vivek Behera
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Looking for a new home?

Presented By:
Vivek Behera
10/MBA/59
DESCRIPTIVE STATISTICS
Price (in
Size (Sq. ft) Distance Bedrooms Baths
thousands)

Arithmetic Mean 2223.8095 14.6286 3.8 2.08095 221.1029

Geometric Mean 13.80765 3.5181 2.0461 216.2361

Harmonic Mean 2196.4177 12.9795 3.2573 2.0128 211.4588

Standard Deviation 248.6594 4.8739 1.5026 0.3930 47.1054

Sample Variance 61831.5018 23.7549 2.2577 0.1544 2218.9191

Median 2200 15 4 2 213.6

Mean Absolute
191.0204 3.9750 1.21905 0.2773 38.266
Deviation
1st Quartile
2100 11 3 2 187

3rd Quartile
2400 18 5 2 251.4

Inter-quartile range 300 7 2 0 64.4


Range 1300 22 6 1.5 220.3

Minimum 1600 6 2 1.5 125

Maximum 2900 28 8 3 345.3

Mode 2100 16 4 2 188.3

Skewness 0.3228 0.4019 0.6609 0.79435 0.4740

Kurtosis 0.6033 -0.1736 -0.1998 0.6307 -0.2768


STATISTICAL COMPARISION
(TOWNSHIP-WISE)
Township-1 Township-2 Township-3 Township-4 Township-5

Minimum Price 125.9 154.3 155.4 125 173.1

Maximum Price 245.4 307.8 327.2 345.3 326.3

Mean Price 223.38375 227.45 228.792 216.927586 231.4

Median Price 199 214.8 233 205.1 224.6

Standard Deviation in
35.784052 44.1933729 48.6546411 49.9771369 48.7983333
Price

Mean Deviation in Price 28.0373333 37.795 39.74432 40.8135553 39.1

Variance in Price 1280.49838 1953.05421 2367.2741 2497.71421 2381.27733

Mode for the variable


3 2 3 4 2
Bedroom

Minimum Size 1700 1900 1600 1700 1900

Maximum Size 2500 2400 2700 2900 2900


Mean Size 2153.3333 2165 2228 2272.4138 2268.75

Median Size 2200 2150 2200 2300 2250

Standard Deviation in
203.07165 153.12534 245.83192 310.4105 260.04807
Size
Mean Deviation in
149.33333 125 181.12 256.12366 193.75
Size

Variance in Size 41238.095 23447.368 60433.333 96354.68 67625


Township House Count Percentage
Township - 1 15 14.00%
Township - 2 20 19.00%
Township - 3 25 24.00%
Township - 4 29 28.00%
Township - 5 16 15.00%

Township Size (Pie Chart)

15% 14%
1 2

19% 3 4
28%
5

24%
FREQUENCY DISTRIBUTION
ANALYSIS
Selling Price (in Mid Point Frequency Cumulative Cumulative %
thousands) Frequency

110 - 145 127.5 3 3 2.86%

145 – 180 162.5 19 22 20.95%

180 – 215 197.5 31 53 50.48%

215 – 250 232.5 25 78 74.29%

250 - 285 267.5 14 92 87.62%

285 - 320 302.5 10 102 97.14%

320 - 355 337.5 3 105 100.00%


Some Observations:
 Most homes (53%) are in the 180,000 – 250,000 range.
 The highest price is near 355,000 and the smallest near
110,000.
 About 40-42 homes sold for less than 200,000.
 About 54% of the homes sold for less than 220000, hence,
46% sold for more.
 Less than 1% of the homes sold for less than or equal to
125,000.
 The distribution is symmetric about 220,000. Similarly,
we notice that size(Area) distribution is symmetric about
2200 sq. ft.
CORRELATION AND REGRESSION
ANALYSIS
CORRELATION MATRIX:
Price Bedrooms Size Pool Distance Garage Baths

Price 1            

Bedrooms 0.467377108 1          

Size 0.371041595 0.383456103 1        

Pool -0.29406475 -0.005301227 -0.200590487 1      

Distance -0.347031166 -0.153355767 -0.117194504 0.139382435 1    

Garage 0.526273941 0.234102158 0.083027319 -0.114153335 -0.359294882 1  

Baths 0.382172576 0.328930238 0.024364862 -0.054532583 -0.194992972 0.221288914 1


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.729108837
R Square 0.531599697
Adjusted R Square 0.502922127
Standard Error 33.21107646
Observations 105

ANOVA
  df SS MS F Significance F
Regression 6 122675.9804 20445.99673 18.53712516 2.70532E-14
Residual 98 108091.6087 1102.975599
Total 104 230767.5891      

  Coefficients Standard Error t Stat P-value


Intercept 57.03490072 39.98570991 1.426382096 0.156936069
Bedrooms 7.117974437 2.55133755 2.78989914 0.006336546
Size 0.038004975 0.014678993 2.589072335 0.011087924
Pool -18.32144885 6.999270663 -2.617622569 0.010258152
Distance -0.929496297 0.727874495 -1.277000779 0.204619375
Garage 35.80981529 7.637666294 4.688580767 8.89992E-06
Baths 23.31499554 9.024667458 2.5834742 0.011257556
AFTER REMOVING DISTANCE FROM THE LIST OF INDEPENDENT VARIABLES

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.723744059
R Square 0.523805463
Adjusted R Square 0.499755234
Standard Error 33.3167027
Observations 105

ANOVA
  df SS MS F Significance F
Regression 5 120877.3239 24175.46478 21.77964543 1.18921E-14
Residual 99 109890.2652 1110.002679
Total 104 230767.5891     

  Coefficients Standard Error t Stat P-value


Intercept 36.12297229 36.59463803 0.98711107 0.325994589
Bedrooms 7.168879641 2.559139529 2.801285181 0.00612243
Size 0.039187603 0.014696343 2.666486706 0.008953702
-
Pool -19.11046686 6.994119786 2.732361962 0.007447849
Garage 38.84719931 7.280944113 5.335461817 6.04527E-07
Baths 24.62355254 8.994819911 2.737525908 0.007340152
Some Observations:
 The variable garage has the strongest correlation with
price.
 Distance and pool variables are inversely related with
price.
 Attached garage increases the price by 38,840.
 Each mile the home is from city center reduces the selling
price by 930.
 Each additional bedroom adds 7,168 to the price.
 A pool reduces value by 19,110.
 Regression equation:-
Price( in 1000s) = (38.8472*G)+(24.62355*Ba)-
(19.1105*P)+( 0.0392*S)+( 7.1689*Be)+ 36.1230
HYPOTHESIS TESTING
•Are mean Selling Price of homes with a pool and homes
without a pool the same?
Null Hypotheses , HO: µp=µnp

Alternate Hypotheses, H1: µp≠µnp

z-Test: Two Sample for


Means
  Selling Price with pool Selling Price without pool
Mean 202.7973684 231.4850746
Known Variance 1136.031074 2557.258
Observations 38 67
Hypothesized Mean
Difference 0
z -3.477270021
P(Z<=z) two-tail 0.000506547
z Critical two-tail 1.959963985 
Conclusion: Since |Zcalc| > |Ztable| , we have the null hypothesis as rejected
one. Thus, the alternate hypothesis is true.
•Is there a difference in the mean selling price among the five
townships?
Null Hypothesis , HO: µ1= µ2= µ3= µ4= µ5

Alternate Hypothesis, H1: µ1 ≠ µ2 ≠ µ3 ≠ µ4 ≠ µ5


SUMMARY
Groups Count Sum Average Variance
196.913333
Price1 15 2953.7 3 1280.498381
Price2 20 4549 227.45 1953.054211
Price3 25 5719.8 228.792 2367.2741
216.927586
Price4 29 6290.9 2 2497.714212
Price5 16 3702.4 231.4 2381.277333

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 13262.84 4 3315.71 1.52 0.200824 2.46
Within Groups 217504.74 100 2175.05

Total 230767.6 104

Conclusion: Since |Fcalc| ≤ |Ftable| , the null hypothesis is not rejected. Thus,
we can say that there is no significant difference in the mean selling price
among the five townships.
•Is there an association between the variables garage and township?

Null Hypothesis , HO: There’s no relationship between attached garage and township.
Alternate Hypothesis, H1: There’s a relationship between attached garage and township.

Chi square test is being used here and the decision criteria is that :-

HO is rejected when chi sq. (calculated) > 9.488 (as seen from the chi sq. table for df = 4 and
0.05 level of significance)
Contingency table for the variable garage
Township Township Township Township
Garage Township 1 Total
2 3 4 5
No 6 5 10 9 4 34
Yes 9 15 15 20 12 71
Total 15 20 25 29 16 105

Chi Sq. Calculated using macro


Township Township Township
Township 1
Chi-Sq 1.98 2 3 4 Township 5 Expected
Expected
Expected Expected Expected
p 0.739 4.857142857 6.476 8.095 9.3905 5.180952
10.14285714 13.524 16.905 19.6095 10.81905

Conclusion: Since chi sq.(calc) < chi sq. (theoretical), the null hypothesis is
not rejected. Thus, we can say that there is no kind of association between the
variables in question.
THANK YOU

You might also like