0% found this document useful (0 votes)
416 views3 pages

SMMD Assignment 1

This report summarizes key statistics about house prices and living areas: 1. House prices have a mean of $163,862, median of $151,917, and standard deviation of $67,652. Price increases with living area. 2. The price distribution is positively skewed but still allows for statistical analysis. Outliers above $329,453 and below $92,785 are seen. 3. Assuming a normal price distribution, the probabilities of price exceeding $92,800 or being below $255,000 match between the normal calculation and JMP software analysis, validating the normal assumption.

Uploaded by

Shubham Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
416 views3 pages

SMMD Assignment 1

This report summarizes key statistics about house prices and living areas: 1. House prices have a mean of $163,862, median of $151,917, and standard deviation of $67,652. Price increases with living area. 2. The price distribution is positively skewed but still allows for statistical analysis. Outliers above $329,453 and below $92,785 are seen. 3. Assuming a normal price distribution, the probabilities of price exceeding $92,800 or being below $255,000 match between the normal calculation and JMP software analysis, validating the normal assumption.

Uploaded by

Shubham Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

SMMD Assignment-1 Name: Shubham Chauhan l PGID: 61910037

1. Summary report for House Prices:

1.Price Statistics 2. Price Histogram with Normal Plot 3. Price Versus Living Area

Observations/Comments –
• Mean of House prices is -$163,862 with Median price of $151,917
• Price distribution is positively skewed to right
• Standard Deviation of prices is -$67,652
• Price is directly proportional to the living area available in the house. This can be clearly
observed in the Graph-3 (neglecting few exceptions)

2. Although the plot is not perfectly normal but it does provide us a good statistical insight in terms of normal
mapping. We can clearly see the outliers in the below plotting which indicates very high prices. Also the
eviation from the straight line indicates the deviation from normal behaviour. But it is acceptable for
applying statistical analysis.

Graphical Insight

• As can be seen from the graph, there is a


deviation from normality since quantile plot
doesn’t follow a straight line.
• We can clearly see the outliers in the Box plot.
These values lie outside of ±1.5 IQR
3. Assuming a normal model for Price ~ N(164K, (68K)2)
A. Probabilities for two scenarios - P(Price>92,800) and P(Price<255,000) have been calculated and
reported below-
• Below probabilities have been calculated using Excel function <Norm.Dist>

Mean 1,64,000 Mean 1,64,000


SD 68,000 SD 68,000
xi 92,800 xi 2,55,000
P(Price>92800) 85.25% P(Price<255000) 90.96%

• Below table has been generated using the JMP Software

Price in Tabular Insights


Quantiles Region $ 1.P>92,800
100.00% maximum 446436
• By Perfectly Normal -85.25%
99.50% 385738 • By JMP (Actual Data) – 90%
97.50% 329453
There is a variation of around 5% in both methods. This is because excel
90.00% 255526
formula considers a perfect normal distribution which is not the case in real.
75.00% quartile 205397
JMP software depicts the actual scenario. Both model doesn’t agree
50.00% median 151917 2.P<255,000
25.00% quartile 111875
• By Perfectly Normal -90.96%
10.00% 92785
• By JMP (Actual Data) – 90%
2.50% 62971
0.50% 41830 Probability matches almost exactly i.e the excel calculation and the JMP
0.00% minimum 16858 Software. Data agrees

B. Probabilities for two scenarios - P(Price>232,000)

The probability by excel sheet comes out to be 84.13%.

Mean 1,64,000
SD 68,000
xi 2,32,000
P(Price<232000) 84.13%

The probability by JMP comes out to be almost 84%


Smoothed Empirical Likelihood Quantiles
Quantile Estimate Lower 95%Upper 95%
84% 232141 224528 238139

Yes, both models are similar and agrees with each other

C. Based on theoretical model, the price of 75%percentile house should be $209,865 while the
actual price is $205397 which are pretty much close to each other and hence agrees with each
other.
4. Below is the histogram for the living area data. As can be clearly seen in the histogram, the data appears
to be right skewed which is consistent with the summary statistics (Skewness -0.808)

5. Below is the graph for Log of Living room data. Yes, the graph looks almost perfectly normal when
compared to plotting the living room data.

Graphical Insights

One of the major reason for such an improvement


in the normality is that the variance in the data has
reduced. The standard deviation has reduced from
641 to just 0.350. This has resulted in original
skewness to drop from 0.808 to 0.005

This is the reason that log(Living Area) graph is


almost perfectly normal while the graph for (Living
Area) is right skewed.

You might also like