0% found this document useful (0 votes)
22 views6 pages

Prac 1

Uploaded by

ankushstatsdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Prac 1

Uploaded by

ankushstatsdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Name: Nehal Dixit

Roll No.: 21026765019


Group: A
Practical 1

Objective: To carry out linear regression on the basis of given data.

Problem:

The following data gives the House price in Lakhs (Y) and Area in square yards (X) of a reality firm. Fit the simple
linear regression model to the following data and carry out the analysis.

Y X Y X Y X Y X
186 175 182 167 162 156 179 160
180 168 162 160 192 180 170 149
160 154 169 165 185 167 170 160
186 166 176 167 163 157 165 148
163 162 180 175 185 167 165 154
172 152 157 157 170 157 169 171
192 179 170 172 176 168 171 165
170 163 186 181 176 167 192 175
174 172 180 166 160 145 176 161
191 170 188 181 167 156 168 162
182 170 153 148 157 153 169 162
178 147 179 169 180 162 184 176
181 165 175 170 172 156 171 160
168 162 165 157 184 174 161 158
162 154 156 162 185 160 185 175
188 166 185 174 165 152 184 174
168 167 172 168 181 175 179 168
183 174 166 162 170 169 184 177
188 175 179 159 161 149 175 158
166 164 181 155 188 176 173 161
180 163 176 171 181 165 164 146
176 163 170 159 156 143 181 168
185 171 165 164 161 158 187 178
169 161 183 175 152 141 181 170
Theory :
Regression model describes the relationship between variables by fitting a line to the observed data. Linear
regression models use a straight line, while logistic and nonlinear regression models use a curved line. Regression
allows us to estimate how a dependent variable changes as the independent variable(s) change.

Simple linear regression is used to estimate the relationship between two quantitative variables. We can use simple
linear regression when we want to know:
i) How strong the relationship is between two variables (e.g., the relationship between quantity bought
and price per unit of commodity).
ii) The value of the dependent variable at a certain value of the independent variable (e.g., the amount of
quantity bought at a certain price per unit).

It is a model with a simple regressor (x) that has a relationship with a response (y).

Model :
y = β 0 + β1x + ε, where the intercept β 0 and slope β1 are unknown constants, called parameter, which are to be
estimated by method of least square and ε is a random error component.

Assumptions :
i) There is a linear relationship between the response (y) and regressor (x).
ii) The errors are assumed to be Normally distributed with mean 0 and unknown variance σ 2.
i.e. , εi ~ N(0 , σ2) for all i .
iii) The error terms are uncorrelated which implies the absence of autocorrelation.
i.e. , Cov(εi , εj) = 0 for all i ≠ j .
iv) There is no multicollinearity and the variables are homoscedastic.

The model along with the above assumptions is known as Classical Linear Regression Model (CLRM).

The R2 is called the coefficient of determination. It tells us the proportion or percentage of variation can be explained
by regressor x. The value of R2 lies between 0 and 1. The values of R 2 that are close to 1 imply that most of the
variability in y is explained by the regression model. The value of R 2 always increases when we add new regressor
variables.

The Adjusted R2 tells the percentage of variation explained by only those regressors that actually affect the
dependent variable y.
The Analysis of Variance (ANOVA) is based on partitioning of total variability in response variable to draw inferences
about the significance of regression.

Hypothesis : Here, we wanted to test whether the Regressor is significant or not.


i.e., to test H0: β1 = 0 against H1: β1 ≠ 0.

Test criteria : If p-value<0.05, we reject H 0 at 5% level of significance and conclude on the basis of given data that the
regressor is statistically significant.

Steps :

Analyze → Regression → Linear → Dependent: House_price → Independent(s) → Area_yards → Statistics →


Estimates, Model fit, Descriptives → Continue → OK

Output :
Table 1 : Table showing the given data
Table 2
Descriptive Statistics
Std.
Mean Deviation N
Y 174.32 9.960 96
X 163.90 9.168 96

Table 3

Correlations
Y X
Pearson Y 1.000 .763
Correlation X .763 1.000
Sig. (1-tailed) Y . .000
X .000 .
N Y 96 96
X 96 96

Table 4

Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
1 .763a .582 .578 6.471
a. Predictors: (Constant), X

Table 5
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 5488.749 1 5488.749 131.075 .000a
Residual 3936.240 94 41.875
Total 9424.990 95
a. Predictors: (Constant), X
b. Dependent Variable: Y

Table 6

Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 38.439 11.887 3.234 .002
X .829 .072 .763 11.449 .000
a. Dependent Variable: Y

Conclusion:

1. From table 2, we observe that the Mean and Standard Deviation of House_price are 174.32 lakhs and 9.960
lakhs respectively. Again, Mean and Standard Deviation of Area_yards are 163.92 square yards and 9.168
square yards respectively. There are a total 96 pair of observations.

2. From table 3, we observe that the correlation between House_price and Area_yards is 0.763 , which is high.
Also, the p-value=0<0.05. So, we reject H 0: Correlation between House_price and Area_yards in insignificant
and conclude that there is a significant correlation between House_price and Area_yards.

3. From table 4, we see that R2 = 0.582 which implies that this particular regression model explains 58.5% of
the total variation in the response (House_price). Also, Adjusted R2 = 0.578 ≈ R2. So, the model is good.

4. From the ANOVA table in table 5, we see that the p-value for testing H 0: β1 = 0 against H1: β1 ≠ 0 is 0<0.05. So,
we reject H0 at 5% level of significance and conclude on the basis of the given data that the regressor is
significant.

5. In table 6, we see the coefficients of the regression model are β ^0 = 38.439 and β^1 = 0.829. So, the fitted
regression model is : House_price = 38.439 + 0.829*Area_yards . We can also see that the p-value for
testing H0: β1 = 0 against H1: β1 ≠ 0 is 0<0.05. So, we reject H 0 at 5% level of significance and conclude on the
basis of the given data that the regressor (Area_yards) and intercept is significant.

You might also like