Regression Analysis in Minitab
Regression Analysis in Minitab
The Minitab project file Heightweight.MPJ contains data on heights of 22 individuals (both men and women) and their desired weights. In these notes you will be shown step by step how to obtain the equation of the regression line that best fits the data. We use height as explanatory variable and weight the response. We first construct a scatterplot of the data.
Scatterplot of weight vs height
240 220 200 weight 180 160 140 120 100 60 62 64 66 68 70 height 72 74 76 78
The scatterplot reveals a fairly strong, positive linear association. Therefore, we can proceed to find the equation of the regression line. In order to do that we select: STAT->Regression->Fitted line plot (see below). In the window that opens select the Predictor (or explanatory variable) height, and the response weight and type of regression: linear, and press ok.
The window that opens will show the scatterplot together with the regression line and the equation of the regression line.
The slope of the equation is positive (as we correctly interpreted the scatterplot) and indicates that for each increase in height of 1in, the average weight increases with approximately 6.9 pounds. Notice that the negative y-intercept means that for a person of zero inches in height (nonsense!) the average weight is -315.3 pounds! Of course this does not make sense, therefore we cannot apply this model far outside the range of our data. For example it would not make sense to make predictions for weights of toddlers based on his model! Take a toddler measuring say 36 inches, the model would predict a weight of -66.8 pounds! Notice also that we have a big R-square of 89.6%, meaning that 89.6% of the variation in weight is explained by the linear dependence on height. A plot of the residuals can b obtained if in the Fitted line plot window above you select Graphs -> Residuals versus variable and select weight in the variable box. Click then ok and again ok. You will get the following plot besides the one shown above:
Since the points are scattered on both sides of the horizontal line in a chaotic way, we can say that the linear model is a good fit for the given data.