100% found this document useful (2 votes)
1K views

Linear Regression (Lecture)

The equation of the least-squares line for the ordered pairs in Table 13.11c is Ŷ = 1.9x + 0.4.

Uploaded by

yoyoyo
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
1K views

Linear Regression (Lecture)

The equation of the least-squares line for the ordered pairs in Table 13.11c is Ŷ = 1.9x + 0.4.

Uploaded by

yoyoyo
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Linear Regression

In many applications, scientists try to determine whether two variables are related. If they are related,
the scientists then try to find an equation that can be used to model the relationship. For instance, the
zoology professor R. McNeill Alexander wanted to determine whether the stride length of a dinosaur, as
shown by its fossilized footprints, could be used to estimate the speed of the dinosaur. Stride length for
an animal is defined as the distance x from a particular point on a footprint to that same point on the
next footprint of the same foot. (See the figure below.) Because no dinosaurs were available, Alexander
and fellow scientist A. S. Jayes carried out experiments with many types of animals, including adult men,
dogs, camels, ostriches, and elephants. The results of these experiments tended to support the idea that
the speed y of an animal is related to the animal’s stride length x. To better understand this relationship,
examine the data in Table 13.11, which are similar to, but less extensive than, the data collected by
Alexander and Jayes.

TABLE 13.11 Speed for Selected Stride Lengths

a.

Adult men Stride 2.5 3.0 3.3 3.5 3.8 4.0 4.2 4.5
length (m)
Speed (m/s) 3.4 4.9 5.5 6.6 7.0 7.7 8.3 8.7

b. Dogs Stride length (m) 1.5 1.7 2.0 2.4 2.7 3.0 3.2 3.5
Speed (m/s) 3.7 4.4 4.8 7.1 7.7 9.1 8.8 9.9

Dogs Stride 1.5 1.7 2.0 2.4 2.7 3.0 3.2 3.5
length (m/s)
Speed(m/s) 3.7 4.4 4.8 7.1 7.7 9.1 8.8 9.9

c. Camels Stride length (m) 2.5 3.0 3.2 3.4 3.5 3.8 4.0 4.2
Speed (m/s) 2.3 3.9 4.4 5.0 5.5 6.2 7.1 7.6

Camels Stride 2.5 3.0 3.2 3.4 3.5 3.8 4.0 4.2
length (m)
Speed (m/s) 2.3 3.9 4.4 5.0 5.5 6.2 7.1 7.6
A graph of the ordered pairs in Table 13.11 is shown in Figure 13.15. In this graph, which is called a
scatter diagram or scatter plot, the x-axis represents the stride lengths in meters and the y-axis
represents the average speeds in meters per second. The scatter diagram seems to indicate that for
each of the three species, a larger stride length generally produces a faster speed. Also note that for
each species, a straight line can be drawn such that all of the points for that species lie on or very
close to the line. Thus the relationship between speed and stride length appears to be a linear
relationship.
After a relations hip between paired data, which are referred to as bivariate data, has been discovered, a
scientist tries to model the relationship with an equation. One method of determining a linear
relationship for bivariate data is called linear regression. To see how linear regression is carried out, let
us concentrate on the bivariate data for the dogs, which is shown by the green points in Figures 13.15
and 13.16. There are many lines that can be drawn such that the data points lie close to the line;
however, scientists are generally interested in the line called the line of best fi t or the least-squares
regression line.
▼ The Least-Squares Regression Line
The least-squares regression line for a set of bivariate data is
the line that minimizes the sum of the squares of the vertical
deviations from each data point to the line.

The least-squares regression line is also called the least-


squares line. The approximate equation of the least-squares
line for the bivariate data for the dogs is Ŷ = 3.2x - 1.1. Figure
13.16 shows the graph of these data and the graph of Ŷ =
3.2x - 1.1. In Figure 13.16, the vertical deviations from the
ordered pairs to the graph of = 3.2x - 1.1 are 0, -0.06, 0.5,
-0.52, -0.16, -0.6, 0.34 and 0.2. It is traditional to use the
symbol Ŷ (pronounced y-hat) in place of y in the equation of
a least-squares line. This also helps us differentiate the line’s
y-values from the y-values of the given ordered pairs. The
next formula can be used to determine the equation of the
least-squares line for a given set of ordered pairs.

▼ The Formula for the Least-Squares Line The equation of


the least-squares line for the n ordered pairs (x1, y1) ,( x2,
y2) ,( x3, y3) , ... , (xn, yn) is

Ŷ = ax+ b, where

n ∑ xy−∑ x ∑ y
a= n ∑ x 2−¿ ¿ ¿ ¿
and b=ȳ- a
Example 1:
Find the equation of the least-squares line for the ordered
pairs in Table 13.11a.
Solution:
The ordered pairs are (2.5, 3.4) ,( 3.0, 4.9) ,( 3.3, 5.5) ,
( 3.5,6.6) , (3.8,7.0) , (4.0,7.7 ),( 4.2, 8.3) ,( 4.5, 8.7) The
number of ordered pairs is n 8. Organize the data in four
columns, as shown in Table 13.12. Then find the sum of each
column.
Table 13.12
x y X2 xy
2.5 3.4 6.25 8.50
3.0 4.9 9.00 14.70
3.3 5.5 10.89 18.15
3.5 6.6 12.25 23.10
3.8 7.0 14.44 26.60
4.0 7.7 16.00 30.80
4.2 8.3 17.64 34.86
4.5 8.7 20.25 39.15
∑ x = 28.8 ∑ y =52.1 ∑ x 2=106.72 ∑ x y=195.86
Find the slope a

n ∑ xy−∑ x ∑ y 8 ( 195.86 )−(28.8)( 52.1)


a= 2
n ∑ x −¿ ¿ ¿ ¿
= 8 ( 106.72 )−(28.8)2
≈ 2.7303

Find ȳ and
28.8 52.1
∑❑ = 8
=3.6 ∑ y= 8
=6.5125

Find the y- intercept b,


b=ȳ- a =6.5125 – (2.7303)(3.6) = -3.31658

If a and b are rounded to the nearest tenth, to reflect the


accuracy of the original data, then we have the equation of
least – squares line:

Ŷ= ax + b = 2.7x – 3.3

Try this:
Find the equation of the least-squares line for the ordered
pairs in Table 13.11 c .

You might also like