Statistics Review

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 5
At a glance
Powered by AI
The document discusses topics related to descriptive statistics such as measures of central tendency, measures of spread, correlation, and the normal distribution.

The measures of central tendency discussed are median, mean, and mode. The measures of spread discussed are range, interquartile range, variance, and standard deviation.

An example of positive correlation provided is the height of corn and amount of precipitation. An example of negative correlation provided is resale value of computers and their age.

Sample standard deviation

s=

(x
i =1

x) 2

n 1
for grouped data

Sample standard deviation

s=

f (m x )
i =1 i i

n 1

Covariance of two variables in a sample:

s XY =

1 n ( xi x )( yi y ) n 1 i =1

Correlation Coefficient
r=
or

s XY s X sY
[ nx (x ) 2 ][ ny 2 (y ) 2 ]
2

r=

nxy (x)( y )

The equation of the line of Best Fit


y = ax + b where

a=

n(xy ) (x )( y ) n(x 2 ) (x ) 2

and b = y ax

Standardized Normal Distribution


If X ~ N(,) then Z ~ N(0, 1) where

Z =

Binomial Distribution

n x n x P(X = x) = p q x
One Variable Statistics Review
Section 2.5 Measures of Central tendency Section 2.6 Measures of Spread
MULTIPLE CHOICE 1. A box-and-whisker plot does not show the a) mean b) first quartile c) third quartile d) median

2. Which of the following is not a measure of dispersion in a set of data? a) mean c) variance b) interquartile range d) standard deviation PROBLEM 1. The following table lists the approximate numbers of residents in 21 Canadian cities in 2002. City Calgary Edmonton Halifax Hamilton Kingston Kitchener/Waterloo Lethbridge London Ottawa Regina Saint John a) b) c) d) e) Population 864 700 693 800 117 200 347 500 60 300 276 400 71 200 350 900 348 500 182 800 73 600 City Saskatoon Sault Sainte Marie St. John's Sudbury Thunder Bay Toronto Vancouver Victoria Windsor Winnipeg Population 72 500 193 600 97 500 99 200 122 500 2 571 700 534 600 76 600 213 100 635 200

Find the median, first quartile, and third quartile for these data. Determine the range and interquartile range. Calculate the mean, standard deviation, and variance. What is the z-score for the population of Windsor? What is the z-score for the population of Toronto?

Two Variable Statistics - Review

3.1 Scatter plot and Correlation Coefficient


1. Which set of data would probably show a strong positive linear correlation? a) marks on a history test and the heights of the students b) the number of defective light bulbs produced and the time of the day when they were manufactured c) the colour of cars sold and the annual income of the car buyers d) the height of corn in a field and the amount of precipitation during the growing season 2. Which set of data would probably show a strong negative linear correlation? a) resale values of computers and their ages b) heights volleyball players can jump and the strength of their leg muscles c) numbers of people at a water park and the air temperature d) scores on a mathematics test and the number of hours spent studying for it

Problems
1. Hans has collected data to study the effect of the total winter snowfall on the height of his corn crop the following summer. a) Complete the table below and use the results to calculate the correlation coefficient, r. Snowfall, x Corn Height, y Year (cm) (cm) x2 y2 xy 1995 1996 1997 1998 1999 Totals 173 165 152 184 178 182 190 207 180 184

b) Explain what this correlation coefficient tells you about the relationship between the amount of winter snowfall and the height of corn plants the following summer.
b)

Section 3.2 Linear Regression


PROBLEM 1. A hi-fi store kept track of the number of advertisements it placed in local newspapers and the number of stereo systems it sold each week. Week Advertisements, x Stereos Sold, y 1 6 20 2 5 15 3 3 12 4 2 8 5 1 6 6 4 7 7 3 9 8 2 7

a) Determine the line of best fit

, using the formulas

b) c) d) e)

and . Determine the equation of the line of best fit using a graphing calculator, a spreadsheet, or Fathom. Compare the equations you found in parts a) and b), and account for any differences. What is the correlation coefficient for the set of data? What does this correlation coefficient suggest about the effectiveness of the advertisements?

Normal Distribution - Review


Section 8.2 Properties of the Normal Distribution
1.The annual returns from a particular mutual fund are believed to be normally distributed. The following table lists the annual returns for this fund over a 20-year period. Year Return (%) Year Return (%) 1 6.2 11 5.4 2 11.3 12 26.0 3 16.1 13 13.5 4 16.9 14 24.2 5 9.8 15 1.5 6 18.3 16 1.4 7 11.2 17 15.7 8 12.1 18 11.8 9 19.2 19 1.9 10 17.6 20 17.8 a) Determine the mean and standard deviation of the annual returns for this fund. b) What is the probability that an annual return will be i) at least 9%? ii) negative? c) For how many of the next ten years would you expect the fund to have an annual return greater than 6%? What assumptions are necessary to answer this question?

Section 8.4 Normal Approximation to the Binomial Distribution


MULTIPLE CHOICE 1. Under what conditions is a normal probability distribution a good approximation for a discrete binomial distribution? a) np and nq greater than 5 c) b) np and nq less than 5 d) continuity correction applied

2. For which of the binomial distributions listed below is the normal distribution not a reasonable approximation? a) n = 50, p = 0.4 b) n = 40, p = 0.12 c) n = 75, p = 0.11 d) n = 40, p = 0.8 SHORT ANSWER 1. Use the normal approximation to find and in a binomial distribution with n = 1000 and p = 0.5. 2. QuenCola, a soft-drink company, knows that it has a 42% market share in one region of the province. QuenColas marketing department conducts a blind taste test with 100 people at a mall in the region. Use a normal approximation to calculate the probability that fewer than 40 of these people will choose QuenCola. 3. QuenCola, a soft-drink company, knows that it has a 42% market share in one region of the province. QuenColas marketing department conducts a blind taste test with 100 people at a mall in the region. Use a normal approximation to calculate the probability that exactly 40 of these people will choose QuenCola. PROBLEM

1. The probability of an airline flight arriving on time is 90%. Use the normal approximation to find the probability that at least 300 of a random sample of 350 flights will arrive on time. Explain each step in the calculation.

You might also like