0% found this document useful (0 votes)
209 views4 pages

IAL Statistic 1 Revision Worksheet Month 4

This document provides a summary of revision worksheets for statistics topics covered in May 2021. It includes 12 practice problems spanning multiple concepts in statistics such as regression analysis, correlation, distributions, and summarizing data. The problems involve calculating measures of central tendency and variation, interpreting correlation coefficients, drawing and comparing distributions, and using statistical formulas and data to solve word problems related to business, agriculture, and other real-world scenarios.

Uploaded by

Le Jeu Life
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
209 views4 pages

IAL Statistic 1 Revision Worksheet Month 4

This document provides a summary of revision worksheets for statistics topics covered in May 2021. It includes 12 practice problems spanning multiple concepts in statistics such as regression analysis, correlation, distributions, and summarizing data. The problems involve calculating measures of central tendency and variation, interpreting correlation coefficients, drawing and comparing distributions, and using statistical formulas and data to solve word problems related to business, agriculture, and other real-world scenarios.

Uploaded by

Le Jeu Life
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

JAN 22 BATCH

S1 Revision Worksheet
Month-04 (May 2021)
Syllabus: Chapters 2, 3, 5 (till Ex 5E)

1. Energy consumption is claimed to be a good predictor of Gross National Product. An economist recorded
the energy consumption (x) and the Gross National Product (y) for eight countries. The data are shown in
the table.
Energy Consumption x 3.4 7.7 12.0 75 58 67 113 131

Gross National Product y 55 240 390 1100 1390 1330 1400 1900

a) Calculate 𝑆𝑥𝑦 and 𝑆𝑥𝑥 .


b) Find the equation of the regression line of y on x in the form 𝑦 = 𝑎 + 𝑏𝑥.
c) Estimate the Gross National Product of a country that has an energy consumption of 100.
d) Estimate the energy consumption of a country that has a Gross National Product of 3500.
e) Comment on the reliability of your answer to d.

2. The following table shows the values of two variables v and m.


v 50 70 60 82 45 35 110 70 35 30

m 140 200 180 210 120 100 200 180 120 60


𝑚
The results were coded using 𝑥 = 𝑣 − 30 and 𝑦 = 20.
a) Complete the table for x and y.
x 20 40 30 52 15 80 5 0

y 7 8 6 5 10 9 6 3

b) Calculate 𝑆𝑥𝑥 , 𝑆𝑦𝑦 𝑎𝑛𝑑 𝑆𝑥𝑦.


(You may use ∑ 𝑥 = 287 ∑ 𝑥 2 = 13879 ∑ 𝑦 = 75.5 ∑ 𝑦 2 = 627.25 ∑ 𝑥𝑦 = 2661.
c) Usingyour answers to b calculate the product moment correlation coefficient for x and y.
d) Write down the product moment correlation coefficient for v and m.
e) Describe and interpret your moment correlation coefficient for v and m.

3. The table gives the distances travelled to school, in km, of the population of children in a particular region of the
United kindom.
Distance, km 0-1 1–2 2-3 3-5 5 - 10 10 and over

Number 2565 1784 1170 756 630 135

A histogram of this data was drawn with distance along the horizontal axis.
A bar of horizontal width 1.5 cm and height 5.7 cm represented the 0 - 1 km group.
Find the widths and heights, in cm to one decimal place, of the bars representing the following groups:
a. 2 - 3, b. 5 - 10.

𝑖−90
4. Some data were coded using 𝑦 = 100
and the following summations were obtained.
∑ 𝑦 = 131, ∑ 𝑦 2 = 176.84 𝑛 = 100
Work out an estimate for mean and the standard deviation of 𝑖.
JAN 22 BATCH
5. The daily total sunshine, 𝑠, in Amman is recorded,
The data are coded using 𝑥 = 10𝑠 + 1 and the following summary statistics are obtained,
𝑛 = 30 ∑ 𝑥 = 947𝑆𝑥𝑥 = 33065.37
Find the mean and the standard deviation of daily total sunshine. [mean = 3.06, s.d=3.32]

6.

Some students take part in an obstacle race. The time it took each students to complete the race was noted. The results
are shown in the diagram.
a. Give a reason to justify the use of a histogram to represent these data.
The number of students who took between 60 and 70 seconds is 90.
b. Find the number of students who took between 40 and 60 seconds.
c. Find the number of students who took 80 seconds or less.
d. Calculate the total number of students who took part in the race.

7. Sophie and Jack do a survey every day for three weeks. Sophie counts the number of pedal cycles using Market
Street. Jack counts the number of pedal cycles using Stand Road. The data they collected are summarised in the
back-to-back stem and leaf diagram.
Sophie Stem Jack Key: 5|0|6 means Sophie
9 9 7 5 0 6 6 counts 5 cycles and

7 6 5 3 3 2 2 2 1 1 1 1 1 5 Jack counts 6 cycles

5 3 3 2 2 2 1 2 2 2 3 7 7 8 9

2 1 3 2 3 4 7 7 8

4 2

a. Write down the modal number of pedal cycles using Strand Road.
The quartiles for these data are summarised in the table below.
Sophie Jack
Lower quartile X 21
Median 13 Y
Upper quartile Z 33
b. Find the values for X, Y and Z.
c. Write down the road you think has the most pedal cycles travelling along it overall. Give a reason for your answer.
JAN 22 BATCH
8. A farm food supplier monitors the number of hens kept(𝑥) against the weekly consumption of hen food (𝑦 𝑘𝑔) for
a sample of 10 small holders. He records the data and works out the regression line for 𝑦 on 𝑥 to be
𝑦 = 0.16 + 0.79𝑥
a) Write down a practical interpretation of the figure 0.79
b) Estimate the amount of food that is likely to ne needed by a small holder who has 30 hens [23.9 kg]
c) If food costs £12 for 10kg bag, estimate the weekly cost of feeding 50 hens. [47.59]

9. Each of 10 cows was given an additive (x) every day for four weeks to see if it would improve their milk yield (y).
At the beginning the average milk yield per day was 4 gallons. The milk yield of each cow was measured on the
last day of the four weeks. The data collected is shown in the table.

Cow A B C D E F G H I J
Additive, x (25 gm units) 1 2 3 4 5 6 7 8 9 10
Yield, y (gallons) 4.0 4.2 4.3 4.5 4.5 4.7 5.2 5.2 5.1 5.1
a. Draw a scatter diagram of these data.
b. Write down any conclusion you can draw from the scatter diagram.
c. From the diagram write down, with reason, the amount of additive that could be given to each cow to maximize yield
and minimize cost.
d. The product moment correlation coefficient is to be calculated for the first seven cows. Write down why you think
cows H, I and J are being left out for this calculation.
e. Use the values 𝑆𝑥𝑥 = 28, 𝑆𝑦𝑦 = 0.90857 and 𝑆𝑥𝑦 = 4.8 to calculate the product moment correlation coefficient
for the seven cows.
f. Write down, with a reason, how the product moment correlation coefficient for all 10 cows would differ from your
answer to e.

10. Describe the main features and uses of a box plot.


Children from schools A and B took part in a fun run for charity. The times, to the nearest minute, taken by the
children from school A are summarized in figure 1.

b. i. Write down the time by which 75% of the children in school A had completed the run.
ii. State the name given to this value.
c. Explain what you understand by the two crosses (×) on figure 1.
For school B the least time taken by any of the children was 25 minutes and the longest time was 55 minutes. The
three quartiles were 30, 37 and 50 respectively.
d. On graph paper, draw a box plot to represent the data from school B.
e. Compare and contrast these two box plots.
JAN 22 BATCH

11. The numbers of questions answered correctly by children taking a general knowledge test are shown in the
following frequency distribution.
Number of 0– 5 6 − 10 11 − 15 16 − 20 21 − 60 61 − 70
correct answers
Frequency 4 15 5 2 0 1

a) Write down the class width for the first group


b) Is number of correct answers discrete or continuous. Explain your answer
c) Write down the class for number of correct answers that most of the children answered correctly
d) Find out the mean and the standard deviation of the number of correct answers
e) Estimate the number of correct answers which were one standard deviation less than the mean
f) Find the median and the interquartile range of the number of correct answers
g) Write down the number of questions that at least 25% of the children answered correctly
h) State, giving a reason, whether the mean or the median is a better representation of the data.

12. A farmer collected data on the annual rainfall, x cm, and the annual yield of potatoes, p tonnes per acre.
𝑥−4
The data for annual rainfall were coded using 𝑣 = 8
and the following statistics were found:
𝑆𝑣𝑣 = 10.21 𝑆𝑝𝑣 = 15.26 𝑆𝑝𝑝 = 23.39 𝑝̅ = 9.88 𝑣̅ = 4.58
a) Find the equation of the regression line of 𝑝 on 𝑣 in the form 𝑝 = 𝑎 + 𝑏𝑣 [𝑝 = 3.03 + 1.49𝑣]
b) Using your regression line, estimate the annual yield of potatoes per acre when the annual rainfall is 42 cm.
[10.1 tonnes]

13. Bilash, a market gardener, measures the amount of fertilizer, 𝑥 litres, that he adds to the compost for a random
sample of 7 chilli plant beds. He also measures the yield of chillies, 𝑦 kg.
The data are shown in the table below:
𝑥, litres 1.1 1.3 1.4 1.7 1.9 2.1 2.5
𝑦, kg 6.2 10.5 12 15 17 18 19

(∑ 𝑥 = 12 ∑ 𝑥 2 = 22.02 ∑ 𝑦 = 97.7 ∑ 𝑦 2 = 1491.69 ∑ 𝑥𝑦 = 180.37)

a) Show that the PMCC for these data is 0.946, correct to 3 significant figures

The equation of the regression line of 𝑦 on 𝑥 is given as 𝑦 = −1.2905 + 8.8945𝑥


b) Calculate the residuals [ ]

Bilash thinks that because the PMCC is close to 1, a linear relationship is a good model for these data.
c) With reference to the residuals, evaluate Bilash’s conclusion.
[Residuals are not randomly scattered about zero, they ‘rise and fall’, so this indicates that a linear
relationship is not a good model for the data]

You might also like