0% found this document useful (0 votes)
8 views17 pages

Stats EQ

The document outlines various statistical sampling techniques and their applications, including random sampling, stratified sampling, and the importance of sample representativeness. It includes practical exercises related to sampling methods, data presentation, and interpretation, as well as probability calculations and hypothesis testing. The content is structured into sections covering statistical distributions, data interpretation, and hypothesis testing, providing a comprehensive overview of statistical concepts.

Uploaded by

besedab541
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

Stats EQ

The document outlines various statistical sampling techniques and their applications, including random sampling, stratified sampling, and the importance of sample representativeness. It includes practical exercises related to sampling methods, data presentation, and interpretation, as well as probability calculations and hypothesis testing. The content is structured into sections covering statistical distributions, data interpretation, and hypothesis testing, providing a comprehensive overview of statistical concepts.

Uploaded by

besedab541
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

1 Statistical sampling 12 marks

 Understand and use the terms ‘population’ and ‘sample’


 Use samples to make informal inferences about the population
 Understand and use sampling techniques, including simple random sampling and opportunity
sampling.
 Select or critique sampling techniques in the context of solving a statistical problem, including
understanding that different samples can lead to different conclusions about the population

1. (a) Max wants to take a random sample of students from his year group.
(i) Explain what is meant by a random sample.
..............................................................................................................................................
..............................................................................................................................................

(ii) Describe a method Max could use to take his random sample.
..............................................................................................................................................
..............................................................................................................................................
..............................................................................................................................................
(2)

(b) The table below shows the numbers of students in 5 year groups at a school.
Year Number of students
9 239
10 257
11 248
12 190
13 206

Lisa takes a stratified sample of 100 students by year group.

Work out the number of students from Year 9 she has in her sample.

..............................................................................................................................................
(2)
(Total for Question is 4 marks)
2. The table shows information about 1065 students.

Elena takes a stratified sample of 120 students by year group and by gender.
Work out the number of Year 8 female students in her sample

...........................................................

(Total for Question is 2 marks)


3. Nathan is doing a survey about DVDs.
He writes a questionnaire.
Nathan decides to hand out his questionnaire to the women who are inside a DVD store.
His sample is biased.
Give two possible reasons why.
..............................................................................................................................................
..............................................................................................................................................
..............................................................................................................................................
..............................................................................................................................................
(2)

(Total for Question is 2 marks)

4. For a school project, Neville is investigating the types of music people in the UK like to listen to.
He collects data by asking friends from his year group.
Is this sample likely to be representative of the population?
Give one way in which the sample could be improved.

..............................................................................................................................................
..............................................................................................................................................
..............................................................................................................................................

(Total for Question is 2 marks)

5. Sara is investigating the variation in daily maximum gust, t kn, for Camborne in June and July 1987.

She used the large data set to select a sample of size 20 from the June and July data for 1987. Sara selected the
first value using a random number from 1 to 4 and then selected every third value after that.
(a) State the sampling technique Sara used.
(1)

(b) From your knowledge of the large data set, explain why this process may not generate a sample of size
20.
(1)

(Total for Question is 2 marks)


2 Data presentation and interpretation 27 marks

 2.1 Interpret diagrams for single-variable data, including understanding that area in a histogram
represents frequency
 Connect to probability distributions
 2.2 Interpret scatter diagrams and regression lines for bivariate data, including recognition of scatter
diagrams which include distinct sections of the population (calculations involving regression lines are
excluded)
 Understand informal interpretation of correlation
 Understand that correlation does not imply causation
 2.3 Interpret measures of central tendency and variation, extending to standard deviation
Be able to calculate standard deviation, including from summary statistics
 2.4 Recognise and interpret possible outliers in data sets and statistical diagrams
Select or critique data presentation techniques in the context of a statistical problem
Be able to clean data, including dealing with missing data, errors and outliers

2.1 and 2.2

1. The number of hours of sunshine each day, y, for the month of July at Heathrow are summarised in
the table below.

Hours 0y<5 5y<8 8  y < 11 11  y < 12 12  y < 14


Frequency 12 6 8 3 2

A histogram was drawn to represent these data. The 8  y < 11 group was represented by a bar of
width 1.5 cm and height 8 cm.

(a) Find the width and the height of the 0  y < 5 group.
(3)

(b) Use your calculator to estimate the mean and the standard deviation of the number of hours of
sunshine each day, for the month of July at Heathrow. Give your answers to 3 significant figures.
(3)

The mean and standard deviation for the number of hours of daily sunshine for the same month in Hurn
are 5.98 hours and 4.12 hours respectably. Thomas believes that the further south you are the more
consistent should be the number of hours of daily sunshine.

(c) State, giving a reason, whether or not the calculations in part (b) support Thomas’ belief.
(2)
(d) Estimate the number of days in July at Heathrow where the number of hours of sunshine is more
than 1 standard deviation above the mean.
(2)
(Total for Question is 10 marks)
2. The partially completed histogram and the partially completed table show the time, to the nearest
minute, that a random sample of motorists were delayed by roadworks on a stretch of motorway.

Delay (minutes) Number of motorists


4–6 6
7 –8
9 17
10 – 12 45
13 – 15 9
16 – 20

Estimate the percentage of these motorists who were delayed by the road works for between 8.5 and
13.5 minutes.
(5)

(Total for Question is 5 marks)

3. The mark, x, scored by each student who sat a statistics examination is coded using y = 1.4x − 20
The coded marks have mean 60.8 and standard deviation 6.60
Find the mean and the standard deviation of x.

(Total 4 marks)
4. Sara was studying the relationship between rainfall, r mm, and humidity, h %, in the UK. She takes
a random sample of 11 days from May 1987 for Leuchars from the large data set. She obtained the
following results.

h 93 86 95 97 86 94 97 97 87 97 86
r 1.1 0.3 3.7 20.6 0 0 2.4 1.1 0.1 0.9 0.1

Sara examined the rainfall figures and found

Q1 = 0.1 Q2 = 0.9 Q3 = 2.4

A value that is more than 1.5 times the interquartile range (IQR) above Q3 is called an outlier.

(a) Show that r = 20.6 is an outlier.


(1)
(b) Give a reason why Sara might (i) include

(ii) exclude this day’s reading.


(2)

Sara decided to exclude this day’s reading and drew the following scatter diagram for the remaining 10
days’ values of r and h.

(c) Give an interpretation of the correlation between rainfall and humidity.


(1)

The equation of the regression line of r on h for these 10 days is r  12.8 + 0.15h.

(d) Give an interpretation of the gradient of this regression line.


(1)

(e) (i) Comment on the suitability of Sara’s sampling method for this study.

(ii) Suggest how Sara could make better use of the large data set for her study.
(2)
(Total for Question is 7 marks)
3 Probability 26 marks

 understand and be able to use mutually exclusive and independent events when calculating probabilities;
 be able to make links to discrete and continuous distributions.

1. The Venn diagram shows the probabilities for students at a college taking part in various sports.

A represents the event that a student takes part in Athletics.


T represents the event that a student takes part in Tennis.
C represents the event that a student takes part in Cricket.
p and q are probabilities.

The probability that a student selected at random takes part in Athletics or Tennis is 0.75.

(a) Find the value of p.


(1)

(b) State, giving a reason, whether or not the events A and T are statistically independent. Show your
working clearly.

(3)

(c) Find the probability that a student selected at random does not take part in Athletics or Cricket.

(1)

(Total for Question is 5 marks)


2. A manufacturer carried out a survey of the defects in their soft toys. It is found that the
probability of a toy having poor stitching is 0.03 and that a toy with poor stitching has a
probability of 0.7 of splitting open. A toy without poor stitching has a probability of 0.02 of
splitting open.
(a) Draw a tree diagram to represent this information.
(3)
(b) Find the probability that a randomly chosen soft toy has exactly one of the two defects, poor
stitching or splitting open.
(3)
The manufacturer also finds that soft toys can become faded with probability 0.05 and that this defect is
independent of poor stitching or splitting open. A soft toy is chosen at random.

(c) Find the probability that the soft toy has none of these 3 defects.
(2)
(d) Find the probability that the soft toy has exactly one of these 3 defects.
(4)

(Total 12 marks)
3. State in words the relationship between two events R and S when P(R∩S) = 0
(1)
The events A and B are independent with P(A) = and P(A∪B) =
Find
(b) P(B)
(4)

(c) P(A'∩B)
(2)

(d) P(B' A)
(2)

(Total 9 marks)
4 Statistical distributions 34 marks

 Understand and use simple, discrete probability distributions (calculation of mean and variance of
discrete random variables is excluded), including the binomial distribution, as a model; calculate
probabilities using the binomial distribution

1. The discrete random variable X can take only the values 2, 3, 4 or 6. For these values the
probability distribution function is given by

where k is a positive integer.

(a) Show that k = 3

(2)
Find
(b) F(3)

(1)

(Total 3 marks)

2. The discrete random variable X can take only the values 1, 2 and 3. For these values the
cumulative distribution function is defined by

F (x) = x = 1,2, 3

(a) Show that k = 13

(2)

(b) Find the probability distribution of X.

(4)
(Total 6 marks)

3. In a large restaurant an average of 3 out of every 5 customers ask for water with their meal.
A random sample of 10 customers is selected.

(a) Find the probability that


(i) exactly 6 ask for water with their meal,
(ii) less than 9 ask for water with their meal.
(5)

A second random sample of 50 customers is selected.

(b) Find the smallest value of n such that

P(X < n) ≥ 0.9

where the random variable X represents the number of these customers who ask for water.

(3)
(Total 8 marks)

4. The probability of a telesales representative making a sale on a customer call is 0.15

Find the probability that

(a) no sales are made in 10 calls,

(2)
(b) more than 3 sales are made in 20 calls.

(2)

Representatives are required to achieve a mean of at least 5 sales each day.

(c) Find the least number of calls each day a representative should make to achieve this requirement.

(2)
(d) Calculate the least number of calls that need to be made by a representative for the probability of at least
1 sale to exceed 0.95

(3)
(Total 9 marks)

5. A manufacturer supplies DVD players to retailers in batches of 20. It has 5% of the players returned
because they are faulty.

1. Write down a suitable model for the distribution of the number of faulty DVD players in a batch.
(2)

Find the probability that a batch contains


2. no faulty DVD players,

(2)

3. more than 4 faulty DVD players.

(2)

4. Find the mean and variance of the number of faulty DVD players in a batch.

(2)
(Total 8 marks)

5 Statistical hypothesis testing 46 marks

 Understand and apply the language of statistical hypothesis testing, developed through a binomial
model: null hypothesis, alternative hypothesis, significance level, test statistic, 1-tail test, 2-tail test,
critical value, critical region, acceptance region, p-value
 Conduct a statistical hypothesis test for the proportion in the binomial distribution and interpret the
results in context
 Understand that a sample is being used to make an inference about the population
and appreciate that the significance level is the probability of incorrectly rejecting the null hypothesis

1. In a manufacturing process 25% of articles are thought to be defective. Articles are produced in batches
of 20

(a) A batch is selected at random. Using a 5% significance level, find the critical region for a two tailed test
that the probability of an article chosen at random being defective is 0.25
You should state the probability in each tail which should be as close as possible to 0.025

(5)

The manufacturer changes the production process to try to reduce the number of defective articles. She then
chooses a batch at random and discovers there are 3 defective articles.

(b) Test at the 5% level of significance whether or not there is evidence that the changes to the process have
reduced the percentage of defective articles. State your hypotheses clearly.

(5)
(Total 10 marks)

2. Sammy manufactures wallpaper. She knows that defects occur randomly in the manufacturing process at
a rate of 1 every 8 metres. Once a week the machinery is cleaned and reset. Sammy then takes a random sample
of 40 metres of wallpaper from the next batch produced to test if there has been any change in the rate of defects.

(a) Stating your hypotheses clearly and using a 10% level of significance, find the critical region for this test.
You should choose your critical region so that the probability of rejection is less than 0.05 in each tail.
(4)

(b) State the actual significance level of this test.

(2)

Thomas claims that his new machine would reduce the rate of defects and invites Sammy to test it. Sammy takes
a random sample of 200 metres of wallpaper produced on Thomas' machine and finds 19 defects.

(c) Using a suitable approximation, test Thomas' claim. You should use a 5% level of significance and state your
hypotheses clearly.
(7)
(Total 13 marks)

3. Sue throws a fair coin 15 times and records the number of times it shows a head.

(a) State the distribution to model the number of times the coin shows a head.

(2)

Find the probability that Sue records

(b) exactly 8 heads,

(2)

(c) at least 4 heads.

(2)

Sue has a different coin which she believes is biased in favour of heads. She throws the coin 15 times and
obtains 13 heads.

(c) Test Sue's belief at the 1% level of significance. State your hypotheses clearly.

(6)
(Total 12 marks)
4. State the conditions under which the normal distribution may be used as an approximation to the
binomial distribution.

(2)

A company sells seeds and claims that 55% of its pea seeds germinate.

(b) Write down a reason why the company should not justify their claim by testing all the pea seeds they
produce.

(1)
To test the company's claim, a random sample of 220 pea seeds was planted.

(d) State the hypotheses for a two-tailed test of the company's claim.

(1)
Given that 135 of the 220 pea seeds germinated,

(d) use a normal approximation to test, at the 5% level of significance, whether or not the company's claim is
justified.
(7)
(Total 11 marks)

You might also like