0% found this document useful (0 votes)
13 views29 pages

1) S - Data Collection

The document discusses different methods for collecting data including a census, sample, population, and sampling frame. It explains key statistical terms like simple random sampling, systematic sampling, and stratified sampling. Examples are provided to illustrate when different sampling methods should be used based on the characteristics of the population.

Uploaded by

layden.jp6932
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views29 pages

1) S - Data Collection

The document discusses different methods for collecting data including a census, sample, population, and sampling frame. It explains key statistical terms like simple random sampling, systematic sampling, and stratified sampling. Examples are provided to illustrate when different sampling methods should be used based on the characteristics of the population.

Uploaded by

layden.jp6932
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Statistics

Data Collection
Twitter: @Owen134866

www.mathsfreeresourcelibrary.com
Prior Knowledge Check

1) Find the mean, median, mode 3) Rebecca records the shoe size,
and range of these data sets x, of the female students in her
year. The results are in the table.
a) 1, 3, 4, 4, 6, 7, 8, 9, 11 5.89, 6, 4, 10
Number of
b) 20, 18, 17, 20, 14, 23, 19, 16
students,
18.38, 18.5, 20, 9 35 3
2) Here is a question from a 36 17
questionnaire surveying TV viewing 37 29
habits. 38 34
Overlapping category
How much TV do you watch? Nothing above 4 39 12
0-1 Hours No time period etc
1-2 Hours Find the mean shoe size.
3-4 Hours
37.37
Give 2 criticisms and write an
improved version of the question.
Teachings for Exercise 1A
Data Collection
You need to understand a
number of key statistical terms

Population Census Sampling Unit


 A population is a set of items  A census measures every  A sampling unit is an
that are of interest member of a population individual unit of the
population
 For example, items
manufactured in a factory, Sample
people working in a company
 A sample is a selection of the Sampling Frame
population used to estimate  A sampling frame is a list of
information about the all the sampling units, each
population as a whole given a number

1A
Data Collection
You need to understand a
number of key statistical terms

Advantages Disadvantages

 Time consuming
 Cannot be used if the sampling
Census  Completely accurate result process would render the items
unusable
 Processing a lot of data takes a
long time
 Less time-consuming  Data might not be accurate
Sample  Fewer responses needed  Sample might not be properly
 Less data to process representative of the population

1A
Data Collection
You need to understand a number
of key statistical terms

A supermarket wants to test a


delivery of avocados for ripeness
by cutting them in half. A census would involve testing all
the avocados, so then none could be
sold!
a) Suggest a reason why the
supermarket should choose a
sample rather than a census

The supermarket tests a sample of


5 avocados and find that 4 of
them are ripe. They estimate that
80% of the total are ripe. They could test
more avocados!

b) Suggest a way this estimate


could be improved

1A
Teachings for Exercise 1B
Data Collection
You need to know about simple,
systematic and stratified
sampling

When taking a sample, the key idea


is that the sample reflects the
population as a whole

 For example, if the population of


a herd of cattle is 30% male,
then a sample should contain
30% males

1B
Data Collection
You need to know about simple,
systematic and stratified
sampling

Simple random Systematic Stratified


sampling sampling sampling

In a simple random sample, In a systematic sample, In a stratified sample, the


every element in the set has an members are chosen at regular population is divided into
equal chance of being selected intervals from an ordered list groups (eg) Male and Female

 Selecting members will  Choosing 20 people from a  The quantities chosen


involve assigning numbers to population of 100 might randomly from each group
all members of the set, and involve choosing every 5th should mean that the
generating numbers at person sample reflects the
random to choose them population as a whole

1B
Data Collection
You need to know about simple,
systematic and stratified
sampling

A yacht club with 100 members are They could assign each member a
listed alphabetically in the club’s number from 1-100. Then they could
membership book. The committee
wants to take a sample of 12 generate 12 numbers and choose the
members to fill in a questionnaire. members that were assigned those
numbers.
a) Explain how they could use a
random number generator to
generate the sample
They could write all the
members’ names on hats, and
b) Explain how they could use a then draw out 12 members to
lottery system to generate the
sample
make up the sample

1B
Data Collection Total workers
= 300
You need to know about simple, Age Quantity
systematic and stratified sampling
18-32 75 20
33-47 140
A factory manager wants to find
out what his workers think of the 48-62 85
canteen facilities. He decides to
give a questionnaire to a sample of
80 workers. It is believed that As a fraction, workers are from 18-32
different age groups will have
different opinions.  We need the same fraction, but of 80
workers to be selected…
The table to the right shows the
number of workers in each age 75
× 80
bracket. 300

¿ 20
a) What sampling method should
be used? Stratified Sampling
b) How many workers should be
selected from each age
bracket?
1B
Data Collection Total workers
= 300
You need to know about simple, Age Quantity
systematic and stratified sampling
18-32 75 20
A factory manager wants to find
33-47 140 37
out what his workers think of the 48-62 85 23
canteen facilities. He decides to
give a questionnaire to a sample of 140 85
80 workers. It is believed that × 80 × 80
different age groups will have 300 300
different opinions.
¿ 37.3 ¿ 22.7
The table to the right shows the
¿ 37 ¿ 23
number of workers in each age
bracket.

a) What sampling method should


be used? Stratified Sampling
b) How many workers should be
selected from each age
bracket?
1B
Data Collection
You need to know about simple,
systematic and stratified
sampling

Advantages Disadvantages

 Free of bias
Simple random  Easy and cheap to implement
 Not suitable for a large population or
sample size
sampling  Every unit has an equal chance of
 A sampling frame is needed
selection

Systematic  Simple and quick to use  A sampling frame is needed


 Suitable for large samples and  Possible bias as units do not have an
Sampling populations equal chance of selection

 Sample accurately reflects the  Population must be classified into groups


Stratified population which can be time-consuming
Sampling  Guarantees proportional  Selection within a group has the same
representation of groups issues as simple random sampling

1B
Teachings for Exercise 1C
Data Collection
You need to know about non-random
sampling

Quota Sampling Opportunity Sampling


In Quota sampling, an In Opportunity sampling, the
interviewer or researcher sample is taken from people
selects a sample that reflects available at the time and who fit
the characteristics of the group the criteria needed

 So an interviewer might meet  An example would be speaking


with factory members to with people who are leaving a
determine their characteristics, supermarket, in order to get
and choose the sample from that their opinions
information

1C
Data Collection
You need to know about non-random
sampling

Advantages Disadvantages

 Potential for bias to be introduced


 Allows a small sample to represent
 Can take time to divide the population
Quota the population
into groups after
 No sampling frame required
 A more in-depth study would require an
Sampling  Quick and inexpensive
increasing number of different groups
 Allows comparison between groups
 Some people might not be willing to take
part
 Unlikely to give a proportional sample
Opportunity  Easy to carry out  Researcher’s ability can affect the
outcome
Sampling  Inexpensive
 People might not want to be
interviewed/asked

1C
Teachings for Exercise 1D
Data Collection
There are various different types of
data which can be used in statistics

Quantitative data Discrete data


 Data which is numerical,  Data which can only take
such as height certain values. For
example, the number of
Qualitative data people can only be an
integer
 Data which is non-
numerical, such as colour,
or worded answers to Continuous data
questions
 Data which can take any
value, the only limitation
being how accurately we
can measure it. Eg) Height

1D
Data Collection
There are various different types of
data which can be used in statistics
Class Boundaries
 These are the maximum
Length of wing Number of and minimum values that
(mm) butterflies, f belong in a group
30-31 2

32-33 25
Midpoint
 This is the mean of the
34-36 30 class boundaries
37-39 13
Class width
 This is the difference
between the upper and
lower class boundaries

1D
Data Collection
There are various different types of
data which can be used in statistics Is the length Qualitative or Quantitative?
 Quantitative

Length of wing Number of


Is the length Discrete or Continuous?
(mm) butterflies, f
 Continuous
30-31 2

32-33 25 Write down the class boundaries,


midpoint and class width for the class 34-
34-36 30 36

37-39 13  Class boundaries are 33.5 and 36.5


 BE CAREFUL! For continuous data you need
to take values from half-way between each
group

 Midpoint = 35mm

 Class width = 3mm (using the boundaries


above)
1D
Teachings for Exercise 1E
Data Collection
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

 The large data set contains


information recorded over a number
of years at weather stations around
the world (as shown)

 The data was recorded in both 1987


and 2015, and you will most likely be
asked to draw comparisons between
the two

 You do not need to memorise the


data, but being familiar with it and
the locations shown will be useful

1E
Data Collection
Daily mean wind
You need to be able to answer exam Daily mean cloud
questions based on large amounts of direction and
windspeed cover
real data that you will be given
 This is measured in  Measured in ‘oktas’, or
knots according to the eighths of the sky
The different sets of data recorded beaufort scale (more on covered by cloud
are as follows: the next slide)

Daily mean temperature Daily mean visibility


Daily maximum gust
 The mean temperature
on that day  The highest  This is measured in
instantaneous wind speed decametres (Dm). The
recorded, in knots greatest distance at which
Daily total rainfall
an object can be seen in
 The amount of rainfall that daylight
day (including snow or hail Daily maximum
that has been melted) relative humidity Daily mean pressure
 Amounts less than 0.05mm
are recorded as ‘tr’ (trace)  This is a percentage of  Measured in
air saturation with water. hectopascals (1hPa = 100
Above 95% leads to Newtons per square
Daily total sunshine
mist/fog metre)
 Recorded to the
nearest tenth of an hour
1E
Data Collection
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

Average speed at 1 knot is a ‘nautical


Beaufort scale Descriptive term 10metres above mile per hour’, and is
equivalent to 1.15mph
ground
0 Calm Less than 1 knot
CLICK
1-3 Light 1-10 knots

4 Moderate 11-16 knots

5 Fresh 17-21 knots

1E
Data Collection
You need to be able to answer exam
questions based on large amounts of
real data that you will be given Camborne

1E
Data Collection
Hurn
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

Look at the extract given to the right.

a) Describe the type of data


represented by daily total rainfall

 Continuous Quantitative Data

1E
Data Collection
Hurn
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

Look at the extract given to the right.

Alison is investigating daily maximum


gust. She wants to select a sample
size of size 5 from the first 20 days
in Hurn in June 1987. She uses the
first two digits of the date as the
sampling frame and generates 5
random numbers from 1-20

b) State the type of sampling method


used Simple Random Sampling
c) Why might the method not
Some of the values have n/a, meaning
generate a sample of size 5?
no data was recorded that day

1E
Data Collection
Hurn
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

Using the extract to the right,


calculate:

a) The mean daily mean temperature


for the first 5 days of June in
Hurn in 1987
𝟏𝟓 .𝟏+𝟏𝟐 .𝟓+𝟏𝟑 .𝟖 +𝟏𝟓 . 𝟓+𝟏𝟑 . 𝟏
𝟓
The table is to 1dp, so
maintain this level of ¿ 𝟏𝟒 . 𝟎°
accuracy!
b) The median daily total rainfall for
the week of 14th June to 20th June 0 ,3.7 ,5.6 ,0.1,7.4 ,𝑡𝑟 ,0 Trace amount are slightly
inclusive larger than 0
0 ,0 ,𝑡𝑟 ,0.1,3.7 ,5.6 ,7.4  Treat them as 0 for any
𝑀𝑒𝑑𝑖𝑎𝑛=0.1 numerical calculations though!

1E
Data Collection
Hurn
You need to be able to answer exam
questions based on large amounts of
real data that you will be given

b) The median daily total rainfall for


the week of 14th June to 20th June
inclusive
𝑀𝑒𝑑𝑖𝑎𝑛=0.1mm

The median daily total rainfall for the


same week in Perth was 19.0mm. Karl
states that more southerly countries
experience higher rainfall during June

c) State, with a reason, whether your Perth is in Australia, which is south of the UK, and
answer to b) supports this statement its median rainfall was higher. However, taking a
small sample from a single location is each country
means there is not enough data to support the
statement.
1E

You might also like