1 - Practice Exercise 1 Data Descriptives
1 - Practice Exercise 1 Data Descriptives
PRACTICE EXERCISE 1
DATA TYPE & DESCRIPTIVE STATISTICS
Question 1 – Data
Question 2 – Data
Which of the following describe discrete data?
a) The numbers of people surveyed in each of the next several National Health and
Nutrition Examination Surveys
Discrete
b) The exact foot lengths (cm) of a random sample of statistics students
Not Discrete (Continuous)
c) The exact times that randomly selected drivers spend texting while driving during the
past 7 days
Discrete
Question 3
In a survey of 1020 adults in the United States, 44% said that they wash
their hands after riding public transportation (based on data from KRC Research).
a) Identify the sample and the population.
The population consists of all adults in the United States and the sample is the 1020
adults who were surveyed.
b) Is the value of 44% a statistic or a parameter? Why?
Statistics it is calculated from the sample and not the population.
c) What is the level of measurement of the value of 44%? (Nominal, ordinal, interval,
ratio)’
Ratio level of measurement
d) Are the numbers of subjects in such surveys discrete or continuous?
Discrete
e) The responses are “yes,” “no,” “not sure,” or “refused to answer.” Are these responses
quantitative data or categorical data?
Categorical data
Total 6 marks
Question 4
The data below shows the ages of patients in an optometry clinic on a certain day.
24 27 32 23 35 34 28 40 28 29 45 51 24 33 42
22 34 21 34 56 38 29 41 44 27 30 63 30 39 49
2.12344778899
3.0023444589
4.012459
5.16
6.3
Key 2|1 = 21
(2marks)
b) Determine the
a. Median
33+34
Median = 2 = 33.5
(2marks)
b. Interquartile range
NOTE – ONE APPROACH
Finding the quartiles
• First arrange the data in ascending order
Case 1: An even number of data values
• Split the data into their upper half and lower half
• Then the median of the upper half is Q3, and the median of the lower
half is Q1.
Case 2: An odd number of data values
• Find the median, Q2, and delete it from the list.
• Split the remaining data into their upper half and lower half.
• Then the median of the upper half is Q3, and the median of the lower
half is Q1.
Interquartile range = upper quartile(Q3) – lower quartile(Q1)
Interquartile range = 41 – 28 = 13
(4marks)
c. Mean
∑𝑥 1052
Mean 𝜇 = 𝑛 = 30 = 35.067
(2marks)
d. 5% trimmed mean
This is the mean when the lowest 5% and highest 5% approximately of the
ordered data are excluded. i.e., removing approximately the upper 2 and lower
2 values. Then finding the mean of the remaining numbers.
21 22 23 24 24 27 27 28 28 29 29 30 30 32 33
34 34 34 35 38 39 40 41 42 44 45 49 51 56 63
∑𝑥 890
5% trimmed Mean ≈ = ≈ 34.23
𝑛 26
(4marks)
e. Standard Deviation
∑ 𝑥 = 1052
∑ 𝑥 2 = 39998
n = 30
𝑛(∑ 𝑥 2 ) − (∑ 𝑥)2
𝜎= √
𝑛(𝑛 − 1)
(3marks)
f. Variance
Variance = 𝜎 2 = 10.352192 = 107.168
(1mark)
OR
(4marks)
d) Hence describe the shape of the distribution by discussing its skewness and normality.
The data is positively skew and does not follow a bell shape curve hence it is not
normal.
(3marks)
Total 25 marks
Question 5 - Histogram
Answer the questions by referring to the following histogram, which represents the sepal
widths (mm) of a sample of irises.
a) Based on the histogram, what is the approximate number of irises in the sample?
Approximately 50
b) What is the class width? What are the approximate lower- class and upper-class limits
of the first class?
Class Width is the difference between two consecutive lower-class limits (or two
consecutive lower-class boundaries) in a frequency distribution.
That is, 2.5 – 2 = 0.5.
Lower-class = 2
Upper Class = 2.5
d) Does it appear that the sample is from a population having a normal distribution?
The sample does not seem normally distributed since is does not follow the bell shape
curve.
31
𝑥̅ = = 3.10
10
Yes the student did make the dean’s list.
(5marks)
Question 7 – z scores
Body Data
Females have pulse rates with a mean of 74.0 beats per minute (BPM) and a standard
deviation of 12.5 beats per minute and that maximum is 104 BPM.
a) What is the difference between the maximum and the mean?
Difference = 104 – 74.0 = 30
(1mark)
b) How many standard deviations is that [the difference found in part (a)]?
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 30
That is = 12.5 = 2.4
𝜎
c) Convert the maximum pulse rate to a z score.
𝑋−𝜇
𝑧=
𝜎
104 − 74
𝑧=
12.5
𝑧 = 2.4
a) Identify the class width, class midpoints, and class boundaries for the given frequency
distribution.
Class Width = 100
Class midpoints. x = 49.5, 149.5, 249.5, 349.5, 449.5, 549.5, 649.5
Class Boundaries = -0.5, 99.5, 199.5, 299.5, 399.5, 499.5, 599.5.
(3marks)
b) Construct a histogram and bar chat.
Histogram
Bar Graph
(4marks)
d) Ignore the given frequencies. Assume that the first three frequencies are 2, 12, and 18,
respectively. Assuming that the distribution of the 153 sample values is a normal
distribution, identify the remaining four frequencies.
∑ 𝑥𝑓
𝑥̅ = ∑𝑓
=
49.5(1)+ 149.5(51)+ 249.5(90)+349.5(10)+449.5(0)+ +549.5 (0)+ 649.5(1)
1+51+90+10+0+0+1
49.5+7624.5+22455+3495+0+0+ 649.5
=
153
34273.5
=
153
𝑥̅ = 224.0098
(3marks)
f) Find the Standard deviation using find the standard deviation by using the formula
below, where x represents the class midpoint, f represents the class frequency, and n
represents the total number of sample values.
n = ∑ 𝑓 =153
153(8388188) − (34273.5)2
𝑠=√
153(153 − 1)
1283392802 − 1174672802
𝑠=√
23256
s = √4674.9226
s = 68.37
(4marks)
g) Hence, find the variance.
Variance = s2 = 68.372 = 4674.9226
(1mark)
Total 20marks