0% found this document useful (0 votes)
18 views9 pages

Exercise - 1 - On Lectures I and II - 2025

The document outlines an exercise for an introductory statistics and probability course, covering various topics including descriptive and inferential statistics, sampling methods, random variables, and data representation techniques. It includes specific questions and tasks related to statistical concepts, data analysis, and practical applications, such as constructing graphs and calculating frequencies. The exercise aims to enhance students' understanding of statistical principles and their application in real-world scenarios.

Uploaded by

Bless Tetteh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views9 pages

Exercise - 1 - On Lectures I and II - 2025

The document outlines an exercise for an introductory statistics and probability course, covering various topics including descriptive and inferential statistics, sampling methods, random variables, and data representation techniques. It includes specific questions and tasks related to statistical concepts, data analysis, and practical applications, such as constructing graphs and calculating frequencies. The exercise aims to enhance students' understanding of statistical principles and their application in real-world scenarios.

Uploaded by

Bless Tetteh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DEPARTMENT OF STATISTICS & ACTUARIAL SCIENCE

SCHOOL OF PHYSICAL & MATHEMATICAL SCIENCES


FIRST SEMESTER, 2024/2025 ACADEMIC YEAR
B.A/B.Sc. STATISTICS
STAT 111: INTRODUCTION TO STATISTICS AND PROBABILITY 1

EXERCISE 1 ON LECTURES 1 AND 2

1. Briefly explain the difference between the following pairs of terms;


(a) Descriptive and Inferential Statistics.
(b) Population and Sample.
(c) Parameter and Statistic.
(d) Census and Sample survey.
(e) Primary data and Secondary data.

2. What is a sampling error?

3. Describe the four methods of collecting samples from a population.

4. (a) Define a random variable.


(b) State and explain the two classifications of variables, with examples.
(c) Differentiate between discrete and continuous random variables.

5. A sociologist wishes to estimate the proportion of all adults in a certain region who have
never been married. In a random sample of 1,320 adults, 145 have never married, hence
145∕1320 ≈ 0.11 or about 11% have never married.
(a) What is the population of interest?
(b) What is the parameter of interest?
(c) What is the statistic involved?

STAT 111 Exercise 1 2024/2025 Page 1 of 9


6. What is one of the distinctions between a population parameter and a sample statistic?
(a) A population parameter is only based on conceptual measurements, but a sample
statistic is based on a combination of real and conceptual measurements.
(b) A sample statistic changes each time you try to measure it, but a population
parameter remains fixed.
(c) A population parameter changes each time you try to measure it, but a sample
statistic remains fixed across samples.
(d) The true value of a sample statistic can never be known but the true value of a
population parameter can be know
7. A magazine printed a survey in its monthly issue and asked readers to fill it out and send
it in. Over 1000 readers did so. This type of sample is called
(a) a cluster sample.
(b) a self-selected sample.
(c) a stratified sample.
(d) a simple random sample.
8. Which one of the following variables is not categorical?
(a) Age of a person.
(b) Gender of a person: male or female.
(c) Choice on a test item: true or false.
(d) Marital status of a person (single, married, divorced, other)
9. Which one of these statistics is unaffected by outliers?
(a) Mean
(b) Interquartile range
(c) Standard deviation
(d) Range
10. Which of the following would indicate that a dataset is not bell-shaped?
(a) The range is equal to 5 standard deviations.
(b) The range is larger than the interquartile range.
(c) The mean is much smaller than the median.
(d) There are no outliers

STAT 111 Exercise 1 2024/2025 Page 2 of 9


11. A scatter plot of number of teachers and number of people with college degrees for
institutions in Ghana reveals a positive association. The most likely explanation for this
positive association is:
(a) Teachers encourage people to get college degrees, so an increase in the number of
teachers is causing an increase in the number of people with college degrees.
(b) Larger institutions tend to have both more teachers and more people with college
degrees, so the association is explained by a third variable, the size of the
institution.
(c) Teaching is a common profession for people with college degrees, so an increase
in the number of people with college degrees causes an increase in the number of
teachers.
(d) Institutions with higher incomes tend to have more teachers and more people
going to college, so income is a confounding variable, making causation between
number of teachers and number of people with college degrees difficult to prove.
12. Scale used in statistics which provides difference of proportions as well as magnitude of
differences is considered as
(a) satisfactory scale
(b) ratio scale
(c) goodness scale
(d) exponential scale
13. Give an example of a population and two different characteristics that may be of interest.
14. Median, mode, deciles and percentiles are all considered as measures of
(a) mathematical averages
(b) population averages
(c) sample averages
(d) averages of position

15. Number of observations are 30 and value of arithmetic mean is 15 then sum of all values
is
(a) 15
(b) 450
(c) 200
(d) 45

STAT 111 Exercise 1 2024/2025 Page 3 of 9


16. If most repeated observations recorded are outliers of data, then mode is considered as
(a) intended measure
(b) percentage measure
(c) best measure
(d) poor measure
17. Explain, with examples, the four levels of measurement.

18. Consider the set of all students enrolled in a STAT 111. Suppose you are interested in
learning about the current grade point averages (GPAs) of this group.
(a) Define the population and variable of interest.
(b) Is the variable qualitative or quantitative?
(c) Suppose you determine the GPA of every member of the class. Would this represent
a census or a sample?
(d) Suppose you determine the GPA of 10 members of the class. Would this represent a
census or a sample?
(e) If you determine the GPA of every member of the class and then calculate the
average, how much reliability does this have as an "estimate" of the class average
GPA?
(f) What must be true in order for the sample of 10 students you select from your class to
be considered as a random sample?

19. Define the following:


(a) Relative frequency
(b) Cumulative frequency
(c) Percentage frequency

20. The ages (in years) of 30 employees of a certain company are given below

55 25 34 66 28
32 28 26 56 48
47 61 25 24 31
24 28 44 37 51
47 35 32 36 50
37 36 27 27 44
(a) Represent the above information on a stem and leaf diagram.

(b) Determine the number of classes required.


(c) Find the range.
(d) Determine the class width.
(e) Construct a frequency distribution, a relative frequency distribution and a cumulative
frequency distribution for the above data.

STAT 111 Exercise 1 2024/2025 Page 4 of 9


21. For a random sample of 25 units selected from a target population, a qualitative variable with
four classes (A, B, C, D) is measured on each unit. The results are presented below:
A A C D B B B C C D A
C B A A D A B D A A B
D B B
(i) Find the frequency for each of the four classes.
(ii) Compute the relative frequency for each class.
(iii) Draw a bar chart to represent the information in (I) above.
(iv) From your bar chart, determine which class has the highest number of units.
(v) Draw a pie chart to display the information in (II) above.

22. Explain the difference between a bar graph and a histogram. 1 2

23. Consider the relative frequency table of 200 measurements below:

Class Interval Relative Frequency


0.5 – < 4.5 0.05
4.5 – < 8.5 0.35
8.5 – < 12.5 0.15
12.5 – < 16.5 0.20
16.5 – < 20.5 0.10
20.5 – < 24.5 0.05
24.5 – < 28.5 0.10

(a) Draw a relative frequency histogram.


(b) Calculate the number of measurements falling into each of the measurement classes.
(c) Draw a frequency histogram for the number of measurements.

24. How many classes are to be used for grouping a data set with 50 observations?

25. The International Rhino Federation estimates that there are 17,470 rhinoceroses living in the
wild of Africa and Asia. A breakdown of the number of rhinos of each species is reported in
the accompanying table:
Rhino Species Population estimate
African Black 3,610
African White 11,100
Asian Sumatran 300
Asian Javan 60
Asian Indian 2,400
Total 17,470

(a) Construct a table of relative and cumulative frequencies for the data.
(b) Construct a bar graph for the relative frequencies.

STAT 111 Exercise 1 2024/2025 Page 5 of 9


(c) What proportion of the rhinos are:
(i) African rhinos
(ii) Asian rhinos

26. Fifteen students (named A-O) took classes and statistics and biology. The marks earned by
each of the students are shown below.

Student A B C D E F G H I J K L M N O
Statistics 74 53 67 63 77 57 60 47 76 54 80 92 53 52 80
Biology 60 68 64 66 71 66 55 71 82 73 84 59 63 55 79

(a) Summarize this information on stem and leaf displays.


(b) What are the highest and lowest scores in statistics? Biology?
(c) Find the sum of the scores earned by each student in the two subjects and represent
your answer on a stem and leaf diagram (Use a stem unit of tens and a leaf unit of
ones).

27. Chance (Spring 2000) reported on a study to estimate the number of pennies required to fill a
coin collectors’ album. The data used in the study were obtained by noting the mint date on
each in a sample of 2,000 pennies. The distribution of mint dates is summarized in the
following table:

Mint Date Number


Pre-1960 18
1960s 125
1970s 330
1980s 727
1990s 800
(a) Identify the experimental unit for the study.
(b) Identify the variable measured.
(c) What proportion of pennies in the sample have mint dates in the 1960s?
(d) Construct a pie chart to describe the distribution of mint dates for the 2,000 sampled
pennies.

28. The following data represents the annual income (in thousand cedis) for a sample of 12
households in Ghana:
23 , 17 , 32 , 60 , 22 , 52 , 29 , 38 , 42 , 92 , 27 , 46.
(a) Construct a box and whisker plot for the data.
(b) Find the lower and upper inner fences.
(c) Determine the smallest and largest value within the two inner fences.

STAT 111 Exercise 1 2024/2025 Page 6 of 9


29. College and universities are requiring an increasing amount of information about applicants
before making acceptance and financial aid decisions. Classify each of the following types of
data required on a college application as quantitative or qualitative.
(a) High school Aggregated Score (WASSCE)
(b) High school class rank
(c) Applicant’s score on the Scholastic Assessment Test (SAT) or American College
Testing (ACT)
(d) Gender of applicant
(e) Parents’ income
(f) Age of applicant

30. Use the table below to construct an ogive.

Days to maturity Cumulative relative


frequency
30 0.000
40 0.075
50 0.100
60 0.300
70 0.550
80 0.725
90 0.900
100 1.000

31. Consider the stem-and leaf display shown here:

Stem Leaf
5 1
4 4 5 7
3 0 0 0 3 6
2 1 1 3 4 5 9 9
1 2 2 4 8

(a) How many observations were in the original data set?


(b) Re-create all the numbers in the data set and construct a dot plot.

32. The table below displays the population (in millions) and the number of violent crimes (in
millions) in US from 1982 to 1989.

Year 1982 1983 1984 1985 1986 1987 1988 1989


US 231 234 239 241 243 246 248 249
Population
Violent 1.32 1.26 1.33 1.49 1.48 1.57 1.65 1.82
Crimes

(a) Draw a line graph representing the trend in violent crime over time.
(b) Draw a scatter plot of population versus violent crime, comment on the scatter plot.

STAT 111 Exercise 1 2024/2025 Page 7 of 9


33. You are being consulted by level 400 psychology student on her final project. Her work entails
estimating the population of students in University of Ghana who are claustrophobic. i.e.,
Having an extreme or irrational fear of confined places.
(a) What is the population in her case?
(b) Would you advise a complete enumeration of the population or a survey? Give reasons.
(c) Describe a sampling method you think will help her obtain the best estimate of the
proportion of claustrophobic students in the population. Justify the selected sampling
method and enumerate its advantages over at least one other sampling method you
know.

34. The following are figures on an oil well’s daily production in barrels.
214 , 204 , 226 , 198 , 243 , 225 , 207 , 203 , 209 , 200 , 217 , 202 , 208 , 212
205 , 220 , 200 , 208 , 191 , 202 , 201 , 208 , 200 , 198
(a) Construct a stem-and-leaf plot with stems 19, 20 , … , 24
(b) Use the stem- and leaf plot to find the quartiles of the data.
(c) Find the modal production.
(d) Calculate the 95th and 64th percentiles.

35. The table below shows how support by voters for the main political parties in England in 2015
varied with a range of demographic factors. Voters are classified by sex, age group, socio-
economic group, location, and ethnicity. Each figure given is an index showing support by a
group for a party. A figure of 100 indicates that support is at the national average level; a figure
of 105 shows that support is 5% higher than the national average level; a figure of 90 indicates
that support is 10% less than the national average level. (The socio-economic group AB is
upper middle class and middle class, C1 is lower middle class, C2 is skilled working class, DE
is working class and those at the lowest level of subsistence.)
(i) Summarise how sex is related to political party preference in the data. Draw a
suitable graph, or graphs, to illustrate your answer.
(ii) The political parties are shown in order from what are generally considered to be the
most left wing (Green) to most right wing (UKIP). Identify the main patterns in
political preferences as they vary by age. Draw a suitable graph, or graphs, to illustrate
your answer.
(iii) Of the sixteen categories in the table identify that with profile most like the national
average, and that with profile least like the national average. Discuss briefly why
these groups might be expected to have such profiles.
(iv) Show that about 60% of Green supporters are female. Find the corresponding
percentage for UKIP supporters. State any assumptions that you have made in
answering those questions.
(v) A newspaper article noted the figures of 92 for the level of support among people in
the 18–24 age range for the Liberal Democrats and for the Conservatives. The article
stated that support in this age-group was evenly split between these two parties. Explain
why that is the wrong conclusion to draw from the data, and state the correct
conclusion.

STAT 111 Exercise 1 2024/2025 Page 8 of 9


STAT 111 Exercise 1 2024/2025 Page 9 of 9

You might also like