0% found this document useful (0 votes)
74 views15 pages

STATISTICS FINAL EXAM (MPM) Answer Sheet

The document provides instructions for a statistics exam, including details about submission requirements and a sample answer sheet with multiple cases/questions. Case one defines sampling and complete enumeration, outlining key advantages of sampling such as reduced cost, greater accuracy, and feasibility. Case two discusses the five stages of statistical investigation: data collection, organization, presentation, analysis, and inference. Case three provides population and sample details to identify variables of interest, data sources, sampling techniques. Case four presents a probability question about drawing balls from two boxes.

Uploaded by

yetm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views15 pages

STATISTICS FINAL EXAM (MPM) Answer Sheet

The document provides instructions for a statistics exam, including details about submission requirements and a sample answer sheet with multiple cases/questions. Case one defines sampling and complete enumeration, outlining key advantages of sampling such as reduced cost, greater accuracy, and feasibility. Case two discusses the five stages of statistical investigation: data collection, organization, presentation, analysis, and inference. Case three provides population and sample details to identify variables of interest, data sources, sampling techniques. Case four presents a probability question about drawing balls from two boxes.

Uploaded by

yetm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Program –Masters Project Management (MPM)

STATISTICS FINAL EXAM

(BRIDGING COURSE) FINAL EXAM

NAME: YETMGETA AMTATE ESHETE ID NO: MPM/484/15


SECTION: ONE(1)

Instructor’s Name ✓
Hussen

Instructions: Please carefully read the instructions below


1 The exam must be taken completely alone. Checking your exam answers or discussing any of the
materials or concepts with any other person is forbidden. If the students’ exam answers are found
the same, it shall be voided automatically

2 You can work on the exam for 48 hrs. Please submit the exam answer file typed in a word document
or hand writing to your course instructor at 2:30 AM morning Local time ON Saturday, January
28, 2023 CPU-3 Instructors office. Unequivocally no postponements - Late submissions are totally
unacceptable. The only option for submission is by using hard copy, problems will not be accepted
for any case.

4 Once exam answer file is submitted for grading, no requests for amendments or supplements will be
permitted.

1
Answer sheet– Write your answers below

CASE ONE:

#1: State briefly the relative importance of sampling over complete enumeration. ( 5 pts)
Sampling theory provides the tools and techniques for data collection, keeping in mind the
objectives to be fulfilled and the nature of the population.

There are two ways of obtaining the information.

1. Sample surveys

2. Complete enumeration or census.

Census: The complete count of the population is called a census. The observations on all the
sampling units in the population are collected in the census. For example, in India, the census is
conducted at every tenth year in which observations on all the persons staying in India is collected.
Sample: One or more sampling units are selected from the population according to some specified
procedure. A sample consists only of a portion of the population units. Such a collection of units
is called the sample.

ADVANTAGES OF SAMPLING OVER COMPLETE ENUMERATION:

1. Reduced cost and enlarged scope.

Sampling involves the collection of data on a smaller number of units in comparison to the
complete enumeration, so the cost involved in the collection of information is reduced. Further,
additional information can be obtained at little cost in comparison to conducting another separate
survey. For example, when an interviewer is collecting information on health conditions, then
he/she can also ask some questions on health practices. This will provide additional information
on health practices, and the cost involved will be much less than conducting an entirely new survey
on health practices.

2. Organization of work:

2
It is easier to manage the organization of a collection of a smaller number of units than all the units
in a census. For example, in order to draw a representative sample from a state, it is easier to
manage to draw small samples from every city than drawing the sample from the whole state at a
time. This ultimately results in more accuracy in the statistical inferences because the better
organization provides better data and in turn, improved statistical inferences are obtained.

3. Greater accuracy:

The persons involved in the collection of data are trained personals. They can collect the data more
accurately if they have to collect a smaller number of units than a large number of units.

4. Urgent information required:

The data from a sample can be quickly summarized. For example, the forecasting of the crop
production can be done quickly based on a sample of data than collecting first all the observation.

5. Feasibility:

Conducting the experiment on a smaller number of units, particularly when the units are destroyed,
is more feasible. For example, in determining the life of bulbs, it is more feasible to fuse a
minimum number of bulbs. Similarly, in any medical experiment, it is more feasible to use less
number of animals

CASE TWO:
Briefly discuss the purpose and meaning of the different stages of Statistical investigation. (5 pts)

#2: Stages in Statistical Investigation


There are five stages or steps in any statistical investigation.

1. Collection of data: the process of measuring, gathering, assembling the raw data up on
which the statistical investigation is to be based. It means the methods that are to be
employed for obtaining the required information from the units under investigations.
Importance
i. Low cost and universal
ii. Free from biases.
iii. Respondents have adequate time to respond iv. Fairly approachable

3
2. Organization of data: Summarization of data in some meaningful Data organization
is the practice of categorizing and classifying data to make it more usable. Similar to a file
folder, where we keep important documents, you’ll need to arrange your data in the most
logical and orderly fashion, so you — and anyone else who accesses it — can easily find
what they’re looking for.

Why is data organization important?


Good data organization strategies are important because your data contains the keys to managing
your company’s most valuable assets. Getting insights out of this data could help you obtain better
business intelligence and play a major role in your company’s success.

3. Presentation of the data :The process of re-organization, classification, compilation,


and summarization of data to present it in a meaningful form. Data presentation is a process
of comparing two or more data sets with visual aids, such as graphs. Using a graph, you
can represent how the information relates to other data. This process follows data analysis
and helps organise information by visualising and putting it into a more readable format.

Importance of Data Presentation


Data Presentation tools are powerful communication tools that can simplify the data by making it
easily understandable & readable at the same time while attracting & keeping the interest of its
readers and effectively showcase large amounts of complex data in a simplified manner.

4. Analysis of data: The process of extracting relevant information from the summarized
data, mainly using elementary mathematical operation. Is the process of systematically
applying statistical and/or logical techniques to describe and illustrate, condense and recap,
and evaluate data.

Why Is Data Analytics Important?

Data Analysis is essential as it helps businesses understand their customers better, improves sales,
improves customer targeting, reduces costs, and allows for the creation of better problem-solving
strategies.

5. Inference of data: The interpretation and further observation of the various statistical
measures through the analysis of the data by implementing those methods by which
conclusions are formed and inferences made. Statistical inference is the process of drawing
conclusions about an underlying population based on a sample or subset of the data. In most
cases, it is not practical to obtain all the measurements in each population.

Importance of Statistical Inference

4
Inferential Statistics is important to examine the data properly. To make an accurate conclusion,
proper data analysis is important to interpret the research results. It is majorly used in the future
prediction for various observations in different fields.

CASE THREE:
CPU College has registered 12,000 students for the last four years. The college administration
would like to know the number of students who have participated in co-curricular activities. For
the purpose of the study, the administrator collected the names of 400 students from the files by
taking proportional number of students from each of the years (batches) for interview. (5 pts)
Based on the above information, find
a. The variable of interest
b. The source of data (primary or secondary)
c. The population
d. The sample
e. The sampling technique used

#3:
a. The variable of interest
The number of the students who have participated in co-curricular activities

Co-curricular activities
b. The source of data
Secondary data because the administrator of CPU college was collected the names of students from
the files not taking the name of the students directly from their.
c. The population
The total number of students that registered in CPU college in the last last four years. i.e 12,000
students re the population.
d. The sample
The number of the students that are the college administrator collected from the total number of
the students within the college by taking proportional number students from each of the
years(batches) for interview.
i.e. 400 students are sample
e. The sampling technique used
The sampling techniques is stratified sampling. Because the population is first divided into
groups(strata) according to batches and the population are heterogeneous.
5
CASE FOUR:
Suppose one box contains 5 black and 3 white balls and a second box contains 4 black and 6 white
balls if one ball is drawn from each box, what is the probability that…………(3 pts)

a) Both are black.


b) both are white.
c) 1 white and 1 black
#4:
a) Both are black.
In the first box there are 8 balls 5 black and 3 white
In the second box there are 10 balls
4 black and 6 white
Then the probability of both are black
P(both are black)= 5/8*4/10

=1/4
b) Both are white.
The probability of both are white
P(both are white)= 3/8*6/10

=9/40
c) The sample space at the first box is 8
P1(B)=5/8
P1(W)=3/8
The sample space of the second box is 10
P2(B)=4/10=2/5
Then the probability of 1 white and 1 black balls
1 white from the first box and a black ball from the second box P(W1 Λ B2) 1 white ball from
the second box and 1 black from the first box i.e. P(W2 Λ B1)
So that P(W Λ B) = P(W1 Λ B2)+ P(W2 Λ B1)
6
P(W Λ B)= P(W1)(B2)+ P(W2)( B1)
=3/8*4/10+6/10*5/8
=3/20+3/8
=6+15/40

=21/40

CASE FIVE:
#5: The frequency distribution of the hourly wage rate of 60 employees of a paper mill is
as follows: (5 pts)
Wage rate (Rs.) 54-56 56-58 58-60 60-62 62-64
Number of workers 10 10 20 10 10

Calculate the
a. Mean

Ʃ𝑓(𝑥)
X=
𝑁
Find Class Mark
𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
x= 2

N= the summation of the frequency


Wage No of Worker Class Mark f(x)
(Frequency) f (x)
54-56 10 55 550
56-58 10 57 570
58-60 20 59 1180
60-62 10 61 610
62-64 10 63 630
60 3540
N=Ʃf=60 Ʃ f(x)=3540

7
3540
x = Ʃ𝑓(𝑥)
𝑁 = = 59
60
Mean=59
b. Range

Range of Grouped Data = UCLK – LCL1


UCLK= upper class limit of the last class
LCL1= lower class limit of the first class
Range of Grouped Data = UCLK – LCL1
Range of Grouped Data = 64 – 54
Range of Grouped Data = 10
c. Standard deviation and variance

VARIANCE =

Wage No of Worker Class Mark f(x) X x- X (x-X)2 f(x- X)2


(Frequency) f (x)
54-56 10 55 550 59 -4 16 160
56-58 10 57 570 59 -2 4 40
58-60 20 59 1180 59 0 0 0
60-62 10 61 610 59 2 4 40
62-64 10 63 630 59 4 16 160
Total 60 3540 400

400
VARIANCE =S = 2
=6.78
60−1
Standard déviation =√𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 =√𝑠 2 = √6.78=2.60

8
Standard deviation = 2.60 AND VARIANCE
=6.78

d. Median

Wage No of Worker Cumulative


(Frequency) f Frequency (Less
Than Type)
54-56 10 10
56-58 10 20
58-60 20 40
60-62 10 50
62-64 10 60

N/2 = 60/2 = 30
40 the first cumulative frequency to be greater than or equal to 30

58-60 the median class


Lm= lower class boundary of the median class=57.5
W= the size the median class H-L=56-54=2
N=summation of the frequency =∑(f)= 60
Cfcb= commutative frequency class before the median class 10+10=20
Fm= frequency of the median class =20
𝑤 𝑁 2 60
Median=𝐿 𝑚𝑒𝑑 + 𝑓 𝑚 ( 2 − 𝐶𝑓𝑐𝑏)= 57.5 + 20 ( 2 − 20)=58.41

Median = 58.41
e. Mode
Solution
Wage No of Worker (Frequency) f
54-56 10
56-58 10
58-60 20
9
60-62 10
62-64 10
58-60 is the modal Class (Highest Frequency)
∆1
X=𝐿𝑚 + (∆1+∆2)𝐶

Lm= 58
W = class size =60-58=2
1=the difference between frequency of the modal class and frequency of the class before it.
= 20-10=10
2=the difference between the frequency of the modal class and the frequency of the class after
it. = 20-10=10
10
 = 58 (10+10) *2=59

Mode = 59

CASE SIX:
Suppose that a couple will have three children. Letting B denote a boy and G denote a girl. List
the sample space outcomes and probability that correspond to each of the following events. (8
pts)

i. All three children will have the same gender.


ii. Exactly two of the three children will be girls.
iii. Exactly one of the three children will be a girl.
iv. None of the three children will be a girl.

10
#6:
sample space
BBB BBG BGG BGB GGG GGB GBB GBG

B
B G
B G B
G
G B B
G
G B

i. all their children will have the same gender.


they may be BBB or GGG so that it includes both gender s
the total sample space is 8.
P(All three children are the same gender)=2/8=1/4

since there are two ways have the same gender


(BBB and GGG)

ii. from the total sample space i.e. 8 there are three ways to have exactly two girls(GGB ,GBG
and BGG)
P(exactly two girls) =3/8

iii. There are three ways to have exactly one girl child (BBG,BGB,GBB)
P(exactly one girl child)= 3/8
iv. There are only one way to have not at least one girl (BBB)
11
P none of the girl child =1/8

CASE SEVEN:
Suppose that the CPU college dean of students has generally assumed that the average age of a
student is no more than 20 years. However, lately the students have appeared to be somewhat older
than before, and the office believes that the average age now might be older. Suppose that 50
students are chosen from enrolment records randomly and the means found to be 20.76. If the
population standard deviation for the ages of these university students is 3.6 years, perform a
hypothesis test at 𝛼 = 0.05………..(8 pts)
a. State the hypothesis
b. State the decision rule
c. Compute the value of the test statistic (in this case the Z-value)
d. Accept or reject H0

#7:
a.
i. Null hypothesis H0 =µ= 20 years
ii. Alternative hypothesis h1 µ> 20 years
b. The test is the right test so that, to reject the null hypothesis the value of the Z
calculated(Zcal) > the value of the Z tabulated (Ztab) at α=0.05 and No reject Null
hypothesis the value of Z calculated(Zcal) < the value of Z tabulated (Ztab) at α= 0.05
Hence: at α=0.05
Ztab = ±1.65
Or if Zcal > Ztab(±1.65) reject H0
If Zcal < Ztab(±1.65) Not reject H0

c. Calculate the value of Zcal


𝑥−µ
Zcal =
𝛼/√𝑁

Where x= sample mean =20.76


µ= hypothesis population mean =20
α= population standard déviation =3.6
N= Sample size = 50

12
20.76−20 0.76 0.76
Zcal = = 3.6⁄ = 0.51
3.6/√20 7.1

Zcal = ±1.49 Zcal = 1.49 Ztab= 1.65


Zcal < Ztab

d. Accept or reject
When the right tailed test value of Zcal < the value of Ztab the null hypothesis would be
accepted.
Hence our value of Zcal is less than the value of Ztab. So null hypothesis accepted.

CASE EIGHT:
Write the difference between random and non-random sampling techniques and list different
sampling techniques under each category?

#8:
There are mainly two methods of sampling which are random and non-random sampling.
Random sampling is referred to as that sampling technique where the probability of choosing
each sample is equal.
The sample that is chosen randomly is an unbiased representation of the total population. If at all,
the sample chosen does not represent the population, it leads to sampling error.
Non-random sampling is a sampling technique where the sample selection is based on factors
other than just random chance. In other words, non-random sampling is biased in nature.
Here, the sample will be selected based on the convenience, experience or judgment of the
researcher.
Following are some of the points of difference between random sampling and non-random
sampling.
Random sampling Non-random sampling
Random sampling is a sampling technique Non-random sampling is a sampling
where each sample has an equal probability of technique where the sample selected will be
getting selected based on factors such as convenience,
judgement and experience of the researcher
and not on probability
Random sampling is unbiased in nature Non-random sampling is biased in nature
Based on probability Based on other factors such as convenience,
judgement and experience of researcher but,
not based on probability

13
Random sampling is representative of the Non-random sampling lacks the
entire population representation of the entire population
Zero probability never occur Zero probability can occur
Random sampling is the most simple Non-random sampling method is a somewhat
sampling technique complex sampling technique
Probability sampling methods
✓ Simple random sampling.
✓ Systematic sampling.
✓ Stratified sampling.
✓ Cluster sampling.
✓ Convenience sampling.
✓ Purposive sampling.
✓ Snowball sampling.
The commonly used non-probability sampling methods include the following.
✓ Convenience or haphazard sampling.
✓ Volunteer sampling.
✓ Judgement sampling.
✓ Quota sampling.
✓ Snowball or network sampling.
✓ Crowdsourcing.
✓ Web panels.

CASE NINE:
A researcher wishes to estimate the number of days it takes an automobile dealer to sella
Chevrolet Aveo. A random sample of 50 cars had a mean time on the dealer’s lot of 54 days.
Assume the population standard deviation to be 6.0 days. Find the best point estimate of the
population mean and the 95% confidence interval of the population mean. (6 points)

#9:
Given
X mean of the sample =54
α= standard deviation of the population =6
CI= confidence interval of the population mean =95%

14
Required
a) µ= mean of population
b) 95% confidence interval of the population on mean

Solution:
a) We have already seen that the mean “x” of a sample can be used to estimate the mean “µ”
of the population.
However, the mean of every sample will equal the population mean. Hence the best point
estimate of the population on mean “µ” is 54 days.
b) The confidence interval of the mean of the population for 95% confidence interval
𝛼
Formula x± Zβ/2 (√𝑁) or

𝛼 𝛼
x- Zβ/2 (√𝑁) <µ<x+ Zβ/2 (√𝑁)

where x = sample = 54
µ= population mean
Zβ/2 = standard error of population mean =6
α= standard deviation of population
N= number of sample = 50
For 95% confidence interval standard error of the population mean(Zβ/2) =1.96
Or Zβ/2 =1.96
Then
6 6
54-1.96(√50 )<µ<54+1.96(√50 )
6 6
54-1.96(7.1)<µ<54+1.96(7.1)

54-1.96(0.85) <µ<54+(0.85)
54-1.67<µ<54+1.67
52.33<µ<55.67
Or 54±1.67

Hence with 95% confidence that the interval between 52.33 and 55.67 days does contain the
population mean.

15

You might also like