0% found this document useful (0 votes)
32 views8 pages

Module 03 Inferential Statistics

Uploaded by

glenn.apon24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Module 03 Inferential Statistics

Uploaded by

glenn.apon24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Module 3 – Inferential Statistics

Inferential statistics involves the use of a sample (1) to estimate


some characteristic in a large population; and (2) to test a research
hypothesis about a given population. To appropriately estimate a
population characteristic, or parameter, a random and unbiased sample
must be drawn from the population of interest.
Inferential statistical methods can help business experts analyse
data and evaluate the probability of specific outcomes. It assists
professionals in understanding the larger population and making
conclusions about it using the sample data.

Sampling
Gathering information about an entire population often costs too
much or is virtually impossible. Instead, we use a sample of the
population. A sample should have the same characteristics as the
population it is representing. Most statisticians use various methods of
random sampling in an attempt to achieve this goal. This section will
describe a few of the most common methods. There are several
different methods of random sampling. In each form of random
sampling, each member of a population initially has an equal chance of
being selected for the sample. Each method has pros and cons. The
easiest method to describe is called a simple random sample. Any
group of n individuals is equally likely to be chosen as any other group
of n individuals if the simple random sampling technique is used. In
other words, each sample of the same size has an equal chance of
being selected.
Besides simple random sampling, there are other forms of
sampling that involve a chance process for getting the sample. Other
well-known random sampling methods are the stratified sample, the
cluster sample, and the systematic sample.

To choose a stratified sample, divide the population into groups


called strata and then take a proportionate number from each stratum.
For example, you could stratify (group) your college population by
department and then choose a proportionate simple random sample
from each stratum (each department) to get a stratified random
sample. To choose a simple random sample from each department,
number each member of the first department, number each member of
the second department, and do the same for the remaining
departments. Then use simple random sampling to choose
proportionate numbers from the first department and do the same for
each of the remaining departments. Those numbers picked from the
first department, picked from the second department, and so on
represent the members who make up the stratified sample.

To choose a cluster sample, divide the population into clusters


(groups) and then randomly select some of the clusters. All the
members from these clusters are in the cluster sample. For example, if
you randomly sample four departments from your college population,
the four departments make up the cluster sample. Divide your college
faculty by department. The departments are the clusters. Number each
department, and then choose four different numbers using simple
random sampling. All members of the four departments with those
numbers are the cluster sample.

To choose a systematic sample, randomly select a starting point


and take every nth piece of data from a listing of the population. For
example, suppose you have to do a phone survey. Your phone book
contains 20,000 residence listings. You must choose 400 names for the
sample. Number the population 1–20,000 and then use a simple
random sample to pick a number that represents the first name in the
sample. Then choose every fiftieth name thereafter until you have a
total of 400 names (you might have to go back to the beginning of your
phone list). Systematic sampling is frequently chosen because it is a
simple method.

A type of sampling that is non-random is convenience


sampling. Convenience sampling involves using results that are readily
available. For example, a computer software store conducts a
marketing study by interviewing potential customers who happen to be
in the store browsing through the available software. The results of
convenience sampling may be very good in some cases and highly
biased (favor certain outcomes) in others.

Sampling data should be done very carefully. Collecting data


carelessly can have devastating results. Surveys mailed to households
and then returned may be very biased (they may favor a certain group).
It is better for the person conducting the survey to select the sample
respondents.

True random sampling is done with replacement. That is, once a


member is picked, that member goes back into the population and thus
may be chosen more than once. However for practical reasons, in most
populations, simple random sampling is done without replacement.
Surveys are typically done without replacement. That is, a member of
the population may be chosen only once. Most samples are taken from
large populations and the sample tends to be small in comparison to
the population. Since this is the case, sampling without replacement is
approximately the same as sampling with replacement because the
chance of picking the same individual more than once with
replacement is very low.
Sampling without replacement instead of sampling with
replacement becomes a mathematical issue only when the population
is small.

When you analyze data, it is important to be aware of sampling


errors and nonsampling errors. The actual process of sampling causes
sampling errors. For example, the sample may not be large enough.
Factors not related to the sampling process cause nonsampling errors.
A defective counting device can cause a nonsampling error.

In reality, a sample will never be exactly representative of the


population so there will always be some sampling error. As a rule, the
larger the sample, the smaller the sampling error.

In statistics, a sampling bias is created when a sample is collected


from a population and some members of the population are not as
likely to be chosen as others (remember, each member of the
population should have an equally likely chance of being chosen). When
a sampling bias happens, there can be incorrect conclusions drawn
about the population that is being studied.

Problem 1

A study is done to determine the average tuition that San Jose State
undergraduate students pay per semester. Each student in the
following samples is asked how much tuition they paid for the Fall
semester. What is the type of sampling in each case?

a. A sample of 100 undergraduate San Jose State students is taken


by organizing the students’ names by classification (first-year
student, sophomore, junior, or senior), and then selecting 25
students from each.
b. A random number generator is used to select a student from the
alphabetical listing of all undergraduate students in the Fall
semester. Starting with that student, every 50th student is chosen
until 75 students are included in the sample.
c. A completely random method is used to select 75 students. Each
undergraduate student in the fall semester has the same
probability of being chosen at any stage of the sampling process.
d. The first-year, sophomore, junior, and senior years are numbered
one, two, three, and four, respectively. A random number
generator is used to pick two of those years. All students in those
two years are in the sample.
e. An administrative assistant is asked to stand in front of the library
one Wednesday and to ask the first 100 undergraduate students
he encounters what they paid for tuition the Fall semester. Those
100 students are the sample.

Problem 2

Suppose ABC College has 10,000 part-time students (the population).


We are interested in the average amount of money a part-time student
spends on books in the fall term. Asking all 10,000 students is an almost
impossible task.

Suppose we take two different samples.

First, we use convenience sampling and survey ten students from a first
term organic chemistry class. Many of these students are taking first
term calculus in addition to the organic chemistry class. The amount of
money they spend on books is as follows:

$128; $87; $173; $116; $130; $204; $147; $189; $93; $153
The second sample is taken using a list of senior citizens who take P.E.
classes and taking every fifth senior citizen on the list, for a total of ten
senior citizens. They spend:

$50; $40; $36; $15; $50; $100; $40; $53; $22; $22

It is unlikely that any student is in both samples.

a. Do you think that either of these samples is representative of (or is


characteristic of) the entire 10,000 part-time student population?

b. Since these samples are not representative of the entire population,


is it wise to use the results to describe the entire population?

Now, suppose we take a third sample. We choose ten different part-


time students from the disciplines of chemistry, math, English,
psychology, sociology, history, nursing, physical education, art, and
early childhood development. (We assume that these are the only
disciplines in which part-time students at ABC College are enrolled and
that an equal number of part-time students are enrolled in each of the
disciplines.) Each student is chosen using simple random sampling.
Using a calculator, random numbers are generated and a student from
a particular discipline is selected if they have a corresponding number.
The students spend the following amounts:

$180; $50; $150; $85; $260; $75; $180; $200; $200; $150

Problem

c. Is the sample biased?


Variation in Data

Variation is present in any set of data. For example, 16-ounce


cans of beverage may contain more or less than 16 ounces of liquid. In
one study, eight 16 ounce cans were measured and produced the
following amount (in ounces) of beverage:

15.8; 16.1; 15.2; 14.8; 15.8; 15.9; 16.0; 15.5

Measurements of the amount of beverage in a 16-ounce can may


vary because different people make the measurements or because the
exact amount, 16 ounces of liquid, was not put into the cans.
Manufacturers regularly run tests to determine if the amount of
beverage in a 16-ounce can falls within the desired range.

Be aware that as you take data, your data may vary somewhat
from the data someone else is taking for the same purpose. This is
completely natural. However, if two or more of you are taking the same
data and get very different results, it is time for you and the others to
reevaluate your data-taking methods and your accuracy.

Variation in Samples
It was mentioned previously that two or more samples from the
same population, taken randomly, and having close to the same
characteristics of the population will likely be different from each other.
Suppose Doreen and Jung both decide to study the average amount of
time students at their college sleep each night. Doreen and Jung each
take samples of 500 students. Doreen uses systematic sampling and
Jung uses cluster sampling. Doreen's sample will be different from
Jung's sample. Even if Doreen and Jung used the same sampling
method, in all likelihood their samples would be different. Neither
would be wrong, however.
Think about what contributes to making Doreen’s and Jung’s
samples different.

If Doreen and Jung took larger samples (i.e. the number of data
values is increased), their sample results (the average amount of time a
student sleeps) might be closer to the actual population average. But
still, their samples would be, in all likelihood, different from each other.
This variability in samples cannot be stressed enough.

Size of a Sample

The size of a sample (often called the number of observations,


usually given the symbol n) is important. The examples you have seen
in this book so far have been small. Samples of only a few hundred
observations, or even smaller, are sufficient for many purposes. In
polling, samples that are from 1,200 to 1,500 observations are
considered large enough and good enough if the survey is random and
is well done. Later we will find that even much smaller sample sizes will
give very good results. You will learn why when you study confidence
intervals.

Be aware that many large samples are biased. For example, call-in
surveys are invariably biased, because people choose to respond or not.

You might also like