Advanced Statistics1
Advanced Statistics1
SAMPLE SIZES
2. Inferential Statistics
- It is a group of statistical measurements or methods that functions or
aims to infer or to make interpretations.
- It makes a concluding statement about the population based on the
result derived from the data of the sample.
- When you have collected data from a sample, you can use inferential
statistics to understand the larger population from which the sample is
taken.
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
Classification of Data
A. According to Nature
1. Quantitative Data – information that are obtains from variables which
are in the form of numbers.
Ex. Age, bills, financial ratios, supply capacity
2. Qualitative Data – information that one obtains from variables which
are in the form of categories, characteristics, names or labels.
Ex. Gender, type of business, socio-economic status
B. According to Source
1. Primary Data – first –hand information
Ex. Autobiography, financial statement
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
THE POPULATION:
In statistics, population is the entire set of items from which you draw data for a
statistical study.
A population is the entire group that you want to draw conclusions about.
A population can either be finite or infinite . If there is an upper limit to the number
of observations it contains, then, the population is finite, but, if there is no qualifying limit
to its size, then the population is infinite.
Examples of finite population:
(1) The number of families living in a certain community
(2) The number of students enrolled in a particular subject on a particular sem.
(3) The number of cards in a deck playing cards
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
What is Sample?
Sampling is the process of collecting data from a small subsection of the
population and then using it to generalize over the entire set.
A sample consists of a smaller group of entities, which are taken from the entire
population. This creates a subset group that is easier to manage and has the
characteristics of the larger population.
A sample is the specific group that you will collect data from. The size of the
sample is always less than the total size of the population.
This smaller subset is then surveyed to gain information and data. The sample
should reflect the population as a whole, without any bias towards a specific attribute or
characteristic.
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
Types of Sample
Probability sampling, also known as random sampling, is a kind of sample
selection where randomization is used instead of deliberate choice.
Simple random sampling - Every element in the population has an equal chance of
being selected as part of the sample.
Systematic sampling - Also known as systematic clustering, in this method,
random selection only applies to the first item chosen. A rule then applies so that
every nth item or person after that is picked.
Stratified random sampling - Sampling uses random selection within predefined
groups.
Cluster sampling - Groups rather than individual units of the target population are
selected at random.
Non-probability sampling techniques involve the researcher deliberately picking
items or individuals for the sample based on their research goals or knowledge
Convenience sampling – People or elements in a sample are selected based
on their availability.
Quota sampling – The sample is formed according to certain groups or criteria.
Purposive sampling Also known as judgmental sampling. The sample is
formed by the researcher consciously choosing entities, based on the survey
goals.
Snowball sampling Also known as referral sampling. The sample is formed by
sample participants recruiting connections.
Difference Between Population & Sample:
Population Sample
All residents of a country would constitute All residents who live above the poverty
the population set. line would be the sample.
All residents above the poverty line in a All residents who are millionaires will
country would be the population. make-up the Sample
All employees in an office would be the Out of all the employees, all managers in
population. the office would be the Sample.
THE PARAMETER
When you collect data from a population or a sample, there are various
measurements and numbers you can calculate from the data.
A parameter is a measure that describes the whole population (population
mean).
Example: 20% of Philippine senators voted for a specific measure. Since there
are only 24 senators, you can count what each of them voted.
Sampling error is the difference between a parameter and a corresponding
statistic. Since in most cases you don’t know the real population parameter, you can use
inferential statistics to estimate these parameters in a way that takes Sampling error into
account.
There are two important types of estimates you can make about the population: point
estimates and interval estimates.
A Point Estimate is a single value estimate of a parameter. For instance, a
sample mean is a point estimate of a population mean.
An Interval Estimate gives you a range of values where the parameter is
expected to lie. A confidence interval is the most common type of interval
estimate.
Both types of estimates are important for gathering a clear idea of where a parameter is
likely to lie.
THE STATISTICS
A statistic is a measure that describes the sample. (Sample mean).
Example: 50% of people living in Cebu agree with the latest health care
proposal. Researchers can’t ask hundreds of millions of people if they agree, so they
take samples, or part of the population and calculate the rest.
1. For many population, the distribution of the sample mean X tends to be more
consistent (with less variation) than the distributions of other sample statistics.
2. For all population, the sample mean is an unbiased estimator of the population
mean µ , meaning that the distribution of the sample mean tends to center about
the value of the population mean µ.
CONFIDENCE INTERVAL
A confidence interval (or interval estimate) is a range (or an interval) of values
that is likely to contain the true value of the population parameter. A confidence interval
is associated with a degree of confidence, which is a measure of how certain you are
that the interval contains the population parameter. The definition of degree of
confidence uses α (lowercase Greek alpha) to describe a probability that corresponds to
an area. The degree of confidence is also called the level of confidence.
Common choices for the degree of confidence are 90% (with α = 0.01), 95%
(with α = 0.05), and 99% (with α = 0.01). The choice of 95% is most common because
it provides a good balance between precision & reliability.
CRITICAL VALUE
A critical value is the number in the borderline separating sample statistics that
are likely to occur from those that are unlikely to occur, the number Zα/2 is a critical
value that is a Z score with the property that it separates an area of α/2 in the right tail of
the standard normal distribution. There is an area of 1 – α between the vertical
borderlines at - Zα/2 and Zα/2
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
Example: Find the critical value Zα/2 corresponding to a 95% degree of confidence.
Solution: A 95% degree of confidence corresponds to α = 0.05 (5%). In Fig. 1.2, we
can see that the area in each of the shaded tail is α/2 = 0.025. The region to the left of
Zα/2 and bounded by the mean of Z = 0 must be 0.5 – 0.025 = 0.4750. We can now find
the critical value Zα/2 from table II, A, the area 0.4750 corresponds exactly to a Z value
of 1.96. Therefore, for a 95% degree of confidence, the critical value of Z α/2 = 1.96 at the
right tail and – 1.96 at the left.
Solve the following problems:
1. Find the critical value Zα/2 that corresponds to the given degree of confidence.
a. 99% b. 98%
(BA-STAT102) Advanced Statistics – CHAPTER 1: REVIEW OF ESTIMATES &
SAMPLE SIZES
MARGIN OF ERROR
Margin of Error (E) is the maximum likely (with probability 1 - α) difference
between the observed sample mean X and the true value pf the population mean (µ).
The margin of error E also called maximum error of the estimate can be found by
multiplying the critical value and the standard deviation of the sample mean.
In symbol,
E = Zα/2 · σ/ √n
X - E<µ< X +E
2.4 - 0.61 < µ < 2.4 + 0.61
1.79 < µ < 3.01
P - E<P< P +E
0.650 - 0.03 < P < 0.650 + 0.0
0.62 < P < 0.68
Confidence Interval P = 0.650 ± 0.03
Confidence Interval Limits P = (0.62, 0.68)
SEATWORK: