0% found this document useful (0 votes)
13 views35 pages

Stat Quick Overview

The document outlines essential steps for decision making, including assessing information adequacy and summarizing data. It covers statistical concepts such as descriptive and inferential statistics, types of variables, levels of measurement, and various graphical representations of data. Additionally, it discusses probability, including classical and empirical approaches, and rules for calculating probabilities.

Uploaded by

tasintaha60
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views35 pages

Stat Quick Overview

The document outlines essential steps for decision making, including assessing information adequacy and summarizing data. It covers statistical concepts such as descriptive and inferential statistics, types of variables, levels of measurement, and various graphical representations of data. Additionally, it discusses probability, including classical and empirical approaches, and rules for calculating probabilities.

Uploaded by

tasintaha60
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Chapter 1

Useful Decision Making Step


1.​ Determine whether the existing information is adequate or additional information is
required.
2.​ Gather additional information, if it is needed, in such a way that it does not provide
misleading results.
3.​ Summarize the information in a useful and informative manner.
4.​ Analyze the available information.
5.​ Draw conclusions and make inferences while assessing the risk of an incorrect
conclusion.

Descriptive Statistics
Methods of organizing, summarizing, and presenting data in an informative way.
Example: the United States government reports the population of the United
States was 179,323,000 in 1960; 203,302,000 in 1970; 226,542,000 in 1980;
248,709,000 in 1990; 265,000,000 in 2000; and 308,400,000 in 2010. This informa-
tion is descriptive statistics.

Inferential Statistics
The methods used to estimate a property of a population on the basis of a sample.
POPULATION The entire set of individuals or objects of interest or the measurements obtained
from all individuals or objects of interest.
SAMPLE A portion, or part, of the population of interest.

Types of Variables
1.​ Quantitative (Numeric)
●​ Discrete (non decimal)
●​ Continuous (decimal)
2.​ Qualitative (Nonnumeric)
Level of Measurement
Nominal:
1. The variable of interest is divided into categories or outcomes.
2. There is no natural order to the outcomes.
The classification of the six colors of M&M’s milk chocolate candies is an example of the
nominal level of measurement.
Ordinal:
1. Data classifications are represented by sets of labels or names(high, medium, low) that have
relative values.
2. Because of the relative values, the data classified can be ranked or ordered.
One classification is “higher” or “better” than the next one. That is, “Superior” is better than
“Good,” “Good” is better than “Average,” and so on. However, we are not able to distinguish the
magnitude of the differences between groups.
Interval:
1. Data classifications are ordered according to the amount of the characteristic they possess.
2. Equal differences in the characteristic are represented by equal differences in the
measurements.
An example of the interval level of measurement is temperature. Equal differences between two
temperatures are the same, regardless of their position on the scale.
Ratio:
1. Data classifications are ordered according to the amount of the characteristics
they possess.
2. Equal differences in the characteristic are represented by equal differences in the numbers
assigned to the classifications.
3. The zero point is the absence of the characteristic and the ratio between two numbers is
meaningful.
Examples of the ratio scale of measurement include wages, units of production, weight, changes
in stock prices, distance between branch offices, and height.

Chapter 2

FREQUENCY TABLE A grouping of qualitative data into mutually exclusive classes


showing the number of observations in each class.
RELATIVE CLASS FREQUENCY A relative frequency captures the relationship between a
class total and the total number of observations.
BAR CHART A graph that shows qualitative classes on the horizontal axis and the class
frequencies on the vertical axis. The class frequencies are proportional to the heights of the bars.

Variable of interest is qualitative thus the gap


between bars.

PIE CHART A chart that shows the proportion or percentage that each class represents the total
number of frequencies.

HISTOGRAM A graph in which the classes are marked on the horizontal axis and the class
frequencies on the vertical axis. The class frequencies are represented by the heights of the bars,
and the bars are drawn adjacent to each other.
FREQUENCY POLYGON It consists of line segments connecting the points formed by the
intersections of the class midpoints and the class frequencies.
HISTOGRAM & FREQUENCY POLYGRAM Both the histogram and the frequency polygon
allow us to get a quick picture of the main characteristics of the data (highs, lows, points of
concentration, etc.). Although the two representations are similar in purpose, the histogram has
the advantage of depicting each class as a rectangle, with the height of the rectangular bar
representing the number in each class. The frequency polygon, in turn, has an advantage over the
histogram. It allows us to compare directly two or more frequency distributions.

Constructing Frequency Table:


Step 1: Decide on the number of classes. A useful recipe to determine the number of classes (k)
is the “2 to the k rule.” This guide suggests you select the smallest number (k) for the number of
classes such that 2k (in words, 2 raised to the power of k) is greater than the number of
observations (n)
𝐻−𝐿
Step 2: Determine the class interval or class width. i≥ 𝐾
. i is the class interval, H is the
highest observed value, L is the lowest observed value, and k is the number of classes.
Step 3: Set the individual class limits.
Step 4: Tally the vehicle profit into the classes.
Step 5: Count the number of items in each class. The number of observations in each class is
called the class frequency.
The major advantage to organizing the data into a frequency distribution is that we get a
quick visual picture of the shape of the distribution without doing any further calculation.
To put it another way, we can see where the data are concentrated and also determine
whether there are any extremely large or small values. There are two disadvantages,
however, to organizing the data into a frequency distribution: (1) we lose the exact identity
of each value and (2) we are not sure how the values within each class are distributed.

Relative Frequency Distribution


Chapter 3

Population Mean
Mean calculated value from all of the values (population).
where:
µ represents the population mean. It is the Greek lowercase letter “mu.”
N is the number of values in the population.
X represents any particular value.
∑ is the Greek capital letter “sigma” and indicates the operation of adding.
∑X is the sum of the X values in the population.
PARAMETER A characteristic of a population.

Sample Mean
STATISTIC A characteristic of a sample.

Properties of Arithmetic Mean:


1. Every set of interval- or ratio-level data has a mean.
3. The mean is unique. That is, there is only one mean in a set of data.
4. The sum of the deviations of each value from the mean is zero.
Weighted Mean
It occurs when there are several observations of the same value. x1,x2,x3 = Set of numbers and
w1,w2, w3= frequency counts

Median:
The major properties of the median are:
1. It is not affected by extremely large or small values. Therefore, the median is a valuable
measure of location when such values do occur.
2. It can be computed for ordinal-level data or higher.
Mode:
Advantage: we can determine the mode for all levels of data—nominal, ordinal, interval, and
ratio. The mode also has the advantage of not being affected by extremely high or low values.
Disadvantage: For many sets of data, there is no mode because no value appears more than
once. Conversely, for some datasets there is more than one mode. (If two modes, Bimodal).
Zero Skewness:
●​ Distribution is symmetrical.
●​ Mean, Mode, Median are located in the center. All are equal.

Positive Skewness:
●​ The slope will be at the right.
●​ The arithmetic mean is the highest. Mean is more affected by extreme values.

Negative Skewness:
●​ The slope will be at the left.
●​ The mean is the lowest. Mean>Median>Mode.
Geometric Mean:
Mean changes in the percentage. It has a wide application in business and economics because we
are often interested in finding the percentage changes in sales, salaries,or economic figures.

Measures of Dispersion

RANGE The simplest measure of dispersion is the range. It is the difference between the largest
and the smallest values in a data set.

MEAN DEVIATION The arithmetic mean of the absolute values of the deviations
from the arithmetic mean.

VARIANCE The arithmetic mean of the squared deviations from the mean.
STANDARD DEVIATION The square root of the variance.

Population Standard Deviation

Sample Standard Deviation

Interpretation of Standard Deviation: Standard deviation (SD) measures how spread out the
values in a dataset are from the mean (average).
Small SD → Data points are close to the mean (low variability).
Large SD → Data points are spread out from the mean (high variability).
A dataset with a higher SD is more spread out and less consistent.
A dataset with a lower SD is more tightly clustered around the mean.
CHEBYSHEV’S THEOREM
For any set of observations (sample or population), the proportion of the values that lie within k
standard deviations of the mean is at least 1 1/k2, where k is any constant greater than 1.
●​ k = 2 (Two standard deviations) At least 75% of values lie within ±2 SDs of the mean.
●​ k = 3 (Three standard deviations) At least 88.9% of values lie within ±3 SDs of the
mean.
●​ k = 4 (Four standard deviations) At least 93.75% of values lie within ±4 SDs of the
mean.

Why is Chebyshev’s Theorem Useful?

●​ Unlike the Empirical Rule, which only applies to normal distributions, Chebyshev’s
Theorem applies to all distributions.
●​ Useful when the shape of the distribution is unknown or skewed.

EMPIRICAL RULE (Normal Rule)


For a symmetrical, bell-shaped frequency distribution, approximately 68 percent of the
observations will lie within plus and minus one standard deviation of the mean; about 95 percent
of the observations will lie within plus and minus two standard deviations of the mean; and
practically all (99.7 percent) will lie within plus and minus three standard deviations of
the mean.

Comparison to Chebyshev’s Theorem:

●​ The Empirical Rule applies only to normal distributions.


●​ Chebyshev’s Theorem applies to all distributions but gives more conservative estimates.
Chapter 4

Dot Plot
A dot plot groups the data as little as possible, and we do not lose the identity of an individual
observation. To develop a dot plot, we simply display a dot for each observation along a
horizontal number line indicating the possible values of the data. If there are identical
observations or the observations are too close to be shown individually, the dots are “piled” on
top of each other. This allows us to see the shape of the distribution, the value about which the
data tend to cluster, and the largest and smallest observations.
Dot plots are most useful for smaller data sets, whereas histograms tend to be most useful
for large data sets.

Stem and Leaf Display


One technique that is used to display quantitative information in a condensed form is the
stem-and-leaf display.
An advantage of the stem-and-leaf display over a frequency distribution is that we do not lose the
identity of each observation.The stem value is the leading digit or digits, in this case 9. The
leaves are the trailing digits. The stem is placed to the left of a vertical line and the leaf values to
the right.

Quartiles, Deciles, and Percentiles

n= number of observation, p= position


Quartiles divide a dataset into four equal parts (25% each).
Q1 (First Quartile) → 25% of data falls below this point.
Q2 (Second Quartile / Median) → 50% of data falls below this point.
Q3 (Third Quartile) → 75% of data falls below this point.
Deciles divide a dataset into ten equal parts (10% each)
D1 (First Decile) → 10% of data falls below this point.
D2 (Second Decile) → 20% of data falls below this point.
D5 (Fifth Decile) → 50% of data falls below (Same as Median).
D9 (Ninth Decile) → 90% of data falls below this point.
Percentiles divide a dataset into 100 equal parts (1% each).

●​ P1 (1st Percentile) → 1% of data falls below this point.


●​ P50 (50th Percentile / Median) → 50% of data falls below this point.
●​ P99 (99th Percentile) → 99% of data falls below this point.

Box Plots

A box plot is a graphical display, based on quartiles, that helps us picture a set of data. To
construct a box plot, we need only five statistics: the minimum value, Q1 (the first quartile), the
median, Q3 (the third quartile), and the maximum value.

The box plot shows that the middle 50 percent of the deliveries take between 15 minutes and 22
minutes. The distance between the ends of the box, 7 minutes, is the interquartile range. The
interquartile range is the distance between the first and the third quartile. It shows the spread or
dispersion of the majority of deliveries.

Coefficient of Skewness:


Relationship Between Two Variables

When we study the relationship between two variables, we refer to the data as bivariate. One
graphical technique we use to show the relationship between variables is called a scatter
diagram. We scale one variable along the horizontal axis (X-axis) of a graph and the other
variable along the vertical axis (Y-axis). Usually one variable depends to some degree on the
other. A scatter diagram requires that both of the variables be at least interval scale.

CONTINGENCY TABLE A table used to classify observations according to two identifiable


characteristics. (Nominal or Ordinal Data)

Histogram Dotplot

Groups data into ranges Displays each data individually

Bars represent frequency in intervals Dots represent individual data points

Useful for large data sets Useful for small data sets

Histogram Stem and Leaf

Groups data into ranges Displays each data individually

Bars represent frequency in intervals Data split into stems and leaves

Useful for large data sets Useful for small data sets
Chapter 5

PROBABILITY A value between zero and one, inclusive, describing the relative possibility
(chance or likelihood) an event will occur.
EXPERIMENT A process that leads to the occurrence of one and only one of several possible
observations.
OUTCOME A particular result of an experiment.
EVENT A collection of one or more outcomes of an experiment.
Objective probability is subdivided into
(1) classical probability and (2) empirical probability.
1.Classical Probability
Classical probability is based on the assumption that the outcomes of an experiment are equally
likely.

MUTUALLY EXCLUSIVE The occurrence of one event means that none of the other events
can occur at the same time. The variable “gender” presents mutually exclusive outcomes, male
and female. An employee selected at random is either male or female but cannot be both.
COLLECTIVELY EXHAUSTIVE At least one of the events must occur when an experiment
is conducted.
2. Empirical Probability
Empirical or relative frequency is the second type of objective probability. The probability of an
event happening is the fraction of the time similar events happened in the past is called empirical
probability.

The empirical approach to probability is based on what is called the law of large numbers. The
key to establishing probabilities empirically is that more observations will provide a more
accurate estimate of the probability.
LAW OF LARGE NUMBERS Over a large number of trials, the empirical probability of an
event will approach its true probability.
To explain the law of large numbers, suppose we toss a fair coin. Based on the classical
definition of probability, the likelihood of obtaining a head in a single toss of a fair coin is .5.
Based on the empirical or relative frequency approach to probability, the probability of the event
happening approaches the same value based on the classical definition of probability.

Subjective Probability
The likelihood (probability) of a particular event happening that is assigned by an individual
based on whatever information is available. Basically, guessing the probability on the basis of
information available.

Rules of Addition
To apply the special rule of addition, the events must be mutually exclusive. Mutually
exclusive means that when one event occurs, none of the other events can occur at the same time.

Complement Rule
The probability that a bag of mixed vegetables selected is underweight, P(A), plus the probability
that it is not an underweight bag, writtenP(∼ 𝐴) and read “not A,” must logically equal 1.
The General Rule of Addition
If not mutually exclusive,

What is the probability a selected person visited either Disney World or Busch Gardens?” (1) add
the probability that a tourist visited Disney World and the probability he or she visited Busch
Gardens, and (2) subtract the probability of visiting both. For the expression P(A or B), the word
or suggests that A may occur or B may occur.This also includes the possibility that A and B may
occur. This use of or is sometimes called an inclusive.
JOINT PROBABILITY A probability that measures the likelihood two or more events will
happen concurrently.
Rules of Multiplication
Special Rule of Multiplication:
The special rule of multiplication requires that two events A and B are independent.
INDEPENDENCE The occurrence of one event has no effect on the probability of the
occurrence of another event. For example, when event B occurs after event A occurs, does A
have any effect on the likelihood that event B occurs? If the answer is no, then A and B are
independent events.

General Rule of Multiplication:


If the events are dependent,

P(B l A) means if A happens probability of B


Suppose there are 10 cans of soda in a cooler, 7 are regular and 3 are diet. A can is selected from
the cooler. The probability of selecting a can of diet soda is 3/10, and the probability of selecting
a can of regular soda is 7/10. Then a second can is selected from the cooler, without returning the
first. The probability the second is diet depends on whether the first one selected was diet or not.
The probability that the second is diet is:
2/9, if the first can is diet. (Only two cans of diet soda remain in the cooler.)
3/9, if the first can selected is regular. (All three diet sodas are still in the cooler.)
The fraction 2/9 (or 3/9) is aptly called a conditional probability because its value is conditional
on (dependent on) whether a diet or regular soda was the first selection from the cooler.
Tree Diagram

Bayes’ Theorem
It helps revise an initial probability (prior) based on new evidence (likelihood). The updated
probability (posterior) tells us how likely an event is given the new information.

PRIOR PROBABILITY The initial probability based on the present level of information.
POSTERIOR PROBABILITY A revised probability based on additional information.
Chapter 6

PROBABILITY DISTRIBUTION A listing of all the outcomes of an experiment and the


probability associated with each outcome.

CHARACTERISTICS OF A PROBABILITY DISTRIBUTION


●​ The probability of a particular outcome is between 0 and 1 inclusive.
●​ The outcomes are mutually exclusive events.
●​ The list is exhaustive. So the sum of the probabilities of the various events is equal to 1.

A random variable is a numerical value that represents the outcome of a random event. It
assigns numbers to different possible outcomes in a probability experiment.

BINOMIAL PROBABILITY DISTRIBUTION


The binomial probability distribution is used when an experiment consists of n independent
trials, where each trial has only two possible outcomes: success or failure.
Characteristics:
1. An outcome on each trial of an experiment is classified into one of two mutually exclusive
categories—a success or a failure.
2. The random variable counts the number of successes in a fixed number of trials.
3. The probability of success and failure stay the same for each trial.
4. The trials are independent, meaning that the outcome of one trial does not affect the outcome
of any other trial.
Cumulative Binomial
Instead of finding the probability of exactly kkk successes, cumulative probability finds the
probability of up to or more than a certain number of successes.
For a binomially distributed random variable X with n trials and success probability p:

a) Cumulative Probability of "At Most" k Successes


P(X≤k)=P(X=0)+P(X=1)+...+P(X=k)

b) Cumulative Probability of "At Least" k Successes


P(X≥k)=1−P(X≤k−1)

Hypergeometric Probability Distribution


The hypergeometric distribution is used when selecting objects without replacement from a
finite population. Unlike the binomial distribution (where trials are independent), in
hypergeometric distributions, the probability of success changes after each selection because the
sample is drawn without replacement.

Poisson Probability Distribution


The Poisson probability distribution describes the number of times some event occurs during a
specified interval. The interval may be time, distance, area, or volume.
The distribution is based on two assumptions. The first assumption is that the probability is
proportional to the length of the interval. The second assumption is that the intervals are
independent.
Characteristics:
1. The random variable is the number of times some event occurs during a defined interval.
2. The probability of the event is proportional to the size of the interval.
3. The intervals do not overlap and are independent.

Chapter 7
Uniform Distribution
The distribution’s shape is rectangular and has a minimum value of a and a maximum of b.
Why is this rectangular?
The uniform distribution is called rectangular because its probability is evenly spread across
all possible values. This means that every outcome in the range has the same probability,
creating a flat, rectangular shape when graphed.
Example: Rolling a dice. Every number has the same probability.

Area= height X base= 1/(b-a).(b-a) = 1

Normal Probability Distribution


The Normal Probability Distribution, also called the Gaussian Distribution, is a bell-shaped
curve that describes how data is distributed around the mean.

Characteristics of a Normal Distribution

●​ Symmetrical: The left and right sides are mirror images.


●​ Bell-shaped: Most values cluster around the mean.
●​ Asymptotic: The curve gets closer and closer to the X-axis but never actually touches it.
●​ Mean = Median = Mode: The peak of the curve is at the mean.
●​ Total area under the curve = 1 (100%).
●​ Empirical Rule applies
Z Value

The signed distance between a selected value, designated X, and the mean divided by the
standard deviation.

Interpretation : The Z-value (Z-score) measures how far a data point (or sample mean) is from
the population mean, in terms of standard deviations. It helps us determine how unusual or
typical a value is in a normal distribution.
Application of Standard Normal Distribution

Finding Z Value from Z Table


Finding z critical values using a z table

Chapter 8
Reasons to Sample
1. To contact the whole population would be time consuming.
2. To cut costs
3. The physical impossibility of checking all items in the population.
4. The destructive nature of some tests.
5. The sample results are adequate.
SIMPLE RANDOM SAMPLE A sample selected so that each item or person in the population
has the same chance of being included.
SYSTEMATIC RANDOM SAMPLE A random starting point is selected, and then every kth
member of the population is selected. k is calculated as the population size divided by the sample
size.
STRATIFIED RANDOM SAMPLE A population is divided into subgroups, called strata, and
a sample is randomly selected from each stratum.
CLUSTER SAMPLE A population is divided into clusters using naturally occurring geographic
or other boundaries. Then, clusters are randomly selected and a sample is collected by randomly
selecting from each cluster.

Sampling Definition How It Works When to Advantag Disadvanta


Method Use es ges

Simple Every Use a random method to When the - Reduces - May not
Random individual select individuals from population is bias represent
Sampling in the the entire population. homogeneou - Easy to subgroups
(SRS) population s and a implement well
has an completely - Can be
equal random inefficient
chance of selection is for large
being needed. populations
selected.
Systemati Individuals Choose a starting point When a - Easy to - Can
c are selected randomly, then select simple and execute introduce
Random at regular every k-th individual quick - Ensures bias if there
Sampling intervals (where k=Population method is good is a hidden
from an Size/Sample) needed for coverage pattern in
ordered list. large of the the
populations. population population

Stratified The Divide the population When - More - Requires


Random population into homogeneous strata specific precise detailed
Sampling is divided (e.g., age, gender), then subgroups representa knowledge
into apply SRS within each need to be tion of the of
subgroups stratum. represented population population
(strata) proportionall - Reduces strata
based on y. variability - Can be
characterist in complex
ics, and estimates and
random time-consu
samples are ming
taken from
each.
Cluster The Divide the population When -Cost-effe - Higher
Sampling population into heterogeneous surveying a ctive for sampling
is divided clusters (e.g., large, large error if
into neighborhoods, schools), geographical population clusters are
clusters, randomly select clusters, ly dispersed s not diverse
and some then sample all population. - Useful enough
clusters are individuals within those for - Less
randomly clusters. geographi precise than
selected; all cally stratified
individuals spread sampling
in selected population
clusters are s
included.

SRS: When the population is uniform.


Systematic: When a list of the population is available.
Stratified: When you need to ensure subgroup representation.
Cluster: When working with a large, spread-out population.
SAMPLING ERROR The difference between a sample statistic and its corresponding
population parameter. The difference between sample mean and population mean can be
considered as sampling error.

SAMPLING DISTRIBUTION OF THE SAMPLE MEAN A probability distribution of all


possible sample means of a given sample size.

n= number of samples, N= population. To get the number of possible samples.


Relationships between the population distribution and the sampling distribution of the
sample mean:
1. The mean of the sample means is exactly equal to the population mean.
2. The dispersion of the sampling distribution of sample means is narrower than the population
distribution.
3. The sampling distribution of sample means tends to become bell-shaped and to approximate
the normal probability distribution.
CENTRAL LIMIT THEOREM If all samples of a particular size are selected from any
population, the sampling distribution of the sample mean is approximately a normal distribution.
This approximation improves with larger samples.
The central limit theorem indicates that, regardless of the shape of the population distribution,
the sampling distribution of the sample mean will move toward the normal probability
distribution.
The larger the number of observations in each sample, the stronger the convergence.
Standard Error of the Mean
Its longer name is actually the standard deviation of the sampling distribution of the sample
mean. Population SD/ Root of sample size.

1. The mean of the distribution of sample means will be exactly equal to the population mean if
we are able to select all possible samples of the same size from a given population.
2. There will be less dispersion in the sampling distribution of the sample mean than in the
population.

Usage of Sampling Distribution


The sampling distribution of the sample mean will follow the normal probability distribution
under two conditions:
1. When the samples are taken from populations known to follow the normal distribution.
2. When the shape of the population distribution is not known or the shape is known to be
nonnormal, but our sample contains at least 30 observations. We should point out that the
number 30 is a guideline that has evolved over the years. In this case, the central limit theorem
guarantees the sampling distribution of the mean follows a normal distribution.

The z values express the sampling error in standard units—in other words, the standard error.
Chapter 9
Point Estimates
A point estimate is a single value (point) derived from a sample and used to estimate a
population value. For example, suppose we select a sample of 50 junior executives and ask how
many hours they worked last week. Compute the mean of this sample of 50 and use the value of
the sample mean as a point estimate of the unknown population mean.
Confidence Intervals
A range of values constructed from sample data so that the population parameter is likely to
occur within that range at a specified probability. The specified probability is called the level of
confidence.

Population SD
known

Population SD unknown

Difference between point estimate and confidence intervals

Aspect Point Estimate Confidence Interval (CI)

Definition A single value used to A range of values within which the


estimate a population true population parameter is
parameter. expected to lie, with a given
confidence level.

Example A sample mean used to A 95% CI for the mean weight of


estimate the population apples: (150g, 160g).
mean

Precision Provides only a single value Provides a range that accounts for
as an estimate. variability.
Accuracy May not always be close to More reliable since it accounts for
the true population sampling error.
parameter.

Reliability Less reliable because it More reliable as it quantifies


does not show uncertainty. uncertainty with probability.

Usage Quick estimation but lacks Used for decision-making in


certainty. statistics, research, and quality
control.

Sample Proportion

X= number of success

Use Z-Statistic when:

●​ The population standard deviation is known.


●​ The sample size is large (n≥30).
●​ The population follows a normal distribution.
●​ Z distribution is fixe with thinner tails

Use T-Statistic when:

●​ The population standard deviation (σ) is unknown.


●​ The sample size is small (n<30)
●​ The population follows a normal distribution or is approximately normal.
●​ The T-distribution has heavier tails, meaning it accounts for more variability in small
samples.

Characteristics of t stat

The t-distribution (or Student’s t-distribution) is a probability distribution used in statistical


inference when the population standard deviation (σ) is unknown and the sample size is small
(n<30). It is commonly used in t-tests and confidence intervals for small samples.
1. Bell-Shaped and Symmetric

●​ The t-distribution is symmetrical and bell-shaped, similar to the normal (Z) distribution.
●​ However, it has heavier tails, meaning it accounts for more variability.

2. Dependent on Degrees of Freedom (df=n−1)

●​ The shape of the t-distribution changes based on sample size.


●​ Smaller sample sizes (low df) → more spread out (fatter tails).
●​ Larger sample sizes (high df) → approaches the standard normal distribution
(Z-distribution).

3. Mean and Variance

●​ Mean = 0 (like the standard normal distribution).


●​ Variance is greater than 1 but approaches 1 as sample size increases.

4. Used When Population Standard Deviation (σ) is Unknown

●​ When σ is unknown, the sample standard deviation (s) is used instead.


●​ This introduces extra uncertainty, making the t-distribution more spread out than the
normal distribution.

5. More Probability in the Tails (Fat Tails)

●​ The t-distribution has thicker tails than the normal distribution.


●​ This means there is a higher chance of extreme values compared to the normal
distribution.

6. Approaches the Normal Distribution as n→∞

●​ As sample size increases, the t-distribution converges to the normal distribution.


●​ This is because the sample standard deviation (s) becomes a better estimate of σ with
larger samples.

Appropriate Sample Size Determination


Our decision is based on three variables:
1. The margin of error the researcher will tolerate.There is a trade-off between the margin of
error and sample size. A small margin of error will require a larger sample and more money and
time to collect the sample. A larger margin of error will permit a smaller sample and a wider
confidence interval.
2. The level of confidence desired, for example, 95 percent. Level should be high.
3. The variation or dispersion of the population being studied. If the population is widely
dispersed, a large sample is required. if the population is concentrated (homogeneous), the
required sample size will be smaller.
Margin of Errors,

For proportion,

π =Portion of Population

Chapter 10

HYPOTHESIS TESTING A procedure based on sample evidence and probability theory to


determine whether the hypothesis is a reasonable statement.
Five-Step Procedure for Testing a Hypothesis

NULL HYPOTHESIS A statement about the value of a population parameter developed for the
purpose of testing numerical evidence. H0 and read “H sub zero.”
ALTERNATE HYPOTHESIS A statement that is accepted if the sample data provide
sufficient evidence that the null hypothesis is false. H1 and read “H sub one.”
The null hypothesis will always contain the equal sign. =, ≥, ≤
For an alternate hypothesis, it must contain ≠, >, <.
LEVEL OF SIGNIFICANCE The probability of rejecting the null hypothesis when it is true.
It is also sometimes called the level of risk.
H0 Doesn’t Reject Reject

True Correct Type 1(α)

False Type 2 (β) Correct

TEST STATISTIC A value, determined from sample information, used to determine whether to
reject the null hypothesis.

CRITICAL VALUE The dividing point between the region where the null hypothesis is
rejected and the region where it is not rejected.
DECISION MAKING
H0 can not be accepted if,
Left tailed, Zcalc < Zcrit. Used when testing if the mean is significantly less than a specified
value. Example: A researcher wants to test if the average blood pressure of a group is lower
than 120 mmHg.
Right tailed, Zcalc > Zcrit. Used when testing if the mean is significantly greater than a
specified value.Example: A factory wants to know if a machine produces more than 500 units
per hour on average.
Two tailed, |𝑍𝑐𝑎𝑙𝑐|>|𝑍𝑐𝑟𝑖𝑡|. It is used when we are concerned with deviations in both
directions, not just one. We reject H0​if the test statistic falls into either of the two extreme tails
(upper or lower). The rejection region is split into two parts (each at α/2).
Differences in One and Two tailed

P Value
The probability of observing a sample value as extreme as, or more extreme than, the value
observed, given that the null hypothesis is true.
1.​ The p-value helps determine statistical significance.
2.​ A small p-value (≤0.05) suggests rejecting H0
3.​ The p-value must be compared to the significance level (α)
Type II Error
Chapter 11

Two-Sample Tests of Hypothesis: Independent Samples

When we want to know the difference between means or proportions of two different samples.

H1​: μ1≠μ2 (Two-tailed test)


or μ1>μ2 (Right-tailed test)
or μ1<μ2 (Left-tailed test)

Two Sample for Proportion

Two-Sample Tests of Hypothesis: Dependent Samples


Paired t-Test:
Pooled T test
A pooled t-test is used when two independent samples come from populations with equal
variances.
Degrees of Freedom

Degrees of freedom (df) refer to the number of values in a calculation that are free to vary while
still satisfying a given condition.For example, if you have 5 numbers with a known mean, only 4
numbers can be freely chosen because the last one is fixed by the mean.

In a t-test:

●​ For a one-sample t-test: df=n−1 (since the sample mean is already known).
●​ For an independent two-sample t-test (pooled): df=n1+n2−2
●​ For a paired t-test: df=n−1 (since differences are calculated first).

Equal and Unequal Variance

Features Equal (pooled) Unequal (Welch t test)

Assumption Population variances are Population variances are not


equal equal

Pooled Variance Used for better precision Not used

When to Use? When sample variances are When sample variances are
similar (check using Levene’s significantly different
test or F-test)

Advantages More precise if assumption Works even if variances are


holds different

Disadvantages Can lead to errors if variances Slightly more difficult to


are actually unequal calculate degrees of freedom

You might also like