Reviewer in Statistics and Probability
Reviewer in Statistics and Probability
Random Variable - a numerical quantity that is assigned to the outcome of an experiment. It is a variable
that assumes numerical values associated with the events of an experiment. A random variable is a
quantitative variable which values depends on change.
Sample Space The set of possible outcomes of a random experiment; denoted by a capital letter, usually
S.
Example: A coin is flipped three times. If 𝑋 represents the number of tails of the outcome, what are the
possible values of 𝑋?
Solution: 1. List the possible outcomes of the experiment. This can be done using a table or tree diagram.
Let 𝐻 represent heads and 𝑇 represent tails.
Based on the table, the number of tails can be 0, 1, 2, and 3. Thus, the possible values of 𝑋 are 0,1, 2,3.
Probability distribution is a mathematical function that describes the probability of different possible
values of a variable. Probability distributions are often depicted using graphs or probability tables.
CONTINUOUS RANDOM VARIABLE can assume an infinite number of values in one or more intervals
and can take a fraction or decimal values. Example: A student’s weight is a continuous random variable
since its possible values can be represented by decimal numbers such as 80.3 kg and 57.12 kg. The
number of its possible values is not countable, and there can be an infinite number of values.
Mean(µ) of Discrete Random Variable - measure of the central location of a random variable or Weighted
average.
Variance and Standard Deviation - are two values that describe how scattered or spread out the scores
are from the mean value of the random variable. σ² = ∑(𝑥 − µ)² p(x)
Standard deviation σ -σ = √∑(x - µ)² p(x)
(σ² - variance, σ – standard deviation, µ - mean, p(x) – probability of the outcome)
Example: The number of cars sold per day at a local car dealership, along with its corresponding
probabilities, is shown in the succeeding table. Compute the variance and the standard deviation of the
probability distribution by following the given steps.
Population - The entire group that you want to draw conclusions.
Sample - Specific group that you will collect data from the population.
Mean and Standard Deviation – two parameters.
Normal Distribution - the graph of a normal distribution, also known as Gaussian distribution.
It is a probability distribution of continuous random variables. Many random variables are either normally
distributed or, at least approximately normally distributed. Examples: Height, Weights, and examination
scores.
Inflection points are the points that mark the change in the curve’s concavity.
Inflection point is the point at which a change in the direction of curve at mean minus standard deviation
and mean plus standard deviation.
• Note that each inflection point of the normal curve is one standard deviation away from the mean.
The Z – Table - The outermost column and row represent the z-values. The first two digits of the z-
value are found in the leftmost column and the last digit (hundredth place) is found on the first row.
given z = 1.85, the area is equal to 0.9678.
The Percentile - is a measure used in statistics indicating the value below which a given percentage of
observations in a group of observations fall.
Random sampling is a selection of n elements derived from the N population, which is the subject of an
investigation or experiment, where each point of the sample has an equal chance of being selected using
the appropriate sampling technique.
2. Systematic sampling - is a sampling technique in which members of the population are listed and
samples are selected at intervals called sample intervals. In this technique, every nth item in the list
will be selected from a randomly selected starting point.
For example, if we want to draw a 200 sample from a population of 6,000, we can select every 3rd person
in the list. In practice, the numbers between 1 and 30 will be chosen randomly to act as the starting point.
3. Stratified random sampling - is a sampling procedure in which members of the population are
grouped on the basis of their homogeneity. This technique is used when there are a number of distinct
subgroups in the population within which full representation is required. The sample is constructed by
classifying the population into subpopulations or strata on the basis of certain characteristics of the
population, such as age, gender or socio-economic status. The selection of elements is then done
separately from within each stratum, usually by random or systematic sampling methods.
4. Cluster sampling - is sometimes referred to as area sampling and applied on a geographical basis.
Generally, first sampling is performed at higher levels before going down to lower levels.
For example, samples are taken randomly from the provinces first, followed by cities, municipalities or
barangays, and then from households.
B. Construct the sampling distribution of sample means. List all the possible outcome and get the mean
of every sample.
C. This time, let us make a probability distribution of the sample means. This probability distribution is
called the sampling distribution of the sample means.
Central Limit Theorem states that the sampling distribution of the mean approaches a normal
distribution, as the sample size increases.
Regardless of the initial shape of the population distribution, if samples of size n are randomly selected
from a population, the sampling distribution of the sampling means will approach a normal distribution as
the sample size n gets larger.