0% found this document useful (0 votes)
5 views9 pages

Statistics & Probability

The document provides an overview of statistics and probability, covering concepts such as probability, random variables, and measures of central tendency. It explains discrete and continuous random variables, probability distributions, and the normal distribution, including its properties and the use of z-scores. Additionally, it discusses the importance of random sampling techniques in data collection.

Uploaded by

24-12782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Statistics & Probability

The document provides an overview of statistics and probability, covering concepts such as probability, random variables, and measures of central tendency. It explains discrete and continuous random variables, probability distributions, and the normal distribution, including its properties and the use of z-scores. Additionally, it discusses the importance of random sampling techniques in data collection.

Uploaded by

24-12782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

STATISTICS & PROBABILITY

■​ tossing a coin three times


UNDERSTANDING OF PROBABILITY & EXPLORING
■​ rolling a dice twice
RANDOM VARIABLE
■​ drawing two balls in a box
●​Probability
★​ Is a field of mathematics that deals with chance
●​Experiment
★​ activity in which the results cannot be
predicted with certainty. Each repetition of an
experiment is called a trial.
●​Outcome
★​ a result of an experiment.
●​Event
★​ is any collection of outcomes
●​Simple event
★​ an event with only one possible outcome.
●​Sample space (S)

●​PROBABILITY
bb
★​ contains all possible outcomes of the
experiment.

★​ Sample space = {1, 2, 3, 4, 5, 6} (MUST


ALWAYS BE IN BRACES!)
★​ Probability of an event P(event) =
DISCRETE AND CONTINUOUS RANDOM VARIABLE

●​Discrete random variable


★​ A countable number of possible values
●​Continuous random variable
★​ can assume an infinite number of values in one
or more intervals
★​ needs measurement
★​ n(event) - no. of outcomes of the event ●​EXAMPLE
★​ n(sample space) - no. of all possible outcomes DISCRETE RANDOM CONTINUOUS
VARIABLE RANDOM VARIABLE
no. of pens in a box amount of antibiotics in a
vile
lc
no. of ants in a colony length of electric wires
no. of ripe bananas in a voltage of car batteries
basket
no. of COVID 19 positive cases weight of newborn in the
in Tanauan, Batangas hospital
no. of defective batteries amount of sugar in a cup
of coffee

EXPLORING RANDOM VARIABLE


PROBABILITY DISTRIBUTIONS OF DISCRETE
●​Random variable RANDOM VARIABLES
★​ a result of chance event, that you can measure
or count ●​Discrete probability distribution
★​ may be viewed as a way to map outcomes of ★​ consists of the values of random variable that
statistical experiment determine by chance can assume the corresponding probabilities
into a number
★​ a set whose elements are the numbers
assigned to the outcomes of an experiment. In
some experiment such as:

1
STATISTICS & PROBABILITY

bb
●​NOTE!!
★​ ALWAYS SIMPLIFY THE ANSWER
★​ HISTOGRAM SHOULD BE SHADED
★​ THE SUM OF ALL PROBABILITIES SHOULD
EQUAL TO 1
★​ PROBABILITIES SHOULD BE CONFINED
lc
BETWEEN 0 AND 1 (NEGATIVE IS NOT
ALLOWED)

MEASURE OF CENTRAL TENDENCY OF


COMPUTING PROBABILITY CORRESPONDING TO A UNGROUPED DATA
GIVEN RANDOM VARIABLE
●​Mean
★​ most commonly used measure of central
SYMBOL MEANING WORD PHRASES tendency. When we speak of average, we
< Less than Fewer than,
always refer to the mean. (𝑥 = x bar)
below Σ𝑥
> Greater than More than, above ★​ Formula: 𝑥 = 𝑁
≤ Less than or At most, no more ●​Median
equal to than ★​ middle value in a set of data. It is symbolized
≥ Greater than or At least, no less as (x̃) (read as “X – tilde”)
equal to than ★​ If there are two medians, add and then divide
≠ Not equal to Not equal to by two
●​Mode
★​ the value that occurs most often in the data

2
STATISTICS & PROBABILITY
̂ ) (read as X-hat)
set. It is symbolized as (X
★​ Types of modes: unimodal, bimodal, trimodal, &
multimodal

MEASURE OF CENTRAL TENDENCY OF GROUPED


DATA

●​Frequency Distribution Table


★​ The number of pieces of data that fall into a
particular class is called the frequency of the
class. A table listing all classes and their
frequencies is called a frequency distribution.
●​Mean
Σ(𝑓𝑥)
★​ Formula: 𝑥 = Σ𝑓

bb ●​Mode

●​Median
★​ median is the middle value in a set of
lc
quantities.
★​ separates an ordered set of data into two equal
parts. Half of the quantities are located above
the median and the other half is found below it.

3
STATISTICS & PROBABILITY
MEAN OF A DISCRETE RANDOM VARIABLE

●​Mean of a discrete random variable


★​ The Mean µ of a discrete random variable is the
central value or average of its corresponding
probability mass function. It is also called the
Expected Value. It is computed using the
formula: µ = ∑ 𝑋𝑃(𝑥)
★​ Where x is the outcome and P(x) is the
probability of the outcome.
VARIANCE AND STANDARD DEVIATION OF A
●​example 1
RANDOM VARIABLE

●​The variance and standard deviation are two values


that describe how scattered or spread out the
scores are from the mean value of the random
2
bb variable. The variance, denoted as σ , is determined
2 2
using the formula: σ = Σ(𝑥 − µ) 𝑃(𝑥)
★​ The standard deviation 𝜎 is the square root of
2
the variance, thus, = σ = Σ(𝑥 − µ) 𝑃(𝑥)
●​example 2 ●​Steps in finding the variance and standard
deviation
★​ 1. Find the mean of the probability distribution.
★​ 2. Subtract the mean from each value of the
random variable X.
★​ 3. Square the result obtained in Step 2.
★​ 4. Multiply the results obtained in Step 3 by the
corresponding probability.
★​ 5. Get the sum of the results obtained in Step 4.
●​example 3 ★​ 6. Results obtained is the value of the variance
★​ The probabilities that a customer will buy 1, 2, of probability distribution.
lc
3, 4, or 5 items in a grocery store are
3
,
1
, ●​example 1
10 10
1 2 3
10
, 10
, and 10
respectively. What is the
average number of items that a customer will
buy?
★​ To solve the above problem, we will follow 3
steps.
■​ Step 1: Construct the probability
distribution for the random variable X
representing the number of items that
the customer will buy.
■​ Step 2: Multiply the value of the random
variable X by the corresponding
probability. THE NORMAL DISTRIBUTION AND ITS PROPERTIES
■​ Step 3: Add the results obtained in Step
2. Results obtained is the mean of the ●​The Normal Distribution
probability distribution. ★​ Normal Probability Distribution - is a
probability distribution of continuous random
variables.

4
STATISTICS & PROBABILITY
★​ *Many random variables are either normally ■​ Inflection point is the point at which a
distributed or, at least approximately normally change in the direction of curve at mean
distributed. EXAMPLE: height, weights and minus standard deviation and mean plus
examinations standard deviation.
●​The properties of normal distribution ■​ Note that each inflection point of the
★​ 1. The graph is a continuous curve and has a normal curve is one standard deviation
domain -∞ < X < ∞. This means that X may away from the mean.
increase or decrease without bound. ★​ 7. Every normal curve corresponds to the
★​ 2. The graph is asymptotic to the x-axis. The “empirical rule” (also called the 68 - 95 99.7%
value of the variable gets closer and closer but rule):
will never be equal to 0 . ■​ About 68.3% of the area under the curve
★​ 3. The highest point on the curve occurs at x = falls within 1 standard deviation of the
µ (mean). The mean (µ) indicates the highest mean (half = 34.15)
peak of the curve and is found at the center. ■​ About 95.4% of the area under the curve
■​ The median and mode of the distribution falls within 2 standard deviations of the
are also found at the center of the graph. mean (half = 47.7)
This indicates that in a normal
bb ■​ About 99.7% of the area under the curve
distribution, the mean, median and mode falls within 3 standard deviations of the
are equal . mean. (half = 49.85)
■​ Graph THE STANDARD NORMAL DISTRIBUTION

●​The standard normal distribution, which is denoted


by Z, is also a normal distribution having a mean of
0 and a standard deviation of 1.
●​Since the normal distribution can have different
values for its mean and standard deviation, it can
be standardized by setting the µ = 0 and the 𝝈 = 1
●​The Z - Table
★​ Let us get a closer look at the z-table. The
outermost column and row represent the
★​ 4.The curve is symmetrical about the mean. z-values. The first two digits of the z-value are
This means that the curve will have balanced found in the leftmost column and the last digit
lc
proportions when cut in halves. (hundredth place) is found on the first row.
■​ Ex. ●​Identifying regions under the normal curve
★​ Now that you already know how to use the
z-table to find the corresponding area for the
z-score, let us identify the regions under the
normal curve that corresponds to different
standard normal values. In order to find the
regions, a probability notation is used.
★​ The probability notation P(a < Z < b) indicates
that the z-value is between is above a and b,
★​ 5. The total area in the normal distribution
P(Z > a) means z-value a and P(Z < a) means
under the curve is equal to 1 . Since the mean
z-value is below a. It would not matter whether
divides the curve into halves, 50% of the area is
we are considering P(Z < a) or P(Z ≤ a) or P(Z >
to the right and 50% to its left having a total of
a) or P(Z ≥)
100% or 1.
●​Steps
★​ 6. In general, the graph of a normal distribution
★​ 1: Draw a normal curve and locate the z - scores
is a bell-shaped curve with two inflection
and shade
points, one on the left and another on the right.
★​ 2: Locate the corresponding area of the z -
Inflection points are the points that mark the
scores in the z-table.
change in the curve’s concavity.

5
STATISTICS & PROBABILITY
★​ 3: If you are looking for the area between two z ●​Three important things to remember
scores, simply subtract the corresponding ★​ First, a probability value corresponds to an
areas to arrive at the answer area under the normal curve.
●​The z-score ★​ Second, in the Table of Areas Under the
★​ The z-score is an essential component in Normal Curve, the numbers in the extreme left
standard normal distribution. This allows us to and across the top are z scores, which are the
describe a given set of data by finding the distances along the horizontal scale. The
z-scores. This leads us to a question of how numbers in the body of the table are areas or
z-scores are identified probabilities.
★​ Given a normal random variable X with mean ★​ Third, the z –scores to the left of the mean are
(µ) and standard deviation ( 𝜎), each value of x negative values.
of the variable can be transformed into ●​Example 1
z-scores using the formula, ★​ Having an obtained score of 85 in a recently
concluded unit test in Science, John wanted to
know how he fared in comparison with his
classmates. His teacher told him that he scored
bb at the 90th percentile. What is the
corresponding z –score of the 90th percentile?
●​Steps ★​ Procedure:
★​ 1: Write the formula ■​ STEP 1: Draw the appropriate normal
★​ 2: Substitute the given values curve
★​ 3: Perform the operation
★​ 4: Write the corresponding z –score
●​NOTE: answer must be in percent form and 2
decimal places. If the answer is 95.5 make it
95.50

LOCATING PERCENTILES UNDER THE NORMAL


CURVE
■​ STEP 2: Split 90% or 0.9000 into
●​percentile 0.5000 + 0.4000
★​ A percentile is a measure of relative standing. It ■​ STEP 3: Shade 0.5000 of the sketch of
lc
is a descriptive measure of the relationship of a the normal curve in Step 1
measurement to the rest of the data.

■​ STEP 4: Find the area of 0.4000 in the


body of z –table. If it cannot be found in
the table, get the area value nearest to it.
●​Remember ■​ STEP 5: Identify the corresponding
★​ When we are given the area and we wish to find z–score of the found area.
the corresponding z –value, we locate the given ★​ The nearest value is 0.3997
area at the body of the table. ★​ The corresponding z –score is 1.28.
★​ If the exact area is not available, we take the ★​ Therefore the z–score that
nearest value. Then, we look up the corresponds to the 90th percentile on
corresponding z –value in the Table of Areas the normal curve is 1.28.
Under the Normal Curve or z –table.

6
STATISTICS & PROBABILITY
●​Example 2 IDENTIFYING THE DIFFERENT RANDOM SAMPLING
★​ A score in 96th percentile. Where is the score
TECHNIQUE
under the normal curve?
●​Definition of terms
★​ Procedure:
★​ Population – the set of all possible values of a
■​ STEP 1: Draw the appropriate normal
variable.
curve
★​ Sample – It consists of one or more data drawn
■​ STEP 2: Split 96% or 0.9600 into 0.5000
from the population.
+ 0.4600
★​ Random Sampling – it is a sampling method of
■​ STEP 3: Shade 0.5000 of the sketch of choosing representatives from the population
the normal curve in Step 1 wherein every sample has an equal chance of
■​ STEP 4: Find the area of 0.4600 in the being selected. Accurate data can be collected
body of z –table. If it cannot be found in using random sampling techniques.
the table, get the area value nearest to it. ★​ Probability Sampling – the sampling
■​ STEP 5: Identify the corresponding techniques that involve random selection.
z–score of the found area. ★​ Non–Probability Sampling – the sampling
★​ It is between the values of 0.4599 techniques that do not involve random
bb ★​ The corresponding z –score of 0.4599 selection of data.
is 1.75. ●​Different types of RANDOM sampling
★​ Therefore the z–score that ★​ SIMPLE RANDOM SAMPLING – is the most
corresponds to 96th percentile on the basic random sampling wherein each element
normal curve is 1.75. in the population has an equal probability of
●​Example 3 being selected.
★​ Find the upper 10%of the normal curve.
★​ Solution:
■​ Express the given percentage as
probability 10% or 0.1000
■​ Using the upper side of the mean, find
the remaining area. 0.5000 – 0.1000 =
0.4000
■​ Find the area of 0.4000 in the body of
the z –table. The nearest value is 0.3997
★​ Therefore the upper 10% is above z = ★​ SYSTEMATIC RANDOM SAMPLING – this can
lc
1.28 be done by listing all the elements in the
●​Example 4 population and selecting every kth element in
★​ The results of a nationwide aptitude test in your population list.
mathematics are normally distributed with 𝝁=
80 and 𝝈=15. What is the percentile rank of a
score of 84?
★​ Solution:
■​ Convert raw score of 84 to z –score form
𝑥−µ 84 − 80
𝑧 = σ
= 15
= 0. 27
■​ Find the area that corresponds to z =
0.27 = 0.1064 ★​ STRATIFIED RANDOM SAMPLING – is a
random sampling wherein the population is
■​ Get the total area below z = 0.27 0.5000
divided into different strata or divisions. The
+ 0.1064 = 0.6064
number of samples will be proportionately
★​ The percentile rank of the score 84 in
picked in each stratum that is why all strata are
a test is 60.64%.
represented in the samples.

7
STATISTICS & PROBABILITY
★​ SNOWBALL SAMPLING – defined as a non
probability sampling technique in which the
samples have traits that are rare to find. This is
a sampling technique, in which existing
subjects provide referrals to recruit samples
required for a research study.

★​ CLUSTER SAMPLING – is a random sampling


wherein the population is divided into clusters
★​ QUOTA SAMPLING – sample units are picked
or groups and then the clusters are randomly
for convenience but certain quotas are given to
selected. All elements of the clusters randomly
interviewers. This design is specially used in
selected are considered the samples of the
bb market research. Researchers choose these
study.
individuals according to specific traits or
qualities.

★​ VOLUNTEER SAMPLING – sample units are


volunteers in studies wherein the measuring
process is painful or troublesome to a
respondent
●​Different types of NON-PROBABILITY sampling
★​ CONVENIENCE SAMPLING – the researchers
lc
gather data from nearby sources of the
information exerting minimal effort.
Convenience is being used by persons giving
questionnaires on the streets to ask the
passersby.

ESTIMATION OF PARAMETERS

●​Parameter
★​ It is a descriptive population measure. It is a
measure of the characteristics of the entire
population (a mass of all the units under
consideration that share common
characteristics) based on all the elements
within that population.
■​ ex. All people living in one city, all male
teenagers worldwide, all elements in a
shopping cart, and all students in a
classroom.

8
STATISTICS & PROBABILITY
●​Statistic
★​ It is a number that describes the sample. It can
be calculated and observed directly.
★​ The statistics is a characteristic of a
population or sample group. You will get the
sample statistic when you collect the sample
and calculate the standard deviation and the
mean.
★​ ex. Researchers interviewed 70% of Covid 19
survivors.
●​Definition of Terms
★​ Parameter
■​ The measurement or quantity that
describes the population
★​ Statistic
■​ The measurement or quantity that
bb describes the sample

COMPUTING FOR THE PARAMETER AND STATISTIC

●​Population Mean

●​Population Variance

●​Population Standard Deviation


lc
●​Sample Mean

●​Sample Variance

●​Sample Standard Deviation

You might also like