0% found this document useful (0 votes)
22 views7 pages

Q3 Lectures STATS

Reviewer for stats in 3rd quarter

Uploaded by

Alexa Cunanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views7 pages

Q3 Lectures STATS

Reviewer for stats in 3rd quarter

Uploaded by

Alexa Cunanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

STATISTICS AND PROBABILITY – 3RD QUARTER

INTRODUCTION TO STATISTICS AND PROBABILITY

Statistics- It is the science of conducting studies to (1) Collect, (2) Organizes, (3) Present, (4) Analyze, and (5)
Interpret data. [COPAI]

Data are values that the variables can assume.


- They are the raw materials which the statistician works on.
- Data can be found through surveys, experiments, numerical records, and other modes of research.
Variable - is a characteristic that is observable or measurable in every unit of universe.
Population - is a set of all possible values of a variable.
Sample Group- is a subgroup of the population.
Statistician - is a person who simply collects information or one who prepares analysis or interpretations.
Categories of Statistics
1. Descriptive - is concerned with collecting, organizing, presenting, and analyzing numerical data.
2. Inferential - is concerned with analyzing the organized data leading to prediction or inferences.
Probability- is a body of knowledge that focuses on activities that involve predicting chances and quantifying
the randomness of events. It is primarily concerned with predicting chances, especially the occurrence of an
event.
An experiment is any probability activity that gives results which are known as outcomes.
The outcome is a result of a single trial of an experiment.
An event in probability can be defined as certain outcomes of a random experiment.
LESSON: RANDOM VARIABLE (DISCRETE AND CONTINUOUS)
Random Variable - is a variable whose value is unknown or a function that assigns values to each of an
experiment’s outcomes.
Types of Variables
1. Qualitative variables - Represent differences in quantity, character, or kind but not in amount.
Example: Gender, Color, Size (S/M/L), Skin complexion, Status
Classification of Qualitative Variables:
a. Nominal
b. Ordinal
c. Interval
d. Ratio
2. Quantitative variables - Are numerical in nature and can be ordered or ranked.
Example: Height, Weight, Age, Distance, Days
Classification of Quantitative Random Variable:
a. Discrete random variable - A variable which can take only a finite number of distinct values. It
represents count data.
Examples:
- Number of students in a classroom
- Number of storms per year in the Philippines
- Price of a pen
-

b. Continuous random variable - A variable which takes an infinite number of possible values on a
continuous scale. It represents measured data.
Examples:
- Length of time it takes to go from Pampanga to Manila via bus.
- Weight of an infant
- Height of a grade 11 pupil
-

LESSON: FINDING THE POSSIBLE VALUES OF RANDOM VARIABLES


Algebra Variable
In algebra a variable, like x, is an unknown value:
Example:

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

X + 2 = 6 (to find the value of x solve it by using the subtraction property of equality.)
Solution:
X+2=6
X+2–2=6–2
x=4
Random Variable
A random variable has a whole set of values, and it could take on any of those values, randomly.
Example 1: throw a die once
(random variable) X = "the score shown on the top face"
X = 1, 2, 3, 4, 5 or 6
Sample space = { 1, 2, 3, 4, 5, 6 }
We can show the probability of any one value using this style:
P(X = value) = probability of that value
(random variable) X = { 1, 2, 3, 4, 5, 6 }
in this case they are all equally likely, so the probability of any one is 1/6
P(X = 1) = 1/6
P(X = 2) = 1/6
P(X = 3) = 1/6
P(X = 4) = 1/6
P(X = 5) = 1/6
P(X = 6) = 1/6
P(X) = 6/6
Note that the sum of the probabilities = 1, as it should be.
Example 2: Drawing balls from a box
Two balls are drawn in succession without replacement from box containing 5 red balls and 6 blue balls. Let A
be the random variable representing the number of blue balls. Find the values of the random variable a
representing the number of blue balls.
First determine the probabilities of events.
Red-red = (5/11)(4 /10) Blue-red = (6 /11)(5/10)
=20/110 = 30/110
= 2/11 = 3/11
Red-blue = (5/11)(6 /10) Blue-blue = (6 /11)(5/10)
= 30/110 =30/110
= 3/11 = 3/11
(random variable) A = “the color of the picked ball”
(random variable) A = 0, 1, 2
Sample space = { 0, 1, 2 }
P(A = 0) = 2/11
P(A = 1) = 6/11
P(A= 2) = 3/11
P(A) = 11/11
X P(Z)
0 2/11
1 6/11
2 3/11

CHAPTER 1: PROBABILITY DISTRIBUTION


Probability distribution - A table which consists of the values of a random variable and the corresponding
probabilities of the values.

Properties Probability Distribution


1. The probability of each value of the random variable must be between or equal to 0 and 1. (0 ≤ P(X) ≤ 1)
2. The sum of the probabilities of all values of the random variable must be equal to 1. (ΣP(X) = 1)

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

Step in constructing Probability Distribution


1. Determine the sample space.
2. Count the number of each outcome and assign the random value of X.
3. Probability value P(X) to each value of random variable.
Example: Tossing two coins
Suppose two coins are tossed, let X be the random variable representing the number of heads that occur. Find
the values of the random variable X.
Steps Solution
1. Determine the sample space. Let The sample space for this
H represent Head and T for Tail. experiment is:
S = {TT, TH, HH, HT}

2. Count the number of heads in each


outcome in the sample space and Possible Value of
assign this number to this outcome. outcomes the
random
Variable x
(number
of heads)
TT 0
TH 1
HT 1
HH 2
3. There are four possible values of
The probability distribution of discrete the random variable X representing Value of Probabilit
random variable x the number of heads. These are 0, 1 the y P(X)
and 2. Assign probability values random
P(X) to each value of the random Variable
variable. X
(number
of heads)
0 1/4
1 2/4
2 1/4
X Probability
P(X)
0 1/4
1 2/4
2 1/4
LESSON: MEAN, VARIANCE, AND STANDARD DEVIATION OF PROBABILITY DISTRIBUTION
Mean ( μ), Variance (σ 2), and Standard Deviation (𝜎) of Discrete Random Variable - are statistical measures
used to describe the distribution of data.
Mean ( μ) of a Probability Distribution - is the arithmetic average value of a random variable. The symbol μ is
read as “myu”.
Formula to find the mean of probability distribution:
μ=Σ [X·P (X )]
Variance (σ 2) of a Probability Distribution - measures how far the data values are dispersed from the mean. The
symbol σ 2is read as “sigma squared”.
Formula to find the variance probability distribution:
2 2 2
σ =Σ[ X · P( X )]−μ

Standard Deviation (𝜎) of a Probability Distribution - is the square root of variance and is used to calculate the
amount of dispersion of the given data set values.
The symbol 𝜎 is read as “sigma”.
Formula to find the standard deviation probability distribution:
σ=√ Σ [ X 2 · P(X )]−μ 2
where,
X – value of the random variable
P( X)– probability of X
Σ - summation symbol

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

CHAPTER 2: NORMAL DISTRIBUTION

Lesson 2: Areas Under the Normal Curve


A standard normal distribution is a normal distribution with mean of (0) and standard deviation of (1). Basically, any normal
distribution could be transformed into this type.

The figure above illustrates a standard normal distribution. The values of the horizontal axis are the values of the random
variable Z, the transformed values of the random variable X.
The values of Z are computed using the formula:

where in: μ = mean and σ = standard deviation


Finding the Areas under the Normal Curve
The Standard Normal Distribution Table (z-Table) provides the area between the mean and some Z score. The z-table is
included in the next page for reference.

Four-Step Process in Finding the Areas Under the Normal Curve Given a z-Value
1. Express the given z-value in a four-digit form.
2. Using the z-Table, find the first two digits on the left column.
3. Match the third digit with the appropriate column on the right.
4. Read the area (or probability) at the intersection of the row and the column. This is the required area.

Lesson 3: Area under the Normal Curve and Probability Notation


• P(a < z < b) denotes the probability that the z-score is between a and b
• P(z > a) denotes the probability that the z-score is greater than a
• P(z < a) denotes the probability that the z-score is less than a where a and b are z-score values
Remember: Area represented by probability, percentage and proportion.

Lesson 4: Application of Normal Distribution


Each of the values obtained using the given formula is called a z-score. Basically, zscores are not only applicable for normal
distributions, but in essence, for any type of distribution. This is sometimes called standard score.

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

Z-score or Standard Score tells how many standard deviations a value is,
away from the mean. A negative z-score tells that the value is below the
mean, while a positive z-score tells that the value is above the mean. A z-
score is unitless, thus, even values of different units could
be compared relative to their groups.

Where: X= given measurement (random variable)


μ =population mean
σ = population standard deviation
X = sample mean
s = sample standard deviation

Lesson 5: Locating Percentiles Under the Normal Curve


Example: Find the 38th percentile of a normal curve.
Analysis: By definition of P38, this means locating an area before (or below) the point. We want to know what z-value is at this
point.

CHAPTER 3: SAMPLING AND SAMPLING DISTRIBUTION


Terms to remember:
Population is a group of phenomena that have something in common.
Sample is a smaller group of members of a population selected to represent the population.
Random is governed by or involving equal chances for each item. (Oxford dictionary)
Sampling is the action or process of taking samples of something for analysis. (Oxford dictionary)
Random Sampling is selecting samples from a population using chance methods or random numbers from the table of
random numbers.

Lesson 1: Type of Random Sampling


1. Simple Random Sampling – it is the most common random sampling method. Every member of the population has an equal
chance of being chosen.
2. Systematic Random Sampling - every member of the population is listed with a number, but instead of randomly generating
numbers, individuals are chosen at regular intervals.
3. Stratified Random Sampling – the population is divided into groups based on a shared characteristic. Each group is called a
stratum. Then, one or more choices are made at random from each stratum.
4. Cluster Sampling – is similar to stratified random sampling in that both begin by dividing the population into groups based
on a particular characteristic. But, while a stratified survey takes one or more samples from each of the strata, a
cluster sampling survey chooses clusters at random, then takes samples from them.
5. Multistage Random Sampling - a sampling process that uses more than one kind of sampling.

Lesson 2: Parameter and Statistic


A parameter is a measure or characteristic obtained by using all the data values in the population. The parameter can be
numerical or nominal level of measurement and is usually referred to as the true value of the population. For example, the
average age of67 years old for a population of N = 1,200 senior citizens in a certain barangay is called a parameter.
A statistic is a measure or characteristic obtained by using only the data values in a sample. A statistic is an estimate of
the parameter. For example, if 100 random samples will be obtained from the 1,200 senior citizens in the above example, then
the average age obtained from the 100 senior citizens is called statistic.

Lesson 3: Sampling Distribution of the Sample Means


A sampling distribution is the probability distribution for the values of the sample statistic obtained when random samples are
repeatedly drawn from a population.

Steps in constructing sampling distribution:


1. Using the formula of combinations in probability, we can find the number or possible samples.

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

2. We can now construct the frequency distribution of the sample means and its probability. This is the sampling distribution of
the sample means.
3. The histogram of the sampling distribution of the sample means is constructed by making a bar graph where the sample
means are plotted on the horizontal axis and the corresponding probabilities are shown in the vertical axis.

Lesson 4: Finding the Mean, Variance and Standard Deviation of the Sampling Distribution of the Sample Means
Steps in finding the mean, variance and standard deviation of the sampling distribution of the sample means:
Steps Formula
1. Compute the mean of the population μ.

2. Compute the mean of the sampling distribution of the


μx̄ = Σ[x̄ ⋅ P(x̄ )]
sample means μx̄ .

3. Compare μ and μx̄ . μx̄ = μ.

4. Compute the variance σ² and standard deviation σ of the


population.

5. Compute the variance (σx̄ 2) and standard deviation (σx̄ ) of


the sampling distribution of the sample means.

Lesson 5: The Variance and the Standard Deviation of the Sampling of the Sample Means for Finite and Infinite Population

CHAPTER 4: CENTRAL LIMIT THEOREM


Central Limit Theory (CLT) states that, given a large sample size, the sampling distribution of the mean for a variable
will approximate a normal distribution regardless of that variable’s distribution in the population.
In other words, the theorem is saying that the sampling distribution of the mean will draw near a normal distribution as
the size of the sample (n) increases, regardless of the shape of the original population distribution.
As you increase the number of samples the graph of the sample means will move towards a normal distribution.

1) If x is normal, x̄ is normal. If x is not normally distributed, x̄ is approximately normal for sufficiently large sample size.
2) The population mean is equal to the mean of the sampling distribution of the sample means (μ = μ x)

3) We use when the population is infinite.

4) We use when the population is finite.

CHAPTER 5: INTERVAL ESTIMATE OF POPULATION MEAN


Lesson 1: Interval Estimate
Definitions:
1. Interval Estimate (Confidence Interval) -It is a range of values that is used to estimate an unknown population mean (μ). This
estimate may contain the true value of the population mean.

MISS KATE S. ESGUERRA


STATISTICS AND PROBABILITY – 3RD QUARTER

2. Confidence Level (Degree of Confidence) - The confidence level of an interval estimate is the probability that the interval
estimate contains the unknown population mean. In determining an interval estimate, a degree of confidence (expressed as
percentage) must be set.
3. Margin of Error (E) Defined as the maximum likely difference between the observed sample mean (x̅) and the true value of
the population mean (μ). The length of the confidence interval is equal to twice the length of margin of error.

Lesson 2: Computing the Confidence Interval (With Known σ)

Lesson 3: T-Distribution
The t-distribution (aka, Student’s t-distribution) is a probability distribution that is used to estimate parameters when the sample
size is small and/or when the population variance/ standard deviation is unknown.
Properties of t-distribution
• Bell-shaped and symmetric about the mean which is 0 (same as standard normal curve)
• Standard deviation is a bit larger than 1 (slightly thicker tails than standard normal distribution)
• Gets narrower and more closely resembles standard normal distribution as df increases (nearly identical when df > 30)
• Precise shape depends on degrees of freedom
The degrees of freedom, denoted by df, refers to the number of scores in a
distribution that are free to vary without changing the mean of the distribution.
Formula: df = n – 1 (where n is the sample size)

Lesson 4: Reading the t-table


Step 1: Compute the degrees of freedom. df = n – 1
Step 2: Locate df at the left side of the t-table.
Step 3: Locate the confidence level (90%, 95%, 99%) at the bottom of the t-table.
Step 4: Get the value at the intersection of Steps 2 & 3.

Lesson 5: Computing the Confidence Interval (With Unknown σ)

MISS KATE S. ESGUERRA

You might also like