0% found this document useful (0 votes)
37 views15 pages

LQ1 Notes

Uploaded by

sminatozaki762
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views15 pages

LQ1 Notes

Uploaded by

sminatozaki762
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Basic-Definitions-and-Concepts Nature of data and Variable

• Population- specific collection of objects • Qualitative data- Categorical,


of interest. measurements for which there is no
• Parameter- number that summarizes natural numerical scale.
aspect of the population as a whole. - Descriptive information
(Characteristics of the population) • Quantitative data- Numerical
• Sample- a group of subjects taken from measurements/ information
the population (Subset of population). 1. Discrete- Countable e.g 0,1,2,3,4,…
• Sample data- measurements of sample 2. Continuous- Measurable; uncertain
elements (Characteristics of a sample). number e.g. weight, age
• Census- Sample consists of the whole
Data collection and Presentation
population.
• Measurement- number or attribute for • Data list- explicit listing of all individual
each of population. measurements.
• Statistics- collect, organize, summarize,
analyse and draw conclusions from the
data.
Set notation:
1. Descriptive Statistics- organizing,
displaying, and describing the data. • Data frequency tables

2. Inferential Statistics

- infer information about the population.


Convenient when a data set is large and
-Generalizing from samples
the number of distinct values is not too
-Hypothesis test/ making predictions large.
Some common data displays • Median (Md)- middle value when the data
is arranged in ascending order.
• Stem-and-leaf diagram
1. Let the numbers in the tens place
(from 2 to 9) and the number 10
make up "stems".
2. Arrange the stems in numerical
order.
3. Let the number in the units place
for each measurement be "leaf". • Mode (Mo)- most common value; Highest
4. Place the leaves in a row to the point of the frequency histogram.
right of the corresponding stems.
5. Arrange the leaves in numerical
order.

• Frequency Histogram
Purpose: To provide a graphical display
that gives a sense of data distribution
across the range of values.

Measures of Central Tendency

• Mean ( μμ) - sum of all observations


divided by the total number of
observations.
-Easily affected by the presence of
extreme values
Measures of Variability

• Range-

• Variance-

• Standard Deviation- S and σ

• Leptokurtic: Slender Shape, Fatter Tails


• Platykurtic: Broad Shape, Thinner Tails
Combination: Cannot Reoccur

n! P ( A) + P ( B )
nCr =
(n − r )!r ! “Ace or King”

Permutations: Can Reoccur

n! P( A) + P( B) − P( A & B)
nPr =
(n − r )! “Ace and King”
--------------------------------------------------
“4 blue balls and 3 red balls. Probability
Circular (For unique/people) of getting blue ball. There are 7 total
number of balls. Pick 3 times”
( n − 1)!
W/ Replacement Concepts
Circular for Repeated Patterns: Every time we pick up a ball, it is
(Bracelet(Red/Blue/Red/Blue)) REPLACED by a random another ball.

(n − 1)! 4 4 4
• •
7 7 7
2 W/o Replacement Concepts
Repeating items / terms
Every time we pick up a ball, it is NOT
n! REPLACED by a random another ball.
a !b!c !... 4 3 3
• •
Let [n] be the total number of 7 6 5
variables and [a, b, c, …, z.] be the
For the first try there are still 4 blue balls
repeated variables
and 3 red balls, thus the probability is
Ex. For “MATHEMATICS” given
Total number of words [n]=11 For the second try, provided that we
picked up a blue ball during the first try,
Number of repeated M’s [a] = 2
then the remaining blue balls are 3 and
Number of repeated A’s [b] = 2 3 red balls.
Number of repeated T’s [c] = 2 For the third try, provided that we picked
up a red ball during the second try, then
11! the remaining blue balls are still three
Then… but two remaining for the red balls.
2!2!2!
Intro to Discrete Probability
Distribution

Random Variables
-Numerical Outcomes
(Probabilities) of a random event fig.1

Ex. Binomial Distribution (Orange Bars)


X 1 2 3 -You can perceive this as a
P(x) 5/9 1/9 3/9 probability of “yes or no”
-The probability of yes has an
equal probability of no in the other side of
Discrete should be: the curve
• Distinct Normal Distribution (Green Line)
• Listable
-In stats, when in an infinite
• Countable
number of probabilities, it’s going to
result in a bell curve (normal distribution)
Discrete Uniform Distribution Also applies to Science.
-Every Probability outcome of
event are equal
Ex. Referring to Fig. 1

Discrete Uniform Distribution 5 C0 1


P(x=0)
0.2 32 32
5 C1 5
0.1 P(x=1)
32 32
0 5 C2 10
x P(x=2)
32 32
A B C D E F 5 C3 10
P(x=3)
32 32
Binomial Distribution 5 C4 5
P(x=4)
32 32
- (Trials here are independent)
5 C5 1
P(x=5)
-Probability of outcomes in terms 32 32
of R.V. (x)
Side Note:
n!
Combination = n Cx =
x !(n − x)!
 n  x n− x
Binomial Probability:  x 
p q Binomial Hypergeometric
  -Independent -Dependent

If looking for Probabilities with specific


range of numbers… then add them all:
Can be used Reasonable
P(x<n)=P(X1) + P(X2) + P(X3) +…P(X<n)
To approximate Approximation
Hypergeometric
Hypergeometric Distribution
When samples
-Trials are dependent
Are 5% of total
-W/o replacement
Population.

Success Combination Failure Combination


Same as Binomial

 a  N − a 
Probability  n 
 
 x
  

P( x) =   
x n x
N
Total   Random Sample Can also be used for more than two
n  groups:
[When people are being used,
automatically w/o replacement] P( x) =
( G1)( G 2 ) (G3)
[Success is the term used for what (TOTAL _ COMBI )
you’re “looking for”]
Random Variables is limited to the
values. It cannot be higher than the
Additional Notes: sample size or success total.

When looking for standard deviation: Additional Notes:


When looking for standard deviation:
 = npq  N −n
 =  npq
 N −1 
When looking for mean:

 = np When looking for mean:


 = np
Poisson Distribution 2. Exponential Distribution
-Events are independent
-Event rates are constant Standard Normal Distribution:
-Probability of occurrences in -Symmetrical Bell Curve
terms of: time, distance, area or volume
x−
 x e−  Z=
P( x) = 
x!
-The Z-value will always indicate
t e
x − t
that it is from the center.
If with time: P( x) =
x! -A value of -2.00 means that the
whole area of the shaded bell curve is
Additional Notes:
.4772 (47.72%) to the left from the center
 = (mean)
2 = -A value of +2.00 means that the
=  whole area of the shaded bell curve is
.4772 (47.72%) to the right from the
center (mean).
Binomial Poisson -The whole half of the bell curve is
-Independent -Dependent equal to .5 or 50%. Thus, both sides
would equal to a 100%.
-If you’re looking for a very specific
Poisson area in the far side right of the curve, you
have to subtract your calculated z-value
 = np
from 0.5.
-Do similar to the left side.
Precise approximate of Binomial if
sample is large and proportion success is
small.
Poisson if
• X (random variable) = countable
• P(x) = Independent
• P(x) = rates are constant..

Continuous Probability Distribution


1. Standard Normal Distribution
Central Limit Theorem Exponential Distribution
-Independent Random Variables -Continuous Probability
(w/ replacement) tends toward a Distribution since TIME is continuous
standard normal distribution, as sample
size approaches infinity -Questions that are mostly TIME
-(This means that when a number of • “How long?”
possible scenarios keeps adding and • “How much time?”
continuing, scientists notice that the
probability of things tends toward the KEEP IN MIND:
center and lesser and lesser as it
1
approaches the side (Bell curve))
=
-See fig.1 for example (Reiterated)  LAMBDA (AVERAGE)

-A sample size of 10-20 is large enough IS EQUAL TO


THE RECIPROCAL OF
True even if: THE MEAN
• Probabilities not normally Another meaning:
distributed
• Mean and standard deviation Lambda = rate parameter
is given Population mean = Expected duration
Additional Notes:
 Formula mostly used: 1 − e− x
S= - sample standard
n It is a Cumulative Distribution Function
As sample size increases, standard Complement Rule:
deviation decreases
P( x  x1 ) = 1 − e−  x
Greater sample size = more precise bell
curve. (Rather than box type as it can be P( x  x2 ) = 1 − [1 − e−  x ] = e−  x
noticed from binomial distribution)
Interval Approach
Definition:
By sample, it means that it is
P( x2  x  x1 ) = 1 − [ P( x  x2 ) + ( x  x1 )]
selected from a population Too Low Too High

Inside of Interval

Outside of Interval
1
Variance: V [ x] =
2
Test for Means
Null Hypothesis Test Statistic Remarks

( x1 − x2 ) − d  1 and  2 are Known


z=
H  : 1 − 2 = d  12  22 (or sample size are larger)
+
n1 n2 Use Z-Distribution

 1 =  2 but Unknown

( x1 − x2 ) − d (n1 − 1) s12 + (n2 − 1) s2 2


t= Sp =
2
H  : 1 − 2 = d 1 2 n1 + n2 − 2
Sp +
n1 n2
Use t-distribution with
df = n1 + n2 − 2

1   2 and Unknown

Use t-distribution with


( x1 − x2 ) − d
t=  s12 s2 2 
2

H  : 1 − 2 = d s12 s2 2  + 
+ df =  n1 n2 
n1 n2 2
 s12   s2 2 
2

   
 n1  +  n2 
n1 − 1 n2 − 1
Paired Observations
d − d
t=
H  :  = d  sd  Use t-distribution with
 
 n df = n − 1

Test for Proportions

Null Hypothesis Test Statistic Remarks

Number of “successes” in the samples:


x1 and x2

pˆ1 − pˆ 2 x1 x2
z= Sample Proportions: pˆ1 = and pˆ 2 =
H  : p1 − p2 = 0 1 1
ˆ ˆ + 
pq
n1 n2

 n1 n2  x1 + x2
Pooled Sample Proportion: pˆ =
n1 + n2

Use the Z-distribution


Test for Means
Null Hypothesis Test Statistic Remarks

( x1 − x2 ) − d  is Known
z=
H  :  =   12  22 (or sample is large)
+
n1 n2 Use Z-distribution

x −   is Unknown
t=
H  :  =   s  Use t-distribution with
 
 n df = n − 1

d − d Paired Observations
t=
H  :  = d  sd  Use t-distribution with
  df = n − 1
 n

Test for Proportions

Null Hypothesis Test Statistic Remarks

x
pˆ =
pˆ − p
, where x is the
n
H  : p = p z= p q
number of “successes” in
the sample, the sample size
n is n  30
Use the Z-distribution
Notes Prepared by:

Rafael E. Barnuevo (BS CpE)


Allan Emmanuel B. Umali (BS ECE)
Tests for Means

Null Hypothesis Test Statistic Remarks

𝜎 is Known
𝑥̅ − 𝜇
𝑧=
𝐻 ∶ 𝜇=𝜇 𝜎 (or sample is large)
√𝑛 Use Z-distribution

𝑥̅ − 𝜇 𝜎 is Unknown
𝑡=
𝐻 ∶ 𝜇=𝜇 𝑠
√𝑛 Use t-distribution with 𝑑𝑓 = 𝑛 − 1

Tests for Proportions

Null Hypothesis Test Statistic Remarks

𝑝̂ = , where 𝑥 is the number of “successes” in the sample, the


𝐻 :𝑝 = 𝑝 𝑝̂ − 𝑝
𝑧= sample size is 𝑛 ≥ 30
𝑝 𝑞
𝑛 Use the Z-distribution
Tests for Means

Null Hypothesis Test Statistic Remarks

(𝑥̅ − 𝑥̅ ) − 𝑑 𝜎 and 𝜎 are Known


𝑧=
𝐻 ∶ 𝜇 −𝜇 =𝑑 𝜎 𝜎 (or sample size are large)
+
𝑛 𝑛 Use Z-distribution

𝜎 = 𝜎 but Unknown
(𝑥̅ − 𝑥̅ ) − 𝑑
𝑡= (𝑛 − 1)𝑠 + (𝑛 − 1)𝑠
𝐻 ∶ 𝜇 −𝜇 =𝑑 1 1 𝑠 =
𝑠 + 𝑛 +𝑛 −2
𝑛 𝑛
Use t-distribution with 𝑑𝑓 = 𝑛 + 𝑛 − 2

𝜎 ≠ 𝜎 and Unknown
Use t-distribution with
(𝑥̅ − 𝑥̅ ) − 𝑑
𝑡=
𝐻 ∶ 𝜇 −𝜇 =𝑑 𝑠 𝑠
𝑠 𝑠 +
+ 𝑛 𝑛
𝑛 𝑛 𝑑𝑓 =
𝑠 𝑠
𝑛 𝑛
+
𝑛 −1 𝑛 −1

𝑑̅ − 𝑑 Paired Observations
𝐻 ∶ 𝜇 =𝑑 𝑡=
𝑠
Use t-distribution with 𝑑𝑓 = 𝑛 − 1
√𝑛

Tests for Proportions

Null Hypothesis Test Statistic Remarks

Number of “successes” in the samples: 𝑥 and 𝑥


Sample Sizes: 𝑛 and 𝑛
𝐻 :𝑝 − 𝑝 = 0 𝑝̂ − 𝑝̂
𝑧= Sample Proportions: 𝑝̂ = and 𝑝̂ =
1 1
𝑝̂ 𝑞 +
𝑛 𝑛
Pooled Sample Proportion: 𝑝̂ =

Use the Z-distribution


Tests for Variances: One Population

Null Hypothesis Test Statistic Remarks

Underlying population is approximately normally distributed.


Sample Size: 𝑛
𝐻 :𝜎 = 𝜎 (𝑛 − 1)𝑠
𝜒 = Sample Standard Deviation: 𝑠
𝜎
Use the 𝜒 – distribution with degrees of freedom 𝜈 = 𝑛 − 1.

𝐻 Critical Region

𝜎 >𝜎 𝜒 >𝜒

𝜎 <𝜎 𝜒 <𝜒

𝜎 ≠𝜎 𝜒 <𝜒 or 𝜒 > 𝜒

Tests for Variances: Two Populations

Null Hypothesis Test Statistic Remarks

Underlying populations are approximately normally distributed.


Sample Sizes: 𝑛 and 𝑛
𝐻 :𝜎 − 𝜎 = 0 𝑠
𝑓= Sample Variances: 𝑠 and 𝑠
𝑠
Use the F-distribution with 𝑣 = 𝑛 − 1 and 𝑣 = 𝑛 − 1.

𝐻 Critical Region

𝜎 −𝜎 >0 𝑓 > 𝑓 (𝑣 , 𝑣 )

𝜎 −𝜎 <0 𝑓<𝑓 (𝑣 , 𝑣 )

𝜎 −𝜎 ≠0 𝑓>𝑓 (𝑣 , 𝑣 ) or 𝑓 < 𝑓 (𝑣 , 𝑣 )

You might also like