0% found this document useful (0 votes)
19 views7 pages

Statistics and Probability Reviewer

The document provides an overview of statistics, including inferential statistics, data collection methods, and famous statisticians such as Gertrude Cox and Ronald Fisher. It covers key concepts like random variables, probability distributions, binomial probability experiments, and sampling techniques. Additionally, it discusses the normal distribution and its properties, including the empirical rule and methods for calculating means and variances in statistical analysis.

Uploaded by

alfred.base.25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views7 pages

Statistics and Probability Reviewer

The document provides an overview of statistics, including inferential statistics, data collection methods, and famous statisticians such as Gertrude Cox and Ronald Fisher. It covers key concepts like random variables, probability distributions, binomial probability experiments, and sampling techniques. Additionally, it discusses the normal distribution and its properties, including the empirical rule and methods for calculating means and variances in statistical analysis.

Uploaded by

alfred.base.25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MODULE 1: INTRODUCTION  Inferential statistics: Analyzing the

TO STATISTICS AND data.


PROBABILITY SOURCES OF DATA
 Primary: Surveys, Interview, Direct
observation.
FAMOUS STATISTICIANS
 Secondary: Newspapers, Journals,
Research paper.
GERTRUDE COX METHODS OF DATA COLLECTION

 First lady of statistics.  OBSERVATION


 Cox was a pioneer in the newly  EXPERIMENTATION
formed discipline of statistics and  SIMULATION
one of the first women in the field.  INTERVIEWING
FLORENCE NIGHTINGALE TOOLS FOR DATA COLLECTION
 Mother of Statistics and the founder  Focus group discussion
of Modern Nursing.  Interview
 She was an innovator in the  Survey
collection, tabulation, interpretation,  Observation
and graphical display of descriptive
statistics. DATA SCIENCE

RONALD FISHER  A scientific discovery and practice


that involves the collection,
 Was a British statistician and management, processing, analysis,
biologist who made significant visualization, and interpretation of
contributions to experimental design vast amounts of data.
and population genetics.  Used to make decisions and
 Father of Modern Statistics and predictions.
Experimental Design  The center of data science is data,
WHAT IS STATISTICS? especially Big Data.
 Considered as an interdisciplinary
 Collecting discipline.
 Analyzing  To obtain information or knowledge
 Interpreting from the data that will help in
 Presenting making better decisions.
 Organizing
DATA SCIENTIST
AREA OF STATISTICS
 Data scientists are one-part
 Descriptive statistics: Describing mathematician, one-part computer
the data. scientist, and one-part trend-spotter
because of their duties to collect
large amounts of unruly data and MODULE 2: RANDOM
organizing them. VARIABLES
TOOLS OF TRADE
PYTHON
RANDOM VARIABLE
 A programming language.
 A numerical quantity that is
generated by a random experiment.
 It is denoted by a capital letter,
PROBABILITY
usually X.
 A field of mathematics that deals
TYPES OF RANDOM VARIABLE
with chance.
 DISCRETE – it has either a finite or
INTERPRETATION OF PROBABILITY
a countable number of possible
Classical values.
 CONTINUOUS – it has an
 A basic probability theory that deals
uncountable number or possible
with situations where all possible outcomes.
outcomes are equally likely.
PROPERTIES OF DISCRETE
Frequentist PROBABILITY DISTRIBUTION
 Probability is a long-run process.
 The probability of each value of a
Bayesian discrete random variable is between
0 and 1 inclusive.
 Probability is a degree of belief.  The sum of all probabilities is 1.
Subjective  A Discrete probability distribution is
a listing of all possible values of a
 Results from intuition, educated discrete random variable along with
guess, and estimate. their corresponding probabilities. It
SAMPLE SPACE is represented in tabular,
graphical/Histogram or formula
 A set that contains all possible form.
outcomes.
STEPS IN GETTING THE
EXPERIMENT PROBABILITY OF EACH VALUE OF
THE RANDOM VARIABLE
 An activity in which the results
cannot be predicted. 1.Assign letters that will represent each
outcome.
OUTCOME
2. Determine the sample space (possible
 Result of an experiment.
outcomes).
3.Count the number of the value of the BINOMIAL PROBABILITY
random variable. NOTATION
4.Given the total possible values of the FORMULA:
random variable, assign probability values to 𝒏
each value of the random variable. 𝒑(𝒙) = ( ) 𝒑𝒙 𝒒𝒏−𝒙
𝒙
𝒏 𝒏!
( )=
PROBABILITY MASS FUNCTION 𝒙 (𝒏 − 𝒙)! 𝒙!
(PMF) x=number of success
 A probability distribution describes n=number of trials
the probability of each specific value
in a random variable. The probability p=probability of success
distribution of a discrete random q=probability of failure
variable is called a probability mass
function.
PROPERTIES OF PMF MEAN, VARIANCE, STANDARD
DEVIATION OF A BINOMIAL
 For every element x in the Support S,
PROBABILITY NOTATION
all the probabilities must be positive.
 The sum of all the probabilities for
all possible x values in the Support S
must be equal to l. MEAN
𝝁 = 𝒏𝒑
VARIANCE
MODULE 3: BINOMIAL
PROBABILITY EXPERIMENT 𝝈𝟐 = 𝝁𝒒
STANDARD DEVIATION

BINOMIAL PROBABILITY 𝝈 = √𝝈𝟐


EXPERIMENT
 The experiment has a fixed number
of trial. POISSON DISTRIBUTION
 Trials are independent.
 Count the number of rare events that
 Each trial has 2 possible outcome
occur in a specified time interval or a
success/failure.
specified region.
 Probability of success is consistent
for each trial.
KEY CHARACTERISTICS OF PROPERTIES
POISSON DISTRIBUTION
1. Countable events: The problem involves
counting the number of events (e.g., defects,
accidents, phone calls) within a fixed
interval (time, space, or volume).
2. Independent events: Each event occurs
independently of the others.
3. Constant average rate: The average rate of  The distribution is bell-shaped.
events is constant within the interval.  The mean, median, and mode are
equal and are located at the center of
4. Random occurrence: Events occur the distribution.
randomly and unpredictably.
 The normal distribution curve is
5. No simultaneous events: Events cannot symmetric about the mean. (the
occur simultaneously. shape are same on the sides).

FORMULA:
𝝁𝒙 𝒆−𝝁
𝒑(𝒙) =
𝒙!
𝜇 =average number of occurrence in a given
time or interval
x=the number of possible occurrence
 The normal distribution is
continuous.
 The normal curve is asymptotic
MODULE 4: NORMAL (it never touches the x-axis).
DISTRIBUTION  The area under the normal curve
is 1 or 100%
 The change of value of the mean
NORMAL DISTRIBUTION shifts the graph of the normal
curve to the right or to the left.
 A normal distribution is a
 The standard deviation
continuous, symmetric, bell-shaped
determines the shape of the
distribution of a variable. The known
graphs. When the standard
characteristics of the normal curve
deviation is large, the normal
make it possible to estimate the
curve is short and wide,
probability of occurrence of any
value of a normally distributed
variable.
while a small value for the standard 𝜎 = standard deviation of the
deviation yields a skinnier and taller distribution
graph.
HOW TO FIND THE X VALUE GIVEN
A Z-SCORE?
 To convert standard normal variable
z to a random variable/ raw score X,
use the formula for calculating X:
EMPIRICAL RULE 𝑿 = 𝝁 + 𝒛(𝝈)
 Approximately 68% of the data lie
within 1 standard deviation of the
mean. MODULE 5: SAMPLING
𝑷𝐫(𝝁 − 𝝈 < 𝑿 < 𝝁 + 𝝈) TECHNIQUES
 Approximately 95% of the data lie
within 2 standard deviations of the
mean. SAMPLING TECHNIQUES
𝑷𝐫(𝝁 − 𝟐𝝈 < 𝑿 < 𝝁 + 𝟐𝝈)
 Population: The whole universe or
 Approximately 99.7% of the data lie
consists of all elements or totality of
within 3 standard deviation of the
things considered in a study.
mean.
 Sample: Is a
𝐏𝐫(𝝁−𝟑𝝈<𝑿<𝝁+𝟑𝝈)
part/portion/fraction/segment of the
population being studied.
FIND THE Z- SCORE  Sampling refers to the process of
selecting a participant as a part of the
 A normal distribution can be
study.
converted into a standard normal
 Random Sampling is a process
distribution by obtaining the z value.
whose members had an equal chance
A z value is the signed distance
of being selected from the
between a selected value, designated
population.
x, and the mean 𝜇, divided by the
 Non-random sampling is a sampling
standard deviation. It is also called as
procedure where samples are
z scores, the z statistics, the standard
selected in a deliberate manner with
normal deviates, or the standard
little or no attention to
normal values.
randomization.
 FORMULA:
𝒙−𝝁 RANDOM SAMPLING
𝒛=
𝝈
𝑧 = z value
𝑥 = the value of any particular SIMPLE RANDOM SAMPLING
observation or measurement
𝜇 = the mean of the distribution  is the most basic random sampling
wherein each element in the
population has an equal probability QUOTA SAMPLING
of being selected.
 Is applied when an investigator
SYSTEMATIC SAMPLING survey collects information from an
assigned number, or quota sampling
 This can be done by listing all the
individuals from one of several
elements in the population and
sample units fulfilling certain
selecting every kth element in your
prescribed criteria or belonging to
population list. It is often used on
one stratum.
long population lists.
𝑁 SNOWBALL SAMPLING
𝐾=
𝑛
N = population  A technique in which one or more
n = sample size members of a population are located
to use to lead the researchers to other
STRATIFIED SAMPLING members of the population.
 A process of subdividing the PARAMETERS AND
population into subgroups or strata
STATISTICS
and drawing members at random
from each subgroups or stratum.  Parameter refers to a numerical
index describing a characteristics of
CLUSTER SAMPLING
a population
 A process of selecting clusters from  Statistics describes a characteristics
a population which is very large or of a sample.
widely spread out over a wide
POPULATION MEAN(𝝁)
geographical area.
∑𝑿
𝝁=
𝑵
NON-RANDOM SAMPLING
∑ 𝑋 =sum of all data values
N=total number of values
CONVENIENCE SAMPLING
POPULATION VARIANCE(𝝈𝟐 )
 A process of selecting a group of
𝟐
∑(𝑿−𝝁)𝟐
individual who (convenient) are 𝝈 =
available for study. 𝑵
X= Given data
PURPOSIVE SAMPLING
N=total number of values
 A process of selecting based from
judgment to select a sample which 𝜇=the population mean
the researcher believed, based on
POPULATION STANDARD
prior information, will provide the
DEVIATION(𝝈)
data they need.
𝝈 = √𝝈𝟐
̅)
SAMPLE MEAN(𝒙 2. List all the possible samples and
compute the mean of each sample.
∑𝑿
x̄ = 3. Construct the sampling distribution.
𝑵
∑ 𝑋 =sum of all data values Finding the Mean and Variance of
Sampling Distribution of Sample Means
N=number of element in the sample
Central Limit Theorem

SAMPLE VARIANCE(𝒔𝟐 )
TIGNAN NYO NA LANG SA PPT NI
∑(𝑿−𝒙̅)𝟐
𝟐
𝒔 = SIR YAN TINAMAD NA KO ILAGAY
𝑵 DI KO RIN GETS 
𝐗 = 𝐠𝐢𝐯𝐞𝐧 𝐝𝐚𝐭𝐚
N=number of element in the sample
𝑥̅ =the population mean

SAMPLE VARIANCE(s)

𝐬 = √𝐬 𝟐

SAMPLING DISTRIBUTION OF
THE SAMPLING MEANS
 A sampling distribution of sample
means is a frequency distribution
using the means computed from all
possible random samples of specific
size from a population. The means of
the samples are less than or greater
than the mean of the population.
Steps in constructing the Sampling
Distribution of the Means
1. Determine the number of sets of all
possible random samples that can be
drawn from the given population
FORMULA: nCn (COMBINATION
FORMULA MERON SA SCICAL
ISANG INPUT LANG)

You might also like