What Is Statistics?: Definition of Statistics Statistics
What Is Statistics?: Definition of Statistics Statistics
Definition of Statistics
– Statistics is the science of collecting, organizing, analyzing,
and interpreting data in order to make a decision.
• Branches of Statistics
– The study of statistics has two major branches –
descriptive(exploratory) statistics and inferential statistics.
• Descriptive statistics is the branch of statistics that
involves the organization, summarization, and display of
data.
• Inferential statistics is the branch of statistics that
involves using a sample to draw conclusions about
population. A basic tool in the study of inferential statistics
is probability.
Scatterplots and Correlation
• Displaying relationships: Scatterplots
• Interpreting scatterplots
1 xi − x yi − y
r= ( )( )
n −1 sx sy
Facts about correlation
• What kind of variables do we use?
– 1. No distinction between explanatory and response variables.
– 2. Both variables should be quantitative
• Numerical properties
– 1. − 1 ≤ r ≤ 1
– 2. r>0: positive association between variables
– 3. r<0: negative association between variables
– 4. If r =1or r = - 1, it indicates perfect linear relationship
– 5. As |r| is getting close to 1, much stronger relationship
−negative relationship − − positive relationship −
−1 0 1
− − − − stronger stronger − − − −
1 1 x−μ 2
p( x) = exp[− ( ) ]
σ 2π 2 σ
X
1 1 x−μ 2
p( x) = exp[− ( ) ]
σ 2π 2 σ
https://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg
(2013)
The normal distribution p(x), with any mean μ and
any positive deviation σ, has the following properties:
• It is symmetric around the mean (μ) of the distribution.
• It is unimodal: its first derivative is positive for x < μ,
negative for x > μ, and zero only at x = μ.
• It has two inflection points (where the second
derivative of f is zero and changes sign), located one
standard deviation away from the mean, x = μ − σ and x =
μ + σ.
• It is log-concave.
• It is infinitely differentiable, indeed supersmooth of
order 2.
Also, the standard normal distribution
p (with μ = 0 and σ = 1) also has the following properties:
σ xy x − μx y − μ y
Correlation between x and y: ρ xy = = E[( )( )]
σ xσ y σx σy
Property of correlation coefficient: − 1 ≤ ρ xy ≤ 1
For Z = ax + by ;
E[( z − μ z ) ] = a σ + 2abσ xy + b σ ;
2 2 2
x
2 2
y
If σ xy = 0, σ = a σ + b σ
2
z
2 2
x
2 2
y
Several sets of (x, y) points, with the correlation coefficient of x and y
for each set.
The correlation reflects the strength and direction of a linear relationship (top
row),
but not the slope of that relationship (middle),
nor many aspects of nonlinear relationships (bottom).
??
σ xy x − μx y − μ y
ρ xy = = E[( )( )]
σ xσ y σx σy
The correlation coefficient can also be viewed as the cosine of the angle
between the two vectors (R D) of samples drawn from the two random variables -
i.e. between the two observed vectors in N-dimensional space (for N observations
of each variable) - http://www.hawaii.edu/powerkills/UC.HTM
This method only works with centered data, i.e., data which have been
shifted by the sample mean so as to have an average of zero.
https://people.math.harvard.edu/~knill/teaching/math19b_2011/handouts/lecture12.pdf
SRC - WIKI
Poisson
Other PDFs:
λ x
−λ ;
P( x) = e λ >0
x!
Binomial
Cauchy
LAPLACE:
Read about:
E(X2) = 25 + 16 = 41
σ xy = E[( x − μ x )( y − μ y )]
PROB. & STAT. - Revisited/Contd.
n n
~ 1
Sample mean is defined as: x = xi P( xi ) = xi where,
i =1 n i =1 P(xi) = 1/n.
n
1 ~
Sample Variance is: σ x = ( xi − x)
2 2
n i =1
~ ~
Higher order moments may also be computed: E ( xi − x) 3 ; E ( xi − x) 4
i =1
Thus, the second central moment (also called Variance) of a random variable x is
defined as:
σ = E[{x − E ( x)} ] = E[( x − μ x ) ]
2
x
2 2
S.D. of x is σx.
= E ( x ) − 2μ + μ = E ( x ) − μ
2 2
x
2
x
2 2
x
Thus
E(x2 ) = σ 2 + μ 2
If z is a new variable: z= ax + by; Then E(z) = E(ax + by)=aE(x) + bE(y).
Also, note that
MAXIMUM LIKELIHOOD ESTIMATE (MLE)
The ML estimate (MLE) of a parameter is that value which, when substituted
into the probability distribution (or density), produces that distribution for which
the probability of obtaining the entire observed set of samples is maximized.
http://grid.cs.gsu.edu/~skarmakar/math1070_slides.html
What are the main types of sampling and how is each done?
E(X) = 1(1/6) +
x 1 2 3 4 5 6 2(1/6) + 3(1/6)+
………………….= 3.5
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
V(X) = (1-3.5)2(1/6) +
(2-3.5)2(1/6) +
…………. …= 2.92
Throwing a dice twice – sampling
distribution of sample mean
E( x) =1.0(1/36)+
6/36 1.5(2/36)+….=3.5
5/36
V(X) = (1.0-3.5)2(1/36)+
4/36 (1.5-3.5)2(2/36)... = 1.46
3/36
2/36
1/36
1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 x
Sampling Distribution of the
Mean
n=5 n = 10 n = 25
μ x = 3.5 μ x = 3.5 μ x = 3.5
2 σ 2x σ 2x σ 2
σ = .5833 ( = )
x
2
σ x = .2917 ( = ) σ 2x = .1167 ( = x )
5 6 10 25
Sampling Distribution of the
Mean
n=5 n = 10 n = 25
μ x = 3.5 μ x = 3.5 μ x = 3.5
σ 2
2 σ 2x 2 σ 2x
2
σ = .5833 ( = )x σ = .2917 ( = )
x σ = .1167 ( = )
x
x
5 10 25
The expected value of the sample mean is equal to the population mean:
E( X ) = μ = μ
X X
The variance of the sample mean is equal to the population variance divided by
the sample size:
σ 2
V(X) = σ 2
= X
X
n
The standard deviation of the sample mean, known as the standard error of
the mean, is equal to the population standard deviation divided by the square
root of the sample size:
σX
s.e. = SD( X ) = σ X =
n
Law of Large Number
How sample means approach the population mean
(μ=25).
Example
- what would happen in many samples?
Recall Some Features of the Sampling Distribution
Or
P(X)
0.10
σ
mean μ and standard deviation n as
P(X)
0.1
(n >30). Large n
0.4
0.3
f(X)
0.2
Population
n=2
n = 30
μ X μ X μ X μ X
Student’s t Distribution
If the population standard deviation, σ, is unknown, replace σ with
the sample standard deviation, s. If the population is normal, the
resulting statistic: X −μ
t=
s/ n
has a t distribution with (n - 1) degrees of freedom.
• The t is a family of bell-shaped and
symmetric distributions, one for each
number of degree of freedom.
• The expected value of t is 0. Standard normal
• The variance of t is greater than 1, but t, df=20
approaches 1 as the number of degrees of t, df=10
freedom increases.
• The t distribution approaches a standard
normal as the number of degrees of
0
freedom increases. μ
• When the sample size is small (<30) we use
t distribution.
Sampling Distributions
σN −n
σx = •
n N −1
Sampling Distribution of x
Standard Deviation of x
Finite Population Infinite Population
σN −n σ
σx = ( ) σx =
n N −1 n
• A finite population is treated as being
infinite if n/N < .05.
• ( N − n) / ( N − 1) is the finite correction factor.
• σ x is referred to as the standard error of the
mean.
The Sampling Distribution of the Sample
Proportion, p
n= 2 , p = 0 .3
0 .4
P(X)
0 .2
number of trials, n.
0 .0
0 1 2
n=10,p=0.3
X 0.3
P(X)
0.1
0.0
0 1 2 3 4 5 6 7 8 9 10
X
P(X)
p (1 − p ) 0.1
n 0.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 ^p