0% found this document useful (0 votes)

102 views22 pages

Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001

This document provides an overview of key concepts in statistics including: 1) Descriptive statistics such as histograms, measures of central tendency, and standard deviation. 2) Probability concepts like the additive law, conditional probability, and discrete distributions. 3) Properties of random variables including expected value, variance, and continuous distributions like the normal, uniform, and exponential. 4) Statistical inference topics like the central limit theorem, sampling theory, point estimation, and constructing confidence intervals.

Uploaded by

kjnero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views22 pages

Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001

Uploaded by

kjnero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Statistics Study Guide

Matthew Chesnes The London School of Economics September 22, 2001

Descriptive Statistics
Pictures of Data: Histograms, pie charts, stem and leafs plots, scatter plots, OGives OGive: A measure of cummulative distribution percentages. Compute reasonable intervals for data and determine their frequency, then their cummulative frequency, and nally their percentage frequency (percentile). Plot percentages at the upper end of the intervals. Easy way to display quartiles. Stem and Leaf Plot: Stem is the major part of the data and leaves are minor part. Choose a reasonable unit for the stem. Organize leaves in order of magnitude. If leaves are too large, split stem into smaller intervals. Placing the leaves equidistant, provides a histogram like representation. A Diagram is Interacting the Eye - B. Blight. Measures of Data: Mean, Median, Mode, Standard Deviation, Quartiles Right Skewed Data - positively skewed - long right tail Left Skewed Data - negatively skewed - long left tail Measures of Location and Spread Mean: Average (stable value and useful in analysis though sensitive to outliers). Mode: Data value that occurs most often ... highest peak in the pdf of the continuous case. Median: 50th percentile (Insensitive to outliers though not as useful in statistical inference) Range: Max - Min (Crude and inaccurate) Interquartile Range: 75th 25th quartile. Sample standard deviation, s, an estimate of population standard deviation, , s= Corrected sum of Squares = n1
n i=1 (xi

x )2 . n1

The standard deviation is calulated on n-1 degrees of freedom rather than n because dividing by n would yield a biased estimator. Alternative form of s:
n

CSS =
i=1

x2 2 = Raw sum of squares - correction i nx

n i=1

x2 2 i nx . n1

Location Transformations: y = x + 5 sd(y ) = sd(x). Scale Transformations: y = 5x sd(y ) = 5sd(x). 2

Probability
Additive Law: P (A B ) = P (A) + P (B ) P (A B ). Exclusive Events: P (A B ) = P (A) + P (B ). Total Probability: P (A) + P (Ac ) = 1. Demorgans Laws: P (Ac B c ) = P ((A B )c ), and P (Ac B c ) = P ((A B )c ). Combinatorial: n Cx = n! . x!(n x)!

2.1

Conditional Probability - Baysian Statistics

P (A F ) P (F |A) P (A) = . P (F ) P (F )

Bayes Rule: P (A|F ) =

Independence: A and B are independent i: P (A|B ) = P (A). Therefore

P (A B ) = P (B ) P (A). Thus, P (A B ) = P (A) P (B ). Only if A and B are independent.

Exclusive events are VERY dependent. One happening completely excludes the possibility of the other occuring. The Law of Total Probability: P (A) = P (A B ) + P (A B c ) = P (A|B ) P (B ) + P (A|B c ) P (B c ). So in general, Bayes Law can be written, P (Bi |A) =
n i=1

P (A|Bi ) P (Bi ) . P (A|Bi ) P (Bi )

3
3.1

Discrete Probability Distributions of Random Variables

The Binomial Distribution
Bernoulli Trials: A series of n independent trials under the same circumstances. The probability of success, P(success), in all trials is identical. P (k successes in n trials) = (n Ck )pk (1 p)nk . Cumulative Distribution Function: Fk = P (K k ).

3.2

Other Discrete Distributions

Hypergeometric: Not independent trials, sampling without replacement. As n , hypergeometric Binomial. Negative Binomial: The distribution of the number of trials needed to get k successes. Multinomial: Generalization of binomial for more than 2 classications.

3.3

The Poisson Distribution

Discrete Distribution Applications: Measuing random arrival times: components that break down over time, Defective items in a large batch. Memoryless Property: A every point in time, there is always the same chance of the event occurring. Arrival rate: . P (r arrivals in time t) = r e . r!

The rate, , is in terms of time, t. Use Poisson approximation for the binomial when n is large AND p is either large ( 1) or small ( 0). Thus, use the poisson approximation if np < 10. Then P DFP oisson = (np)r enp . r!

Properties of Random Variables

A random variable is not a number, its a concept. The mean of X. The probability weighted average of all the values that the random variable can take. =
n i=1

x i p i = E [X ].

If X is distributed binomially, E [X ] = np. If X is distributed poisson, E [X ] = . Expectation is a linear operator. E [a + bX ] = a + bE [X ]. Variance and Standard Deviation of X. R is a random variable
2 = E [(R )2 ] = R r (r

)2 pr .

2 Alternate Form: R = E [(R)2 ] (E [R])2 . 2 Rearranging for an important and useful result: R + 2 = E [(R)2 ]. 2 = npq . If X is distributed binomially, X 2 = . If X is distributed poisson, X

Continuous Probability Distributions

X is a continuous random variable. P (X 2) = P (X < 2). Cumulative Distribution Function: CDF = F (x) = P (X x). F () = 0. and F (+) = 1. E [X ] = =
+

xf (x)dx.

2 X = E [(X )2 ] (E [X ])2 .

5.1

The Uniform Distribution

Constant PDF over the range of the distribution.

5.2

The Exponential Distribution

Consider the Poisson process with points occuring at random in time. is the average number of occurances per unit of time. The time between occurances is a continuous random variable, X, and it follows an exponential distribution. 1 F (x) = P (X > x) = P (0 occurances over the interval from (0,x)) = ex (x)0 = ex . 0! Thus F (x) = 1 ex . Thus f (x) =
d (1 dx

ex ) = ex .

1 If X is distributed exponentially, E [X ] = . 2 If X is distributed exponentially, X = 1 . 2

5.3

The Normal Distribution

The Central Limit Theorem: If n value are sampled from a population and if n is suciently large, then the sample mean (or sum) is normally distributed whatever the distribution of the parent population. If parent is normal, n large n relatively small. If parent is very nonnormal, n large about 50 at most. Standard Normal Distribution: = 0, = 1. 6

Probability Density function for standard normal:

1 2 1 f (z ) = e 2 z . 2

The general normal probability density function X N (, 2 ): f (x) = x 1 2 2 e2(

1 x 2 )

Standardization procedure: z = Combining Random Variables

E [aX + bY ] = aE [X ] + bE [Y ] = aX + bY .
2 2 2 [aX + bY ] = a2 X + b2 Y + 2ab Cov (X, Y ). 2 2 2 [aX bY ] = a2 X + b2 Y 2ab Cov (X, Y ). 2 2 [3 + 2X ] = 4X .

Theorem: Any linear function of normal variables is itself normally distributed. Normal Approximation to the binomial: If R Binomially with n trials and p, the probability of success, as n , but p remains constant, R Normal. As n , but np remains constant (therefore p 0), R Poisson (Use Poisson if np < 10). If R Normal, R N (np, npq ). IMPORTANT ... when using the normal approximation to the binomial, remember to add or subtract a half when computing intervals or nding critical values to reect the discreteness of the original distribution.

Sampling Theory
Let X N (, 2 ). Let Q =
i

xi xi X.

Then Q N (n, n 2 ). = Thus X

Q .X n

N (, ). n

= = The standard error. The Standard Deviation of X n Parameters, Estimators, and Standard Errors. Parameter = ; Estimator = x ; Standard Error = . n Parameter = p ; Estimator = r ; Standard Error = n pq . n

2 2 X Y Parameter = X Y ; Estimator = x y ; Standard Error = + . nX nY p 1 q1 p 2 q2 r1 r2 Parameter = p1 p2 ; Estimator = n n ; Standard Error = + . 1 2 n1 n2

Distribution of Sample Variance: (n 1)s2 2 n1 . 2 Application of 2 : CSS Sxx (xi x )2 = = 2 2 n1 . 2 2 E [ 2 n ] = n. 2 [2 n ] = 2n.

7
7.1

Estimation
Point Estimation
We want to estimate some parameter, using an estimator, . Calculate The Mean Square Error = MSE = E [( )2 ].
2 Square out to nd MSE = + (E [] )2 .

Or otherwise written, MSE = Variance + bias2 . Desirable properties of estimators: unbiased: E[estimator] = parameter. Ecient: Small variance. For example, E [s2 ] = E [ CSS ] = E[ n1 (x x )2 ] = 2. n1

Hence dividing by the n-1 is explained because it gives us an unbiased estimator. However, eciency is more important than unbiasedness. If one estimator is slightly biased but extremely ecient, use it because of the high variability of the alternative.

7.2

Interval Estimation
( x Zcrit (2.5%) SE ( x) = ( x 1.96 ). n

A 95% condence interval for the mean, , of a normally distributed X is,

An incorrect interpretation of this interval would be: There is a 95 percent chance that x is within 1.96 standard errors of . A correct (purist) statement would be: if you took many samples and calculated the condence interval for a parameter each time, then 95 percent of the condence intervals would contain the true value of the parameter. This is because the interval is the thing that has variability, not . is a constant. Condence intervals for proportions: p( r Zcrit n
r (1 n r n ) ). n

A 95 percent CI for Comparing Proportions: p1 p2 ( r1 r2 Zcrit n1 n2 p 1 q1 p 2 q2 + ). n1 n2

Sample size Determination. Dene d to be the Tolerance or the half length of the condence interval. To obtain a 95 percent condence interval for a mean to be with Zcrit 2 in a certain tolerance, d, set n = . One may have to estimate with s using a d small sample rst and then determine optimal n. In general, d = Zcrit SE , and the SE involves n, so solve for n and plug in d. Exact formulation of the variance of x : V ar( x) = 2 n (1 ). n N

If N is very large, this correction term has little to no eect.

7.3

Condence Intervals for Small samples

Suppose the sample is small and the variance is unknown. A condence interval for is, s ( x tcrit n1 ). n The t distribution, AKA, the students t distribution, is more spread out to allow for the variability of both x and s. If is known, use Z distribution for sure. (Unless n is incredibly low). If n is large, use Z because even though the t distribution is theoretically correct, t Z as n . One other case: if n is small and the distribution is really not normal (the central limit theory does not apply), then one must use a non paratmetric approximation. Comparison of Means: 3 cases. Paired Data. Calculate di = xi yi . We want an estimate for d = x y . So condence interval becomes, sd d (d tn1 ( )). n We use the t distribution because n is small and we are estimating sd . Unpaired Large Samples. x y estimated by x y . Thus the standard error here is, 2 2 Sy Sx Sx = + . y nx ny And thus a condence interval becomes, x y ( xy Zcrit
2 2 Sy Sx + ). nx ny

Unpaired Small samples. Must make the assumption that the variances of the two samples is the same! Risky assumption. Assume 1 = 2 = p . Thus, Sp = And, SE = Sp 1 1 + = n1 n2
2 2 (n1 1)S1 + (n2 1)S2 n1 + n2 2

CSS1 + CSS2 = n1 + n2 2

2 2 (n1 1)S1 + (n2 1)S2 . n1 + n2 2

1 1 + . n1 n2

2 Notice that Sp is a weighted function of the sample variances with eachs degrees of freedom as the weights. The test statistic for a hypothesis test or a condence interval will follow a t distribution with n1 + n2 2 degrees of freedom.

7.4

Condence Intervals for a Variance

Sxx . S 2 is not normally distributed. Since E [S 2 ] = 2 , we can rearrange the n1 Sxx terms and it can be shown that, 2 2 n1 . Read o Chi squared values o the table for upper and lower limits for n 1 degrees of freedom. Then, 0.95 = P (2 < 0.95 = P ( Thus, 2 ( (n 1)S 2 (n 1)S 2 , ). 2 2 n1 n1 Sxx 2 ). < 2

S2 =

Sxx Sxx < 2 < 2 ). 2

Hypothesis Testing
Testing H0 versus H1 . Always choose the null hypothesis to be the simpler of the two alternatives. Type I Error: Rejecting H0 when it is true. () Type II Error: Failing to reject H0 when it is false. ( ) and both decrease with a larger sample size. Power Function: The probability of accepting H1 (rejecting H0 ) for dierent values of the true parameter, . Some might use the terminology, Accepting H1 . But this would be incorrect if it implies proof. All we are saying is that the available data supports the hypothesis. Purists would never just accept, they would use the terminology, Fail to reject H0 . To carry out test, dene hypotheses, compute test statistic and compare with the relevant distribution. If n is large, use the Z distribution for your decision. If n is smaller and is unknown, use the t distribution. If a test statistic is on a division point of the critical values, maybe you cannot condently reject H0 , but you should be very suspicious that it is actually true. Always report lowest possible level (highest possible condence). Doing otherwise is just ignorant. - C.Dougherty The P value of the test tells you exactly where the test statistic lies: its the probabilty that under the null hypothesis, you observe an estimate as or more extreme then your value. When computing standard errors for test, always compute them with null values. Since we are assuming that the null is true until proven guilty, one must use its values when doing the test. Advantage of Paired Test: must less sensitive. Never use the data to form your hypothesis: choose the nature of the test (one tailed or two tailed, null and alternative hypotheses, etc) rst and then carry out the test using the data.

Tests for Association

Association: Relating factors via cross tabulated data. (catagorized data) Correlation: Relating variables via measurement data. Display data in a mXn contingency table. Where m and n are the number of factors your comparing, not the levels. Usually just a 2X2. Test H0 = No Association Versus H1 =Association. After setting up the tables you have your Os (observed data). Compute the Es (Expected data) as, E = Find Peircesons Test Statistic as, P =
i

Row Total * Column Total . Grand Total

(Oi Ei )2 2 (r 1)(c1) . Ei

The larger the statistic, P, the larger the likelihood of rejecting H0 in favor of Association. The statistic is distributed as a Chi Squared with (row-1)(col-1) degress of freedom.

Further Properties of Random Variables

Let R be a random variable with p.d.f, pr . Let T be some function, (R). Then Prob(T=t) = pr where is over all the values of r such that (r) = t. Work out the distributions of R and then T to see that this is true. Theorem: For a random variable X and a random variable Y = (X ) such that is a monotonic function, the c.d.f. for X equals the c.d.f. for Y . F (x) = G(y ). Also, (IMPORTANT THEOREM), for the same transformation, , g (y ) = f (x)| dx |. dy For a general transformation on a random variable ( not necessarily monotonic), just look at the graph of the transformed X, and evaluate the above theorem over each monotonic section. Joint density functions of two random variables: f (x, y ). This is simply a surface in three dimensions with the volume under the surface (instead of area under the curve) representing probability. Total volume under the surface is again equal to one. All of Bayes calculus on probabilities also applies to density functions. f (y ) = f (y |x)f (x)dx.

If X and Y are independent, f (x, y ) = f (x)f (y ).

10.1

Covariance and Correlation

Covariance: Cov (X, Y ) = = E [(X X )(Y Y )]. If > 0, X and Y work in the same direction. If < 0, X and Y work in the opposite direction. It can also be shown that Cov (X, Y ) = E [XY ] E [X ]E [Y ]. Since the covariance depends on the units of the random variable, we dene the correlation coecient to be, = . X Y

is the Linear Correlation Coecient, and it lies between -1 and 1. If X and Y are independent, it can be shown that the Cov (X, Y ) = 0 If X is a linear function of Y , then XY = 1. Properties of Variance and Covariance. 14

V ar(aX + bY ) = a2 V ar(X ) + b2 V ar(Y ) + 2abCov (X, Y ). Variance is a second-order operator. Variances always add, though the covariance term takes the sign of ab. 3 variable case: V ar(aX + bY + cZ ) = a2 V ar(X ) + b2 V ar(Y ) + +c2 V ar(Z ) + 2abCov (X, Y ) + 2acCov (X, Z ) + 2bcCov (Y, Z ). Cov (aX +bY, cS +dT ) = acCov (X, S )+adCov (X, T )+bcCov (Y, S )+bdCov (Y, T ).

Matrix Notation for the Multivariate Normal Distribution

Dene X to be a p column vector of random variables. X N (,

is the correlation matrix. All diagonal elements of this matrix are the variances of each of the random variables. The o diagonal entries are covariances. It is of course a symmetric matrix. = E [(X )(X )T ]. ), (ie, is multivariate normal), Then X T
1

Theorem: If X N (0,

X 2 p.

Correlation and Regression

6 Basic Statistics needed for regression. n, sample size; x , sample mean of independent variable; y , sample mean of dependent variable; Sxx , the corrected sum of squares for the xs; Syy , the corrected sum of squares for the ys; Sxy , the corrected sum of products for x and y . Sxx = Syy = Sxy =
i (xi i (yi i (xi

x )2 . y )2 . x )2 (yi y )2 =
i (xi yi )

nx y .

12.1

Correlation Coecient and Tests

Sxy . n1 c = S x Sy Sxy . Sxx Syy

Covariance = c = Correlation = r =

A correlation of zero means that there is no linear relationship between X and Y but does not necessarily mean there is no relationship at all ... it could be nonlinear. Test for Correlation: H0 : = 0 versus H1 : = 0. r n2 Test Statistic = . 1 r2

12.2

Simple Linear Regression

Use scatterplots of the data as a starting point. Simple Linear Model: yi = + xi + i . Error term,
i

iid N (0, 2 ).

Least Squares Criteria: Minimize wrt and , Q=

(yi xi )2 .

Yields estimators, Sxy . Sxx a=y bx . b= It can be shown that a and b are B.L.U.E. : Best, Linear, Unbiased, Estimators. 17

Property: ( x, y ) always lies on the LS regression line. Property:

i ei

= 0. (The residuals always sum to 0).

In general, with n observations and p parameters (including the constant), s2 =

2 i ei

is an unbiased estimator of 2 .
i (yi

Dene RSS = Residual Sums of Squares = Thus s2 = RSS . np

a bxi )2 . Thus RSS =

2 i ei .

Arranging the denition of RSS, we nd, RSS = Syy b2 Sxx . Or in other words, RSS is the extra variability in y that we cannot explain after tting the model. If Syy is the total variability, then b2 Sxx is the explained variability. Analysis of Variance Table. Source Regression (Explained) Residual (Unexplained) Total Degrees of Freedom Sums of Squares p1 b2 Sxx np RSS = Syy b2 Sxx n1 Syy Mean Square 2 Sr 2 S = RSS np 2 ST

Dene,
2 Sxy b2 Sxx R = = . Syy Sxx Syy 2

R2 is the percentage of the variability in y that is explained by the independent variables via the regression equation. In words, it is the explained variability over the total variability, so it is a good measure of how well the line ts the data. In simple linear regression, we saw the that the correlation coecient, r = Sxy . Sxx Syy Thus, in SLR, R2 = r2 . This doesnt apply to multiple regression because there we have many correlations and only one R2 value.

Adjusted R2 . Good for comparing models in the multiple regression setting. Reects the addition of more variables, while always increasing R2 , might lead to a worse model. Dene, S2 S2 2 Radj = T 2 . ST 18

Standard errors for inference purposes: SE (b) = . Sxx

SE (a) =

x 2 1 + . n Sxx

Hypothesis tests and inferences about and are the same as always and will follow a t distribution because we are estimating . F Test for Regression: particularly useful for multiple regression. In SLR, F = t2 . Test H0 : Bi = 0 i versus H1 : i = 0 for at least one i. The Null hypothesis is that S2 the regression has no eect. Test statistic F = r . If F is much dierent from 1, then S2 reject H0 and conclude that there is a valid regression eect. It can be shown that as a ratio of two Chi squared variables,
2 Sr Fp1,np . S2

12.3

Prediction Intervals

Plug your x value into the regression equation and get your predicted y. Be careful though of points outside the range of your data. For an interval of condence, develop a prediction interval for y . y is your estimator and the standard error of y is, SE ( y) = So your prediction interval becomes, y ( y tn2 SE ). From the last term in the SE formula, it is clear that the further away from the mean you are, the larger your prediction interval. 1 (x x )2 1+ + . n Sxx

12.4

Multiple Regression

Model: y = x + . Where, x= 1 1 1 1 ... 1 x11 x21 x31 x41 ... xn1 19 x12 x22 x32 x42 ... xn2 ... ... ... ... ... ... x 1p x 2p x 3p x 4p ... xnp

(1)

And, = Thus, for OLS, we minimize: (y x )T (y x ). Which yeilds, = b = (xT x)1 (xT y ). 1 2 ... p . (2)

Time Series

Index numbers for a series of prices, p0 , p1 , ..., pt , ...

pt Index = p t = 100 p . 0

Laspeyres Index: For comparing prices using quantity at the base time, t0 . Pt = 100 Paasche Price Index: Pt = 100 Quantity Index: Pt = 100 Value Index: Pt = 100 qt p 0 . q0 p 0 qt p t . q0 p 0 q0 p t . q0 p 0 qt p t . qt p 0

Index Linking. Useful to reindex from time to time, but to avoid jumps, dene, Pt = P0,t for t = 0, ..., 10. Pt = P10,t P0,t = p0,10 100 q10 pt for t = 10, ..., 20. q10 p10

Time Series: x0 , x1 , x2 , ..., xt , ... Classical Economic Time series: xt = Tt + St + Ct + It . (Trend + Seasonal + Cyclical + Irregular stationary component.) Stationary Time Series: Relate variable to itself using 1 or more lags. Autoregression: (xt x ) = b(xt1 x ). Auto Correlation ( is the number of lags) : r = Models 1st order Autoregressive Model. xt = xt1 + t . Where, (0, 1) for a stationary time series and > 1 for a non-stationary time series. 21
(xt x )(xt + x ) n 1 ( xt x )2 n1

2nd order Autoregressive Model. xt = 1 xt1 + 2 xt2 + t . Moving Average Model. xt = xt+1 = xt+2 =
t

+b
t+1

t1 .

+ b t.
t+1 .

t+2

Where every neighboring term is correlated with each other, but others are not. This can be extended to more than one lagged interaction. Mixed Models : ARMA Model - AutoRegressive Moving Average Models p q

ai xti =
i=0 i=0

ti .

BAF2202 Management+Accounting+I PDF
60% (5)
BAF2202 Management+Accounting+I PDF
127 pages
FRM-I - Book Quant Analysis
100% (4)
FRM-I - Book Quant Analysis
457 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
DMV - Unit I
No ratings yet
DMV - Unit I
44 pages
Stat 2024 Formulas and Tables For Statistics v2
No ratings yet
Stat 2024 Formulas and Tables For Statistics v2
28 pages
Notes For Lectures 1 To 10 - 2024
No ratings yet
Notes For Lectures 1 To 10 - 2024
39 pages
Week 9+10+11
No ratings yet
Week 9+10+11
82 pages
Sampling Distributions
No ratings yet
Sampling Distributions
32 pages
Review Stat 2024
No ratings yet
Review Stat 2024
86 pages
Econ-2042 - Unit 5-HO
No ratings yet
Econ-2042 - Unit 5-HO
22 pages
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
No ratings yet
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
36 pages
Module01 ProbabilityAndHypothesisTesting
No ratings yet
Module01 ProbabilityAndHypothesisTesting
62 pages
Stat 2024 Formula and Tables For Statistics v1
No ratings yet
Stat 2024 Formula and Tables For Statistics v1
28 pages
Sem 6 Notes Maths
No ratings yet
Sem 6 Notes Maths
7 pages
8.a. Distributions
No ratings yet
8.a. Distributions
28 pages
Data Analysis
No ratings yet
Data Analysis
51 pages
Formula List Statistics 2
No ratings yet
Formula List Statistics 2
4 pages
Distributions
No ratings yet
Distributions
21 pages
Mathematics Handbook
No ratings yet
Mathematics Handbook
11 pages
Sampling Distributions
No ratings yet
Sampling Distributions
9 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
Formula Sheet
No ratings yet
Formula Sheet
19 pages
2 Inferential+Statistics+ (Theoretical)
No ratings yet
2 Inferential+Statistics+ (Theoretical)
4 pages
D Models
No ratings yet
D Models
5 pages
Doc-Cours MathsV
No ratings yet
Doc-Cours MathsV
69 pages
The Constellations and Their Mythology
100% (5)
The Constellations and Their Mythology
52 pages
Ps Notes
No ratings yet
Ps Notes
62 pages
Statistics Review
No ratings yet
Statistics Review
16 pages
Notes 2
No ratings yet
Notes 2
6 pages
Intro To Data Science Lecture 2
No ratings yet
Intro To Data Science Lecture 2
12 pages
Statistics in 3 Pages
No ratings yet
Statistics in 3 Pages
3 pages
Statistics in 3 Pages PDF
No ratings yet
Statistics in 3 Pages PDF
3 pages
Stat1 Formulas and Tables For Statistics 2022
No ratings yet
Stat1 Formulas and Tables For Statistics 2022
34 pages
Probability Distributions-Sarin B
No ratings yet
Probability Distributions-Sarin B
20 pages
St2334-Cheatsheet Organized
No ratings yet
St2334-Cheatsheet Organized
2 pages
Materials SB: 1+3.3log Log (N)
No ratings yet
Materials SB: 1+3.3log Log (N)
10 pages
Basic Probability and Statistics: Random Variables Distribution Functions Various Probability Distributions
No ratings yet
Basic Probability and Statistics: Random Variables Distribution Functions Various Probability Distributions
39 pages
LQ1 Notes
No ratings yet
LQ1 Notes
15 pages
ProbabilityStatistics Probability3
No ratings yet
ProbabilityStatistics Probability3
9 pages
Module 9 - Risk Management & Trading Psychology
No ratings yet
Module 9 - Risk Management & Trading Psychology
132 pages
110 Final Review 2012
100% (1)
110 Final Review 2012
81 pages
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
No ratings yet
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
14 pages
Central Limit Theorem
100% (3)
Central Limit Theorem
38 pages
Review
No ratings yet
Review
12 pages
Key of Week1 - Lecture Notes
No ratings yet
Key of Week1 - Lecture Notes
10 pages
A 18-Page Statistics & Data Science Cheat Sheets
No ratings yet
A 18-Page Statistics & Data Science Cheat Sheets
18 pages
Chapter 5 Prob
No ratings yet
Chapter 5 Prob
6 pages
KL Transform
100% (1)
KL Transform
22 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
Probability Space and Random Variable Proporties
No ratings yet
Probability Space and Random Variable Proporties
21 pages
College Statistics
No ratings yet
College Statistics
244 pages
Random Vectors 1
No ratings yet
Random Vectors 1
8 pages
Statistics Formula Sheet New
No ratings yet
Statistics Formula Sheet New
22 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
Statistical Inference
No ratings yet
Statistical Inference
106 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
Formula PDF
No ratings yet
Formula PDF
7 pages
Decsci Reviewer CHAPTER 1: Statistics and Data
No ratings yet
Decsci Reviewer CHAPTER 1: Statistics and Data
7 pages
Ma Statsv2 3
No ratings yet
Ma Statsv2 3
3 pages
Formula
No ratings yet
Formula
7 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
IITM Machine Learning
No ratings yet
IITM Machine Learning
857 pages
BIM301 Research Notes
No ratings yet
BIM301 Research Notes
72 pages
Probability Cheatsheet
No ratings yet
Probability Cheatsheet
4 pages
Intro Statistics Dtu
No ratings yet
Intro Statistics Dtu
426 pages
Module 25 - Statistics 2
No ratings yet
Module 25 - Statistics 2
9 pages
FRM Part 1: Basic Statistics
No ratings yet
FRM Part 1: Basic Statistics
28 pages
Momentum Worksheet
100% (1)
Momentum Worksheet
2 pages
Stat Cheatsheet (Ver.2)
No ratings yet
Stat Cheatsheet (Ver.2)
2 pages
PPT-Of - Rural Finance
No ratings yet
PPT-Of - Rural Finance
177 pages
Indian Institue of Technology 1
No ratings yet
Indian Institue of Technology 1
186 pages
Financial Modeling Outlines 31012021 103557am
No ratings yet
Financial Modeling Outlines 31012021 103557am
5 pages
Reading 5 Portfolio Mathematics
No ratings yet
Reading 5 Portfolio Mathematics
14 pages
Chapter 6 - Problem Solving (Risk)
No ratings yet
Chapter 6 - Problem Solving (Risk)
5 pages
Saturn: After Audio, Press Space Bar To Start
No ratings yet
Saturn: After Audio, Press Space Bar To Start
36 pages
Assignment 2
100% (1)
Assignment 2
8 pages
Case Study - Risk & Return
100% (1)
Case Study - Risk & Return
6 pages
Bauer 2006
No ratings yet
Bauer 2006
22 pages
Chap 10-11 PDF
No ratings yet
Chap 10-11 PDF
41 pages
Education Overview
No ratings yet
Education Overview
3 pages
Chapter 2. Random Variables: Niprl
No ratings yet
Chapter 2. Random Variables: Niprl
59 pages
Valuation With Multiples Averaging Links Aggregation and The Impact of Capital Structure
No ratings yet
Valuation With Multiples Averaging Links Aggregation and The Impact of Capital Structure
33 pages
Connectedness and Investment Strategies of Volatile Assets D - 2025 - Borsa Ist
No ratings yet
Connectedness and Investment Strategies of Volatile Assets D - 2025 - Borsa Ist
12 pages
Section A
No ratings yet
Section A
8 pages
Risk and Return
No ratings yet
Risk and Return
10 pages
19 Ride Hailing
No ratings yet
19 Ride Hailing
16 pages
Slides: Elements of Probability Theory
No ratings yet
Slides: Elements of Probability Theory
37 pages
Analyse Classify Discriminate: 1) Category Variable
No ratings yet
Analyse Classify Discriminate: 1) Category Variable
2 pages
Insurance Companies: in Trinidad & Tobago
No ratings yet
Insurance Companies: in Trinidad & Tobago
22 pages
Constant Expected Return
No ratings yet
Constant Expected Return
35 pages
PHY 101 Chapter 10 Post
No ratings yet
PHY 101 Chapter 10 Post
33 pages
Mars
No ratings yet
Mars
20 pages
The Error Ellipse: PHYS 6710: Nuclear and Particle Physics II
No ratings yet
The Error Ellipse: PHYS 6710: Nuclear and Particle Physics II
12 pages
Fall 2015 Fall Term 2016 Fall Term 2017 Fall Term: ALMANAC April 7, 2015 8 WWW - Upenn.edu/almanac
No ratings yet
Fall 2015 Fall Term 2016 Fall Term 2017 Fall Term: ALMANAC April 7, 2015 8 WWW - Upenn.edu/almanac
1 page
Topic 16 Basic Statistics PDF
No ratings yet
Topic 16 Basic Statistics PDF
19 pages
And An Increased Emphasis On Small Farmers and Domestic Crops
No ratings yet
And An Increased Emphasis On Small Farmers and Domestic Crops
5 pages
1a. Group Project Part 1 (Due Feb 29, 2024)
No ratings yet
1a. Group Project Part 1 (Due Feb 29, 2024)
2 pages
Moments
No ratings yet
Moments
2 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)

Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001

Uploaded by

Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001

Uploaded by

Statistics Study Guide

Matthew Chesnes The London School of Economics September 22, 2001

x2 2 = Raw sum of squares - correction i nx

Location Transformations: y = x + 5 sd(y ) = sd(x). Scale Transformations: y = 5x sd(y ) = 5sd(x). 2

Conditional Probability - Baysian Statistics

Bayes Rule: P (A|F ) =

Independence: A and B are independent i: P (A|B ) = P (A). Therefore

P (A B ) = P (B ) P (A). Thus, P (A B ) = P (A) P (B ). **Only if A and B are independent.**

P (A|Bi ) P (Bi ) . P (A|Bi ) P (Bi )

Discrete Probability Distributions of Random Variables

Other Discrete Distributions

The Poisson Distribution

Properties of Random Variables

Continuous Probability Distributions

The Uniform Distribution

Constant PDF over the range of the distribution.

The Exponential Distribution

1 If X is distributed exponentially, E [X ] = . 2 If X is distributed exponentially, X = 1 . 2

The Normal Distribution

Probability Density function for standard normal:

The general normal probability density function X N (, 2 ): f (x) = x 1 2 2 e2(

Standardization procedure: z = Combining Random Variables

Then Q N (n, n 2 ). = Thus X

2 2 X Y Parameter = X Y ; Estimator = x y ; Standard Error = + . nX nY p 1 q1 p 2 q2 r1 r2 Parameter = p1 p2 ; Estimator = n n ; Standard Error = + . 1 2 n1 n2

Distribution of Sample Variance: (n 1)s2 2 n1 . 2 Application of 2 : CSS Sxx (xi x )2 = = 2 2 n1 . 2 2 E [ 2 n ] = n. 2 [2 n ] = 2n.

A 95% condence interval for the mean, , of a normally distributed X is,

A 95 percent CI for Comparing Proportions: p1 p2 ( r1 r2 Zcrit n1 n2 p 1 q1 p 2 q2 + ). n1 n2

If N is very large, this correction term has little to no eect.

Condence Intervals for Small samples

2 2 (n1 1)S1 + (n2 1)S2 . n1 + n2 2

Condence Intervals for a Variance

Sxx Sxx < 2 < 2 ). 2

Tests for Association

Row Total * Column Total . Grand Total

Further Properties of Random Variables

If X and Y are independent, f (x, y ) = f (x)f (y ).

Covariance and Correlation

Matrix Notation for the Multivariate Normal Distribution

Dene X to be a p column vector of random variables. X N (,

Correlation and Regression

Correlation Coecient and Tests

Simple Linear Regression

Least Squares Criteria: Minimize wrt and , Q=

Property: ( x, y ) always lies on the LS regression line. Property:

= 0. (The residuals always sum to 0).

In general, with n observations and p parameters (including the constant), s2 =

Dene RSS = Residual Sums of Squares = Thus s2 = RSS . np

a bxi )2 . Thus RSS =

Standard errors for inference purposes: SE (b) = . Sxx

Index numbers for a series of prices, p0 , p1 , ..., pt , ...

You might also like

P (A B ) = P (B ) P (A). Thus, P (A B ) = P (A) P (B ). Only if A and B are independent.