Quantitative Analysis

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 47

Quantitative Analysis

Reading 13-Random Variables


Reading 14-Common Univariate Random Variables
Reading 15-Multivariate Random Variables
Reading 16-Sample Moments
Random Variables and Probability Functions
• Discrete random variable (Bernoulli random variable): one that can take on only
a countable number of possible outcomes
• Example: the number of outcomes of a coin flip, the number of days in June that will have a
temperature greater than 35 °C
• Continuous random variable: uncountable number of possible outcomes.
• Example: The amount of rainfall that will fall in June
• For continuous random variables, we measure probabilities only over some positive interval,
(e.g., the probability that rainfall in June will be between 500 and 520 mm).
• A probability mass function (PMF), f (x) = P(X = x), gives us the probability that the
outcome of a discrete random variable, X, will be equal to a given number, x.
• A cumulative distribution function (CDF) gives us the probability that a random
variable will take on a value less than or equal to x [i.e., F(x) = P(X ≤ x)].
Expected value
• Expected value: weighted average of the possible outcomes of a random variable,
where the weights are the probabilities that the outcomes will occur.
E(X) = ΣPiXi= P1X1 + P2X2 + … + PnXn
In which Pi is the probability of outcome Xi to occur

• The following are two useful properties of expected values:


1. If c is any constant, then:
E(cX) = cE(X)
2. If X and Y are any random variables, then:
E(X + Y) = E(X) + E(Y)
Expected value
• EXAMPLE: Expected earnings per share (EPS)
The probability distribution of EPS for Ron’s Stores is given in the following figure.
Calculate the expected earnings per share.

Answer:
The expected EPS is simply a weighted average of each
possible EPS, where the weights are the probabilities of
each possible outcome.

E(EPS) = 0.10(1.80) + 0.20(1.60) + 0.40(1.20) + 0.30(1.00)


= £1.28
MEAN, VARIANCE, SKEWNESS, AND KURTOSIS
• Four common population moments of a random variable
• Mean: Expected value (μ)
• Variance: The second central moment of a random variable
• Measure how widely dispersed the values of the random variable are around
the mean
• σ2 = E{[X − E(X)]2} = E[(X − μ)2]
• Standard deviation =

Example: Calculate Variance and Standard deviation of EPS for Ron’s Stores
σ2 = 0.10(1.80-1.28)2 + 0.20(1.60-1.28)2 + 0.40(1.20-1.28)2 + 0.30(1.00-1.28)2
= 0.0736
σ = 27.13%
MEAN, VARIANCE, SKEWNESS, AND KURTOSIS
• Skewness: a measure of a distribution’s symmetry, is the standardized third
moment.
• E{[X − E(X)]3} = E[(X − μ)3]

• Skew = 0  perfectly symmetric distribution


MEAN, VARIANCE, SKEWNESS, AND KURTOSIS
• Kurtosis: is the standardized fourth moment.
• Kurtosis is a measure of the shape of a distribution, in particular the total probability
in the tails of the distribution relative to the probability in the rest of the distribution.
• The higher the kurtosis, the greater the probability in the tails of the distribution.

Positive Kurtosis

Negative Kurtosis
The Normal Distribution
The Normal Distribution
• Many of the random variables that are relevant to finance and other professional
disciplines follow a normal distribution.
• It is completely described by its mean, μ, and variance, σ2, stated as X ~ N(μ, σ2).
In words, this says, “X is normally distributed with mean μ and variance σ2.”
• Skewness = 0, meaning the normal distribution is symmetric about its mean, so
that P(X ≤ μ) = P(μ ≤ X) = 0.5, and mean = median = mode.
• Kurtosis = 3.
• A linear combination of normally distributed independent random variables is
also normally distributed.
• The probabilities of outcomes further above and below the mean get smaller and
smaller but do not go to zero (the tails get very thin but extend infinitely).
Confidence interval
• A confidence interval is a range of values around the expected outcome within
which we expect the actual outcome to be some specified percentage of the
time.
• A 95% confidence interval is a range that we expect the random variable to be in
95% of the time.
• For a normal distribution, this interval is based on the expected value (sometimes
called a point estimate) of the random variable and on its variability, which we
measure with standard deviation.
Confidence interval

• The 90% confidence interval for X is μ − 1.65𝞼 to μ + 1.65𝞼 „


• The 95% confidence interval for X is μ − 1.96𝞼 to μ + 1.96𝞼
• „The 99% confidence interval for X is μ − 2.58𝞼 to μ + 2.58𝞼
Confidence intervals
• EXAMPLE: Confidence intervals
The average return of a mutual fund is 10.5% per year and the standard deviation
of annual returns is 18%. If returns are approximately normal, what is the 95%
confidence interval for the mutual fund return next year?
Answer:
Here μ and σ are 10.5% and 18%, respectively. Thus, the 95% confidence interval
for the return, R, is:
10.5 ± 1.96(18) = −24.78% to 45.78%
Symbolically, this result can be expressed as:
P(−24.78 < R < 45.78) = 0.95 or 95%
The interpretation is that the annual return is expected to be within this interval
95% of the time, or 95 out of 100 years.
The standard normal distribution

• A standard normal distribution (i.e., z-distribution) is a normal distribution that has been
standardized so it has a mean of zero and a standard deviation of 1
• N~(0,1)
The standard normal distribution

• EXAMPLE: Standardizing a random variable (calculating z-values)


Assume the annual earnings per share (EPS) for a population of firms are normally
distributed with a mean of $6 and a standard deviation of $2. What are the z-values for EPS
of $2 and $8?
• Answer:
If EPS = x = $8, then z = (x − μ) / σ = ($8 − $6) / $2 = +1
If EPS = x = $2, then z = (x − μ) / σ = ($2 − $6) / $2 = –2
Here, z = +1 indicates that an EPS of $8 is one standard deviation above the mean, and z =
−2 means that an EPS of $2 is two standard deviations below the mean.
Z table

Cumulative Probabilities for a


Standard Normal Distribution

• Probability that a standard


normal random variable will
be less than a z-value
• EXAMPLE: Using the z-table (1)
Considering again EPS distributed with μ = $6 and σ = $2, what is the probability
that EPS will be $9.70 or more?
Answer:
The z-value for EPS = $9.70 is:

That is, $9.70 is 1.85 standard deviations above the mean EPS value of $6. From the
z-table, we have F(1.85) = 0.9678, but this is P(EPS ≤ 9.70).
P(EPS > 9.70) = 1 − 0.9678 = 0.0322, or 3.2%
The Lognormal Distribution
• The lognormal distribution is generated by the function ex, where x is normally distributed.
• Because the natural logarithm, ln, of ex is x, the logarithms of lognormally distributed random
variables are normally distributed.
• The lognormal distribution is skewed to the right.
• „. The lognormal distribution is bounded from below by zero so that it is useful for modeling asset
prices that never take negative values.
Student’s t-Distribution
• Student’s t-distribution is similar to a normal distribution, but has fatter tails (i.e.,
a greater proportion of the outcomes are in the tails of the distribution).
• When small samples (n < 30) from a population with unknown variance and a
normal, or approximately normal, distribution.
• When population variance is unknown and the sample size is large enough that
the central limit theorem will assure that the sampling distribution is
approximately normal
Student’s t-Distribution
• It is symmetrical.
• It is defined by a single parameter, the
degrees of freedom (df) (the number of
sample observations minus 1, n − 1, for
sample means.
• It has a greater probability in the tails
(fatter tails) than the normal distribution.
• As the degrees of freedom (the sample
size) gets larger, the shape of the t-
distribution more closely approaches a
standard normal distribution.
• The Chi-Squared Distribution
• The F-Distribution
• The Exponential Distribution
• The Beta Distribution
• Mixture distributions
Covariance
• Covariance is the expected value of the product of the deviations of
the two random variables from their respective expected values.
• Covariance measures how two variables move with each other or the
dependency between the two variables.
• Cov(X,Y) and σXY.
• Cov(X,Y) = E{[X − E(X)][Y − E(Y)]}
• Cov(X,Y) = E(X,Y) − E(X) × E(Y)
• EXAMPLE: Covariance
Assume that the economy can be in three possible states (S) next year: boom, normal, or
slow economic growth. An expert source has calculated that P(boom) = 0.30, P(normal) =
0.50, and P(slow) = 0.20. The returns for Stock A, RA, and Stock B, RB, under each of the
economic states are provided in the following table. What is the covariance of the returns
for Stock A and Stock B?
Answer:
E(RA) = (0.3)(0.20) + (0.5)(0.12) + (0.2)(0.05) = 0.13
E(RB) = (0.3)(0.30) + (0.5)(0.10) + (0.2)(0.00) = 0.14
Correlation
• Covariance is difficult to interpret because it depends on the scales of X1 and X2.
Thus, it can take on extremely large values, ranging from negative to positive
infinity, and, like variance, these values are expressed in terms of squared units. 
Correlation makes it easier to interpret.
• Correlation measures the strength of the linear relationship between two variables.

• Correlation ranges from −1 to +1 for two variables (i.e., −1 ≤ Corr(X1, X2) ≤ +1).
ρ = 1: two variables are perfectly positively correlated
ρ = -1: two variables are perfectly negatively correlated
Correlation
EXAMPLE: Correlation
Using our previous example, compute and interpret the correlation of the returns for Stocks A
and B, given that σ2(RA) = 0.0028 and σ2(RB) = 0.0124 and recalling that Cov(RA,RB) = 0.0058.

Answer:

σ(RA) = (0.0028)1/2 = 0.0529


σ(RB) = (0.0124)1/2 = 0.1114
Sample moments (Reading 16)
• How sample moments (mean, variance, skewness, and kurtosis) are used to
estimate the true population moments for data generated from independent and
identically distributed (i.i.d.) random variables
• Sample mean

n: the number of observations in the sample

• Population mean μ

N: the number of observations in the population


Sample moments
• Biased sample variance

• Unbiased sample variance

• Population variance
Sample moments
• The use of the entire number of sample observations, n, instead of n − 1 as the divisor
in the computation of s2, will systematically underestimate the population parameter,
σ2, particularly for small sample sizes  cause the sample variance to be a biased
estimator of the population variance.
• Using n − 1 instead of n in the denominator, however, improves the statistical
properties of s2 as an estimator of σ2
• EXAMPLE: Estimating the mean, variance, and standard deviation with sample data
Assume you are evaluating the stock of Alpha Corporation. You have calculated the stock
returns for Alpha Corporation over the last five years to develop the following sample
data set. Given this information, calculate the sample mean, variance, and standard
deviation.
Data set: 24%, 34%, 18%, 54%, 10%
Quantitative Analysis
Reading 21
Stationary Time Series
Time series
• Time series is data collected over regular time periods
• Example: monthly S&P 500 returns, quarterly dividends paid by a
company, etc.).
• Time series data have trends (the component that changes over
time), seasonality (systematic change that occur at specific times of
the year), and cyclicality (changes occurring over time cycles).
Covariance Stationary
• To be covariance stationary, a time series must exhibit the following
three properties:
1. Its mean must be stable over time.
2. Its variance must be finite and stable over time.
3. Its covariance structure must be stable over time.

• Covariance structure refers to the covariances among the values of a


time series at its various lags, which are a given number of periods
apart at which we can observe its values.
Autocovariance and Autocorrelation Functions
• The covariance between the current value of a time series and its
value τ periods in the past is referred to as its autocovariance at lag τ.
• Its autocovariances for all τ make up its autocovariance function. If a
time series is covariance stationary, its autocovariance function is
stable over time.
• To convert an autocovariance function to an autocorrelation function
(ACF), we divide the autocovariance at each τ by the variance of the
time series. This gives us an autocorrelation for each τ that will be
scaled between −1 and +1.
White noises
• A time series might exhibit zero correlation among any of its lagged values. Such a
time series is said to be serially uncorrelated.
• A special type of serially uncorrelated series is one that has a mean of zero and a
constant variance. This condition is referred to as white noise, or zero-mean
white noise, and the time series is said to follow a white noise process.
• One important purpose of the white noise concept is to analyze a forecasting
model. A model’s forecast errors should follow a white noise process
Autoregressive Processes
• The first-order autoregressive [AR(1)] process is specified in the form of a variable
regressed against itself in lagged form. This relationship can be shown in the following
formula:
yt = d + Φyt–1 + εt
where:
• d = intercept term
• yt = the time series variable being estimated
• yt–1 = one-period lagged observation of the variable being estimated
• εt = current random white noise shock (mean 0)
• Φ = coefficient for the lagged observation of the variable being estimated
• In order for an AR(1) process to be covariance stationary, the absolute value of the
coefficient on the lagged operator must be less than one (i.e., |Φ| < 1). Similarly, for an
AR(p) process, the absolute values of all coefficients should be less than 1.
Autoregressive Processes
• Autoregressive model predicts future values based on past values.
• For example, an autoregressive model might seek to predict a stock's future prices
based on its past performance.
• Based on the assumption that past values have an effect on current values.
• For example, an investor using an autoregressive model to forecast stock prices
would need to assume that new buyers and sellers of that stock are influenced by
recent market transactions when deciding how much to offer or accept for the
security.
• This assumption is not always the case.
• For example, in the years prior to the 2008 Financial Crisis, most investors were not
aware of the risks posed by the large portfolios of mortgage-backed securities held
by many financial firms. During those times, an investor using an autoregressive
model to predict the performance of U.S. financial stocks would have had good
reason to predict an ongoing trend of stable or rising stock prices in that sector. 
Moving average process
• An MA process is a linear regression of the current values of a time series against both
the current and previous unobserved white noise error terms, which are random shocks.
MAs are always covariance stationary.
• The first-order moving average [MA(1)] process can be defined as:
yt = μ + θεt−1 + εt
where:
• μ​= mean of the time series
• εt = current random white noise shock (mean 0)
• εt−1 = one-period lagged random white noise shock
• θ = coefficient for the lagged random shock
• The MA(1) process is considered to be first-order because it only has one lagged error
term (εt−1). This yields a very short-term memory because it only incorporates what
happens one period ago
Moving average process
• Example of daily demand for ice cream (yt):
yt = 5,000 + 0.3εt−1 + εt

• The error term is the daily change in demand.


• Using only the current period’s error term (εt), if the daily change is positive, then
we would estimate that daily demand for ice cream would also be positive.
• But, if the daily change yesterday (εt−1) was also positive, then we would expect
an amplified impact on our daily demand by a factor of 0.3.
• If the coefficient θ is negative, the series aggressively mean reverts because the
effect of the previous shock reverts in the current period
Quantitative Analysis
Reading 22
Non-Stationary Time Series
Time Trends
• Non-stationary time series may exhibit deterministic trends,
stochastic trends, or both.
• Deterministic trends include both time trends and deterministic
seasonality.
• Stochastic trends include unit root processes such as random walks
Time Trends
• Time trends may be linear or nonlinear.
• Linear

• Log-linear model

• Non-linear
• log-quadratic model
Seasonality
• Seasonality in a time series is a pattern that tends to repeat from year to year.
• Example: monthly sales data for a retailer. Because sales data normally varies according to the calendar,
we might expect this month’s sales (xt) to be related to sales for the same month last year (x t−12).
• Specific examples of seasonality relate to increases that occur at only certain
times of the year.
• Example: purchases of retail goods typically increase dramatically every year in the weeks leading up to
Christmas. Similarly, sales of gasoline generally increase during the summer months when people take
more vacations.
• Weather is another common example of a seasonal factor as production of agricultural commodities is
heavily influenced by changing seasons and temperatures.
• Seasonality in a time series can also refer to cycles shorter than a year.
• Example: Calendar effects (January effects)
• An effective technique for modeling seasonality is to include seasonal dummy
variables in a regression.
Unit roots
• We describe a time series as a random walk if its value in any given period is its
previous value plus-or-minus a random “shock.” Symbolically, we state this as
yt = yt−1 + εt.
• If it follows logically that the same was true in earlier periods,
yt−1 = yt−2 + εt−1
yt−2 = yt−3 + εt−2 and so forth
y1 = y0 + ε1.
• If we substitute these (recursively) back into yt = yt−1 + εt, we eventually get:
yt = y0 + ε1 + ε2 + … + εt−2 + εt−1 + εt.
That is, any observation in the series is a function of the beginning value and all the
past shocks, as well as the shock in the observation’s own period.
Random walk theory
• Random walk theory suggests that changes in stock prices have the same
distribution and are independent of each other.
• Therefore, it assumes the past movement or trend of a stock price or market
cannot be used to predict its future movement.
• In short, random walk theory proclaims that stocks take a random and
unpredictable path that makes all methods of predicting stock prices futile in the
long run.
Unit roots
• A key property of a random walk is that its variance increases with time. This
implies a random walk is not covariance stationary, so we cannot model one
directly with AR, MA, or ARMA techniques
• A random walk is a special case of a wider class of time series known as unit root
processes.
• The most common way to test a series for a unit root is with an augmented
Dickey-Fuller test

You might also like