0% found this document useful (0 votes)

275 views27 pages

Empirical Methods For Finance: Sjoerd Van Den Hauwe

The document discusses principal components analysis (PCA) as a method to reduce the dimensionality of correlated explanatory variables and extract common signals from imperfect proxies. It provides an example of using PCA to compute investor sentiment from a set of market-based sentiment proxies. Specifically, it (1) standardizes and transforms the proxies, (2) computes principal components from the correlation matrix to obtain loadings and eigenvectors, and (3) uses the top component to represent the common sentiment signal shared among the proxies.

Uploaded by

bscjjw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

275 views27 pages

Empirical Methods For Finance: Sjoerd Van Den Hauwe

Uploaded by

bscjjw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Empirical Methods for Finance

(FEM11198-21)

Sjoerd van den Hauwe

[email protected]

Department of Business Economics – Finance

Erasmus School of Economics
Erasmus Universiteit Rotterdam

Master’s in Financial Economics

Week 7, December 8, 2021
Outline Week 7

Part I: Principal components analysis

I Motivation
I Investor sentiment
I Computation
I Number of components
Part II: Q&A

2
Motivation

All covered methods consider the set of regressors as known.

Two issues might pop up.
I Regressors show strong mutual correlation (week 3).
I Imperfect proxies for the financial quantity of interest.

Remedies
I Summarizing the regressors in fewer variables (dimension reduction).
I Combining proxies to extract a stronger signal of the financial quantity.

→ Principal components analysis (PCA) can do both jobs.

3
Setting

For each (firm) i, (i = 1, 2, . . . , T ) we have

I The set of explanatory variables xj,i , (j = 1, 2, . . . , J)

I → Potentially relevant for dependent variable yi .

Pairs of regressors (xj1 , xj2 ) can

I Be strongly correlated (check sample correlations; VIFs).

I Share a common component (economics/finance theory).

PCA: Construct linear combinations of regressors such that

I Much of the variation in the regressor set is captured in few combinations.

I These linear combinations are uncorrelated (orthogonal).

4
Regressor Combination

We stack regressor observations per firm i

I xi = (x1,i , x2,i , . . . , xJ,i )0
I No constant (=regressor that does not vary)

Collect these firm regressor vectors in

 0 
x1
 x20 
I X =  .  (see week 1).

 .. 
xT0
I T × J matrix.

A linear combination of the J regressors is

I z = Xa = ( Jj=1 xj,1 aj , Jj=1 xj,2 aj , . . . , Jj=1 xj,T aj )0 ,
P P P

I a = (a1 , a2 , . . . , aJ )0 a J × 1 vector with linear combination weights.

5
Principal Components

PCA finds the linear combinations zk = Xak , (k = 1, 2, . . . J) such that

I Total variation of all the zk equals total regressor variation.

I All combinations are uncorrelated: zk1 0 zk2 = 0.

→ These linear combinations

I Ordered (descending) according to their sample variance

I are called the sample principal components.

Hence, the sample variance of the first principal component z1 is larger than
the second’s, z2 ’s, etc.

6
Investor Sentiment Example

Baker & Wurgler (2006, 2007) [BW]:

I Relation investor sentiment and stock prices.

BW define investor sentiment as ”[investor’s] propensity to speculate.”

No definitive or uncontroversial measures of sentiment →

I Construct a composite index
I Use the common variation in a set of sentiment proxies.

Proxies can be
I Survey based (e.g., confidence indexes)
I Market based (trading characteristics)

Both can be dealt with using PCA to extract a common component that is
defined as investor sentiment.

7
Market-Based Proxies

BW consider 5 variables each containing a part representative of investor

sentiment.

Monthly data on (acronym; sign of relation with sentiment)

I Number of initial public offerings (IPOs) (NIPO; +)

I First-day IPO returns: Monthly average (RIPO; +)
I Closed-end funds discount rate: Average difference net asset value per
share–market price. (CEFD; –)
I Equity share in new issues: Volume new equity issues relative to new
equity + long-term debt issues (ENI; +)
I Dividend premium: Log average difference value-weighted M2B dividend
payers and non-payers (PDND; –)

All 5 contain a sentiment component and idiosyncratic (=non-sentiment) part.

8
Transformations

Check regressors for

I Relative timing (some lead/lag others) (time-series data)
I Trending behavior (same reasons as in week 3) (time-series data)
I Aberrant observations

→ Check plots.

Additionally: In most cases we standardize the xj ’s

I If one regressor’s sample variance dominates the others’ →
I It will simply be the first principal component.

→ Standardization erases the impact of regressor’s scale.

I It is the scale-free mutual correlation we need.
I PCA combines those parts of the regressors that co-move to form a linear
combination with maximum sample variance.

9
Example

Investor sentiment: Monthly data for January 1967–December 2018 (T = 624)

Timing:
Some proxies lead/lag (inspecting plots and cross-correlations):
I Dividend premium (PDND) leads 12 months.
I RIPO leads 12 months.

Transformations:
I 12-Month moving averages to remove noise.
I All 5 proxies standardized.

10
Inversely Related Proxies
2 Transformed proxies inversely related to sentiment, Jan. 1967–Dec. 2018

-1

-2

-3
70 75 80 85 90 95 00 05 10 15

RECESSION CEFD PDND

11
Positively Related Proxies
3 Transformed proxies positively related to sentiment, Jan. 1967–Dec. 2018

-1

-2
70 75 80 85 90 95 00 05 10 15

RECESSION RIPO
NIPO ENI

12
Eigenvalues and Eigenvectors

A square (q × q) matrix A has q

I Eigenvalues (scalar) λi , i = 1, . . . , q
I Assciated eigenvectors (q × 1 vector) ei .

Each eigenvalue-eigenvector pair (λi , ei ) has the characteristic that

Aei = λi ei

→ Matrix times eigenvector equals eigenvector multiplied by the eigenvalue.

Note:
I Eigenvectors are known up to a proportionality factor.
I If ei is an eigenvector, then c · ei is.
I We normalize the eigenvectors (=have unit length) such that ei 0 ei = 1.

13
Computing Principal Components

If regressor variables are standardized, then

1
R≡ X 0X
T −1
is the matrix with sample correlations.

Employing PCA:
1. Compute all eigenvalue-eigenvector pairs (λk , ek ) of R, (k = 1, 2, . . . , J) .

I Sort eigenvalues in descending order:

I λ1 ≥ λ2 ≥ . . . ≥ λJ .
I Eigenvectors are normalized: ek 0 ek = 1.

2. The kth sample principal component (SPC) is the T × 1 vector

zk = Xek , → the ith observation of the kth SPC is zk,i = xi0 ek .

14
Correlation Matrix

Investor sentiment example January 1967–December 2018 (T = 624)

Correlation matrix:

 
CEFD NIPO PDND RIPO ENI

 CEFD 1 

 NIPO −0.35 1 
R= 

 PDND 0.61 −0.53 1 

 RIPO −0.24 0.29 −0.51 1 
ENI 0.25 0.27 −0.04 0.15 1

→ Finance interpretation: Mutual correlation predominantly due to sentiment

component they share.

→ PCA to extract this common component: First SPC summarizes investor

sentiment.

15
Loadings

Eigenvalues of sample correlation matrix R in descending order.

λ1 λ2 λ3 λ4 λ5
2.30 1.25 0.73 0.42 0.29

Associated eigenvectors are called SPCs’ loadings.

     
−0.473 0.467 0.208

 0.482 


 0.256 


 −0.540 

e1 = 
 −0.589 ,
 e2 = 
 0.081 ,
 e3 = 
 −0.028 ,

 0.437   0.182   0.805 
0.080 0.822 −0.126
   
0.489 −0.525

 0.621 


 0.159 

e4 = 
 0.200 ,
 e5 = 
 0.778 .

 0.213   0.287 
−0.538 0.109

16
Interpretation

Check how the kth SPC zk loads on the original variables xj , (j = 1, . . . , J).

I As the xj are standardized, sample correlation of SPC and regressor is

p
cor(z
c k , xj ) = ek,j λk .

I Example: First sample principal component = investor sentiment:

z1,t = −0.473 · CEFDt + 0.482 · NIPOt − 0.589 · PDNDt−12

+ 0.437 · RIPOt−12 + 0.080 · ENIt .

I Correlation 5 financial sentiment proxies with investor sentiment

CEFD NIPO PDND RIPO ENI

−0.72 0.73 −0.89 0.66 0.12

17
Investor Sentiment Index
Time-series plot Baker & Wurgler Investor Sentiment, Jan. 1967–Dec. 2018.

-1

-2

-3

-4
70 75 80 85 90 95 00 05 10 15

RECESSION
PC1 -- Investor Sentiment

18
Number of Components

In the example we have a clear financial interpretation of the first component.

How many components to select in general?

I If goal is to summarize much of the total variation by a few components:
I Select the first p that capture relatively much.

If regressors are standardized →

I Total sample variance equals sum of diagonal elements of R: J · 1.
I Sample variance of the kth SPC equals ek0 Rek = λk .
I Hence, kth SPC accounts for λk /J × 100% of total sample variance.

Pp
As SPCs are uncorrelated, the first p explain J −1 k=1 λk × 100%.

19
Choosing Components

How to choose the SPCs depends on financial application.

1. Summarizing much of total sample variance in few SPCs (multicollinearity)

I Make scree plot of λk versus k:
I Look for ”elbow”: eigenvalues more or less of equal size after elbow.

2. Extracting a common component with financial interpretation

I Compute loadings of SPCs on regressors:
I Look for pattern in loadings/check plots.

Relation between regressors and yi is NOT taken into account by PCA.

I The last SPC can have the highest sample correlation with regressand yi !

20
Scree Plot
Cumulative percentage of total sample variance explained by kth SPC

kth SPC
1 2 3 4 5
% 46 71 86 94 100

Scree Plot (Ordered Eigenvalues)

2.4

2.0

1.6

1.2

0.8

0.4

0.0
1 2 3 4 5

21
Principal Components Regression

Observations zk,i on the first p SPCs can serve as regressors in a model for yi ,
p
X
yi = β0 + βk zk,i + εi .
k=1

I Much of the variation of the original regressors is retained.

I New regressors are mutually uncorrelated by construction.
I Finance interpretation: Examine correlation between relevant SPCs in the
regression and original regressors.

Example: Baker & Wurgler use investor sentiment

I To trade on it.
I As factor (=regressor) in empirical asset-pricing model.

22
Course Conclusions

Range of commonly applied empirical methods for finance → to analyze

I Time-series data
I Cross-sectional data
I Panel data

Scheme for doing empirical work in finance

I RQ → empirical (regression) model
I Estimate model parameters
I Check model assumptions (diagnostics)
I Adjust model/use robust s.e.’s if need be
I Test hypotheses (t and F testing)

23
Last Words . . .

This course epitomized in the ”Last Words” by Angrist & Pischke (2009,
p.327):

”If applied econometrics were easy, theorists would do it. But it’s
not as hard as the dense pages of Econometrica might lead you to
believe. Carefully applied to coherent causal questions, regression and
2SLS almost always make sense. Your standard errors probably won’t
be quite right, but they rarely are. Avoid embarrasment by being your
own best skeptic, and especially, DON’T PANIC!”

Joshua D. Angrist and Jörn-Steffen Pischke (2009), Mostly Harmless

Econometrics, Princeton, NJ: Princeton University Press.

24
Course Material Week 7

I Slides

Principal components analysis

I Brooks: Appendix 4.2

25
Background Material Week 7

Principal Components Analysis and Investor Sentiment

I M. Baker & J. Wurgler (2006), Investor Sentiment and the Cross-Section
of Stock Returns, Journal of Finance, 61, pp. 1645–1680
I M. Baker & J. Wurgler (2007), Investor Sentiment in the Stock Market,
Journal of Economic Perspectives, 21, pp. 129–151

26
Q&A

Any questions on the 7 lectures or course material?

Romantizam
No ratings yet
Romantizam
283 pages
Dsa Appointment Form
No ratings yet
Dsa Appointment Form
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
Session 7: Genetics, Experience and Financial Sophistication
100% (1)
Session 7: Genetics, Experience and Financial Sophistication
40 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
100 Opposite Words-1 PDF
86% (7)
100 Opposite Words-1 PDF
5 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Minna No Nihongo I & II - CD Tracklist
50% (4)
Minna No Nihongo I & II - CD Tracklist
13 pages
Chap 008
100% (1)
Chap 008
47 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
33 pages
Coaching Defensive Football Successfully - Vol. 4 - Secondary Play & Coverages by Denny M. Burdine (2011)
100% (6)
Coaching Defensive Football Successfully - Vol. 4 - Secondary Play & Coverages by Denny M. Burdine (2011)
234 pages
BlendingPCAandICA ExtremeEvents Good
No ratings yet
BlendingPCAandICA ExtremeEvents Good
40 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Practical Guide To Principal Component Analysis (PCA) in R & Python
No ratings yet
Practical Guide To Principal Component Analysis (PCA) in R & Python
33 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Chemical Calligraphy
No ratings yet
Chemical Calligraphy
11 pages
Synopsis - On Line Reminder
100% (1)
Synopsis - On Line Reminder
10 pages
Chris Brooks Chapter 3 Slides
No ratings yet
Chris Brooks Chapter 3 Slides
80 pages
Click Here To Read Full Release From Holtzclaw Family
No ratings yet
Click Here To Read Full Release From Holtzclaw Family
4 pages
Week 04
No ratings yet
Week 04
86 pages
Week 9 Lecture - Revision Test-Dual-Translated
No ratings yet
Week 9 Lecture - Revision Test-Dual-Translated
92 pages
8 Dimensionality Reduction
No ratings yet
8 Dimensionality Reduction
49 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
Lecture 6 - PCA - Lecturefin
No ratings yet
Lecture 6 - PCA - Lecturefin
71 pages
MBBS 2021 Urology Osce and Picture Test
No ratings yet
MBBS 2021 Urology Osce and Picture Test
7 pages
3D Voronoi Diagram
No ratings yet
3D Voronoi Diagram
124 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
hst951 7
No ratings yet
hst951 7
32 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
An Intelligent Student Advising System: A Spanning Tree Approach
No ratings yet
An Intelligent Student Advising System: A Spanning Tree Approach
30 pages
20 Pca
No ratings yet
20 Pca
50 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Segmentation-Factor Analysis
No ratings yet
Segmentation-Factor Analysis
50 pages
Using Key Performance Indicators (Kpis) in Inclusive Insurance Supervision
No ratings yet
Using Key Performance Indicators (Kpis) in Inclusive Insurance Supervision
31 pages
Handbook of Capital Recovery (CR) Factors: European Edition
From Everand
Handbook of Capital Recovery (CR) Factors: European Edition
Lars Jäger
No ratings yet
Factor Analysis
No ratings yet
Factor Analysis
57 pages
STM32Cube Firmware
No ratings yet
STM32Cube Firmware
42 pages
Radiance Final (Revised and Edited)
No ratings yet
Radiance Final (Revised and Edited)
19 pages
Data Analytics
No ratings yet
Data Analytics
28 pages
Private Value Too Big Fail Guarantees
No ratings yet
Private Value Too Big Fail Guarantees
46 pages
Quality Service Rev
No ratings yet
Quality Service Rev
9 pages
An End To End Comprehensive Guide For Pca
No ratings yet
An End To End Comprehensive Guide For Pca
24 pages
SL Sir App Ecotrix UNIT 1
No ratings yet
SL Sir App Ecotrix UNIT 1
18 pages
Pca PDF
No ratings yet
Pca PDF
33 pages
AI Practice Session 1 Note
No ratings yet
AI Practice Session 1 Note
23 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Lecture Five-Multivariate Factor Models
No ratings yet
Lecture Five-Multivariate Factor Models
20 pages
PCA - Ensemble Classifiers
No ratings yet
PCA - Ensemble Classifiers
9 pages
Principal Component Analysis - Wikipedia
No ratings yet
Principal Component Analysis - Wikipedia
28 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
20 pages
Multivariate Statistics Principal Component Analysis (PCA)
No ratings yet
Multivariate Statistics Principal Component Analysis (PCA)
41 pages
Understanding Consumers Willingness To Use Ride 2019 Transportation Researc
No ratings yet
Understanding Consumers Willingness To Use Ride 2019 Transportation Researc
16 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
12 pages
Customer Satisfaction Towards Patanjali Ayurvedic Products
No ratings yet
Customer Satisfaction Towards Patanjali Ayurvedic Products
42 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
6 Dimension Reduction Theory
No ratings yet
6 Dimension Reduction Theory
18 pages
No. 19 Potential PCA Interpretation Problems For Volatility Smile Dynamics Dimitri Reiswich, Robert Tompkins
No ratings yet
No. 19 Potential PCA Interpretation Problems For Volatility Smile Dynamics Dimitri Reiswich, Robert Tompkins
42 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
Orems Self Care Deficit Theory
No ratings yet
Orems Self Care Deficit Theory
22 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
23 pages
Column Generation Tutorial: Marc de Leenheer Ghent University - IBBT, Belgium University of California, Davis, USA
No ratings yet
Column Generation Tutorial: Marc de Leenheer Ghent University - IBBT, Belgium University of California, Davis, USA
23 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
Factor Analysis and Principal Components: by A. Subrahmanyam
No ratings yet
Factor Analysis and Principal Components: by A. Subrahmanyam
14 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
20 pages
Bronx Masquerade Selection Test
No ratings yet
Bronx Masquerade Selection Test
5 pages
Module 4-2 Principal Components Analysis
No ratings yet
Module 4-2 Principal Components Analysis
18 pages
Pca Portfolio Selection
No ratings yet
Pca Portfolio Selection
18 pages
Tutorial 4
No ratings yet
Tutorial 4
4 pages
Martin Hofmann's Contributions To Type Theory: Groupoids and Univalence
No ratings yet
Martin Hofmann's Contributions To Type Theory: Groupoids and Univalence
7 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Ch. 10 Principal Components Analysis (PCA)
No ratings yet
Ch. 10 Principal Components Analysis (PCA)
17 pages
Sanjay Singh Principal Component Analysis
No ratings yet
Sanjay Singh Principal Component Analysis
9 pages
Liam - Mescall - PCA Project
No ratings yet
Liam - Mescall - PCA Project
15 pages
PCA Using R
No ratings yet
PCA Using R
12 pages
Eigen Value and Eigen Vectors
No ratings yet
Eigen Value and Eigen Vectors
4 pages
Syllabus: Bisnis Global (Global Business) (ACAU609104) SEMESTER 1 2018-2019
No ratings yet
Syllabus: Bisnis Global (Global Business) (ACAU609104) SEMESTER 1 2018-2019
6 pages
Trasparency of Things Contemplating The Nature of Experience Rupert Spira PDF
100% (2)
Trasparency of Things Contemplating The Nature of Experience Rupert Spira PDF
271 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
QRS Detection Using PCA
No ratings yet
QRS Detection Using PCA
4 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Breast Cancer Diagnosed During Pregnancy
No ratings yet
Breast Cancer Diagnosed During Pregnancy
3 pages
Principal Component Analysis: Term Paper For Data Mining & Data Warehousing
No ratings yet
Principal Component Analysis: Term Paper For Data Mining & Data Warehousing
11 pages
Ibsen's Portrayal of Nora in ADH
No ratings yet
Ibsen's Portrayal of Nora in ADH
1 page
Pca Tutorial
No ratings yet
Pca Tutorial
11 pages
Cat Pics - Google Search
No ratings yet
Cat Pics - Google Search
1 page
Pca 1
No ratings yet
Pca 1
3 pages
The Takbirs of Tashriq
No ratings yet
The Takbirs of Tashriq
1 page
People V Ligon
No ratings yet
People V Ligon
3 pages
ORCO Financial Highlights
No ratings yet
ORCO Financial Highlights
1 page
BDC224 FD ADV Financial Statement A4
No ratings yet
BDC224 FD ADV Financial Statement A4
1 page
Kenneth R. French - Data Library: Fama/French Factors Six Size/book-To-Market ..
No ratings yet
Kenneth R. French - Data Library: Fama/French Factors Six Size/book-To-Market ..
2 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Digitaal Business Definition Model (Abell)
No ratings yet
Digitaal Business Definition Model (Abell)
1 page
Causes of Death
No ratings yet
Causes of Death
1 page
Eng6 q2 Module 7 Week7 Analyzing The Characters and Setting Used in Printnon Print and Digital Materials Removed
100% (1)
Eng6 q2 Module 7 Week7 Analyzing The Characters and Setting Used in Printnon Print and Digital Materials Removed
14 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Leslie Fcking Jones Leslie Jones Download
100% (1)
Leslie Fcking Jones Leslie Jones Download
28 pages

Empirical Methods For Finance: Sjoerd Van Den Hauwe

Uploaded by

Empirical Methods For Finance: Sjoerd Van Den Hauwe

Uploaded by

Empirical Methods for Finance

Sjoerd van den Hauwe

Department of Business Economics – Finance

Master’s in Financial Economics

Part I: Principal components analysis

All covered methods consider the set of regressors as known.

→ Principal components analysis (PCA) can do both jobs.

For each (firm) i, (i = 1, 2, . . . , T ) we have

I The set of explanatory variables xj,i , (j = 1, 2, . . . , J)

Pairs of regressors (xj1 , xj2 ) can

I Be strongly correlated (check sample correlations; VIFs).

PCA: Construct linear combinations of regressors such that

I Much of the variation in the regressor set is captured in few combinations.

We stack regressor observations per firm i

Collect these firm regressor vectors in

A linear combination of the J regressors is

I a = (a1 , a2 , . . . , aJ )0 a J × 1 vector with linear combination weights.

PCA finds the linear combinations zk = Xak , (k = 1, 2, . . . J) such that

I Total variation of all the zk equals total regressor variation.

→ These linear combinations

I Ordered (descending) according to their sample variance

Baker & Wurgler (2006, 2007) [BW]:

BW define investor sentiment as ”[investor’s] propensity to speculate.”

No definitive or uncontroversial measures of sentiment →

BW consider 5 variables each containing a part representative of investor

Monthly data on (acronym; sign of relation with sentiment)

I Number of initial public offerings (IPOs) (NIPO; +)

All 5 contain a sentiment component and idiosyncratic (=non-sentiment) part.

Check regressors for

Additionally: In most cases we standardize the xj ’s

→ Standardization erases the impact of regressor’s scale.

Investor sentiment: Monthly data for January 1967–December 2018 (T = 624)

RECESSION CEFD PDND

A square (q × q) matrix A has q

Each eigenvalue-eigenvector pair (λi , ei ) has the characteristic that

→ Matrix times eigenvector equals eigenvector multiplied by the eigenvalue.

If regressor variables are standardized, then

I Sort eigenvalues in descending order:

2. The kth sample principal component (SPC) is the T × 1 vector

zk = Xek , → the ith observation of the kth SPC is zk,i = xi0 ek .

Investor sentiment example January 1967–December 2018 (T = 624)

→ Finance interpretation: Mutual correlation predominantly due to sentiment

→ PCA to extract this common component: First SPC summarizes investor

Eigenvalues of sample correlation matrix R in descending order.

Associated eigenvectors are called SPCs’ loadings.

I As the xj are standardized, sample correlation of SPC and regressor is

I Example: First sample principal component = investor sentiment:

z1,t = −0.473 · CEFDt + 0.482 · NIPOt − 0.589 · PDNDt−12

I Correlation 5 financial sentiment proxies with investor sentiment

CEFD NIPO PDND RIPO ENI

In the example we have a clear financial interpretation of the first component.

How many components to select in general?

If regressors are standardized →

How to choose the SPCs depends on financial application.

1. Summarizing much of total sample variance in few SPCs (multicollinearity)

2. Extracting a common component with financial interpretation

Relation between regressors and yi is NOT taken into account by PCA.

Scree Plot (Ordered Eigenvalues)

I Much of the variation of the original regressors is retained.

Example: Baker & Wurgler use investor sentiment

Range of commonly applied empirical methods for finance → to analyze

Scheme for doing empirical work in finance

Joshua D. Angrist and Jörn-Steffen Pischke (2009), Mostly Harmless

Principal components analysis

Principal Components Analysis and Investor Sentiment

Any questions on the 7 lectures or course material?

You might also like