0% found this document useful (0 votes)

10 views18 pages

10 Cor1

Uploaded by

tanmay.bhatnagar03

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views18 pages

10 Cor1

Uploaded by

tanmay.bhatnagar03

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Correlation analysis 1: Canonical correlation

analysis

Ryan Tibshirani
Data Mining: 36-462/36-662

February 14 2013

1
Review: correlation
Given two random variables X, Y ∈ R, the (Pearson) correlation
between X and Y is defined as
Cov(X, Y )
Cor(X, Y ) = p p
Var(X) Var(Y )

Recall that

Cov(X, Y ) = E (X − E[X])(Y − E[Y ])
and
Var(X) = E (X − E[X])2 = Cov(X, X)

This measures a linear association between X, Y . Properties:

I −1 ≤ Cor(X, Y ) ≤ 1
I X, Y independent ⇒ Cor(X, Y ) = 0 (Homework 2)
I Cor(X, Y ) = 0 6⇒ X, Y independent (Homework 2)
More on this later ...
2
Review: sample correlation
Given centered x, y ∈ Rn , the sample correlation between x and y
is defined as
xT y
cor(x, y) = √ p .
xT x y T y
Note the analogy to the definition on the last slide—we just
replace everything by its sample version. I.e., if we write cov and
var for the sample covariance and variance, then

cov(x, y)
cor(x, y) = p p .
var(x) var(y)

Note: if x, y ∈ Rn are centered unit vectors then cor(x, y) = xT y

This measures a linear association between x, y. Properties:

I −1 ≤ cor(x, y) ≤ 1
I cor(x, y) = 0 ⇐⇒ x, y are orthogonal
3
Canonical correlation analysis
Principal component analysis attempts to answer the question:
“which directions account for much of the observed variance in a
data set?” Given a centered matrix X ∈ Rn×p , we first find the
direction v1 ∈ Rp to maximize the sample variance of Xv:
v1 = argmax var(Xv)
kvk2 =1

Canonical correlation analysis is similar but instead attempts to

answer: “which directions account for much of the covariance
between two data sets?” Now we are given two centered matrices
X ∈ Rn×p , Y ∈ Rn×q , and we seek the two directions α1 ∈ Rp ,
β1 ∈ Rq that maximize the sample covariance of Xα and Y β:
α1 , β1 = argmax cov(Xα, Y β)
kXαk2 =1, kY βk2 =1

Subject to the constraints, this is equivalent to maximizing

cor(Xα, Y β). (Why?)
4
Canonical directions and variates
The first canonical directions α1 ∈ Rp , β1 ∈ Rq are given by

α1 , β1 = argmax (Xα)T (Y β)
kXαk2 =1, kY βk2 =1

Vectors Xα1 , Y β1 ∈ Rn are called the first canonical variates, and

ρ1 = (Xα1 )T (Y β1 ) ∈ R is called the first canonical correlation

Given the first k − 1 directions, the kth canonical directions

αk ∈ Rp , βk ∈ Rq are defined as

αk , βk = argmax (Xα)T (Y β)
kXαk2 =1, kY βk2 =1
(Xα)T (Xαj )=0, j=1,...k−1
(Y β)T (Y βj )=0, j=1,...k−1

Vectors Xαk , Y βk ∈ Rn are called the kth canonical variates, and

ρk = (Xαk )T (Y βk ) ∈ R is called the kth canonical correlation

5
Example: scores data
Example: n = 88 students took tests in each of 5 subjects:
mechanics, vectors, algebra, analysis, statistics. (From Mardia et
al. (1979) “Multivariate analysis”.) Each test is out of 100 points

The tests on mechanics, vectors were closed book and those on

algebra, analysis, statistics were open book. There’s clearly some
correlation between these two sets of scores:

alg ana sta

mec 0.547 0.409 0.389
vec 0.610 0.485 0.436

Canonical correlation analysis attempts to explain this phenomenon

using the variables in each set jointly. Here X contains the closed
book test scores and Y contains the open book test scores, so
X ∈ R88×2 and Y ∈ R88×3

6
The first canonical directions (multiplied by 103 ):
 
8.782 alg
2.770 mec
α1 = , β1 =  0.860  ana
5.517 vec
0.370 sta
The first canonical correlation is ρ1 = 0.663, and the variates:

0.8
●

0.7
●

● ●
●●
● ●
● ●
0.6 ●
● ● ● ● ●
● ●●
●
● ● ● ●●● ●
●
● ● ● ● ●● ● ●
●
0.5

●
● ●●
●● ● ● ● ●
Yβ1

● ●
●
● ● ●●
●
● ● ● ●●●
●●
● ● ●● ●●
0.4

● ● ●●
● ● ●
● ● ●
● ●
● ●
0.3
0.2

0.1 0.2 0.3 0.4 0.5 0.6

Xα1

The second directions are more surprising, but ρ2 = 0.041

7
How many canonical directions are there?

We have X ∈ Rn×p and Y ∈ Rn×q . How many pairs of canonical

directions (α1 , β1 ), (α2 , β2 ), . . . are there?

We know that any n orthogonal (linearly independent) vectors in

Rn form a basis for Rn . Therefore there cannot be more than p
orthogonal vectors of the form Xα, α ∈ Rp , and q orthogonal
vectors of the form Y β, β ∈ Rq . (Why?)

Hence there are exactly r = min{p, q} canonical directions

(α1 , β1 ), . . . (αr , βr )1

1
This is assuming that n ≥ p and n ≥ q. In general, there are actually only
r = min{rank(X), rank(Y )} canonical directions
8
Transforming the problem

If A ∈ Rp×p , B ∈ Rq×q are invertible, then computing

α̃1 , β̃1 = argmax (XAα̃)T (Y B β̃),

kXAα̃k2 =1, kY B β̃k2 =1

is equivalent to the first step of canonical correlation analysis. In

particular, the first canonical directions are given by α1 = Aα̃1 and
β1 = B β̃1 . The same is also true of further directions

I.e., we can transform our data matrices to be X̃ = XA, Ỹ = Y B

for any invertible A, B, solve the canonical correlation problem
with X̃, Ỹ , and then back-transform to get our desired answers

Why would we ever do this? Because there is a transformation

A, B that makes the computational problem simpler

9
Sphering
For any symmetric invertible matrix A ∈ Rn×n , there is a matrix
A1/2 ∈ Rn×n , called the (symmetric) square root of A, such that
A1/2 A1/2 = A

We write the inverse of A1/2 as A−1/2 . Note A−1/2 AA−1/2 = I.

(Why?)

Given centered matrices X ∈ Rn×p and Y ∈ Rn×q ,2 we define

VX = X T X ∈ Rp×p and VY = Y T Y ∈ Rq×q . Then
−1/2 −1/2
X̃ = XVX ∈ Rn×p and Ỹ = Y VY ∈ Rn×q
are called the sphered versions of X and Y .3 Note that the sample
covariance of X̃ and Ỹ is
cov(X̃) = I/n and cov(Ỹ ) = I/n
2
Here we are assuming that rank(X) = p and rank(Y ) = q
3
Alternatively, for sphering we would sometimes define VX = (X T X)/n and
VY = (Y T Y )/n, so that the transformed sample covariances are exactly I
10
Transforming the problem (continued)
−1/2
As suggested by the previous slide, we will take X̃ = XVX and
−1/2
Ỹ = Y VY , and we’ll solve the problem

α̃1 , β̃1 = argmax (X̃ α̃)T (Ỹ β̃)

kX̃ α̃k2 =1, kỸ β̃k2 =1

−1/2 −1/2
Recall that then α1 = VX α̃1 and β1 = VY β̃1 .

So why is this simpler? Note that the constraint says

−1/2 −1/2
1 = (X̃ α̃)T (X̃ α̃) = α̃T VX X T XVX α̃ = α̃T α̃

i.e., kα̃k2 = 1. Similarly, kβ̃k2 = 1. Hence our problem can be

rewritten as:
α̃1 , β̃1 = argmax α̃T M β̃
kα̃k2 =1, kβ̃k2 =1
−1/2 −1/2
where M = X̃ T Ỹ = VX X T Y VY ∈ Rp×q . The same is true
for further directions
11
Computing canonical directions and variates
Now comes the singular value decomposition to the rescue
(again!). Let r = min{p, q}. Then we can decompose

M = U DV T

where U ∈ Rp×r , V ∈ Rq×r have orthonormal columns, and

D = diag(d1 , . . . dr ) ∈ Rr×r with d1 ≥ . . . ≥ dr ≥ 0. Further:
I The transformed canonical directions α̃1 , . . . α̃r ∈ Rp and
β̃1 , . . . β̃r ∈ Rq are the columns of U and V , respectively
I The canonical directions α1 , . . . αr ∈ Rp and β1 , . . . βr ∈ Rq
−1/2 −1/2
are the columns of VX U and VY V , respectively;
I the canonical variates Xα1 , . . . Xαr ∈ Rn and
−1/2
Y β1 , . . . Y βr ∈ Rn are the columns of XVX U ∈ Rn×r and
−1/2
Y VY V ∈ Rn×r , respectively
I The canonical correlations ρ1 ≥ . . . ≥ ρr are equal to
d1 ≥ · · · ≥ dr , the diagonal entries of D
12
Example: olive oil data
Example: n = 572 olive oils, with p = 9 features (the olives data
set from the R package classifly):

1. region
2. palmitic
3. palmitoleic
4. stearic
5. oleic
6. linoleic
7. linolenic
8. arachidic
9. eicosenoic

Variable 1 takes values in {1, 2, 3}, indicating the region (in Italy)
of origin. Variables 2-9 are continuous valued and measure the
percentage composition of 8 different fatty acids

13
We are interested in the correlations between the region of origin
and the fatty acid measurements. Hence we take X ∈ R572×8 to
contain the fatty acid measurements, and Y ∈ R572×3 to be an
indicator matrix, i.e., each row of Y indicates the region with a 1
and otherwise has 0s. This might look like:
 
1 0 0
 1 0 0 
 
Y =  0 0 1 

 0 1 0 
...

(In this case, canonical correlation analysis actually does the exact
same thing as linear discriminant analysis, an important tool that
we will learn later for classification)

14
The first two canonical X variates, with the points colored by
region:

1.40
●
●● ●●●
● ● ●
● ● ●● ● ●●
● ● ●
●● ● ●● ●
● ●●●●●● ●
● ●
●●● ●
Second canonical x variate
● ● ●●●●
●● ●●
●
●●●●
●
●●
● ● ●●●
●● ● ●●
● ● ●●●● ●
●
● ●
●●
●● ●
●● ●
● ●●●
● ● ●●●●
●●
●
1.35 ●
●
● ● ● ● ●
● ●●●
●
●
●●●
●●
●●●●
●
● ● ● ●● ● ● ● ● ●● ● ●
●●●●
●● ●●● ● ●● ●● ● ●● ● ●
● ● ●●
●
●● ● ●
●●● ● ●●● ●●●●
●● ● ●●
●● ●● ● ●●●● ● ●
● ●●● ●●●
●● ●● ●●
●●
● ●● ● ●●● ●●● ●●● ●
●● ● ●
● ●
●●
●●
●●●
● ● ● ●●●
●● ●
●● ● ● ● ●
● ● ● ●●●●●●●●● ●
●
●
● ●
● ●●●● ● ●
● ● ●●
●
●
●●●● ●
● ●●●●
●●
●
●● ● ●● ●● ●
●●● ● ●
●● ●● ● ●●●● ● ●●
● ●●● ●
●●● ●●
●●●● ● ●
1.30

●●●●● ●● ●● ● ● ● ●● ●●
● ●●● ●●●●●●●●
●●● ●●
● ●●
●● ●●●
●● ●● ●●●●●●● ● ●●
● ●
● ● ● ●●● ●
● ● ●
●● ● ● ● ●
●●
● ● ● ● ● ●●● ● ●●
● ● ● ●●● ● ●●
● ● ●●● ●
● ● ●● ●●●
●
●●
● ●
● ●● ●● ● ● ●●●
●
●●
●●
● ●
● ●
● ●●●
●●●
● ●
●
●
●●
● ●
●●
●
●●●
1.25

● ●●●
●
●
●● ● ● ●●
●● ●●● ●
●● ●● ●
●●
●●
●
●
●●●
●
●●● ●●●●
●●
● Region 1 ● ●●
●
●●
●
● Region 2
1.20

● Region 3 ●

−0.25 −0.20 −0.15 −0.10

First canonical x variate

15
Canonical correlation analysis in R

Canonical correlation analysis is implemented by the cancor

function in the base distribution. E.g.,

cc = cancor(x,y)
alpha = cc$xcoef
beta = cc$ycoef
rho = cc$cor
xvars = x %*% alpha
yvars = y %*% beta

16
Recap: canonical correlation analysis

In canonical correlation analysis we are looking for pairs of

directions, one in each of the feature spaces of two data sets
X ∈ Rn×p , Y ∈ Rn×q , to maximize the covariance (or correlation)

We defined the pairs of canonical directions (α1 , β1 ), . . . (αr , βr ),

where r = min{p, q}, and αj ∈ Rp , βj ∈ Rq . We also defined the
pairs of canonical variates (Xα1 , Xβ1 ), . . . (Xαr , Xβr ), where
Xαj ∈ Rn and Xβj ∈ Rn . Finally, we defined the canonical
correlations ρ1 , . . . ρr ∈ R

We saw that transforming the problem leads to a simpler form.

From this simpler form we can compute the canonical directions,
correlations, and variates using the singular value decomposition

17
Next time: measures of correlation
A lot of work has been done, but there’s still a lot of interest

...
1888 2012

Mathematics: Key Notes Terms Definitions Formulae
91% (11)
Mathematics: Key Notes Terms Definitions Formulae
463 pages
Multivariable Mathematics Compress
100% (1)
Multivariable Mathematics Compress
860 pages
Linear Algebra Answers
No ratings yet
Linear Algebra Answers
364 pages
7thcanonical Correlation Analysis PDF
No ratings yet
7thcanonical Correlation Analysis PDF
13 pages
Unit5 3
No ratings yet
Unit5 3
46 pages
Lesson 13 - Canonical Correlation Analysis
No ratings yet
Lesson 13 - Canonical Correlation Analysis
13 pages
Canonical Correlation Analysis: James H. Steiger
No ratings yet
Canonical Correlation Analysis: James H. Steiger
35 pages
Canonical Correlation Notes
No ratings yet
Canonical Correlation Notes
6 pages
Canonical Correlation
No ratings yet
Canonical Correlation
7 pages
Lecture-12 Canonical Correlation
No ratings yet
Lecture-12 Canonical Correlation
13 pages
Lec 36
No ratings yet
Lec 36
17 pages
Canonical Correlation Analysis: An Overview With Application To Learning Methods
No ratings yet
Canonical Correlation Analysis: An Overview With Application To Learning Methods
22 pages
Canonical Correlation PDF
No ratings yet
Canonical Correlation PDF
10 pages
Python Igraph
No ratings yet
Python Igraph
39 pages
AS2024 11 18 Correlations
No ratings yet
AS2024 11 18 Correlations
16 pages
Correction
No ratings yet
Correction
10 pages
Covariances
No ratings yet
Covariances
12 pages
Canonical Corr
No ratings yet
Canonical Corr
49 pages
Canonical Correlation Analysis: An Overview With Application To Learning Methods
No ratings yet
Canonical Correlation Analysis: An Overview With Application To Learning Methods
22 pages
4-Lecture 04
No ratings yet
4-Lecture 04
34 pages
Bivariate EDA and Regression Analysis
No ratings yet
Bivariate EDA and Regression Analysis
61 pages
1525695618CanonicalCorrelation 1
No ratings yet
1525695618CanonicalCorrelation 1
51 pages
Malacarne
No ratings yet
Malacarne
22 pages
JUDE, ESEMOKUMO and OTI PUBLISHED PAPER
No ratings yet
JUDE, ESEMOKUMO and OTI PUBLISHED PAPER
12 pages
Wald 3 Web
No ratings yet
Wald 3 Web
76 pages
C Regression
No ratings yet
C Regression
28 pages
Bivariate Analysis
No ratings yet
Bivariate Analysis
10 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Intro Class PDF
No ratings yet
Intro Class PDF
7 pages
DSA5205 4 Copula
No ratings yet
DSA5205 4 Copula
61 pages
CH-3d (Cov, Corr, & Indep)
No ratings yet
CH-3d (Cov, Corr, & Indep)
71 pages
Graphical Models
No ratings yet
Graphical Models
43 pages
Joint Moments and Joint Characteristic Functions
No ratings yet
Joint Moments and Joint Characteristic Functions
24 pages
Smith 1
No ratings yet
Smith 1
14 pages
Notes For Multivariate Statistics With R
No ratings yet
Notes For Multivariate Statistics With R
189 pages
Chapter 5 Analysis of Several Groups Canonical Variate Analysis - 1999 - Aspects of Multivariate Statistical Analysis in Geology
No ratings yet
Chapter 5 Analysis of Several Groups Canonical Variate Analysis - 1999 - Aspects of Multivariate Statistical Analysis in Geology
37 pages
Correlation & Simple Regression
No ratings yet
Correlation & Simple Regression
15 pages
Article Test
No ratings yet
Article Test
5 pages
SSRN Id3512994
No ratings yet
SSRN Id3512994
34 pages
Multicolinearidade
No ratings yet
Multicolinearidade
24 pages
Co4 (10) Sem R
No ratings yet
Co4 (10) Sem R
12 pages
Financial Econometrics: Instructor Sergio Focardi PHD Tel: + 33 (0) 4 9318 7820 Email: Sergio - Focardi@Edhec - Edu
No ratings yet
Financial Econometrics: Instructor Sergio Focardi PHD Tel: + 33 (0) 4 9318 7820 Email: Sergio - Focardi@Edhec - Edu
55 pages
STAT501 Multivariate Analysis
No ratings yet
STAT501 Multivariate Analysis
196 pages
Canonical Correlation - MATLAB Canoncorr - MathWorks India
No ratings yet
Canonical Correlation - MATLAB Canoncorr - MathWorks India
2 pages
31 Mathematics Correlation Regression
No ratings yet
31 Mathematics Correlation Regression
9 pages
STAT3006 Lecture Notes 2021 Aug8 2021
No ratings yet
STAT3006 Lecture Notes 2021 Aug8 2021
110 pages
11, 12. Predictive Analysis
No ratings yet
11, 12. Predictive Analysis
33 pages
Lecture06 Prel
No ratings yet
Lecture06 Prel
10 pages
Statistical Techniques - Formatted
No ratings yet
Statistical Techniques - Formatted
51 pages
Lecture 17: Multicollinearity 1 Why Collinearity Is A Problem
No ratings yet
Lecture 17: Multicollinearity 1 Why Collinearity Is A Problem
9 pages
1.12.2024-BSC-301-CSBS-class Note - 2024-25
No ratings yet
1.12.2024-BSC-301-CSBS-class Note - 2024-25
58 pages
Covariance and Correlation: Parthiban Rajendran
No ratings yet
Covariance and Correlation: Parthiban Rajendran
31 pages
MathModel - Lecture 8 1
No ratings yet
MathModel - Lecture 8 1
8 pages
Murphy Gaussians
No ratings yet
Murphy Gaussians
15 pages
Multivariate Statistical Functions in R
No ratings yet
Multivariate Statistical Functions in R
138 pages
Math 100 (MMW) - Activity 04
No ratings yet
Math 100 (MMW) - Activity 04
2 pages
Unit-26 - Canonical - Correlation-Cropped (2 Files Merged)
No ratings yet
Unit-26 - Canonical - Correlation-Cropped (2 Files Merged)
11 pages
The CMA Evolution Strategy: A Tutorial: Nikolaus Hansen June 28, 2011
No ratings yet
The CMA Evolution Strategy: A Tutorial: Nikolaus Hansen June 28, 2011
34 pages
Correlation and Regression
No ratings yet
Correlation and Regression
39 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Actividad - Evaluable2.1 - Chicaiza - Iza - 5582 (Ingles Version)
No ratings yet
Actividad - Evaluable2.1 - Chicaiza - Iza - 5582 (Ingles Version)
7 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Regress
No ratings yet
Regress
11 pages
Lec 6
No ratings yet
Lec 6
29 pages
Lec 1
No ratings yet
Lec 1
32 pages
Lec 2
No ratings yet
Lec 2
36 pages
Lec 4
No ratings yet
Lec 4
33 pages
Module 4 Week 4
No ratings yet
Module 4 Week 4
43 pages
133A Exercises
100% (1)
133A Exercises
103 pages
A General Theory of Vandermonde Matrices
No ratings yet
A General Theory of Vandermonde Matrices
23 pages
Week 4
No ratings yet
Week 4
10 pages
Engineering Mathematics I 1 14
No ratings yet
Engineering Mathematics I 1 14
14 pages
CE2407B Lecture 1 PDF
No ratings yet
CE2407B Lecture 1 PDF
12 pages
Alijebra Unt 1
No ratings yet
Alijebra Unt 1
38 pages
Unit 1 Matrices
No ratings yet
Unit 1 Matrices
28 pages
Linear Algebra With R
No ratings yet
Linear Algebra With R
26 pages
Weatherwax Leveque Solutions
No ratings yet
Weatherwax Leveque Solutions
126 pages
Pocketbrain: Matrix Calculator Application: Anthony Vuolo III
No ratings yet
Pocketbrain: Matrix Calculator Application: Anthony Vuolo III
45 pages
Notes
No ratings yet
Notes
23 pages
Questions For Meritorious Students
No ratings yet
Questions For Meritorious Students
4 pages
Math 2121 Tutorial 7
No ratings yet
Math 2121 Tutorial 7
10 pages
Math's Micro-Project G-2
No ratings yet
Math's Micro-Project G-2
15 pages
Lec 3-Matrix Detrminants, Inverse and Solving System
No ratings yet
Lec 3-Matrix Detrminants, Inverse and Solving System
11 pages
Souvignier Syllabus
No ratings yet
Souvignier Syllabus
35 pages
Maths Synopsis 075648
No ratings yet
Maths Synopsis 075648
40 pages
Syllabus: Cambridge IGCSE Additional Mathematics (US)
No ratings yet
Syllabus: Cambridge IGCSE Additional Mathematics (US)
22 pages
New Syllabus BCA 2015 PDF
No ratings yet
New Syllabus BCA 2015 PDF
93 pages
Apj II Pu Maths Preparatory Papers
No ratings yet
Apj II Pu Maths Preparatory Papers
13 pages
Developments in Statistical Modelling Secure Ebook Download
100% (12)
Developments in Statistical Modelling Secure Ebook Download
14 pages
Joshi, A.W. Elements of Group Theory For Physici
71% (7)
Joshi, A.W. Elements of Group Theory For Physici
348 pages
Applied Mathematics Question Bank 2024-25
100% (2)
Applied Mathematics Question Bank 2024-25
6 pages
LA L5notes
No ratings yet
LA L5notes
38 pages
Maths Practice Paper 4
No ratings yet
Maths Practice Paper 4
7 pages
A Gentle Introduction To Singular-Value Decomposition For Machine Learning
No ratings yet
A Gentle Introduction To Singular-Value Decomposition For Machine Learning
14 pages

10 Cor1

Uploaded by

10 Cor1

Uploaded by

Correlation analysis 1: Canonical correlation

This measures a linear association between X, Y . Properties:

Note: if x, y ∈ Rn are centered unit vectors then cor(x, y) = xT y

This measures a linear association between x, y. Properties:

Canonical correlation analysis is similar but instead attempts to

Subject to the constraints, this is equivalent to maximizing

Vectors Xα1 , Y β1 ∈ Rn are called the first canonical variates, and

Given the first k − 1 directions, the kth canonical directions

Vectors Xαk , Y βk ∈ Rn are called the kth canonical variates, and

The tests on mechanics, vectors were closed book and those on

alg ana sta

Canonical correlation analysis attempts to explain this phenomenon

0.1 0.2 0.3 0.4 0.5 0.6

The second directions are more surprising, but ρ2 = 0.041

We have X ∈ Rn×p and Y ∈ Rn×q . How many pairs of canonical

We know that any n orthogonal (linearly independent) vectors in

Hence there are exactly r = min{p, q} canonical directions

If A ∈ Rp×p , B ∈ Rq×q are invertible, then computing

α̃1 , β̃1 = argmax (XAα̃)T (Y B β̃),

is equivalent to the first step of canonical correlation analysis. In

I.e., we can transform our data matrices to be X̃ = XA, Ỹ = Y B

Why would we ever do this? Because there is a transformation

We write the inverse of A1/2 as A−1/2 . Note A−1/2 AA−1/2 = I.

Given centered matrices X ∈ Rn×p and Y ∈ Rn×q ,2 we define

α̃1 , β̃1 = argmax (X̃ α̃)T (Ỹ β̃)

So why is this simpler? Note that the constraint says

i.e., kα̃k2 = 1. Similarly, kβ̃k2 = 1. Hence our problem can be

where U ∈ Rp×r , V ∈ Rq×r have orthonormal columns, and

−0.25 −0.20 −0.15 −0.10

First canonical x variate

Canonical correlation analysis is implemented by the cancor

In canonical correlation analysis we are looking for pairs of

We defined the pairs of canonical directions (α1 , β1 ), . . . (αr , βr ),

We saw that transforming the problem leads to a simpler form.

You might also like