0% found this document useful (0 votes)

98 views100 pages

Mixomics

This document outlines a presentation on multivariate projection methodologies for integrating large biological data sets. It discusses using statistical techniques like principal component analysis, discriminant analysis, and canonical correlation analysis to integrate multiple 'omics datasets, such as transcriptomics and metabolomics data, in order to gain insight into complex biological systems. The presentation covers exploring single datasets, discriminating between classes, integrating multiple datasets, related graphical outputs, and extensions like sparse methods. The goal is to shift from univariate to multivariate analysis and identify combinations of biomarkers across datasets.

Uploaded by

mikhael

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views100 pages

Mixomics

Uploaded by

mikhael

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 100

Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Multivariate projection methodologies

for the integration of large biological
data sets
Application in R using mixOmics

math.univtoulouse.fr/biostat
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Agenda

● Introduction
● Reminders (?)
● Explore one data set (PCA)
● Discriminant analysis (LDA, PLS-DA)
● Data integration (PLS, CCA, GCCA)
● Graphical outputs
● Extensions: sparse and multilevel
● Conclusion

2 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Introduction

3 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Research hypothesis

● Molecular entities act together to trigger cells’ responses and

need to be appropriately modelled and identified using novel
statistical techniques.

● Multivariate statistical methods to shift the univariate statistics

paradigm to obtain deeper insight into biological systems
– Identify a combination of biomarkers rather than univariate
biomarkers
– Integrate multiple sources of biological data
– Reduce the dimension of the data for a better understanding of
complex biological systems

4 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Multidisciplinarity!
● Nearly unlimited quantity of data from multiple and
heterogeneous sources
● Computational issues to foresee
● Biological interpretation for validation
● Keep pace with new technologies
A close interaction between statisticians, bioinformaticians and
molecular biologists is essential to provide meaningful results

BLAST DNA « Bio »

« Stat » 1
n FASTA RNA
∑X
n i =1 i
BAM ATGCC

 X ' X −1 X ' Y

« Bio-info »
5 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Data integration

Generally, data integration can be defined as the process of

combining data residing in diverse sources to provide users with a
comprehensive view of such data. There is no universal approach to
data integration, and many techniques are still evolving.
From Schneider, M. V., & Jimenez, R. C. (2012). Teaching the Fundamentals of Biological Data Integration Using
Classroom Games. PLoS Computational Biology, 8(12)

mixOmics philisophy in this context:

● R toolkit for multivariate data analysis of ‘omics data
● Statistical data integration
●
Data-driven approaches (≠ database or knowledge-
based approaches)

6 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Overview Quantitative Qualitative

Genotype

Condition
Samples

Transcriptome Proteome Metabolome Autre...

● Univariate
Mean, median, standard deviation...

● Bivariate: 2 quantitatives or 1 quantitative + 1 qualitative or 2 qualitatives

Correlation, statistical test (Student, ANOVA, Chi2)

● Multivariate unsupervised
PCA

● Multivariate supervised
PLS-DA

● Multi-block unsupervised
PLS (2 blocks), GCCA

● Multi-block supervised
GCC-DA 7 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

The mixOmics story

● Started with two phD projects in Université de Toulouse:
– Ignacio González (2004-2007): rCCA
– Kim-Anh Lê Cao (2005-2008): sPLS
● The Australian mixOmics immigration processed began in 2008 ...
– K-A moved to UQ for a postdoc (IMB)
– Core team established: Kim-Anh Lê Cao (FR, AUS), Ignacio González (FR),
Sébastien Déjean (FR)
● First R CRAN release in May 2009
● Today
– 21,000 downloads (unique IP adress) in 2016 (4,000 in 2014, 10,000 in 2015)
– Website: www.mixomics.org
– Two web-interfaces (shiny and PHP, also Galaxy but not advertised)
– 19 multivariate methodologies and sparse variants (13 are our own methods)
– Team: 3 core members and 4 key contributors
– Move to in october 2018

8 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Guidelines
● I want to explore one single data set (e.g. microarray data):
– I would like to identify the trends or patterns in your data, experimental bias or,
identify if your samples ‘naturally’ cluster according to the biological conditions:
Principal Component Analysis (PCA)
● I have one single data set (e.g. microarray data) and I am interested in
classifying my samples into known classes:
– Here X = expression data and Y = vector indicating the classes of the samples. I
would like to know how informative my data are to rightly classify my samples, as
well as predicting the class of new samples: PLS-Discriminant Analysis (PLS-DA)
● I want to want to unravel the information contained in two data sets, where
two types of variables are measured on the same samples (e.g.
metabolomics and transcriptomics data)
– I would like to know if I can extract common information from the two data sets (or
highlight thecorrelation between the two data sets). The total number of variables is
less than the number of samples: Canonical Correlation Analysis (CCA) or
Projection to Latent Sructures (PLS) canonical mode. The total number of variables
is greater than the number of samples: Regularized Canonical Correlation Analysis
(rCCA) or Projection to Latent Sructures (PLS) canonical mode

9 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Practical works

● Based on the vignette of the package

bioconductor.org/packages/release/bioc/vignettes/mixOmics/inst/doc/vignette.html

● Quick start section for every methods

● Focus on PCA, (Sparse-)PLS-DA and

(Sparse-)PLS

10 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Reminders (?)

11 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Variance and Standard Deviation

n
var(X)  X i  X 
1 2 Mean of the squared deviations to the mean
n i1

(X) var(X) Square root of the variance

Properties of the standard deviation:

• Positif (zero if the serie is constant)
• Unchanged by translation
• Sensitive to extreme values
● In the same unit as the data (as the mean but unlike the
variance): If the data are expressed in m then the standard
deviation also express in m (as the mean) and the variance in m² !

12 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Variance and Standard Deviation

Square root of the mean of the squared deviation to the mean

X  X 
5
2

X  X 
4
2

X  X 
3
2
Variance

X  X 
2
2

X  X 
1
2

X4 X3 X2 X1 X X5 Standard
deviation
X1 X
X 2 X
X 3 X
X 4 X
X 5 X

13 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Humour for statistician... Source : xkcd.com

14 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Covariance
n
Covariance cov(X,Y)1  X i  X Yi  Y  cov(X,X)=var(X)
n i1

Sign of the product (Xi-X)(Yi-Y)

6
Intuitively :
- +

4
● If the + win
→positive linear relationship

2
●If the – win y Y
→négative linear relationship 0

On this example : cov(X,Y)=-1.36

-2

Y
+ -
X
-4

9 8 .5 9 9 .0 9 9 .5 1 0 0 .0 1 0 0 .5 1 0 1 .0 1 0 1 .5

x
Y
The covariance depends on the physical units
 correlation coefficient

X X 15 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Correlation
Some properties of correlation coefficients:
– Between –1 and 1

– Pearson correlation coefficient: linear relationship

  X , Y =cov  X , Y / X  Y 

– Spearman correlation coefficient (ranks): monotone

relationship X   Y  RX  RY
ρ s ( X ,Y )=ρ( RX , RY ) 12  5  3   1
15  3  1   2
14  2  2   3

– If the coefficient is positive : when a variable is high the

other is also high. Replace high with low.
– If the coefficient is negative : when a variable is high the
other is low. Replace high with low and inversely.
16 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Correlation
ρ = 0.944 - ρs = 0.855 ρ = 0.722 - ρs = 0.915 ρ = 0.89 - ρs = 1

ρ = 0.914 - ρs = 0.964 ρ = -0.518 - ρs = -0.248 ρ = 0.24 - ρs = 0.248

17 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Linear combination
2 variables 0.5
2 coefficients : c1 = 0.5 ; c2 = 2 W= 2

Height Weight
174.0 65.6 Linear combination of the 2 variables Height
175.3 71.8 and Weight with coefficients c1 and c2
193.5 80.7
186.5 72.6 65.6 218.20
174.0
187.2 78.8 71.8 231.25
175.3
181.5 74.8 80.7 258.15
193.5
184.0 86.4 72.6 238.45
186.5
184.5 78.4 78.8 251.20
LC = 0.5 187.2 + 2 =
175.0 62.0 74.8 240.35
181.5
184.0 81.6 86.4 264.80
184.0
184.5 78.4 249.05
X 175.0 62.0 211.50
184.0 81.6 255.20

Matrix notation: LC = XW

A principal component is a linear combination of the initial variables.

18 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Center / scale

● Center: remove the mean

● Scale: divide by the standard deviation

● Express different variables on a common scale, without

physical unit; the observations are thus expressed as numbers
of standard deviations related to the mean.

● After centering and scaling, the mean is zero and the standard
deviation is 1 (as the variance).
Z X  X
i
i

● Sometimes called « z-transformation » ou « z-score »

 X

19 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Center / scale
X X-mean(x) X/sd(x) [X-mean(x)] / sd(X)

v1 v2 v3 v4 v1 v2 v3 v4 v1 v2 v3 v4 v1 v2 v3 v4
1 3.9 0.2 -3.2 0.6 1 1.4 0.4 -3.8 0.1 1 1.1 0.1 -0.6 0.8 1 0.7 0.3 -0.8 0.1
2 3.3 -1.7 -4.0 0.6 2 0.8 -1.5 -4.5 0.0 2 1.0 -1.1 -0.8 0.8 2 0.4 -1.0 -0.9 0.0
3 1.2 1.8 3.3 0.6 3 -1.3 2.0 2.8 0.0 3 0.3 1.2 0.7 0.7 3 -0.6 1.3 0.6 0.0
4 0.4 -0.9 -1.4 0.3 4 -2.1 -0.7 -2.0 -0.3 4 0.1 -0.6 -0.3 0.4 4 -1.0 -0.5 -0.4 -0.6
5 3.6 -0.6 10.3 -0.2 5 1.1 -0.4 9.8 -0.8 5 1.0 -0.4 2.0 -0.3 5 0.5 -0.3 2.0 -1.7
6 2.5 2.4 -7.5 0.4 6 0.0 2.5 -8.0 -0.2 6 0.7 1.5 -1.5 0.5 6 0.0 1.7 -1.6 -0.3
7 -1.4 0.6 3.2 1.2 7 -3.9 0.8 2.7 0.7 7 -0.4 0.4 0.6 1.6 7 -1.8 0.5 0.5 1.4
8 2.4 -1.2 -1.0 1.4 8 -0.1 -1.0 -1.5 0.9 8 0.7 -0.7 -0.2 1.8 8 -0.1 -0.6 -0.3 1.7
9 2.3 0.2 2.1 0.1 9 -0.2 0.4 1.6 -0.4 9 0.7 0.1 0.4 0.2 9 -0.1 0.2 0.3 -0.9
10 6.7 -2.6 3.4 0.7 10 4.2 -2.5 2.8 0.1 10 2.0 -1.7 0.7 0.9 10 1.9 -1.6 0.6 0.2

Mean 2.5 -0.2 0.5 0.6 Mean 0 0 0 0 Mean 1.1 -0.1 0.1 1.2 Mean 0 0 0 0
S.D. 2.2 1.5 5.0 0.5 S.D. 2.2 1.5 5.0 0.5 S.D. 1 1 1 1 S.D. 1 1 1 1

20 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Log transformation
City Population Log10
Toulouse 441 802 5.65
X Log2(X) Colomiers 35 186 4.55
Tournefeuille 25 340 4.40
0.125 = 2-3 -3 Muret 23 864 4.38
...
0.25 = 2-2 -2 CastanetTolosan 11 033 4.04
SaintOrens... 10 918 4.04
0.5 = 2-1 -1 SaintJean 10 259 4.01
Revel 9 361 3.97
1 = 20 0 PortetsurGaronne 9 435 3.97
Auterive 9 107 3.96
2 = 21 1 ...
La MagdelainesurT/ 1 006 3.00
4 = 22 2 Grépiac    990 2.99
Landorthe    946 2.98
8 = 23 3 VigouletAuzil    944 2.97
...
BelbèzedeLauragais    104 2.02
4<5<8 2 < ~2.3 < 3 SaintGermier    103 2.01
Seyre    102 2.01
2<3<4 1 < ~1.6 < 2 Gouzens     95 1.98
Lourde     98 1.99
0.1 < 0.125 ~ -3.3 < -3 Pouze     97 1.99
...
Y = log2(X) ↔ X = 2Y Saccourvielle     13 1.11
Cirès     13 1.11
Bourgd'Oueil      8 0.90
Y = log10(X) ↔ X = 10Y TrébonsdeLuchon      8 0.90
Caubous      6 0.78
Y = ln(X) ↔ X = eY = exp(Y) Baren      5 0.70
21 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Explore one data set

22 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Principal Components Analysis

Describe with no prior a data set

exclusively composed of quantitatives
variables
p variables

n individuals

23 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Body data set

• 20 individuals
       V1     V2    V3    V4     V5
• 5 variables H 1   106.2   89.5  71.5  65.6  174.0
H 2   110.5   97.0  79.0  71.8  175.3
V1 : shoulder girth (cm) H 3   115.1   97.5  83.2  80.7  193.5
V2 : chest girth (cm) H 4   104.5   97.0  77.8  72.6  186.5
V3 : waist girth (cm) H 5   107.5   97.5  80.0  78.8  187.2
V4 : weight (kg) H 6   119.8   99.9  82.5  74.8  181.5
V5 : height (cm) H 7   123.5  106.9  82.0  86.4  184.0
H 8   120.4  102.5  76.8  78.4  184.5
H 9   111.0   91.0  68.5  62.0  175.0
H 10  119.5   93.5  77.5  81.6  184.0
F 1   105.0   89.0  71.2  67.3  169.5
F 2   100.2   94.1  79.6  75.5  160.0
F 3    99.1   90.8  77.9  68.2  172.7
F 4   107.6   97.0  69.6  61.4  162.6
F 5   104.0   95.4  86.0  76.8  157.5
F 6   108.4   91.8  69.9  71.8  176.5
F 7    99.3   87.3  63.5  55.5  164.4
F 8    91.9   78.1  57.9  48.6  160.7
F 9   107.1   90.9  72.2  66.4  174.0
F 10  100.5   97.1  80.4  67.3  163.8

24 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

1D graphical output: stripchart

Height 174.0 175.3 193.5 186.5 187.2 181.5 184.0 184.5 175.0 184.0

Height in cm

Weight 65.6 71.8 80.7 72.6 78.8 74.8 86.4 78.4 62.0 81.6

Weight in kg

25 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

2D graphical output: scatter plot

Height 174.0 175.3 193.5 186.5 187.2 181.5 184.0 184.5 175.0 184.0
Weight 65.6 71.8 80.7 72.6 78.8 74.8 86.4 78.4 62.0 81.6
Weight in kg

Height in cm
26 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

3D graphical output: scatter plot

Height 174.0 175.3 193.5 186.5 187.2 181.5 184.0 184.5 175.0 184.0
Weight 65.6 71.8 80.7 72.6 78.8 74.8 86.4 78.4 62.0 81.6
Waist g. 71.5 79.0 83.2 77.8 80.0 82.5 82.0 76.8 68.5 77.5

Waist g. in cm

kg
in
ht
e ig
W

Height in cm

27 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

4D ?

28 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternative to 4D (or more)

Shoulder g.

Chest g.

Waist g.

Weight

Height

29 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Shoulder girth

th
ir
g
st
he
C
Waist girth
30 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Shoulder girth

th
ir
g
st
he
C
Waist girth
1st Principal Component :
«beefyness»
31 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

32 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Comments The measurements are rather strongly correlated.

Indeed, one can assume that a person with a high
shoulder girth will also have high chest girth (even if
exceptions exist...). In these conditions, the information
brought by the 5 variables are redundant. Graphically,
in the cube determined by shoulder girth, chest girth
and waist girth, there are nearly emplty areas. One
variable calculated as a combination of these 3
variables (represented as the dotted arrow) would be
enough to represent the individuals with a minimal loss
in information because all the points are located along
these direction that is the first principal component.

Among the potential projections in 2D spaces, all

does not enable to identify easily the object. Among
the 3 proposed projections, the image at the center
is the nearest from the original. One can easily
recongnize the initial object because the projection
was made on the plane formed by the 2 directions
along which the object spreads out the most (high
variability). The information brought by the 3rd
dimension is minimal and its loss is not a problem
to recongnize the fish.

33 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

In other words

● PCA allow determine the sub-spaces of lower dimension than

the initial space on which the projection of the individuals is
the least modified, that is to say, the sub-spaces that keep
the greatest part of the information (i.e. variability).
● The principle of PCA consists in finding a direction (the first
PC), calculated as a linear combination of the initial
variables, such that the variance of the points around this
direction is maximal. Iterate this process in orthogonal
directions to determine the following principal components.
The number of PC that can be calculated is equal to the
number of initial variables.
● Concerning the variables, the PCA keeps at best the
correlation structure between the initial variables.

34 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: simulated examples

Data set : 50 observations, 3 variables (V1 – V2 - V3)

Case 1) Case 2) Case 3)

{V1} - {V2} - {V3} {V1 - V2} - {V3} {V1 - V2 - V3}
-2 -1 0 1 2 -2 -1 0 1 2 -3 -2 -1 0 1 2 3

2
1

1
0

V1 V1 V1

0
-1

-1

-1
-2

-2

-2
3
2
2

2
1

1
1

V2 V2 V2

0
0
0

-1
-1
-1

-2
-2

-3
-2

2
1
1

1
0
V3 V3 V3
0

0
-1

-1
-1

-2

-2
-2

-2 -1 0 1 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 -2 -1 0 1 2 -2 -1 0 1 2

Pearson Correlation matrices

1) V1 V2 V3 2) V1 V2 V3 3) V1 V2 V3
V1 1.0 -0.10 0.00 V1 1.00 0.88 -0.05 V1 1.00 0.88 0.92
V2 -0.1 1.00 -0.12 V2 0.88 1.00 -0.11 V2 0.88 1.00 0.81
V3 0.0 -0.12 1.00 V3 -0.05 -0.11 1.00 V3 0.92 0.81 1.00

35 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Example: 3D scatter plots

Case 1)
Case 2)

Case 3)

36 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Example: individuals plot

37 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Example: individuals plot

38 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Example: individuals plot

39 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Variables plot

The coordinates of
a variable Xj on a
principal component
PCi is given by the
correlation between
this variable and the
component PCi.

40 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Variables plot
Remember trigonometry
Correlation ≈ cosine and right triangles:

The correlation between two

variables is represented as:

● An acute angle (cos(α) > 0)

if it is positive

● An obtuse angle (cos(θ) < 0)

if it is negative

● A right angle (cos(β)≈0) if it

is near zero

41 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Variables plot
1) 2) 3)

Correlation matrices
1) V1 V2 V3 2) V1 V2 V3 3) V1 V2 V3
V1 1.0 -0.10 0.00 V1 1.00 0.88 -0.05 V1 1.00 0.88 0.92
V2 -0.1 1.00 -0.12 V2 0.88 1.00 -0.11 V2 0.88 1.00 0.81
V3 0.0 -0.12 1.00 V3 -0.05 -0.11 1.00 V3 0.92 0.81 1.00

42 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Example: biplot representation

Individuals and variables are plotted on the same graph

1) 2) 3)

43 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

And what about PCA?

● Mathematically, to perform a PCA consists in diagonalising

the covariance (or the correlation for scaled PCA) matrix.
● Indeed, it can be shown that the sub-space in which the
projected points have a maximal variance is given by the
first eigen vectors of the covariance (or correlation) matrix ;
the variance are given by the corresponding eigen values.
● The first eigen vector provides the direction (via the
coefficients of the linear combination to apply to initial
variables) that explains the greatest part of variability. The
second explains the greatest past of the remaining
variance and so on...

44 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: practical aspects

● Should I scale my data before performing PCA?
– Without scaling: one variable with high variance will
structure nearly alone the first principal component
– With scaling: one noisy variable with low variability will be
given the same variance as others meaningful variables
● Can I perform PCA with missing values?
– Specific algorithms to deal with missing values exist (for
instance, NIPALS - implemented in mixOmics). It can be
used to impute missing values but it requires « many »
components.

The best thing to do about missing data is not to have any.

Gertrude Cox, 1900-1978, American statistician
45 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: body data set

Variables plot
73 %

Screeplot

17 %

7%
2% 1%

● 90% of the variability is explained

by the first two PCs
● 10% of the information is lost
when projecting from 5 to 2
dimensions.
● PC 1 «beefyness»: separation of
beefy people on the right (high
values for the 5 variables) and T.ep T.p T.t M T

weakling ones on the left. Correlation

T.ep 1.00 0.74 0.48 0.72 0.71

● PC 2 «fatness, rotundity»: bottom, matrix

T.p 0.74 1.00 0.78 0.81 0.51
T.t 0.48 0.78 1.00 0.86 0.37
variables linked to height and M 0.72 0.81 0.86 1.00 0.61
shoulders; top, weight, waist and T 0.71 0.51 0.37 0.61 1.00
chest girth. 46 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: body data set 73 %

17 %
7%
2% 1%

Individual plots
47 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: body data set

s.g    c.g   w.g    w     h
H 1   106.2   89.5  71.5  65.6  174.0
H 2   110.5   97.0  79.0  71.8  175.3
H 3   115.1   97.5  83.2  80.7  193.5
H 4   104.5   97.0  77.8  72.6  186.5
H 5   107.5   97.5  80.0  78.8  187.2
H 6   119.8   99.9  82.5  74.8  181.5
H 7   123.5  106.9  82.0  86.4  184.0
H 8   120.4  102.5  76.8  78.4  184.5
H 9   111.0   91.0  68.5  62.0  175.0
H 10  119.5   93.5  77.5  81.6  184.0
F 1   105.0   89.0  71.2  67.3  169.5
F 2   100.2   94.1  79.6  75.5  160.0
F 3    99.1   90.8  77.9  68.2  172.7
F 4   107.6   97.0  69.6  61.4  162.6
F 5   104.0   95.4  86.0  76.8  157.5
F 6   108.4   91.8  69.9  71.8  176.5
F 7    99.3   87.3  63.5  55.5  164.4
F 8    91.9   78.1  57.9  48.6  160.7 Origin (coordinate (0,0)): average individual
s.g c.g w.g w h
F 9   107.1   90.9  72.2  66.4  174.0
108.1 94.2 75.4 70.6 174.4
F 10  100.5   97.1  80.4  67.3  163.8 48 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: body data set

Data Covariance matrix
       s.g      c.g     w.g      w      h
       s.g    c.g   w.g    w     h Shoulder.g    68.64     37.74   28.08   55.32  61.19
H 1   106.2   89.5  71.5  65.6  174.0 Chest.g       37.74     37.51   33.90   45.70  32.40
H 2   110.5   97.0  79.0  71.8  175.3 Waist.g       28.08     33.90   50.77   56.58  27.70
H 3   115.1   97.5  83.2  80.7  193.5 Weight        55.32     45.70   56.58   85.71  59.52
H 4   104.5   97.0  77.8  72.6  186.5 Height        61.19     32.40   27.70   59.52 109.31
H 5   107.5   97.5  80.0  78.8  187.2
H 6   119.8   99.9  82.5  74.8  181.5
H 7   123.5  106.9  82.0  86.4  184.0 68.64 + 37.51 + 50.77 + 85.71 + 109.31 = 351.94
H 8   120.4  102.5  76.8  78.4  184.5
H 9   111.0   91.0  68.5  62.0  175.0
H 10  119.5   93.5  77.5  81.6  184.0
F 1   105.0   89.0  71.2  67.3  169.5
F 2   100.2   94.1  79.6  75.5  160.0
F 3    99.1   90.8  77.9  68.2  172.7 351.94 represents (somehow) the
F 4   107.6   97.0  69.6  61.4  162.6
F 5   104.0   95.4  86.0  76.8  157.5 quantity of information contained in
F 6   108.4   91.8  69.9  71.8  176.5
F 7    99.3   87.3  63.5  55.5  164.4
the data.
F 8    91.9   78.1  57.9  48.6  160.7
F 9   107.1   90.9  72.2  66.4  174.0
F 10  100.5   97.1  80.4  67.3  163.8

Mean 108.1 94.2 75.3 70.6 174.4
Var. 68.6 37.5 50.8 85.7 109.3

49 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA: body data set Coordinates of the individuals

on the PCs
Coefficients (optimally calculated) to       Dim1  Dim2  Dim3  Dim4  Dim5
build principal components H1   6.50 4.48 0.37 1.03  1.27
H2    4.40  2.04  0.81  1.87  1.38
Dim1 Dim2 Dim3 Dim4 Dim5 H3   22.66 5.94 6.18  0.11  1.97
shoulder.g 0.45 -0.16 0.78 -0.18 0.36 H4    7.78 5.24 8.38  4.10 1.74
H5   13.73 2.67 8.02  0.82 2.15
chest.g 0.32 0.25 0.26 0.72 -0.49
H6   15.67 0.15  4.49  2.33  4.40
waist.g 0.34 0.53 -0.33 0.24 0.66 H7   26.99  3.19  6.29  0.04 3.08
weight 0.54 0.36 -0.17 -0.60 -0.44 H8   18.41 3.43  5.63  1.09 1.96
height 0.54 -0.70 -0.43 0.17 0.02 H9   6.25 8.48  4.97  0.79  1.86
H10  16.78 3.67  1.99 7.08  1.22
F1   8.83 0.78  0.28 3.02  0.07
PC1 = 0.45*shoulder.g + 0.32*chest.g F2   7.28 15.41 2.31 3.00 2.35
    + 0.34*waist.g + 0.54*weight + 0.54*height F3   6.45  2.25 7.60  0.95  1.15
F4  12.51  2.68  8.91  4.27 1.53
F5   3.65 20.76 0.30 2.45  1.99
PC2 = 0.16*shoulder.g + 0.25*chest.g F6   0.63 4.62  0.34 3.46 2.80
    + 0.53*waist.g + 0.36*weight – 0.70*height F7  23.61 5.07  2.20  1.19 1.15
F8  37.50 9.07 1.33 1.89 0.02
F9   4.98 3.61  0.33 0.50  1.02
PC3 = ... F10  8.24 10.89 1.74  4.86  0.44
Mean   0     0     0     0     0
       PC1   PC2   PC3  PC4  PC5 Var. 255.7 60.2  23.5   8.61  4.0

Covariance PC1 255.66  0.00  0.00 0.00 0.00
PC2   0.00 60.18  0.00 0.00 0.00
matrix PC3   0.00  0.00 23.48 0.00 0.00 255.66 + 60.18 + 23.48 + 8.61 + 4.01
PC4   0.00  0.00  0.00 8.61 0.00 = 351.94
between PCs PC5   0.00  0.00  0.00 0.00 4.01

255.66 is the greatest value of variance that we The same quantity of information (351.94)
can obtain on the individuals with a linear is kept but it is ``optimally’’ allocated.
combination of the initial variables.
50 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Biological data set (1)

PCA for quality control!
3 conditions, 4 replicates, 38000 genes, chip Affymetrix 1 replicate contrôle
(to be removed?)

4 replicates
1 replicat condition condition A
B (to be removed?)

3 replicates condition
B et 3 replicats control
51 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Biological data set (2)

PCA for quality control!

4 conditions (2 treatments * 2 genotypes), 3 replicates, 20000 genes (Affymetrix)

3 replicates
C2_wt
2 replicates
C1_mut

3 replicates
C2_mut
1 replicate C1_mut
(to be removed?) 3 replicates
C1_wt

52 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA = projection

● To interpret the graphical results of PCA must

be done keeping in mind that one is looking at
a projection on a plane (or in a volume for 3D
representation).
● Be careful when interpreting visual proximities
● Illustration in comics with the only true super-
heros ...

53 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

PCA =
projection
I’m TWO-D boy. The
boy X-Y who doesn’t
care about the Z !

Scenario &
illustration
Pascal Jousselin

Colour
Laurence Croix

Web
pjousselin.free.fr

54 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Discriminant analysis

55 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Linear Discriminant Analysis (LDA)

Explore a data set composed of quantitative
variables and one qualitative variable in order to
separate the individuals based on their membership
to the categories of the qualitative variable.

p quantitatives
variables 1 qualitative
variable

n individuals

56 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Body data set V1 V2 V3 V4 V5 S

I 1   106.2   89.5  71.5  65.6  174.0 M
I 2   110.5   97.0  79.0  71.8  175.3 M
I 3   115.1   97.5  83.2  80.7  193.5 M
I 4   104.5   97.0  77.8  72.6  186.5 M
I 5   107.5   97.5  80.0  78.8  187.2 M
Can we found a space I 6   119.8   99.9  82.5  74.8  181.5 M
where the projection of the I 7   123.5  106.9  82.0  86.4  184.0 M
individuals will separate I 8   120.4  102.5  76.8  78.4  184.5 M
men and women I 9   111.0   91.0  68.5  62.0  175.0 M
(qualitative variable S) I 10  119.5   93.5  77.5  81.6  184.0 M
according to the 5 body I 11  105.0   89.0  71.2  67.3  169.5 W
measurements (V1 to V5)? I 12  100.2   94.1  79.6  75.5  160.0 W
I 13   99.1   90.8  77.9  68.2  172.7 W
I 14  107.6   97.0  69.6  61.4  162.6 W
I 15  104.0   95.4  86.0  76.8  157.5 W
I 16  108.4   91.8  69.9  71.8  176.5 W
I 17   99.3   87.3  63.5  55.5  164.4 W
I 18   91.9   78.1  57.9  48.6  160.7 W
I 19  107.1   90.9  72.2  66.4  174.0 W
I 20  100.5   97.1  80.4  67.3  163.8 W
57 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: simulated example 1

2
3
V1
-2.02
1.37
6.02
V2
1.93
-0.12
4.15
V3
2.09
2.01
1.77
Groupe
A
A
A
4 0.50 -4.84 2.63 A
5 -3.46 0.40 2.04 A
6 2.03 0.22 2.09 A
7 -4.27 -0.19 1.84 A
8 10.44 -0.08 1.43 A
9 7.53 3.55 1.59 A
10 -2.75 -2.69 2.06 A
Data set 11
12
-7.16
11.82
5.18
-4.89
2.00
2.25
A
A
13 -0.52 -5.94 2.05 A
14 -0.62 -0.77 1.97 A
15 0.67 0.64 1.76 A
● 50 individuals, 4 variables 16
17
18
2.34
2.79
-1.87
-0.93
-2.98
0.05
1.74
2.07
2.02
A
A
A
19 -0.09 -0.69 2.32 A
20 5.07 5.57 2.08 A
● 3 quantitatives V1 – V2 – V3 21
22
23
0.38
1.50
0.78
0.90
3.79
-4.40
1.69
1.96
1.81
A
A
A
24 1.40 1.16 2.13 A
25 1.64 0.38 1.77 A
26 -4.00 -2.60 -1.95 B
● 1 qualitative Group with 2 categories A and B 27
28
5.15
6.98
0.59
-1.14
-1.94
-2.17
B
B
29 5.57 -6.49 -2.15 B
30 -5.84 -1.83 -1.82 B
31 -3.20 -0.07 -2.14 B
32 3.20 0.87 -1.50 B
33 -6.63 4.56 -1.92 B
34 -2.80 -1.53 -1.70 B

Can we find a space where the projections of 35

36
37
3.43
-4.24
2.20
2.98
-2.61
0.55
-2.14
-2.18
-1.89
B
B
B
the individuals from groups A and B are well 38
39
-3.07
0.26
-2.07
1.30
-1.97
-1.85
B
B

separated? 40
41
42
0.32
1.14
-1.21
0.79
5.79
-2.88
-1.78
-1.64
-1.50
B
B
B
43 1.38 1.71 -2.11 B
44 -0.80 -0.38 -1.99 B
45 -2.04 -4.60 -2.00 B
46 7.67 5.84 -2.09 B
V1 V2 V3 47
48
-4.50
-0.19
-0.15
3.95
-1.85
-1.89
B
B
49 5.92 1.54 -1.72 B
Mean 0 0 0 50 4.82 -1.70 -2.41 B

Variance 20 10 2 58 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: simulated example

Results of a PCA applied only on the quantitative variables (without
considering the qualitative variable).
-2 0 -1 0 0 10 20 30

58%

30
29

0 .3
15

13 12

20
30% 4
V a ria n c e s

23
45

0 .2
12% 17
5

10 42

10
3 26 6

0 .1
50
38 28
0

30 34 16
PC 2
1 41 9 8
44 V a2 r 3 Var 1
2 56
0 .0
7 18

0
47 31 27
5 15 37 32
4 20 1 2 4
● The 3 PC are clearly 39
43
49

identified to the 3 initial 1

-0 .1

-1 0
35
variables. 22 3
9

● Most part of the variability in 48

-0 .2

the data is explained by V1, 20

-2 0
33 Var 2 46
11 41
then by V2, then by V3.
-0 .2 -0 .1 0 .0 0 .1 0 .2 0 .3

PC 1
59 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: simulated example

Representation of the 50
individuals according the 3
variables. Color depends on the
categorie (A-black or B-red)

Although displaying the smallest

variability, V3 is relevant when
addressing a discrimination
purpose.

60 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: simulated example

LDA result

-1 0 -5 0 5 10

LD 1

● 2 categories → 1 discriminant variable (graphical representation in)

● Linear combination of the initial variables:
LD1 = -0.058 * V1 - 0.028 * V2 - 4.41 * V3
● LD1 roughly corresponds to V3 (with negative coefficient, but the sign
doesn’t matter).
61 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: body data set

Centroïd of the 2 groups
s.g c.g w.g w h LD1
F 102.31 91.15 72.82 65.88 166.17 33.81
H 113.80 97.23 77.88 75.27 182.55 36.82

Coefficients of linear
discriminants:
LD1
shoulder g. 0.12
chest g. -0.02
waist g. 0.11
weight -0.11
height 0.14

33 34 35 36 37 38

LD 1

The coefficients indicates that chest girth is the less discriminant variable
(loading -0.02)... The other variables participate nearly in the same way
(loadings around 0.1 in absolute value).
62 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

LDA: principle

● LDA is similar to a PCA performed on the

centroid of the groups determined by the
categories of the qualitative variable.
● Thus, we are looking for a sub-space of small
dimension in which the centroids are the
furthest possible (having a maximal variability)
● If the number of categories is 2, then the
dimension the sub-space is 1; so LDA will
provide only LD1.

63 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Decision-making with LDA

● For a supplementary individual, when knowing

the quantitative variables, the decision-making
problem relies on the affectation of this individual
to a categorie of the qualitative variable.
● Naive (and not so bad) rule: affect the new
individual to the categorie whose centroid is the
closest (many others more sophisticated rules
exist).
● Application: credit scoring, quality control,
diagnostic...

64 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Iris data set

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
---------------------------------------------------------------- This famous (Fisher's or
45 5.1 3.8 1.9 0.4 setosa
46
47
4.8
5.1
3.0
3.8
1.4
1.6
0.3
0.2
setosa
setosa
Anderson's) iris data set
48
49
4.6
5.3
3.2
3.7
1.4
1.5
0.2
0.2
setosa
setosa
gives the measurements
50
51
5.0
7.0
3.3
3.2
1.4
4.7
0.2 setosa
1.4 versicolor in centimeters of the
52 6.4 3.2 4.5 1.5 versicolor
53
54
6.9
5.5
3.1
2.3
4.9
4.0
1.5 versicolor
1.3 versicolor
variables sepal length and
55 6.5 2.8 4.6 1.5 versicolor
----------------------------------------------------------------
width and petal length
95
96
5.6
5.7
2.7
3.0
4.2
4.2
1.3 versicolor
1.2 versicolor and width, respectively,
97
98
5.7
6.2
2.9
2.9
4.2
4.3
1.3 versicolor
1.3 versicolor for 50 flowers from each
99 5.1 2.5 3.0 1.1 versicolor
100
101
5.7
6.3
2.8
3.3
4.1
6.0
1.3 versicolor
2.5 virginica
of 3 species of iris. The
102
103
5.8
7.1
2.7
3.0
5.1
5.9
1.9 virginica
2.1 virginica
species are Iris
104
105
6.3
6.5
2.9
3.0
5.6
5.8
1.8 virginica
2.2 virginica setosa, versicolor,
----------------------------------------------------------------
145
146
6.7
6.7
3.3
3.0
5.7
5.2
2.5 virginica
2.3 virginica
and virginica.
147 6.3 2.5 5.0 1.9 virginica
148 6.5 3.0 5.2 2.0 virginica
149 6.2 3.4 5.4 2.3 virginica
150 5.9 3.0 5.1 1.8 virginica R documentation
65 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Iris data set

PCA LDA

66 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Projection to Latent Structure Discriminant

Analysis (PLS-DA)
p quantitatives
variables 1 qualitative
variable
F G1 G2
G1 1 0
G2 0 1
G1 1 0
n individuals X F G1 1 0
G2 0 1
G2 0 1
G1 1 0

The PLS regression(*) has been extended to deal with discrimination

issues. To do that, the qualitative variable is converted into a dummy
matrix (composed of 0 and 1) with as many rows as individuals and as
many columns as catagories of the qualitative variable.
Please wait for few slides (section integration) to know more about PLS!
67 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Comparison ACP-PLSDA

PCA
PLS-DA
The Small Round Blue Cell
Tumors dataset from Khan et
al., (2001) contains
information of 63 samples
and 2308 genes. The
samples are distributed in
four classes as follows: 8
PLS-DA with variables
Burkitt Lymphoma (BL), 23 selection (see
Ewing Sarcoma (EWS), 12 Voir en 3D
neuroblastoma (NB), and 20 extensions sparse)
rhabdomyosarcoma (RMS).

68 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Data integration

69 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Data integration
The two types of variables are measured on the same
matching samples: X (n x p) and Y (n x q), n << p + q

p quantitative q quantitative
variables variables

n individuals

Aims:
● Understand the correlation/covariance structure between

two data sets

● Select co-regulated biological entities across samples

70 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: simulated example

X Y
X1   X2   X3  X4   X5 Y 1   Y2   Y3 Correlation matrix (X,Y)
0.87 0.31 0.24 0.06 0.29 0.71 0.33 0.53
0.76 0.8  0.52 0.1  0.95 0.62 0.07 0.78
0.65 0.76 0.57 0.1  0.17 0.77 0.10 0.52
0.86 0.47 0.00 0.21 0.75 0.49 0.57 1.09
      X1    X2    X3    X4    X5    Y1    Y2    Y3
0.65 0.46 0.41 0.23 0.86
0.11 0.56 0.84 0.14 0.49
0.76 0.67 0.30
0.53 0.84 0.55
X1  1.00  0.00 0.03  0.13 0.17  0.40 0.10 0.03
0.85 0.81 0.42 0.65 0.39 0.71 0.57 0.75 X2  0.00  1.00  0.06  0.07  0.15  0.10  0.27 0.74
0.74 0.73 0.15 0.81 0.80 0.24 0.89 0.50
0.75 0.30 0.72 0.48 0.99 1.62 0.18 0.80 X3 0.03  0.06  1.00 0.18  0.02  0.07 0.05  0.07
0.55 0.06 0.30 0.87 0.67 0.51 0.16  0.25
0.41 0.52 0.21 0.51 0.59 0.29 0.72 0.61
X4  0.13  0.07 0.18  1.00 0.16 0.02  0.23  0.05
0.59 0.87 0.99 0.67 0.28 1.11 0.80 0.95 X5 0.17  0.15  0.02 0.16  1.00 0.11  0.01 0.14
0.34 0.35 0.56 0.03 0.56 0.49 0.27 0.06
0.07 0.02 0.59 0.04 0.54 0.51 0.02 0.46 Y1  0.40  0.10  0.07 0.02 0.11  1.00  0.05 0.15
0.17 0.08 0.50 0.37 0.89 0.20 0.48  0.36
0.39 0.54 0.53 0.65 0.46 0.27 0.88 0.48 Y2 0.10  0.27 0.05  0.23  0.01  0.05  1.00 0.12
0.06 0.17 0.28 0.82 0.46
0.22 0.83 0.90 0.17 0.49
0.61 0.98 0.51
0.02 0.82 0.74
Y3 0.03 0.74  0.07  0.05 0.14 0.15 0.12  1.00
0.83 0.27 0.51 0.38 0.55 0.40 0.08 0.39
0.02 0.51 0.56 0.34 0.99 0.53 0.46 0.69
0.04 0.46 0.81 0.47 0.46 0.49 0.59 0.28
0.32 0.95 0.65 0.10 0.43 0.07 0.61 1.19
0.42 0.27 0.17 0.36 0.37 0.06 0.51 0.31
0.39 0.68 0.94 0.79 0.87 0.05 0.76 0.18
0.48 0.30 0.83 0.60 0.22 0.25 0.25 0.13
0.84 0.25 0.54 0.00 0.52 0.96 0.11 1.58
0.31 0.14 0.33 0.48 0.38 0.24 0.74  0.41
0.15 0.80 0.09 0.87 0.29 0.23 0.89 1.57
0.99 0.07 0.81 0.96 0.01 0.06 0.76 0.29
0.26 0.21 0.20 0.24 0.66 0.42 0.61 0.22
0.99 0.07 0.86 0.84 0.36 0.64 0.09  0.12
0.91 0.19 0.82 0.04 0.25 1.44 0.08  0.12
0.46 0.17 0.48 0.38 0.02 1.12 0.70  0.18
0.95 0.94 0.41 0.83 0.48 1.29 0.58 1.37
0.80 0.34 0.54 0.72 0.58 1.60 0.51 0.38
0.09 0.01 0.81 0.02 0.63 0.02 0.23  0.05
0.93 0.75 0.54 0.79 0.90 0.01 0.65 1.20
0.78 0.99 0.67 0.08 0.84 1.12 0.81 1.12
0.83 0.05 0.04 0.70 0.41 1.53 0.87  0.09
0.97 0.68 0.37 0.88 0.34 1.15 0.71 0.52
0.13 0.35 0.16 0.95 0.81 0.28 0.23 0.07 Package R
0.5 0.04  0.17 0.49 0.15 0.89 0.20  0.25
0.37 0.64 0.55 0.96 0.14 1.15 0.73 0.48 corrplot
0.01 0.98 0.48 0.94 0.76 0.60 0.01 1.49
0.40 0.44 0.80 0.40 0.94 0.28 0.64  0.23
0.44 0.67 0.67 0.42 0.20 0.71 0.61 1.18
0.92 0.07 0.48 0.92 0.06 0.98 0.24  0.71
0.30 0.39 0.54 0.23 0.92 1.01 0.83 0.51
0.60 0.75 0.22 0.60 0.50 0.09 0.56 1.04
0.25 0.77 0.02 0.51 0.18 0.67 0.15 0.87 71 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: simulated example

Graphical outputs Variable plot
Individual plot

72 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: principle
● The CCA can be viewed as an interative algorithm (like PCA)
● Maximize the correlation (ρ1) between two linear
combinations: one from variables X (t1), the other from
variables Y (u1).
t1 = a11X1 + a12X2 + … + a1pXp t1 and u1 are the first canonical
u1 = b11Y1 + b12Y2 + … + b1qYq variates and p1 is the first
canonical correlation.
ρ1 = cor(t1,u1) = maxt,u cor(t,u)
For next levels, iterate the process under orthogonality
●

constraints
● CCA is analog to PCA for the production and the interpretation
of graphical outputs.
● Mathematical aspects are in the same vein as PCA (eigen
decomposition of matrices)
73 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: nutrimouse data set

CYP4A10 CYP4A14 CAR1 RXRa C16SR C22.6n.3 C16.0 C20.2n.6
-0.81 -0.81 -0.97 -0.67 1.66 10.39 26.45 0.00
-0.88 -0.84 -0.92 -0.59 1.65 2.61 24.04 0.30
● 40 mice (2 genotypes) -0.71
-0.65
-0.98 -0.98 -0.68 1.57
-0.41 -0.97 -0.72 1.61
2.51 23.70
14.99 25.48
0.33
0.00
● Expression of 5 genes -1.16
-0.99
-1.16 -1.06 -0.78 1.66
-1.09 -1.03 -0.62 1.70
6.69 24.80
2.56 26.04
0.23
0.00
● Concentration of 3 lipids -0.62
-0.82
-0.76 -0.91 -0.65 1.58
-0.87 -1.11 -0.76 1.62
9.84 25.94
10.40 28.63
0.00
0.00
-0.48 -0.37 -0.85 -0.55 1.72 16.36 25.34 0.00
-0.79 -0.95 -0.99 -0.67 1.55 1.86 28.49 0.00
-0.51 -0.15 -0.92 -0.60 1.69 16.21 25.73 0.00
Question: are there any -1.00
-0.88
-1.13 -1.02 -0.69 1.57
-0.99 -0.99 -0.67 1.60
6.61 24.28
3.27 24.63
0.21
0.36

genes related to lipids? -1.05

-0.72
-1.15 -1.19 -0.75 1.59
-0.73 -0.93 -0.58 1.61
7.04 26.04
2.71 24.76
0.19
0.35
-0.67 -0.85 -0.99 -0.72 1.60 10.96 26.46 0.00
-1.19 -1.22 -1.15 -0.69 1.60 1.99 23.45 0.00
-0.56 -0.73 -0.95 -0.55 1.78 17.35 29.72 0.00
-1.03 -1.10 -1.02 -0.59 1.67 2.44 27.00 0.00
-1.01 -1.06 -1.01 -0.70 1.60 5.97 24.09 0.23
-1.21 -1.17 -0.91 -0.67 1.65 0.64 23.59 0.05
-1.15 -1.29 -0.90 -0.69 1.55 2.16 19.95 0.31
-1.22 -1.25 -0.88 -0.67 1.55 1.70 17.64 0.61
-1.15 -1.19 -0.90 -0.58 1.65 11.56 22.73 0.27
-1.16 -1.18 -0.87 -0.67 1.57 0.91 14.65 0.83
-0.93 -0.90 -0.73 -0.52 1.74 1.22 20.49 0.32
-1.13 -1.10 -0.83 -0.62 1.61 3.44 18.44 0.09
-1.09 -1.08 -0.85 -0.63 1.64 4.02 17.72 0.12
-1.33 -1.22 -0.85 -0.66 1.60 13.26 21.70 0.24
-1.18 -1.08 -0.74 -0.63 1.62 4.45 16.25 0.10
-1.18 -1.14 -0.84 -0.67 1.57 1.16 22.91 0.00
-0.96 -1.05 -0.70 -0.49 1.72 0.28 23.27 0.00
-1.07 -1.03 -0.83 -0.63 1.60 1.41 20.25 0.33
-1.12 -1.11 -0.84 -0.57 1.60 1.11 20.18 0.54
-1.22 -1.15 -0.90 -0.62 1.59 11.57 20.71 0.24
-1.05 -0.96 -0.88 -0.53 1.65 0.64 21.79 0.07
-1.07 -1.03 -0.73 -0.58 1.62 2.29 21.57 0.11
-1.23 -1.18 -0.98 -0.64 1.64 16.28 25.23 0.26
-1.08 -1.12 -0.63 -0.53 1.72 3.87 16.20 0.13
-1.13 -1.14 -0.79 -0.61 1.55 1.83 20.70 0.59
Correlation matrix
74 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: nutrimouse data set

Individual plot
Variable plot
color depending on the
genotype added a posteriori

Canonical correlations : 0.853 0.627 0.253

75 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: a fundamental method...

● If one data set has only one quantitative

variable, CCA is equivalent to multiple linear
regression.
● If one data set is a dummy matrix corresponding
to the categories of a qualitative variable, CCA is
equivalent to Linear Discriminant Analysis.
● If the two data sets are dummy matrices
corresponding to the categories of two
qualitative variables, CCA is equivalent to
Correspondance Analysis.

76 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

… with limits

● CCA can only be perfomed with « enough »

observations: n >> p+q (sounds like a joke
regarding ‘omics data...)

● Variables X and Y must not be « too » correlated

● Alternative: regularised CCA

77 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternatives
● PLS related methods. In PLS, the algorithm is equivalent
to find linear combinations from X and Y variables that
have the greatest covariance.

● Regularized CCA. Apply a « ridge » penalty using

regularization parameters to make the computation
possible.

78 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: simulated example

● variables X1 and Y1 are strongly correlated (0.9)

● variables X2 and Y2 are less strongly correlated
(0.7)
● Canonical correlations for X et Y are
approximately
ρ1 = 0.9, ρ2 = 0.7 et ρ3 = … = ρp = 0
● simulations are run for
n = 50, p = 10 et q = 10; 25 et 39

79 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

CCA: simulated example

80 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Generalisation

● Generalized CCA (GCCA): integration of more than 2

data sets ; maximizes the sum of every pairwise
covariances between two components.
● Sparse (see extensions) GCCA (SGCCA): variable
selection is performed on each data set

Tenenhaus, A., Philippe, C., Guillemot, V., Lê Cao K-A., Grill, J., Frouin, V. 2014,
Variable selection for generalized canonical correlation analysis, Biostatistics

81 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Graphical outputs

The same principles as those for PCA are still true for other
multivariate methods mentioned here:

● Individuals plots: the coordinates of the individuals are given by

the components calculated with the method (PCA, PLS-DA, PLS,
CCA...)

● Variables plot: the variables are

usually represented using their
correlation with the components
defining the axes. In other words,
the coordinate of one variable Xj
on the component ti is given by
cor(Xj,ti)

82 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternative graphical outputs

Motivations: usual plots are difficult to read and interpret when
● The numer of variables is too high

● The number of relevant components is greater than 2 inducing a « more

than 2D » representation space.

Propositions:
● Identify the pairs of highly related variables

● Produce graphical display making easy the interpretation

I. González, K-A. Lê Cao, M. Davis, S. Déjean (2012) – Visualising associations

between paired ‘omics’ data sets. BioData Mining

83 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternative graphical outputs

● Clustered Image Maps (CIM), Weinstein et al. (1997)
● Heatmaps, Eisent et al. (1998)

Variables set 2
Differ from usual
heatmaps crossing
individuals and
variables Variables set 1 84 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternative graphical outputs

● Covariance Graph, Cox et Wermuth (1993)
● Relevance Network, Butte et al. (2000)

85 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Alternative graphical outputs

● Be careful when interpreting network-based visualisation!

● The same network (same links between same edges) can be

represented in very different ways.
8 8 9

2
7 10

6 1

10
6

4
5 2

3 5
9 4 3

86 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Extensions sparse

87 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Curse of dimensionality
https://en.wikipedia.org/wiki/Curse_of_dimensionality

The curse of dimensionality refers to various phenomena that arise when

analyzing and organizing data in high-dimensional spaces (often with
hundreds or thousands of dimensions) that do not occur in low-
dimensional settings such as the three-dimensional physical space of
everyday experience. The expression was coined by Richard E. Bellman
when considering problems in dynamic optimization.

→ Sparse methods aim at dealing with problems related to the high

dimension of the data.

Occam's razor (law of parsimony): this principle states that among

competing hypotheses, the one with the fewest assumptions should be
selected.

https://en.wikipedia.org/wiki/Occam’s_razor

88 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Sparse PCA
High throughput experiments: too many variables, noisy or
irrelevant. PCA is difficult to visualise and understand.
→ clearer signal if some of the variable weights {a1, …, ap} were set
to 0 for the ‘irrelevant’ variables (small weights):

t = 0.x1 + a2.x2 + … + 0.xp

● Important weights : important contribution to define the Pcs.

● Null weights : those variables are not taken into account when
calculating the PCs

89 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Graphical outputs
PCA Sparse PCA

Représentation
des variables

Représentation
des individus

90 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Extensions multilevel

91 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Principle
● In repeated measures experiments, the subject
variation can be larger than the time/treatment
variation
● Multivariate projection based methodes make the
assumption that samples are independent of each
other
● In univariate analysis we use a paired t-test rather
than a t-test
● In multivariate analysis we use a multilevel approach
● Different sources of variation can be separated
(treatment effect within subjects and differences
between subjects)

92 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Paired data

● No paired structure (A): no

significant difference between
ctrl and trt

● Paired analysis (B): the data

is decomposed into a mean
(black circles) and a
difference (d) per subject. The
differences (net treatment
effect) are projected on the Y-
axis per subject, and are all
different from 0.

Fig. from Westerhuis et al. (2009). Multivariate paired data analysis: multilevel
PLSDA versus OPLSDA. Metabolomics 6(1).
93 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Data decomposition
Decomposition of the data into within and between variations

X = Xm + Xb + Xw
offset term between-sample within-sample

● The multilevel approach extracts the within variation matrix

● Classical multivariate tools can then be applied on the within matrix

→ We take into account the repeated measures of the design of

experiments

Liquet, B. Lê Cao, K-A., et al. (2012). A novel approach for biomarker selection and
the integration of repeated measures experiments from two platforms, BMC
Bioinformatics, 13:325.

94 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Multilevel: simulated example

3 variables (A, B, C) measured for 10 sujets (1...10) in 2 conditions control ou treatment.

Raw data set Between-subject matrix Within-subject matrix

condition subject A B C subject A B C DA DB DC

control 1 20 10 20 1 20.5 11 20 -1 -2 0
control 2 18 12 17 2 19.5 13 17 -3 -2 0
control 3 16 15 14 3 16.5 16 14 -1 -2 0
control 4 14 16 11 4 15.5 17 11 -3 -2 0
control 5 10 2 8 5 10.5 3 8 -1 -2 0
control 6 9 3 5 6 10.5 4 5 -3 -2 0
control 7 7 7 2 7 7.5 8 2 -1 -2 0
control 8 7 7 8 8 8.5 8 8 -3 -2 0
control 9 3 9 14 9 3.5 10 14 -1 -2 0
control 10 2 9 17 10 3.5 10 17 -3 -2 0
treatment 1 21 12 20 1 20.5 11 20 1 2 0
treatment 2 21 14 17 2 19.5 13 17 3 2 0
treatment 3 17 17 14 3 16.5 16 14 1 2 0
treatment 4 17 18 11 4 15.5 17 11 3 2 0
treatment 5 11 4 8 5 10.5 3 8 1 2 0
treatment 6 12 5 5 6 10.5 4 5 3 2 0
treatment 7 8 9 2 7 7.5 8 2 1 2 0
treatment 8 10 9 8 8 8.5 8 8 3 2 0
treatment 9 4 11 14 9 3.5 10 14 1 2 0
treatment 10 5 11 17 10 3.5 10 17 3 2 0

From Westerhuis et al. (2009).

Multivariate paired data analysis: multilevel PLSDA versus OPLSDA. Metabolomics 6(1).
95 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Multilevel: simulated example

PCA on raw data

● The main information

relies on the close
locations of the two
measurements made
on each subject (1-11,
2-12, …, 9-19, 10-20)

● No treatment effect
can be observed

96 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Multilevel: simulated example

PCA on between matrix PCA on within matrix

● Nearly the same information as ● Only 4 distinct points (related to the

obtained on the raw data 4 unique rows in the within matrix)
● Because variability between ● Treatment effect clearly appears
subjects is greater than the
variability due to the treatment
97 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

To put it in a nustshell
● Multivariate linear methods enables to answer a wide range of biological
questions
– data exploration ● Principles
– classification
PCA : max var(aX) →a ?
– integration of multiple data sets PLS1 : max cov(aX, by) →a, b ?
PLS2 : max cov(aX, bY) →a, b ?
● Variable selection (sparse) CCA : max cor(aX,bY) →a, b ?
● Cross-over design (multilevel) PLSDA →PLS2
GCCA : max Σ cov(aiXi,bjXj) →ai, bi ?

98 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

Questions, feedback

Web site with tutorial :

www.mixomics.org

Contact : [email protected]toulouse.fr

Register to our newsletter for the latest updates :

http://mixomics.org/apropos/contactus/

99 / 100
Introduction Reminders Exploration Discrimination Integration Graphics Extensions Conclusion

mixOmics would not exist without...

mixOmics development
Kim-Anh Lê Cao, Univ. Melbourne Methods development
Ignacio González, INRA Toulouse Amrit Singh, UBC, Vancouver
Benoît Gautier, UQDI Benoît Liquet, Univ. Pau
Florian Rohart, TRI, UQ Jasmin Straube, QFAB
Sébastien Déjean, Univ. Toulouse Philippe Besse, INSA Toulouse
François Bartolo, Methodomics Christèle Robert, INRA Toulouse
Xin Yi Chua, QFAB

Data providers and biological point of view

Pascal Martin, INRA Toulouse

And many many mixOmics users and attendees!

100 / 100

Multivariate Data Integration Using R
No ratings yet
Multivariate Data Integration Using R
331 pages
Lecture 1 Data Quality and Statistics
50% (2)
Lecture 1 Data Quality and Statistics
31 pages
4-Lecture 04
No ratings yet
4-Lecture 04
34 pages
hst951 7
No ratings yet
hst951 7
32 pages
IMAMultivariate 1
No ratings yet
IMAMultivariate 1
90 pages
4 - Basics in Statistics and Linear Algebra
No ratings yet
4 - Basics in Statistics and Linear Algebra
7 pages
A Short Course in Multivariate Statistical Methods With R
No ratings yet
A Short Course in Multivariate Statistical Methods With R
11 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
Part2 Statistics
No ratings yet
Part2 Statistics
55 pages
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
100% (1)
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
6 pages
Proteomics Workshop 2014 - Dmitry Grapov
No ratings yet
Proteomics Workshop 2014 - Dmitry Grapov
32 pages
Statistical Analysis
No ratings yet
Statistical Analysis
50 pages
Textbook ML - Removed
No ratings yet
Textbook ML - Removed
22 pages
Unit 3
No ratings yet
Unit 3
36 pages
Feature Extraction
No ratings yet
Feature Extraction
90 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Multivariate Data Integration Using R
No ratings yet
Multivariate Data Integration Using R
40 pages
STAT501 Multivariate Analysis
No ratings yet
STAT501 Multivariate Analysis
196 pages
Dmitry Grapov
No ratings yet
Dmitry Grapov
41 pages
Notes For Multivariate Statistics With R
No ratings yet
Notes For Multivariate Statistics With R
189 pages
Multivariate Statistics Principal Component Analysis (PCA)
No ratings yet
Multivariate Statistics Principal Component Analysis (PCA)
41 pages
Multivariate Data Analysis in R PDF
No ratings yet
Multivariate Data Analysis in R PDF
400 pages
AOD Lec5-6
No ratings yet
AOD Lec5-6
52 pages
LINFO2275 Questions D Examen-4
No ratings yet
LINFO2275 Questions D Examen-4
34 pages
Dmitry Grapov
No ratings yet
Dmitry Grapov
41 pages
HASTS215 - HSTS215 NOTES Chapter1 - 2
No ratings yet
HASTS215 - HSTS215 NOTES Chapter1 - 2
24 pages
Multivariate
100% (1)
Multivariate
78 pages
FALLSEM2023-24 - ITE2011 - ETH - VL2023240102356 - 2023-09-01 - Reference-Material-I (3 Files Merged)
No ratings yet
FALLSEM2023-24 - ITE2011 - ETH - VL2023240102356 - 2023-09-01 - Reference-Material-I (3 Files Merged)
191 pages
Multi Variate Analysis
No ratings yet
Multi Variate Analysis
4 pages
Multivariate Statistical Functions in R
100% (3)
Multivariate Statistical Functions in R
382 pages
Data Cleaning
No ratings yet
Data Cleaning
39 pages
Projecting Data To A Lower Dimension With PCA
No ratings yet
Projecting Data To A Lower Dimension With PCA
6 pages
Biological Data Science Lecture6
No ratings yet
Biological Data Science Lecture6
29 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
No ratings yet
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
42 pages
Multivariant Data.
No ratings yet
Multivariant Data.
36 pages
Module 3 Data Preparation
No ratings yet
Module 3 Data Preparation
33 pages
Unit 2
No ratings yet
Unit 2
20 pages
Applied Multivariate Statistics - Review
No ratings yet
Applied Multivariate Statistics - Review
26 pages
ML Summary
No ratings yet
ML Summary
23 pages
ML SummaryFINAL
No ratings yet
ML SummaryFINAL
48 pages
PG 2.3.6
No ratings yet
PG 2.3.6
17 pages
L4 Exploratory Analysis en
No ratings yet
L4 Exploratory Analysis en
42 pages
Mod2 Notes
No ratings yet
Mod2 Notes
72 pages
Most Compact and Complete Data Science Cheat Sheet 1672981093
No ratings yet
Most Compact and Complete Data Science Cheat Sheet 1672981093
10 pages
Lecture 2 Multivariate Data Analysis and Visualization
100% (1)
Lecture 2 Multivariate Data Analysis and Visualization
33 pages
EDA - Module 4
No ratings yet
EDA - Module 4
35 pages
What Is Data Science? Probability Overview Descriptive Statistics
No ratings yet
What Is Data Science? Probability Overview Descriptive Statistics
10 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Variable: An Item of Data Examples
No ratings yet
Variable: An Item of Data Examples
60 pages
Mvda - Question Bank
No ratings yet
Mvda - Question Bank
14 pages
Feature Selection - New
No ratings yet
Feature Selection - New
41 pages
Notes 14
No ratings yet
Notes 14
189 pages
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Advanced Techniques for Multivariate Data Analysis Using PYTHON. Predictive Models for Classification and Segmentation
From Everand
Advanced Techniques for Multivariate Data Analysis Using PYTHON. Predictive Models for Classification and Segmentation
César Pérez López
No ratings yet
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
From Everand
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Enchanted Forest
No ratings yet
The Enchanted Forest
2 pages
That Time I Got Reincarnated As A Slime, Vol. 10
No ratings yet
That Time I Got Reincarnated As A Slime, Vol. 10
456 pages
Shs12 Trends q1w2
100% (1)
Shs12 Trends q1w2
12 pages
Lec-4, Ch-4, Distress in Flexible Pavement
No ratings yet
Lec-4, Ch-4, Distress in Flexible Pavement
127 pages
Across Arabian Seas - PT 4
No ratings yet
Across Arabian Seas - PT 4
3 pages
Complete Urine Analysis 07-03-2022
No ratings yet
Complete Urine Analysis 07-03-2022
1 page
John Nash PHD Thesis Length
100% (3)
John Nash PHD Thesis Length
9 pages
Optibend Unitube Mini Cable
No ratings yet
Optibend Unitube Mini Cable
7 pages
899 Coulometer: Manual
No ratings yet
899 Coulometer: Manual
124 pages
312006-Basic Mechanical Engineering 281223
No ratings yet
312006-Basic Mechanical Engineering 281223
7 pages
SSRN Id4138427
No ratings yet
SSRN Id4138427
12 pages
Manual Fritzbox Fon Wlan 7170
No ratings yet
Manual Fritzbox Fon Wlan 7170
140 pages
Model 2400 Track Specs 1461875265
No ratings yet
Model 2400 Track Specs 1461875265
4 pages
Geometry Enrichment Packet
No ratings yet
Geometry Enrichment Packet
38 pages
FAMILY LAW-I Marriage
No ratings yet
FAMILY LAW-I Marriage
126 pages
JVC Lt-22hg45e Led TV PDF
No ratings yet
JVC Lt-22hg45e Led TV PDF
43 pages
English Year 4 - Paper 1
No ratings yet
English Year 4 - Paper 1
26 pages
Bacillus Polyfermenticus
No ratings yet
Bacillus Polyfermenticus
6 pages
MCQ Electrical Circuits
50% (2)
MCQ Electrical Circuits
81 pages
Family and Friends Special Edition 4 - Workbook - Answer Key & Script
50% (2)
Family and Friends Special Edition 4 - Workbook - Answer Key & Script
22 pages
(Classics in Applied Mathematics) Stephen L. Campbell, Carl D. Meyer - Generalized Inverses of Linear Transformations - Society For Industrial and Applied Mathematics (2008)
100% (1)
(Classics in Applied Mathematics) Stephen L. Campbell, Carl D. Meyer - Generalized Inverses of Linear Transformations - Society For Industrial and Applied Mathematics (2008)
294 pages
Tiger Grass Industry Studies in Romblon PDF
No ratings yet
Tiger Grass Industry Studies in Romblon PDF
10 pages
Efka SL 3034
No ratings yet
Efka SL 3034
2 pages
Name:Nor Shakira Binti Azemi & Dharvin Dharan A/L Elango Theme: Environment Issue Topic: Humans Are To Blame For Environmental Degradation
No ratings yet
Name:Nor Shakira Binti Azemi & Dharvin Dharan A/L Elango Theme: Environment Issue Topic: Humans Are To Blame For Environmental Degradation
3 pages
The Collected Papers of Peter J. W. Debye, Pgs 500-513
No ratings yet
The Collected Papers of Peter J. W. Debye, Pgs 500-513
30 pages
IGNOU S Indian History Part 2 India Earliest Times To The 8th Century AD
100% (1)
IGNOU S Indian History Part 2 India Earliest Times To The 8th Century AD
439 pages
Tuv India Private Limited: Inspection Visit Report
No ratings yet
Tuv India Private Limited: Inspection Visit Report
5 pages
Activity No. 3 Basic AutoCAD Commands Familiarization Ianjeff
No ratings yet
Activity No. 3 Basic AutoCAD Commands Familiarization Ianjeff
3 pages
Curriculum Vitae 2020 2
No ratings yet
Curriculum Vitae 2020 2
2 pages
Intelligence in IoMT Turkey
No ratings yet
Intelligence in IoMT Turkey
17 pages