0% found this document useful (0 votes)

77 views18 pages

Modeling Input/Output Data: Partial Least Squares (PLS)

This document summarizes a lecture on modeling input/output data using partial least squares (PLS). It discusses issues with using least squares to model relationships between input and output data, including noise, redundancy, and different scales between inputs and outputs. It proposes normalizing the data by centering and scaling to standard deviations to address different scales. PLS aims to find directions in the input and output spaces that have maximum covariance between them to better model the relationships in the data.

Uploaded by

Kerwin Maniaul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views18 pages

Modeling Input/Output Data: Partial Least Squares (PLS)

Uploaded by

Kerwin Maniaul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Lecture 6:

Modeling Input/Output Data: Partial Least Squares (PLS)

c It was shown in the previous chapter that a few simple concepts from linear algebra opened up a world of ideas on how to look at real laboratory data. In this lecture we pursue this further. In Lecture 3, using the method of least squares (linear regression), we dealt with the case of an over determined system of linear equations. At the time it was pointed out that, given enough data, i.e., input/output data we could use least squares to determine an assumed linear relation between these, i.e., the matrix relating these. Specically suppose a stimulus, s, and a response, r, with the relation

r = Ts

(6.1)

with dim(r) = n and dim(s) = m then T has n m unknowns. In principle we can envision linearly independent stimuli {sj } j = 1, ..., M , with corresponding responses {rj } so that

R = TS with R= and S= It then follows that s1 s2 sm . r1 r2 rm

(6.2)

(6.3)

(6.4)

T = RS1 .

(6.5)

100

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

In a real experiment noise is an issue, as is the nature and quality of the relation of input to output. This lecture will deal with these questions in a larger framework.
Determining T, the coecient matrix in (6.5), has more than one perspective. If one dons the mask of a theoretician, the obvious choice for sk is sk = [0, ..., 0, sk , 0, ..., 0] , i.e., zeros except at the kth entry. This leads to the evaluation of the kth column of T at each of the m experiments k = 1, ..., m. On the other hand the experimenter is likely to repeat the same experiment, and as a result the dierence in s & r are due to noise. Within this framework the system is being explored by noise. This is a legitimate procedure but carries with it the need for a (very) large number of repeats. Another feature deserves menton. In the example to be introduced next we consider the inuence of tissue qualities (input) on malignancy (output). The latter is coded as +1 for cancer, and 1 for non-malignancy. This represents a lumping, or coarse-graining on the description which is inherited by T. If the outcome could be rened by say the degree and quality of the malignancy, a more rened model would result.

Linearity To give substance to these ideas consider Table 6.1 which shows protein microarray data of lung tissue cell lines showing 15 biomarkers, as collected from lung tissue and Mesothelium, a lung membrane. Results are shown for both malignant and normal tissue. Protein levels represent input and malignancy/non-malignancy the output.

Table 6.1:

Protein biomarkers at left. In total there are 6 malignant and 3 non-

malignant assays, all of which are represented as columns in the Table. Thus nine experiments in all.

Exercise 6.1. Download smalldata.xls from the website, which yields Table 6.1. (a) Merge the non-malignant data matrix and perform SVD on it. How many signicant components?

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

101

(b) Same for the malignant data. We rst comment on the suitability of the assumption of linearity. For purposes of exposition imagine an input/ouput
1

relation of the general form

r = F(s).

(6.6)

At this stage in our thinking we imagine this as laboratory data in the form of a table. Suppose the input s is p dimensional and the output r is n dimensional. In the neighborhood of some operating point of interest, (r0 , s0 ),

r0 = F(s0 ),

(6.7)

then there should be some range over which a linear approximation, given by a two-term Taylor expansion, F s F s

y = r r0 =

(s s0 ) =

x
s0

(6.8)

is valid. This should provide a reasonable description of the input/output behavior over the same range. Further discussion of such an approach will be facilitated by reformatting the problem. First dene F s

B = in which terms (6.8) can be written as

(6.9)
s0

y = x B,

(6.10)

where the output, y , is an n dimensional row vector, the input, x a n dimensional row vector and B is a p n matrix. The challenge to modeling the data, in input/output form, is then to determine B. As will be seen the reference points, zero subscripts, might be taken to be average values. To give this some perspective suppose (6.6) is a scalar equation, i.e., the experiment produces one (scalar) output y, from a vector input so that instead of (6.10) we would write

y =xb

(6.11)

to emphasize that (6.9) in this case is a vector, of order p, rather than a matrix. Clearly, many experiments are needed (> p) in order to determine the components of the vector b, followed by an appropriate regression analysis.
1

Other equivalents are: stimulus/response, predictor/resultant and so forth.

102

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Returning to the general case, (6.10), we can assemble many, say N , experiments, as rows of an N n matrix, Y, and each of the inputs as rows of the N p matrix, X, y1 y2 Y= . . . yN so (6.10) becomes x1 x2 & X= . , . . xN

(6.12)

Y = XB.

(6.13)

Recalling our discussion of the similar case in Lecture 3 we might then consider a least squares t (6.13) which yields

B = (X X)1 X Y and if (X X) is not full rank we can determine the Moore-Penrose inverse X+ = pinv(X) so that B = X+ Y.

(6.14)

(6.15)

To illustrate with the data of Figure 6.1 we merge all the data and therefore each column represents an experiment, and appears as a row of protein assays in 0.93 1.51 1.19 0.03 0.02 0.27 0.06 0.00 0.12 0.00 0.12 0.52 0.00 0.12 0.00

1.16 2.07 1.79 1.16 0.09 0.02 0.69 0.03 0.24 0.05 0.16 0.40 0.03 0.86 0.01 0.68 1.67 1.25 0.34 0.25 0.71 0.66 0.00 0.41 0.00 0.23 0.71 0.00 0.60 0.00 1.08 2.04 1.36 0.59 0.506 1 0.64 0.03 0.26 0.07 0.05 0.19 0.01 0.01 0.075 X = 1.01 1.64 1.54 0.04 0.79 0.13 0.31 0.00 0.88 0.00 0.23 0.37 0.00 0.25 0.03 . 0.88 2.01 1.74 0.31 0.99 0.02 1.34 0.09 0.25 0.02 0.01 0.20 0.04 0.46 0.04 0.54 4.61 4.87 0.00 0.00 0.17 2.85 0.49 1.09 0.29 2.87 7.47 0.87 3.79 0.32 0.12 5.38 4.32 0.28 0.2 0.35 2.43 0.16 0.91 0.48 0.89 5.62 0.27 1.26 0.72 0.21 3.14 2.28 0.10 0.363 0.00 1.25 0.11 1.02 0.17 0.73 2.68 0.32 1.08 0.40 The output is the presence or absence of malignancy. This we might express as

(6.16)

Y1 = 1 1 1 1 1 1 1 1 1 or equivalently as

(6.17)

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

103

Y2 = Moore-Penrose(Pseudo) Inverse 111100110 000011001

. (6.18)

This is an opportune time to reconsider the Moore-Penrose inverse since it is easily understood within the framework of SVD. Suppose X is N M and has the SVD X = UV =
N j=1 uj j vj

(6.19)

with uj & vj the orthonormal column of U & V, respectively, and j the elements of the diagonal matrix . Then the Moore-Penrose inverse X+ is given by X+ = where 0; j = 0 j = 1/ ; = 0 j j Observe
N j=1 N j=1

vj , u j

(6.20)

(6.21)

X+ X = is the identity matrix in a restricted space. Exercise 6.2. Show 1. XX+ X = X 2. X+ XX+ = X+ 3. (XX+ ) = XX+ 4. (X+ X) = X+ X

vj u j

(6.22)

Exercise 6.3. For (6.16) and (6.17) the model is a scalar

y = x b

(6.23)

Use the Moore-Penrose inverse to nd b. How well does this reproduce (6.17)? From SVD of X in this case what are the s? Do you see a problem?

104

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

As will become clear the discussion leading to (6.14) or (6.15) is purely formal. It does not address a number of underlying issues that accompany this approach, or for that matter any general discussion of data modeling. We enumerate some of the issues: (1) One might for example think of using SVD to decompose both Y and X, but these lead to representations in formally unrelated spaces, and there is no direct way to link these. (2) In constructing Y, we assemble what we regard as a reasonable collection of responses (observations) but there might be some redundancy. (3) In assembling X, the input, or predictors, we face the same problem. In addition we might be recording input that is uncorrelated with what is deemed to be the appropriate output. Observation: An attractive idea is that we try to nd a way which exploits the correlation between input and output. (4) The units of both Y and X can be an issue. To see this suppose voltage and temperature are recorded output variables. Clearly, these carry dierent units, and for each what units should we use? If we use microvolts instead of volts, a factor of 106 enters and gives a great weight to voltage. Even if the physical units are homogeneous, problems can appear. Imagine a biochemical soup with vastly dierent concentrations. Data preparation Item (4) addresses a general issue in dealing with data and can be taken care of by the following recipe: Normalize all data points by removing the mean, called centering, and then divide the result by the standard deviation. More specically if {xj }, j = 1, 2, ..N , denotes the elements of a column of X or Y, then form the mean, 1 N
N

x= and the variance 1 N

xj
j=1

(6.24)

2 = and redene the input/output as

N j=1

(xj x)2

(6.25)

zj =

xj x .

(6.26)

The last is called the Z-score of the measurements, it is zscore in Matlab, and puts all columns of X on an equal footing, and similarly for Y. Observe that this recipe centers all the data, confers unit variance on these and in doing this eliminates physical units.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

105

On a cautionary note it should be noted that for the Rogues Gallery problem, the data is composed of images, and in particular the gray levels at pixels. If the gray levels at each pixel are transformed to Z-scores, to some degree we might lose the information that the placement of pixels carries information. Exercise 6.4 (Optional). Subject the male population ensemble to the Z-score transformation. Calculate eigenfaces and spectra accordingly. After undoing the Z-score in the results compare these with the original analysis.

Partial Least Squares Observe that the above data preparation does not depend on the motivational Taylor expansion discussion given at the start of the lecture. We can now regard Y and X as the matrices of measured output and input, respectively, that have been prepared, perhaps in Z-score form. Further in order to model the underlying phenomena we seek a matrix B such that

Y = XB + ,

(6.27)

where we put in an error term, , the residual, to reect the fact that there may be a (small) error in the model. It is sometimes useful to write (6.27) as

= N X p p Bn + N

(6.28)

to remind you of matrix sizes. At this point we can think of B as the dierential approximation of some unknown (probably non-linear) law. Next we deal with the rst three issues raised above, and follow up on the Observation. Clearly, it would be desirable to correlate input with output. But, as mentioned in general the two vectors live in dierent spaces.
Motivation is provided by going back to SVD, for say the Rogues gallery problem. In this case an ensemble of faces is depicted by the matrix F, (5.47), of N rows, the number of faces, each of which has a large number, say M , of pixel gray levels given by the number of columns of F. Formally, we can recast (5.33) and (5.34) by regarding

v = Fw as an input/output relation and requiring that

(6.29)

C = (v, Fw) be maximized subject to the side conditions

(6.30)

Sw = w This leads to the Lagrangian

1 = 0 & Sv = v

1 = 0.

(6.31)

L = (v, Fw) ( w

1) ( v

1),

(6.32)

106

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

which in turn yields (5.33) and (5.34)2 . This suggests how to link input and output.

We can now seek a linear combination of the output columns of (6.13), viz.,

u = Yq with

(6.33)

q = 1, and a linear combination of the input columns

(6.34)

t = Xw with weights, w (also called loadings), so that

(6.35)

w = 1, so that the correlation of output and input

(6.36)

(t, u) = (Xw, Yq)

(6.37)

is maximized. Within the framework of the Lung data of Table 6.1, (6.37), poses the question is there a combination of biomarkers essentially averaged over all experiments, which is best correlated with some combination of outcomes, again averaged over all experiments. Note that if we maximize (6.37) then t and u will be an admixture of the data, as was the case for SVD. It is worth noting in passing that the cancer problem Y is a column vector, and therefore we are looking at the correlation of it with columns of X. Therefore once again we are led to an optimization framework involving Lagrange multipliers. The criterion function now is

C = (Xw, Yq) and the side conditions are (6.34) and (6.36)3 The Lagrangian is then given by

(6.38)

L = (Xw, Yq) w

1 /2 q

1 /2,

(6.39)

and the variational equations are easily seen to lead to

2 3

The analysis then implies = Note that (6.38) can take on either sign. A large negative value of C, anti-correlation, is as much an indication of output/input

linkage, as a large positive value of C.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

107

X Yq = w and

(6.40)

Y Xw = q,

(6.41)

along with (6.34) and (6.36). This is similar to (5.33) and (5.34), but unlike those two which are correlations, X Y and Y X are cross-correlations of input and output, which underlines the desire to link output to input. Since a matrix and its adjoint are on an equal footing we can take

= . Proceeding as for the case of SVD we can back substitute (6.41) into (6.40) to obtain

(6.42)

X YY Xw = 2 w and vice versa to obtain

(6.43)

Y XX Yq = 2 q.

(6.44)

As was the case for SVD only one of these need be solved, in which case either (6.40) or (6.41) is used to get the other. Formally from (6.33) and (6.35) we observe that

YY XX u = 2 u and

(6.45)

XX YY t = 2 t. Neither of these need be solved if either w or q is known since (6.35) gives t and (6.33) gives u.

(6.46)

It should be noted that the matrix appearing in ((6.45) (and also (6.46)) is not necessarily symmetric, and therefore even though the eigenvectors of (6.43) or (6.44) are orthonormal, since the corresponding matrices are symmetric, this is not true for say (6.46). As will be clear shortly, Partial Least Squares (PLS) requires a set {tk } of vectors (the latent vectors) which are orthogonal. Therefore at the present we have only determined one latent vector, say t1 , which corresponds to w1 , corresponding to the maximum value of 2 from (6.43). The strategy for moving forward is clear. The principal factor t1 should be removed from the input, and the output, and with this done we repeat the optimization on the reduced input and output

108

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

matrices. This is reminiscent of the matrix deation that appears in Lecture 5. We now review this in more detail.
Deation of a Matrix Suppose A = A1 is the matrix in question and {wj }, j = 1, 2, ... a set of linearly independent vectors, then we dene

tk = Ak wk pk = A tk / tk k
2

Ak+1 = Ak tk p . k Observe that from (6.47)

(6.47)

tk Ak+1 = 0. Note this algorithm is to be recursively applied an appropriate number of steps. If A1 is rank N , then AN +1 = 0. Exercise 6.5. Use (6.48) to show that {tk } form an orthogonal set and from this

(6.48)

X
N

k=1

tk p . k

(6.49)

Before continuing mention should be made that there are dierent algorithms that parade under the name of PLS. This should underline for you the fact that this is really a framework, which might require tailoring in a particular situation. However, in all cases the centerpiece of the development is extremizing (6.37). Thus, unlike SVD (or PCA) where the weights (loadings) w reect the covariance within a data structure, in PLS, the weights w, now reect the covariance structure of input and output. This is important for the modeling we wish to achieve, (6.27). However, quite aside from this it will also lead to a clearer realization of what is the response, and to what might be regarded as relevant input. PLS Procedure In implementing the PLS the predictor (input) matrix is emphasized, and in this spirit the matrix X is deated. The algorithm is started by setting

X1 = X & Y1 = Y,

(6.50)

and w1 is the eigenvector of (6.43) that corresponds to the maximal value of 2 , as already discussed. Then for k = 1 we determine

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

109

tk = Xwk pk = X tk / tk k
qk = Yk tk / tk 2 2

(6.51)

At this point we deate Xk & Yk

Xk+1 = Xk tk p k Yk+1 = Yk tk q k (6.52)

Next we repeat the process of maximizing the covariance of input and output and nd the solution of

X Yk+1 Yk+1 Xk+1 wk+1 = 2 wk+1 , k+1

(6.53)

for minimum 2 . We continue to cycle through (6.51) and so forth to obtain t1 T= . . .

As an alternative to carrying t
2

t2 . . . .

(6.54)

in the above we can force t to have unit norm, by dening

r = w/ Xw and then redening t by

(6.55)

t = Xr so that

(6.56)

t =1

(6.57)

The degree of deation is left open. It then follows that

X = TP + with

(6.58)

P = (T T)1 T X

(6.59)

110

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

where

is the error matrix, which is zero if we completely deate X. The fractional variance captured is

given by X where
2

X 2

(6.60)

=
ij

2 Xij ,

(6.61)

called the Frobenius norm, and is just the usual vector norm if the rows of a matrix are concatenated to form a vector. If X is fully deated then 100% of the variance is captured. We can also write

Y = TQ + with

(6.62)

Q = (T T)1 T Y.

(6.63)

Since (6.37) is being maximized we can hope that a high variance capture of X is accompanied by a relatively high variance capture of Y. In any case

( Y

)/ Y

(6.64)

is the fractional capture, which may not be unity even if (6.60) is. Finally, if we can write

Q = PB

(6.65)

then substitution in (6.62) gives us the sought after model. Since both Q and P are known, B in principal is known. Since the inversion of P may be problematical we resort to the Moore-Penrose inverse and write

B0 = P+ Q. Thus the required model is

(6.66)

Y XB0 , where we dont explicitly point out the error term.

(6.67)

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

111

The objective in arriving at (6.67) was not in recasting the known data in this form, but rather in arriving at a predictive model to be applied to new data. As will be seen there is value in examining (6.66) for the data used in obtaining B0 . Incidentally the data used to generate B0 can be referred to as the training set. The algorithm just described goes under the acronym NIPALS (Hskuldson, A. (1988). PLS regression o methods. Journal of Chemometrics, 2:211-228; Geladi, P., & Kowlaski, B. (1986). Partial least square regression: A tutorial. Analytica Chemica Acta, 35:1-17.). A variation of this, due to de Jong (de Jong, S. (1993). SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18:251-263.), and called SIMPLS appears in the Matlab Statistics toolbox as plsregress. The reason for the given exposition is that, although far from pretty, it is systematic, and I believe does reveal the internal structure which was the intention. A minor dierence of this algorithm is that the latent vectors tn are taken to be unit length. An extensive website for PLS can be found at www.models.kvl.dk/source/nwaytoolbox/). PLS has been used eectively to model biochemical signaling. See for example Gaudet, S., Janes, K.A., Albeck, J.G., Pace, E.A., Lauenburger, D.A. and Sorger, P.K. A compendium of signals and responses triggered by prodeath and prosurvival cytokines. Mol. Cell. Proteomics, 4:1569-1590, 2005. and Janes, K.A., Albeck, J.G., Gaudet, S., Sorger, P.K., Lauenburger, D.A. and Yaee, M.B. A systems model of signaling identies a molecular basis set for cytokine-induced apoptosis. Science, 310:1646-1653, 2005. Example: The Cancer Data To begin with it should be mentioned that there is considerable variation in notation and procedures associated with PLS. This Lecture mainly attempts to convey the basic ideas, and a consensus notation. The approach is given by S. de Jong, cited above, is what Matlab implements. Matlab documentation of plsregress is opaque, and the following is intended as a guide to the use of this algorithm. It should be rst noted that plsregress does not directly yield results for the standardized Z-score form of data, but does center the data. In terms of input X and output Y Matlab calls PLS with

[XL, YL, XS, YS, , PCTVAR, MSE, stats] = plsregress(X, Y, ncomp)

(6.68)

ncomp refers to the number of components. The various elements of (6.68) will be discussed in the course of modeling (6.16)-(6.17). In tensor form (6.27) this can be written as

Yij = Xim Bmj +

(6.69)

where i = 1, ..., N yields the experiments, 9 in number, and j = 1, ...p, (p = 1) the number of output readings for each, m = 1, ..., n (n = 15) gives the number of input measurements for each experiment. or residual.
ij

is the error

112

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

plsregress rst centers the data. To accomplish this we consider the average over experiments. 1 Yj = N If we dene u as
N

Yij = X m Bmj +
L=i

1 = N

Xim Bmj
L=i

1 + N

N ij . i=1

(6.70)

(u)i = 1, i = 1, N then the centered data is given by

(6.71)

o Yij = Yij ui Y j = (Xim ui X m )Bmj + (

ui j ).

(6.72)

o and clearly each term of (6.72) averages to zero. Note the resemblance of this to (6.8). Yij is the centered

output and

o Xim = Xim ui X m ,

(6.73)

the centered input. As mentioned a distinction between the algorithm and the development given in this Lecture is that the ti appearing in (6.58) are normalized to be of unit length in the algorithm (and thus w is no longer of unit length). The T matrix, (6.50), is essentially given by XS, the 3rd argument in (6.68). It follows from the form of Xo , (6.73), that T requires more explanation. As you can verify rank (X) = 9. However rank (XS) is 8. To resolve this note that Xo is X with the mean of each column removed, hence the rows of Xo add to zero, so that the rank is diminished by 1 and the rank of XS should be 8. The rst two entries of (6.68) are given by

XL = Xo XS YL = Yo XS. In terms of original variables we can write (6.74)

o Yij ui Y j + Xim Bmj = ui (Y j X m Bmj ) + Xim Bmj

(6.75)

and for convenience the error is not carried. The quantity that appears in (6.68) is dened so that

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

113

1j = Y j X m Bmj mj = Bm1j , m > 1 (6.76)

Note that the rst relation of (6.76) is the 1st term on the right of (6.75), and is the oset or intercept for X = 0. In Matlab language the model is

Y = [ones(n, 1), X] + residual

(6.77)

Returning to (6.68): PCTVAR is a 2ncomp matrix of fractional variances for each component of X (1st row) and of Y (2nd row). MSE contains various error estimates, and stats, a structure, refers to various statistics. In Figure 6.1 we plot the % variance PCTVAR, capture for each PLS component: plot(1:8,cumsum(p6(2,:))*100,ro,1:8,cumsum(p6(1,:))*100,b*) The variance of X appears in blue and for Y in red. Observe that for 2 or 3 and it would seem that we do quite well.

Fig. 6.1: Percent capture of X and Y

114

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

A slightly dierent interpretation emerges if we actually carry out the t indicated by (6.75) or (6.77):

yt = [u, X] ;

(6.78)

The exact result is shown in Figure 6.2 as a red O. The indicates use of all components, +, 2 components and a blue O, 3 components. The last two show some relatively large discrepancies. This is mostly due to the fact that variance is the square of standard deviation

Fig. 6.2: Recovery of the output Y in terms of 2,3,8 components

The output in this example, 1, is a scalar and therefore once again the sought after model has the form

y xb

(6.79)

which is the form given in (6.78). Figure 6.3 shows b as a bar plot, and if we go back to Table 6.1, we can get an indication of what proteins are the most signicant determinants of malignancy/non-malignancy.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

115

Fig. 6.3: b the vector of (6.79)

There are at least two questions which should be explored. (1) How good is the model as a predictive tool? (2) How good is the procedure at eliminating irrelevant data? The following exercise attempts to answer these questions using a larger dataset that is available in Matlab. Exercise 6.6 (Special). In Matlab enter doc plsregress which describes plsregress and also furnishes an example using a dataset which you can access with load spectra

116

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

This furnishes an input X = NIR and an output octane. The latter gives the octane values of 60 gasoline samples, and the former spectral intensities at 401 wavelengths for each of these, presumably acquired during an ignition process. (I) To deal with question (1) randomly select 10 experiments and remove from the data for the purposes of testing the model. Perform PLS on the remaining set of 50 with ncomp = 2, 5, 10, 20, 40, 60 and compute b in each case. Test to see how well

y bx

(6.80)

ts the known values for the ten test cases. In this case y is the octane value and b is a vector of 401 components. What is the best ncomp? (II) The spectral intensities appear to lie within (0, 1). We can create a cohort of ctitious spectra by considering

Xnew = [NIR, rand(60, k)] instead of NIR. Ideally the last k wavelength should contribute negligible components to b. (a) Perform SVD on NIR. How many signicant modes? (b) Repeat this calculation on Xnew for k = 1, 2, 6, 10.

(6.81)

(c) Using the best ncomp from part (I) and using the full Xnew & octane see what the contribution to b is at the ctitious wavelengths for k = 1, 2, 6, 10. Exercise 6.7 (Special). (a) Obtain an algorithm in the form of an m-File which includes the option of putting the data in Z-score form. The program should call plsregress. (b) The program should contain as one of the outputs the model t in original variables, i.e., the mean and standard deviation removed. (c) Apply this to the cancer data and compare with the results discussed in this Lecture.

All Models Are Wrong
No ratings yet
All Models Are Wrong
429 pages
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
100% (6)
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
637 pages
Stan Users Guide 2 32
No ratings yet
Stan Users Guide 2 32
456 pages
PM Shri KV Gachibowli Maths Class XII Chapter Wise Practice Papers Answers
100% (1)
PM Shri KV Gachibowli Maths Class XII Chapter Wise Practice Papers Answers
102 pages
Intro Aml FP
No ratings yet
Intro Aml FP
92 pages
Slides Foundations
No ratings yet
Slides Foundations
81 pages
Pattern Recognition Systems
No ratings yet
Pattern Recognition Systems
81 pages
Partial Least Squares Regression A Tutorial
100% (1)
Partial Least Squares Regression A Tutorial
17 pages
Schaum's Theory & Problems of Matrices PDF
100% (1)
Schaum's Theory & Problems of Matrices PDF
231 pages
2022lectures1-8 Optimization For DataScience
No ratings yet
2022lectures1-8 Optimization For DataScience
35 pages
Week 4 Linear Regression
No ratings yet
Week 4 Linear Regression
38 pages
Lecture3 2015
No ratings yet
Lecture3 2015
38 pages
Report System Identification and Modelling
No ratings yet
Report System Identification and Modelling
34 pages
C195X PDF AppA
No ratings yet
C195X PDF AppA
23 pages
Skills JEE - Algebra
100% (5)
Skills JEE - Algebra
847 pages
A Simple Explanation of Partial Least Squares
No ratings yet
A Simple Explanation of Partial Least Squares
10 pages
Lecture Notes On High Dimensional Linear Regression
No ratings yet
Lecture Notes On High Dimensional Linear Regression
73 pages
Dynamic Behavior of Chemical Processes
No ratings yet
Dynamic Behavior of Chemical Processes
45 pages
A. H. Nuttal and G. C. Carter, A Generalized Framework For Power Spectral Estimation, Appendices
No ratings yet
A. H. Nuttal and G. C. Carter, A Generalized Framework For Power Spectral Estimation, Appendices
37 pages
An Overview of Methods in Linear Least-Squares Regression
No ratings yet
An Overview of Methods in Linear Least-Squares Regression
69 pages
Composites As Factors
No ratings yet
Composites As Factors
11 pages
Statistical Methods-1
No ratings yet
Statistical Methods-1
63 pages
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
4 pages
Sketching As A Tool For Numerical Linear Algebra
No ratings yet
Sketching As A Tool For Numerical Linear Algebra
139 pages
Wainwrightslides 2
No ratings yet
Wainwrightslides 2
77 pages
BBL 016
No ratings yet
BBL 016
13 pages
Chapter 05 - Least Squares
No ratings yet
Chapter 05 - Least Squares
27 pages
Terbraak 1998
No ratings yet
Terbraak 1998
14 pages
Chap 1
No ratings yet
Chap 1
10 pages
Lec 9
No ratings yet
Lec 9
14 pages
The Geometry of Partial Least Squares
No ratings yet
The Geometry of Partial Least Squares
28 pages
SSPI Lecture 3 Estimation Intro 2025
No ratings yet
SSPI Lecture 3 Estimation Intro 2025
56 pages
Lab 6 DS412
No ratings yet
Lab 6 DS412
2 pages
How To Make Pls Consistent
No ratings yet
How To Make Pls Consistent
6 pages
FRTN35 Exercises
No ratings yet
FRTN35 Exercises
25 pages
Bishop Solutions PDF
No ratings yet
Bishop Solutions PDF
87 pages
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
No ratings yet
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
19 pages
ch2 PDF
No ratings yet
ch2 PDF
22 pages
Amath/Math 516 Second Homework Set Linear Least Squares
No ratings yet
Amath/Math 516 Second Homework Set Linear Least Squares
6 pages
E9 205 - Machine Learning For Signal Processing: Practice For Midterm Exam # 1
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice For Midterm Exam # 1
8 pages
Snelson 2005 Sparse Gps
No ratings yet
Snelson 2005 Sparse Gps
8 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
Big Challenges With Big Data in Life Sciences: Shankar Subramaniam UC San Diego
No ratings yet
Big Challenges With Big Data in Life Sciences: Shankar Subramaniam UC San Diego
94 pages
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
100% (1)
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
48 pages
Overview and Recent Advances in Partial Least Squares: Lecture Notes in Computer Science November 2005
No ratings yet
Overview and Recent Advances in Partial Least Squares: Lecture Notes in Computer Science November 2005
19 pages
Journal of Statistical Software: The Pls Package: Principal Component and Partial Least Squares Regression in R
No ratings yet
Journal of Statistical Software: The Pls Package: Principal Component and Partial Least Squares Regression in R
23 pages
Chap 03
No ratings yet
Chap 03
59 pages
Regression Using LS Handout
No ratings yet
Regression Using LS Handout
21 pages
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
No ratings yet
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
34 pages
Prof. Richardson Neuralnetworks
No ratings yet
Prof. Richardson Neuralnetworks
61 pages
CS550 Lec2
No ratings yet
CS550 Lec2
24 pages
Tutorial On PLS and PCA
100% (1)
Tutorial On PLS and PCA
17 pages
SI2018
No ratings yet
SI2018
32 pages
Solutions To Selected Problems in Machine Learning: An Algorithmic Perspective
0% (1)
Solutions To Selected Problems in Machine Learning: An Algorithmic Perspective
4 pages
Thomas Banchoff, John Wermer Auth. Linear Algebra Through Geometry PDF
67% (3)
Thomas Banchoff, John Wermer Auth. Linear Algebra Through Geometry PDF
315 pages
Pem1 PDF
No ratings yet
Pem1 PDF
53 pages
Lecture 1 - Overview of Supervised Learning
No ratings yet
Lecture 1 - Overview of Supervised Learning
133 pages
System Id
No ratings yet
System Id
3 pages
Sta 2010
No ratings yet
Sta 2010
9 pages
The Properties of Cost and Profit Functions
40% (5)
The Properties of Cost and Profit Functions
1 page
Least Squares Model Leastsquares PDF
No ratings yet
Least Squares Model Leastsquares PDF
27 pages
MIDA1 AUT - Solutions
No ratings yet
MIDA1 AUT - Solutions
4 pages
Analysis of Two Partial-least-Squares Algorithms For Multivariate Calibration PDF
No ratings yet
Analysis of Two Partial-least-Squares Algorithms For Multivariate Calibration PDF
17 pages
Maths Booklet
No ratings yet
Maths Booklet
56 pages
Lab Assignment Questions of Python
100% (1)
Lab Assignment Questions of Python
2 pages
Maths Class Xii Sample Paper Test 6 For Board Exam 2024
No ratings yet
Maths Class Xii Sample Paper Test 6 For Board Exam 2024
6 pages
CBSE Class 12 Maths 2014 Study Material
No ratings yet
CBSE Class 12 Maths 2014 Study Material
172 pages
978 1 4612 0755 9 - 2 PDF
No ratings yet
978 1 4612 0755 9 - 2 PDF
2 pages
Calculus and Linear Algebra PDF
0% (1)
Calculus and Linear Algebra PDF
10 pages
1st Year Meth Notes
No ratings yet
1st Year Meth Notes
30 pages
Diagonalization: Definition. A Matrix
No ratings yet
Diagonalization: Definition. A Matrix
5 pages
IEEE WorkShop Slides Lavretsky
No ratings yet
IEEE WorkShop Slides Lavretsky
185 pages
Maths Chapter1to6 Imp Question
No ratings yet
Maths Chapter1to6 Imp Question
29 pages
Full Length Mock Exam-rf+Itf+Matrix-01!06!2025
No ratings yet
Full Length Mock Exam-rf+Itf+Matrix-01!06!2025
5 pages
Math For EE
No ratings yet
Math For EE
280 pages
Maths Paper 2 Solution
No ratings yet
Maths Paper 2 Solution
21 pages
Applied Maths
No ratings yet
Applied Maths
22 pages
BCA New Syllabus - 20!21!211220
No ratings yet
BCA New Syllabus - 20!21!211220
67 pages
Introduction To Matrices
No ratings yet
Introduction To Matrices
6 pages
Delhi Govt. Schools For CBSE Term I (2021-22) : in This Section, Attempt Any 16 Questions (From 01 - 20)
No ratings yet
Delhi Govt. Schools For CBSE Term I (2021-22) : in This Section, Attempt Any 16 Questions (From 01 - 20)
6 pages
Matrix Algebra
No ratings yet
Matrix Algebra
56 pages
IMCA Syllabus
No ratings yet
IMCA Syllabus
23 pages
Class-Xii Holiday Homework (2024-25) Science
No ratings yet
Class-Xii Holiday Homework (2024-25) Science
7 pages
Eigone
No ratings yet
Eigone
2 pages
CUET-2023 Mock1 Sol - Maths Q S
No ratings yet
CUET-2023 Mock1 Sol - Maths Q S
21 pages
Mathematics Class 12 Syllabus For Session 2016-2017: Unit I: Relations and Functions
No ratings yet
Mathematics Class 12 Syllabus For Session 2016-2017: Unit I: Relations and Functions
3 pages
Orthogonal Decomposition Tutorial
No ratings yet
Orthogonal Decomposition Tutorial
22 pages
M&D Game - 04
No ratings yet
M&D Game - 04
2 pages
Recursive Analysis
From Everand
Recursive Analysis
R. L. Goodstein
No ratings yet
Probability Theory: A Concise Course
From Everand
Probability Theory: A Concise Course
Y. A. Rozanov
4/5 (2)

Modeling Input/Output Data: Partial Least Squares (PLS)

Uploaded by

Modeling Input/Output Data: Partial Least Squares (PLS)

Uploaded by

Lecture 6:

Modeling Input/Output Data: Partial Least Squares (PLS)

R = TS with R= and S= It then follows that s1 s2 sm . r1 r2 rm

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Protein biomarkers at left. In total there are 6 malignant and 3 non-

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

relation of the general form

B = in which terms (6.8) can be written as

Other equivalents are: stimulus/response, predictor/resultant and so forth.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Y2 = Moore-Penrose(Pseudo) Inverse 111100110 000011001

Exercise 6.3. For (6.16) and (6.17) the model is a scalar

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

x= and the variance 1 N

2 = and redene the input/output as

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

v = Fw as an input/output relation and requiring that

C = (v, Fw) be maximized subject to the side conditions

Sw = w This leads to the Lagrangian

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

q = 1, and a linear combination of the input columns

t = Xw with weights, w (also called loadings), so that

w = 1, so that the correlation of output and input

(t, u) = (Xw, Yq)

and the variational equations are easily seen to lead to

linkage, as a large positive value of C.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

X YY Xw = 2 w and vice versa to obtain

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Ak+1 = Ak tk p . k Observe that from (6.47)

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

At this point we deate Xk & Yk

Xk+1 = Xk tk p k Yk+1 = Yk tk q k (6.52)

X Yk+1 Yk+1 Xk+1 wk+1 = 2 wk+1 , k+1

for minimum 2 . We continue to cycle through (6.51) and so forth to obtain t1 T= . . .

in the above we can force t to have unit norm, by dening

r = w/ Xw and then redening t by

The degree of deation is left open. It then follows that

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

B0 = P+ Q. Thus the required model is

Y XB0 , where we dont explicitly point out the error term.

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

[XL, YL, XS, YS, , PCTVAR, MSE, stats] = plsregress(X, Y, ncomp)

Yij = Xim Bmj +

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

(u)i = 1, i = 1, N then the centered data is given by

o Yij = Yij ui Y j = (Xim ui X m )Bmj + (

XL = Xo XS YL = Yo XS. In terms of original variables we can write (6.74)

o Yij ui Y j + Xim Bmj = ui (Y j X m Bmj ) + Xim Bmj

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

1j = Y j X m Bmj mj = Bm1j , m > 1 (6.76)

Y = [ones(n, 1), X] + residual

Fig. 6.1: Percent capture of X and Y

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Fig. 6.2: Recovery of the output Y in terms of 2,3,8 components

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

Fig. 6.3: b the vector of (6.79)

Lecture 6: Modeling Input/Output Data: Partial Least Squares (PLS)

You might also like