Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
63 views
Linear Mixed Models For Longitudinal Data
Uploaded by
Jren Mao
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Linear Mixed Models for Longitudinal Data For Later
Download
Save
Save Linear Mixed Models for Longitudinal Data For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
63 views
Linear Mixed Models For Longitudinal Data
Uploaded by
Jren Mao
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Linear Mixed Models for Longitudinal Data For Later
Carousel Previous
Carousel Next
Save
Save Linear Mixed Models for Longitudinal Data For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 579
Search
Fullscreen
Linear Mixed Models for Longitudinal Data Geert Verbeke Geert Molenberghs SpringerSpringer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg K. Krickeberg, I, Olkin, N. Wermuth, S. Zeger Springer New York Berlin Heidelberg Barcelona Hong Kong London Milan Paris Singapore TokyoSpringer Series in Statistics Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Bolfarine/Zacks: Prediction Theory for Finite Populations. Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications Brockwell/Davis: Time Series: Theory and Methods, 2nd edition. Chen/Shao/Ibrahim: Monte Carlo Methods in Bayesian Computation. Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications. Fahrmeir/Tutz: Multivariate Statistical Modelling Based on Generalized Linear Models. Farebrother: Fitting Linear Relationships: A History of the Calculus of Observations 1750-1900. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume I: Two Crops. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume II: Three or More Crops. Fienberg/Hoaglin/Kruskal/Tanur (Eds.): A Statistical Model: Frederick Mosteller's Contributions to Statistics, Science and Public Policy. Fisher/Sen: The Collected Works of Wassily Hoeffding. Good: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses, 2nd edition. Gouriéroux: ARCH Models and Financial Applications. Grandell: Aspects of Risk Theory. Haberman: Advanced Statistics, Volume J: Description of Populations. Hall: The Bootstrap and Edgeworth Expansion. Hardle: Smoothing Techniques: With Implementation in S. Hart: Nonparametric Smoothing and Lack-of-Fit Tests. Hartigan: Bayes Theory. Hedayat/Sloane/Stufken: Orthogonal Arrays: Theory and Applications. Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal Parameter Estimation. Huet/Bowvier/Gruet/Jolivet: Statistical Tools for Nonlinear Regression: A Practical Guide with S-PLUS Examples. Kolen/Brennan: Test Equating: Methods and Practices. Kotz/Johnson (Eds.): Breakthroughs in Statistics Volume I. Kotz/Johnson (Eds.): Breakthroughs in Statistics Volume II. Kotz/Johnson (Eds.): Breakthroughs in Statistics Volume III. Kiichler/Sprensen: Exponential Families of Stochastic Processes. Le Cam: Asymptotic Methods in Statistical Decision Theory. Le Cam/Yang: Asymptotics in Statistics: Some Basic Concepts. Longford: Models for Uncertainty in Educational Testing. Miller, Jr.: Simultaneous Statistical Inference, 2nd edition. Mosteller/Wallace: Applied Bayesian and Classical Inference: The Case of the Federalist Papers. Parzen/Tanabe/Kitagawa: Selected Papers of Hirotugu Akaike. Politis/Romano/Wolf: Subsampling. (continued after index)Geert Verbeke Geert Molenberghs Linear Mixed Models for Longitudinal Data With 128 Hlustrations é SpringerGeert Verbeke Biostatistical Centre Katholieke Universiteit Leuven Kapucijnenvoer 35 B-3000 Leuven Belgium Geert Molenberghs Biostatistics Center for Statistics Limburgs Universitair Centrum Universitaire Campus, Building D B-3590 Diepenbeek Belgium Library of Congress Cataloging-in-Publication Data Verbeke, Geert. Linear mixed models for longitudinal data / Geert Verbeke, Geert Molenberghs. p. cm. — (Springer series in statistics) Includes bibliographical references and index. ISBN 0-387-95027-3 (alk. paper) 1. Linear models (Statistics) 2. Longitudinal method. 1. Molenberghs, Geert. II. Title. HI. Series. QA279 .V458 2000 519.5’3—dc21 © 2000 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by 00-026596 the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. ISBN 0-387-95027-3 Springer-Verlag New York Berlin Heidelberg SPIN 10761593Preface The dissemination of the MIXED procedure in SAS and related software have provided a whole class of linear mnixed-effects models, some of which with a long history, for routine use. Experience shows that both the ideas behind the techniques and their software implementation are not at all straightforward, and users from various applied backgrounds often encount- er difficulties in using the methodology effectively. Courses and consultancy in this domain have been in great demand over the last decade, illustrating the clear need for resource material to aid the user. As an outgrowth of such courses, Verbeke and Molenberghs (1997) was in- tended as a contribution to bridging this gap. Since its appearance, it has been the basis for several short and regular courses in academia and in- dustry. In the meantime, many research papers on these and related topics have appeared in the statistical literature. Therefore, it is considered timely to present a second, entirely recast version. Material kept from Verbeke and Molenberghs (1997) has been reworked, and a large range of new topics has been added. ‘The structure of the book reflects not only our own research activity but also our experience in teaching various applied longitudinal modeling courses, such as the Longitudinal Data Analysis course in the Master of Science in Biostatistics Programme of the Limburgs Universitair Centrum, the Repeated Measures course in the International Study Pro- gramme in Statistics of the Katholieke Universiteit Leuven, and the Topics in Biostatistics course at the Universiteit Antwerpen.viii Preface As with the first version, we hope this book will be of value to a wide audience, including applied statisticians and biomedical researchers, par- ticularly in the pharmaceutical industry, medical and public health research organizations, contract research organizations, and academic departments. This implies that the majority of the chapters is explanatory rather than research oriented and that it emphasizes practice rather than mathemat- ical rigor. In this respect, guidance and advice on practical issues are the main focus of the text. On the other hand, some more advanced topics are included as well, which we believe to be of use to the more demanding modeler. In the first version, we had placed strong emphasis on the SAS procedure MIXED, without discouraging the non-SAS users. Considerable effort was put. in treating data analysis issues in a generic fashion, instead of mak- ing them fully software dependent. Therefore, a research question was first translated into a statistical model by means of algebraic notation. In a number of cases, such a model was then implemented using SAS code. ‘This was positively received by many readers and we therefore for most part kept this format. In this version, much of the SAS-related issues are centralized in a single chapter, and we still keep selected examples through- out the text. Additionally, an Appendix is devoted to other software tools (MLwiN. SPlus). Because SAS Version 7 has not been generally marketed, SAS Version 6.12 was used throughout this book. The Appendix briefly lists the most important changes in Version 7. Selected macros for tools discussed in the text, not otherwise available in commercial software packages, as well as publicly available data sets, can be found at Springer-Verlag’s URL: www.springer-ny.com. Geert Verbeke (Katholieke Universiteit Leuven, Leuven) Geert Molenberghs (Limburgs Universitair Centrum, Diepenbeek)Acknowledgments This book has been accomplished with considerable help from several peo- ple. We would like to gratefully acknowledge their support. A large part of this book is based on joint research. We are grateful to sev- eral co-authors: Larry Brant (Gerontology Research Center and The Johns Hopkins University, Baltimore), Luc Bijnens (Janssen Research Founda- tion, Beerse), Tomasz Burzykowski (Limburgs Universitair Centrum), Marc Buyse (International Institute for Drug Development, Brussels), Desmond Curran (European Organization for Research and Treatment of Cancer, Brussels), Helena Geys (Limburgs Universitair Centrum), Mike Kenward (London School of Hygiene and Tropical Medicine), Emmanuel Lesaffre (Katholieke Universiteit Leuven), Stuart Lipsitz (Medical University of South Carolina, Charleston), Bart Michiels (Janssen Research Foundation, Beerse), Didier Renard (Limburgs Universitair Centrum), Ziv Shkedy (Lim- burgs Universitair Centrum), Bart Spiessens (Katholieke Universiteit Leu- ven), Herbert Thijs (Limburgs Universitair Centrum), Tony Vangeneug- den (Janssen Research Foundation, Beerse), and Paige Williams (Harvard School of Public Health, Boston). Russell Wolfinger (SAS Institute, Cary, NC) has been kind enough to pro- vide us with a trial version of SAS Version 7.0 during the development of this text. Bart Spiessens (Katholieke Universiteit Leuven) kindly provided us with technical support. Steffen Fieuws (Katholieke Universiteit Leuven) commented on earlier versions of the text.x Acknowledgments We gratefully acknowledge support from Research Project Fonds voor We- tenschappelijk Onderzoek Vlaanderen G.0002.98: “Sensitivity Analysis for Incomplete Data,” NATO Collaborative Research Grant CRG950648: “Sta- tistical Research for Environmental Risk Assessment,” and from Onder- zoeksfonds K.U.Leuven grant PDM/96/105. It has been a pleasure to work with John Kimmel and Jenny Wolkowicki of Springer-Verlag. We apologize to our wives, daughters, and son for the time not spent with them during the preparation of this book and we are very grateful for their understanding. The preparation of this book has been a period of close and fruitful collaboration, of which we will keep good memories. Geert and Geert Kessel-Lo, December 1999Contents Preface Acknowledgments 1 Introduction 21 2.2 2.3 24 2.5 2.6 Examples The Rat Data. 6 1. eee The Toenail Data (TDO) The Baltimore Longitudinal Study of Aging (BLSA) .. . . 2.3.1 The Prostate Data... ...........0000. 2.3.2 The Hearing Data .........-....--... The Vorozole Study... 0... ee Heights of Schoolgirls 2.0... 0. eee ee ee Growth Data . 2... 2... ee eeeContents 2.7 Mastitis in Dairy Cattle... 2... ..0....2.-.200- 3 A Model for Longitudinal Data 8.1 Introduction... 2-2... ee 3.2. A Two-Stage Analysis... 0 ee B21 Stagel o 6. ee 8.2.2 Stage 2... eee ee ee ee 3.2.3 Example: The Rat Data. . 2... 6.0.2.0 0006 3.24 Example: The Prostate Data ............. 3.2.5 Two-Stage Analysis 2.2... ....0. 00000005 3.3 The General Linear Mixed-Effects Model .... 2.2... 3.3.1 The Model .... 2.2.0.0. 0002p eee eee 3.3.2 Example: The Rat Data... ......-..0050.5 3.3.3. Example: The Prostate Data ........-...- 3.3.4 A Model for the Residual Covariance Structure . . . 4 Exploratory Data Analysis 41 Introduction... 2-2... ee 4.2. Exploring the Marginal Distribution ............. 421 The Average Evolution ..........2----. 4.2.2 The Variance Structure... 6... 202.02 4.2.3 The Correlation Structure... ........---- 4.3 Exploring Subject-Specific Profiles .............. 4.3.1 Measuring the Overall Goodness-of-Fit........ 4.3.2 Testing for the Need of a Model Extension .... . 4.3.3 Example: The Rat Data... 2.0. 0. ee ee ee 4.3.4 Example: The Prostate Data ............. 19 19 20 20 20 21 21 22 23 23 25 26 26Contents xiii 5 Estimation of the Marginal Model 41 5.1 Introduction. 6... eee 41 5.2 Maximum Likelihood Estimation... ....-..-.--- A2 5.3 Restricted Maximum Likelihood Estimation ........ 43 5.3.1 Variance Estimation in Normal Populations .... . AB 6 5.3.2 Estimation of Residual Variance in Linear Regression 43 5.3.3 REML Estimation for the Lmear Mixed Model... 44 5.3.4 Justification of REML Estimation .......... 46 5.3.5 Comparison Between ML and REML Estimation . 46 5.4 Model-Fitting Procedures... 2... 2-2 ee ee AT 5.5 Example: The Prostate Data ................. 48 5.6 Estimation Problems... 2. 0. 02 ee 50 5.6.1 Small Variance Components... 2.2. ........ 50 5.6.2 Model Misspecifications ..............0. 52 Inference for the Marginal Model 55 6.1 Introduction... 6. bk ee eee ee 5S 6.2 Inference for the Fixed Effects... ............-. 55 6.2.1 Approximate Wald Tests ............05. 56 6.2.2 Approximate t-Tests and F-Tests. . 22.2. 00 56 6.2.3 Example: The Prostate Data ............. 57 6.2.4 Robust Inference... .......-.--2000.. 61 6.2.5 Likelihood Ratio Tests... 0.2.0.0... 0000.5 62 6.3 Inference for the Variance Components... ......... 64 6.3.1 Approximate Wald Tests .............0.- 64 6.3.2 Likelihood Ratio Tests... 6... eee ee ee 65 6.3.3 Example: The Rat Data... .........---.- 66xiv Contents 6.3.4 Marginal ‘Testing for the Need of Random Effects. . 69 6.3.5 Example: The Prostate Data ............. 72 6.4 Information Criteria... 2.0... ee 74 7 Inference for the Random Effects 77 7.1 Introduction... 6. 6. eee 7 7.2 Empirical Bayes Inference .. 2.0.0 ee ee 78 7.3 Henderson’s Mixed-Model Equations... 6... 0 ee ee 79 7.4 Best Linear Unbiased Prediction (BLUP) .........- 80 7.5 Shrinkage... eee 80 7.6 Example: The Random-Intercepts Model... 2... 05. 81 7.7 Example: The Prostate Data .............005- 82 7.8 The Normality Assumption for Random Effects... ... . 83 7.8.1 Tntroduction 4 yc. sce Hah 6 te neti tence td gnats 83 7.8.2 Impact on EB Estimates ............... 85 7.8.3 Impact on the Estimation of the Marginal Model .. 87 7.84 Checking the Normality Assumption... ...... 89 8 Fitting Linear Mixed Models with SAS 93 8.1 Introduction... ..........-..2..2.2..---- 93 8.2 The SAS Program .... 2... 6-0 eee eee 94 8.2.1 The PROC MIXED Statement ............ 95 8.2.2 The CLASS Statement...............-.- 96 8.2.3 The MODEL Statement ................ 96 8.2.4 TheID Statement ................... 97 8.2.5 The RANDOM Statement... ............ 97 8.2.6 The REPEATED Statement ............. 988.3 8.4 8.5 8.6 Contents 8.2.7 The CONTRAST Statement ............. 8.2.8 The ESTIMATE Statement ............-% 8.2.9 The MAKE Statement... ........--..-- 8.2.10 Some Additional Statements and Options ..... . The SAS Output... ee 8.3.1 Information on the Iteration Procedure 8.3.2 Information on the Model Fit... .......... 8.3.3 Information Criteria... 2.2... 2 ee eee 8.3.4 Inference for the Variance Components... ... . - 8.3.5 Inference for the Fixed Effects ............ 8.3.6 Inference for the Random Effects ........... Note on the Mean Parameterization... 6... .2.0.-. The RANDOM and REPEATED Statements ........ PROC MIXED versus PROC GLM .. 2.2 2 ee ee 9 General Guidelines for Model Building 9.1 9.2 9.3 9.4 9.5 Introduction ..4 “si Res Eg alae ay cle et Selection of a Preliminary Mean Structure ......... Selection of a Preliminary Random-Effects Structure ... . Selection of a Residual Covariance Structure... 2... . Model Reduction .. 2.6... 2. ee 10 Exploring Serial Correlation 10.1 Introduction. ©. eee 10.2 An Informal Check for Serial Correlation. . 2... 2.2... 10.3 Flexible Models for Serial Correlation 10.3.1 Introduction «6.6... ee ee 101 101 102 102 104 104 105 107 107 Ii 113 114 117 119 121 121 123 125 128 132 135 135 136 137xvi Contents 10.3.2 Fractional Polynomials... 2... ...-.002-5. 137 10.3.3 Example: The Prostate Data ............. 138 10.4 The Semi-Variogram .. 2-1 0. ee ee ee ee 141 10.4.1 Introduction ............-.200002005 141 10.4.2 The Semi-Variogram for Random-Intercepts Models 142 10.4.3 Example: The Vorozole Study... 6... 6 eee ee 144 10.4.4 The Semi-Variogram for Random-Effects Models . . 144 10.4.5 Example: The Prostate Data .............- 147 10.5 Some Remarks ©... 6. ee ee eee 148 11 Local Influence for the Linear Mixed Model 151 11.1 Introduction 2... 22. ee ee 151 11.2 Local Influence 2... eee 153 11.3 The Detection of Influential Subjects... ....--...- 158 11.4 Example: The Prostate Data ...............5. 162 11.5 Local Influence Under REML Estimation .......... 167 12 The Heterogeneity Model 169 12.1 Introduction. ©... 2. ee 169 12.2 The Heterogeneity Model .... 2... ....-..---- 171 12.3 Estimation of the Heterogeneity Model... ......... 173 12.4 Classification of Longitudinal Profiles... .........- 177 12.5 Goodness-of-Fit Checks . 2.2.2. -.-002020-0005- 178 12.6 Example: The Prostate Data 2... 0.2... 0505000 180 12.7 Example: The Heights of Schoolgirls .. 2... ....... 183 13 Conditional Linear Mixed Models 189 13.1 Introduction 2... eee 189Contents xvii 13.2 A Linear Mixed Model for the Hearing Data. ........ 190 13.3 Conditional Linear Mixed Models... 2... 2. ee eee 194 13.4 Applied to the Hearing Data .. 2... ......--026- 197 13.5 Relation with Fixed-Effects Models... 2.2.2.0... 198 14 Exploring Incomplete Data 201 15 Joint Modeling of Measurements and Missingness 209 15.1 Introduction. ©... 2. ee 209 15.2 The Impact of Incompleteness.. 2... 0.0.20... 2 210 15.3 Simple ad hoc Methods .............0.00005 211 15.4 Modeling Incompleteness .........0......00-. 212 15.5 Terminology... 2. 0. ee 214 15.6 Missing Data Patterns... 22.00.2002. ee ee eee 215 15.7 Missing Data Mechanisms .. 2... 2.002002 2 eee 215 15.8 Ignorability ©... 0... 2. eee 217 15.9 A Special Case: Dropout... 2... 2 ee ee eee 218 16 Simple Missing Data Methods 221 16.1 Introduction. ©... 6 ee 221 16.2 Complete Case Analysis .. 2.0.0.0... 0 0.000005 223 16.3 Simple Forms of Imputation .. 6. 6 ee ee 228 16.3.1 Last Observation Carried Forward ....... 2. - 224 16.3.2 Imputing Unconditional Means .........-.-. 225 16.3.3. Buck’s Method: Conditional Mean Imputation .. . 225 16.3.4 Discussion of hnputation Techniques... ..... . 226 16.4 Available Case Methods .. 2.2... 2.0 eee ee 227 16.5 MCAR Analysis of Toenail Data... 2... 0.00000. 227xviii Contents 17 Selection Models 231 17.1 Introduction. ©... eee 231 17.2 A Selection Model for the Toenail Data .....--...- 233 17.2.1 MAR Analysis... 2.0.00. 2 eee eee 233 17.2.2 MNAR analysis... 0 ee 234 17.3 Scope of Ignorability ©... ee ee 289 17.4 Growth Data... 2.2... eee 240 17.4.1 Analysis of Complete Growth Data... 2.2.2... 240 17.4.2 Frequentist Analysis of Incomplete Growth Data . . 256 17.4.3 Likelihood Analysis of Incomplete Growth Data... 257 17.4.4 Missingness Process for the Growth Data ..... . 267 17.5 A Selection Model for Nourandom Dropout... . - .. . - 269 17.6 A Selection Model for the Vorozole Study .......... 270 18 Pattern-Mixture Models 275 18.1 Introduction. ©... eee 275 18.1.1 A Simple Illustration. 6... ee 275 18.1.2 A Paradox . 2.2... ee ee 278 18.2 Pattern-Mixture Models... 2... 02. eee ee eee 280 18.3 Pattern-Mixture Model for the Toenail Data... .. .. . . 281 18.4 A Pattern-Mixture Model for the Vorozole Study ..... . 287 18.5 Some Reflections .. 2... 0.22202 eee ee eee 291 19 Sensitivity Analysis for Selection Models 295 19.1 Introduction... 2. 2. eee 295 19.2 A Modified Selection Model for Nonrandom Dropout... . 297 19.3 Local Influence 2... eee 29819.3.1 Review of the Theory ..............--. 299 19.3.2 Applied to the Model of Diggle and Kenward .... 300 19.3.3 Special Case: Compound Symmetry ....-...- 302 19.3.4 Serial Correlation... 2.2.0.2. ee eee 306 19.4 Analysis of the Rat Data 2... 6... ee eee ee 307 19.5 Mastitis in Dairy Cattle 2... ee ee ee ee BIZ 19.5.1 Informal Sensitivity Analysis .... 2.2.2.2... . 312 19.5.2 Local Influence Approach ...........---. 319 19.6 Alternative Local Influence Approaches... 2.2... 6+ 326 19.7 Random-coefficient-based Models... 2.6... ..-0-0-5- 328 19.8 Concluding Remarks... 2.2... ..02.00202000005 330 20 Sensitivity Analysis for Pattern-Mixture Models 331 20.1 Introduction... .. - jo de pp SpE at eelbethobdnt ta ot 331 20.2 Pattern-Mixture Models and MAR ............-- 332 20.2.1 MAR and ACMV 1... 0... eee ee eee 333 20.2.2 Nonmonotone Patterns: A Counterexample ... . . 335 20.3 Multiple imputation... 2... ..02.0.202.0.0.. 336 20.3.1 Parameter and Precision Estimation ......... 338 20.3.2 Hypothesis Testing... 2... 0. ee eee 338 20.4 Pattern-Mixture Models and Sensitivity Analysis... . . . 339 20.5 Identifying Restrictions Strategies .............. 343 20.5.1 Strategy Outline... 2.2.0.2... ee ee 343 20.5.2 Identifying Restrictions ................ 344 20.5.3 ACMV Restrictions .. 2.2.0.2... .2 000005 347 20.5.4 Drawing from the Conditional Densities ... 2... 350 20.6 Analysis of the Vorozole Study ............-.-- 352XX Contents 20.6.1 Fitting a Model... . 2.2... 200-0000 0 00% 352 20.6.2 Hypothesis Testing... .........2.000048 366 20.6.3 Model Reduction... 2... -.0.-. 02525005 371 20.7 Thoughts . 6... eee 373 21 How Ignorable Is Missing At Random ? 375 21.1 Introduction. ©... ee ee 375 21.2 Information and Sampling Distributions... 2... 2... 377 21.3 Illustration... 6. 379 21.4 Example... ee 383 21.5 Implications for PROC MIXED... . 0. .....-2-24. 385 22 The Expectation-Maximization Algorithm 387 23 Design Considerations 391 23.1 Introduction... Pre iqg 8 pee a ee ee 391 23.2 Power Calculations Under Linear Mixed Models... .. . « 392 23.3 Example: The Rat Data... 2.0... ee ee 393 23.4 Power Calculations When Dropout Is to Be Expected .. . 394 23.5 Example: The Rat Data... ........2..2-00000.5 397 23.5.1 Constant pj nj>n, Varying ng oe ee ee 399 23.5.2 Constant pjnj>%, Constant ng. ee 401 23.5.3 Increasing p;,xj>x over Time, Constant nj... . . . 402 24 Case Studies 405 24.1 Blood Pressures «6 2... eee 405 24.2 'The Heat Shock Study... 0. 0 ee ee ee AIL 24.2.1 Introduction ... 2.2... 22 ee eee All24.2.2 Analysis of Heat Shock Data... 2.2.2... A15 24.3 The Validation of Surrogate Endpoints from Multiple Trials 420 24.3.1 Introduction ..2 456 bea He eee eee Re 420 24.3.2 Validation Criteria... 2... ee A21 24.3.3 Notation and Motivating Examples. ......... A24 24.3.4 A Meta-Analytic Approach .. 0.0... eee ee 429 24.3.5 Data Analysis... 2.2.2.2. ee ee eee 434 24.3.6 Computational Issues ..........-.0000- 439 24.3.7 Extensions . 2... ee AA2 24.3.8 Reflections on Surrogacy... 2... .-.-.-0-2005 443 24.3.9 Prediction Intervals ........-.....000.. Ada 24.3.10 SAS Code for Randoin-Effects Model... - .. - - 445 24.4 The Milk Protein Content Trial... 2... ......... 446 244.1 Introduction ... 0.2... eee eee AAG 24.4.2 Informal Sensitivity Amalysis .......--+--- 448 24.4.3 Formal Sensitivity Analysis .............5 457 24.5 Hepatitis B Vaccimation .... 2... ..--2.20-0-00-% 470 24.5.1 Time Evolution of Antibodies... .. 6... 1... 472 24.5.2 Prediction at Year 12 2... .......-.0 00005 481 24.5.3 SAS Code for Vaccination Models .... 2... - A82 Appendix A Software 485 A.l The SAS System .. pepe b ert eer eet eres 485 A.1.1 Standard Applications ................. A85 A.1.2 New Features in SAS Version 7.0 ........... 485xxii Contents A.2 Fitting Mixed Models Using MLwiN ............-- A.3 Fitting Mixed Models Using SPlus ............28 A.3.1 Standard SPlus Functions... 6... ..---00-- A.3.2. OSWALD for Nonrandom Nonresponse ...... . B Technical Details for Sensitivity Analysis B.1 Local Influence: Derivation of Components of A; ...... B.2 Proof of Theorem 20.1... ..........00000000- References Index 515 515 518 523 5541 Introduction In applied sciences, one is often confronted with the collection of correlated data. This generic term embraces a multitude of data structures, such as multivariate observations, clustered data, repeated measurements, longitu- dinal data, and spatially correlated data. Among those, multivariate data have received most attention in the statis- tical literature (e.g., Seber 1984, Krzanowski 1988, Johnson and Wichern 1992). Techniques devised for this situation include multivariate regression and multivariate analysis of variance, which have been implemented in the SAS procedure GLM (SAS 1991) for general linear models. In addition, SAS contains a battery of relatively specialized procedures for principal components analysis, canonical correlation analysis, discriminant analysis, factor analysis, cluster analysis, and so forth (SAS 1989). As an example of a simple multivariate study, assume that a subject’s systolic and diastolic blood pressure are measured simultaneously. This is different from a clustered setting where, for example, for a number of families, diastolic blood pressure is measured for all of their members. A. design where, for each subject, diastolic blood pressure is recorded under several experimental conditions is often termed a repeated measures study. In the case that diastolic blood pressure is measured repeatedly over time for each subject, we are dealing with longitudinal data. Although one could view all of these data structures as special cases of multivariate designs, we believe there are many fundamental differences, thoroughly affecting the2 1. Introduction mode of analysis. First, certain multivariate techniques, such as principal components, are hardly useful for the other designs. Second, in a truly multivariate set of outcomes, the variance-covariance structure is usually unstructured, in contrast to, for example, longitudinal data. Therefore, the methodology of the general linear model is too restrictive to perform satisfactory data analyses of these more complex data. In contrast, the general linear mized model, as implemented in the SAS procedure MIXED (Littell et al. 1996), is much more flexible. Replacing the time dimension in a longitudinal setting with one or more spatial dimensions leads naturally to spatial data. While ideas in the lon- gitudinal and spatial areas have developed relatively independently, efforts have been spent in bridging the gap between both disciplines. In 1996. a workshop was devoted to this idea: “The Nantucket Conference on Mod- eling Longitudinal and Spatially Correlated Data: Methods, Applications, and Future Directions” (Gregoire et al. 1997). Still, restricting attention to the correlated data settings described earlier is too limited to fully grasp the wide applicability of the general linear mixed model. In designed experiments, such as analysis of variance (ANOVA) or nested factorial designs, the variance structure has to reflect the design and thus elaborate structures will be needed. A good mode of analysis should be able to account for various sources of variability. Linear mixed models originated precisely in this area of application. For a review, see Robinson (1991). Among the clustered data settings, longitudinal data perhaps require the most elaborate modeling of the random variability. Diggle, Liang, and Zeger (1994) distinguish among three components of variability. The first one groups traditional random effects (as in a random-effects ANOVA model) and random coefficients (Longford 1993). It stems from interindividual vari- ability (i.e., heterogeneity between individual profiles). The second compo- nent, serial association, is present when residuals close to each other in time are more similar than residuals further apart. This notion is well known in the time-series literature (Ripley 1981, Diggle 1983, Cressie 1991). Finally, in addition to the other two components, there is potentially also measure- ment error. This results from the fact that, for delicate measurements (e.g., laboratory assays), even immediate replication will not be able to avoid a certain level of variation. In longitudinal data, these three components of variability can be distinguished by virtue of both replication as well as a clear distance concept (time), one of which is lacking in classical spatial and time-series analysis and in clustered data. This implies that adapting models for longitudinal data to other data structures is in many cases rel- atively straightforward. For example, clustered data could be analyzed by leaving out all aspects of the model that refer to time.1. Introduction 3 A very important characteristic of data to be analyzed is the type of out- come. Methods for continuous data form the best developed and most ad- vanced body of research; the same is true for software implementation. This is natural, since the special status and the elegant properties of the normal distribution simplify model building and ease software development. It is in this area that the general linear mixed model and the SAS procedure MIXED, as well as its counterparts in, for example, SPlus and MLwiN, are situated. However, also categorical (nominal, ordinal, and binary) and discrete outcomes are very prominent in statistical practice. For example, quality of life outcomes are often scored on ordinal scales. Two fairly different views can be adopted. The first one, supported by large-sample results. states that normal theory should be applied as much as possible, even to nou-normal data such as ordinal scores and counts. A different view is that each type of outcome should be analyzed using instruments that exploit the nature of the data. We will adopt the second standpoint. In addition, since the statistical community has been familiar- ized with generalized linear models (GLIM; McCullagh and Nelder 1989), some have taken the view that the normal model for continuous data is but one type of GLIM. Although this is correct in principle, it fails to ac- knowledge that normal models are much further developed than any other GLIM (e.g., model checks and diagnostic tools) and that it enjoys unique properties (e.g., the existeuce of closed-form solutious, exact distributious of test statistics, unbiased estimators). Extensions of GLIM to the longitu- dinal case are discussed in Diggle, Liang, and Zeger (1994), where the main emphasis is on generalized estimating equations (Liang and Zeger 1986). Generalized lmear mixed models have been proposed by, for example, Bres- low and Clayton (1993). Fahrmeir and Tutz (1994) devote an entire book to GLIM for multivariate settings. In longitudinal settings, each individual typically has a vector Y of re- sponses with a natural (time) ordering among the components. This leads to several, generally nonequivalent, extensions of univariate models. In a marginal model, marginal distributions are used to describe the outcome vector Y, given a set X of predictor variables. The correlation among the components of Y can then be captured either by adopting a fully parametric approach or by means of working assumptions, such as in the semiparametric approach of Liang and Zeger (1986). Alternatively, in a random-effects model, the predictor variables X are supplemented with a vector 6 of random effects, conditional upon which the components of Y are usually assumed to be independent. This does not preclude that more elaborate models are possible if residual dependence is detected (Longford 1993). Finally, a conditional model describes the distribution of the compo- nents of Y, conditional on X but also conditional on (a subset of) the other components of Y. In a longitudinal context, a particular relevant class of4 1. Introduction conditional models describes a component of Y given the ones recorded earlier in time. Well-known members of this class of transition models are Markov type models. Several examples are given in Diggle, Liang, and Zeger (1994). For normally distributed data, marginal models can easily be fitted, for ex- ample, with the SAS procedure MIXED, the SPlus function Ine, or within the MLwiN package. For such data, integrating a mixed-effects model over the random effects produces a marginal model, in which the regression parameters retain their meaning and the random effects contribute in a simple way to the variance-covariance structure. For example, the mar- ginal model corresponding to a random-intercepts model is a compound symmetry model that can be fitted without explicitly acknowledging the raudom-intercepts structure. In the same vein, certain types of transition model induce simple marginal covariance structures. For example, some first-order stationary autoregressive models imply an exponential or AR(1) covariance structure. As a consequence, many marginal models derived from random:-effects and transition models can be fitted with mixed-models software. It should be emphasized that the above elegant properties of normal mod- els do not extend to the general GLIM case. For example, opting for a marginal model for longitudinal binary data precludes the researcher from answering conditional and transitional questions in terms of simple model parameters. This implies that each model family requires its own specific software tools. For example, an analysis based on generalized estimating equations can be performed with the GENMOD procedure in SAS, or the SPlus set of functions termed OSWALD (Smith, Robertson, and Dig- gle 1996). Mixed-effects models for non-Gaussian data can be fitted using the MIXOR program (Hedeker and Gibbons 1994, 1996), MLwiN, or the SAS procedure NLMIXED. The latter procedure is available from Versiou 7 onward and is the successor of the macros GLIMMIX and NONLINMIX. Motivated by the above discussion, we have restricted the scope of this book to linear mixed models for continuous outcomes. Fahrmeir and Tutz (1994) discuss generalized linear (mixed) models for multivariate outcomes, while longitudinal versions are treated in Diggle, Liang, and Zeger (1994). Non- linear models for repeated measurement data are discussed by Davidian and Giltinan (1995). While research in this area has largely focused on the formulation of linear mixed-effects models, inference, and software implementation, other im- portant aspects, such as exploratory analysis, the investigation of model fit, and the construction of diagnostic tools have received considerably less attention. In addition, longitudinal data are typically very prone to in- completeness, due to dropout or intermediate missing values. This poses1. Introduction 5 particular challenges to methodological development. In this book, we have attempted to give a detailed account of several of these topics. By no means has it been our intention to give a complete or definitive overview. Indeed, given the high research activity, this would be impossible. Broadly, the structure of the book is as follows. The key examples, used throughout the book, are introduced in Chapter 2. Chapters 3 to 9 pro- vide the core about the linear mixed-effects model, while Chapters 10 to 13 discuss more advanced tools for model exploration, influence diagnostics, as well as extensions of the original model. Chapters 14 to 16 introduce the reader to basic incomplete data concepts. Chapters 17 and 18 discuss strategies to model incomplete longitudinal data, based on the linear mixed model. The sensitivity of such strategies to parametric assumptions is in- vestigated in Chapters 19 and 20. Some additional missing data topics are presented in Chapters 21 and 22. Chapter 23 is devoted to design consid- erations. Five case studies are treated in detail in Chapter 24. Appendix A reviews a number of software tools for fitting mixed models. Since the book puts relatively more emphasis on SAS than on other packages, this proce- dure is discussed in detail in Chapter 8, while worked examples can be found throughout the text. Some technical background material from the sensitivity chapters is deferred until Appendix B.2 Examples This chapter introduces the longitudinal sets of data which will be used throughout the book. The rat data are presented in Sectiou 2.1. The TDO data, studying toenails, are described in Section 2.2. Section 2.3 is de- voted to the Baltimore Longitudinal Study of Aging, with two substudies: prostate-specific antigen data (Section 2.3.1) and data on hearing (Sec- tion 2.3.2). Section 2.4 introduces the Vorozole study, focusing on quality of life in breast cancer patients. In Section 2.5, we will introduce data, previously analyzed by Goldstein (1979), on the heights of 20 schoolgirls. Section 2.6 presents the growth data of Potthoff and Roy (1964). Mastitis in dairy cattle is the subject of Section 2.7. To complement the data introduced in this chapter, five case studies, in- volving additional sets of data, are presented in Chapter 24. 2.1 The Rat Data In medical science, there has recently been increased interest in the ther- apeutic use of hormones. However, such drastic therapies require detailed knowledge about their effect on the different aspects of growth. To this respect, an experiment has been set up at the Department of Orthodontics of the Catholic University of Leuven (KUL) in Belgium (see Verdonck et al. 1998). The primary aim was to investigate the effect of the inhibition8 2. Examples Control 88 88 oS oS z z ge ge 2 2 B 8 6 6 a a g g 8 8 c « 547555 es es 35 105 115 S445 55 65 s es 95 05 15 Age (days) Age (days) High dose 88 ‘s o x an g B S a g 8 & 45 55 65 75 65 65 105 15 Age (days) FIGURE 2.1. Rat Data. Individual profiles for each of the treatment groups in the rat experiment separately. of the production of testosterone in male Wistar rats on their craniofacial growth. A total of 50 male Wistar rats have been randomized to either a control group or one of the two treatment groups where treatinent consisted of a low or high dose of the drug Decapeptyl, which is an inhibitor for testos- terone production in rats. The treatment started at the age of 45 days, and measurements were taken every 10 days, with the first observation taken at the age of 50 days. The responses of interest are distances (in pixels) between well-defined points on X-ray pictures of the skull of each rat. taken after the rat has been anesthetized. Of primary interest is the estimation of changes over time and testing whether these changes are treatment de- pendent. For the purpose of this book, we will consider one of the measurements which can be used to characterize the height of the skull. The individual profiles are shown in Figure 2.1. It is clear that not all rats have measure- ments up to the age of 110 days. This is due to the fact that many rats do not survive anaesthesia and therefore drop out before the end of the study. Table 2.1 shows the number of rats observed at each occasion. While 50 rats have been randomized at the start of the experiment, only 22 of them survived the 6 first measurements, so measurements on only 22 rats are available in the way anticipated at the design stage. For example, at the2.2 The Toenail Data (TDO) 9 TABLE 2.1. Rat Data. Summary of the number of observations taken at each occasion in the rat experiment, for each group separately and in total. # Observations Age (days) Control Low High Total 50 15 18 17 50 60 13 17 16 46 70 13 15, 15 43 80 10 15 13 38 90 7 12 10 29 100 4 10 10 24 110 4 8 10 22 second occasion (age = 60 days), only 46 rats were available, implying that for 4 rats only 1 measurement could be recorded. 2.2 The Toenail Data (TDO) The data introduced in this section were obtained from a randomized, double-blind, parallel group, multicenter study for the comparison of two oral treatments (in the sequel coded as A and B) for toenail dermatophyte onychomycosis (TDO), described in full detail by De Backer et al. (1996). ‘TDO is a common toenail infection, difficult to treat, affecting more than 2 out of 100 persons (Roberts 1992). Antifungal compounds, classically used for treatment. of TDO, need to be taken until the whole nail has grown out healthy. The development of new such compounds, however, has reduced the treatment duration to 3 months. The aim of the present study was to compare the efficacy and safety of 12 weeks of continuous therapy with treatment A or with treatment B. In total, 2 x 189 patients were randomized, distributed over 36 centers. Subjects were followed during 12 weeks (3 months) of treatment and fol- lowed further, up to a total of 48 weeks (12 months). Measurements were taken at baseline, every month during treatment, and every 3 months af- terward, resulting in a maximum of seven measurements per subject. For our purposes, we will only consider one of the secondary endpoints, unaf- fected nail length, which is measured as follows. At the first occasion, the treating physician indicates one of the affected toenails as the target nail, the nail which will be followed over time. At each occasion, the unaffected nail length (measured from the nail bed to the infected part of the nail,10 2. Examples Treatment A Treatment B 8 a a Unaffected Nail Length (mm) 3 Unaffected Nail Length (mm) 3 ° Time (months) FIGURE 2.2. Toenail Data. Individual profiles of 80 randomly selected subjects in each of the treatment groups in the toenail experiment. which is always at the free end of the nail) of the target nail is measured in millimeters. Obviously, this response will be related to the toe size. There- fore, we will only include here those patients for which the target nail was one of the two big toenails. This reduces our sample under consideration to 150 and 148 subjects, respectively. Figure 2.2 shows the observed profiles of 30 randomly selected subjects from treatment group A and treatment group B, respectively. Due to a variety of reasous, 72 (24%) out of the 298 participants left the study prematurely. Table 2.2 summarizes the number of subjects still in the study at each occasion, for both treatment groups separately. Although the comparison of the average evolutions in both treatment groups was of primary interest, there was also some interest in studying the relationship between the dropout process and the actual outcome. For example, are patients who drop out doing better or worse than patients who do not drop out from the study ? 2.3 The Baltimore Longitudinal Study of Aging (BLSA) The Baltimore Longitudinal Study of Aging (BLSA) is an ongoing multi- disciplinary observational study, which started in 1958, and with the study of normal human aging as primary objective (Shock et al. 1984). Partici- pants in the BLSA are volunteers who return approximately every 2 years for 3 days of biomedical and psychological examinations. They are predom- inantly white (95%), well educated (over 75% have bachelor’s degrees), and financially comfortable (82%). So far, over 1400 men with an average of al- most 7 visits and 16 years of follow-up have participated in the study since its inception in 1958. Later on, females have been included in the study as well.2.3 The Baltimore Longitudinal Study of Aging (BLSA) 11 TABLE 2.2. Toenail Data. Summary of the number of observations taken at each occasion in the TDO study, for each group separately and in total. # Observations Time (months) ‘Treatment A ‘Treatment B ‘Total _ 0 150 148 298 1 149 142 291 2 146 138 284 3 140 131 271 6 131 124 255 9 120 109 229 12 118 108 226 The BLSA (Pearson et al. 1994) is a unique resource for rapidly evaluating longitudinal hypotheses because of the availability of data from repeated clinical examinations and a bank of frozen blood samples from the same individuals over 30 years of follow-up (where new studies would require many years to conduct). On the other hand, the observational aspect of the study poses additional complications on the statistical analysis. For ex- ample, although repeated visits are scheduled every 2 years, some subjects may have more than one visit within 1 year of time, while others have over 10 years between two successive visits. Also. longitudinal evolutions may be highly influenced by many covariates which may or may not be recorded in the study. In this book, two of the many responses measured in the BLSA will be used to illustrate the statistical methodology. In Section 2.3.1, it will be discussed how data from the BLSA can be used to study the natural history of prostate disease. Afterward, in Section 2.3.2, the hearing data will be presented. 2.8.1 The Prostate Data During the last 10 years, inany papers have been published on the natural history of prostate disease; see, for example, Carter et al. (1992a, 1992b) and Pearson et al. (1991, 1994). According to Carter and Coffey (1990), prostate disease is one of the most common and most costly medical prob- lems in the United States, and prostate cancer has become the second leading cause of male cancer deaths. It is therefore very important. to look for markers which can detect the disease at an early stage. The prostate- specific antigen (PSA) is such a marker. PSA is an enzyme produced by12 2. Examples TABLE 2.3. Prostate Data. Description of subjects included in the prostate data set, by diagnostic group. The cancer cases are subdivided into local/regional (L/R) and metastatic (M) cancer cases. Cancer Cases Controls BPH cases L/R M Number of participants 16 20 14 1 Age at diagnosis (years) Median 66 75.9 73.8 721 Range 56.7-80.5 64.6-86.7 63.6854 62.7828 Years of follow-up Median 15.1 14.3 17.2 17.4 Range 94168 69-241 106-249 10-253 Time between meastirements (years) Median 2 2 17 L7 Range 1111.7 0983 09108 094.8 Number of measurements per individual Median 8 8 ul 95 Range 410 Bll TAS 7-12 both normal and cancerous prostate cells, and its level is related to the volume of prostate tissue. Still, an elevated PSA level is not necessarily an indicator of prostate cancer because patients with benign prostatic hyper- plasia (BPH) also have an enlarged volume of prostate tissue and therefore also an increased PSA level. This overlap of the distribution of PSA values in patients with prostate cancer and BPH has limited the usefulness of a single PSA value as a screening tool since, according to Pearson et al. (1991), up to 60% of BPH patients may be falsely identified as potential cancer cases based on a single PSA value. Based on clinical practice, researchers have hypothesized that the rate of change in PSA level might be a more accurate method of detecting prostate cancer in the early stages of the disease. This has been extensively investi- gated by Pearson et al. (1994), who analyzed repeated PSA measures from the Baltimore Longitudinal Study of Aging (BLSA), using linear mixed models. A retrospective case-control study was undertaken that utilized frozen serum samples from 18 BLSA participants identified as prostate cancer cases, 20 cases of BPH, and 16 controls with no clinical signs of prostate disease. In order to be eligible for the analyses, men had to meet several criteria:2.3 The Baltimore Longitudinal Study of Aging (BLSA) 13 Controls BPH cases 4 4 a3 a3 G & & & 42 42 £ 1 £ 1 °5 5 10 16 20 25 30 ° 30 Years befare diagnosis Years befare diagnosis L/A cancer cases Metastatic cancer cases 4 a as w o a a + +2 £ £, 20 °C 5 0 & 2 2 20 Years before diagnosis Years before diagnosis FIGURE 2.3. Prostate Data. Longitudinal trends in PSA in men with prostate cancer, benign prostatic hyperplasia, or no evidence of prostate disease. 1. seven or more years of follow-up prior to diagnosis of prostate cancer, simple prostatectomy for BPH. or exclusion of prostate disease by a urologist. 2. confirmation of the pathological diagnosis, aud 3. no prostate surgery prior to diagnosis. ‘To the extent possible, age at diagnosis and years of follow-up were matched for the control, BPH, and cancer groups. However. due to the high preva- lence of BPH in men over age 50, it was difficult to find age-matched controls with no evidence of prostate disease. In fact. the control group remained significantly younger at first visit and at diagnosis, compared to the BPH group. For this reason, our analyses of this data set will always correct. for age differences at the time of the diagnosis. A description of the data, differentiating between local/regional (L/R) can- cer cases and metastatic cancer cas s given in Table 2.3. The number of repeated PSA measurements per individual varies between 4 and 15, and the follow-up period ranges from 6.9 to 25.3 years. Since it was anticipated that PSA values would increase exponentially in prostate cancer cases, the responses were transformed to n(PSA + 1). These transformed individual profiles are shown in Figure 2.3.14-2. Examples Left ear Right ear a a 3 s 8 3 e e e 2 2 2 8 8 ge 2 5 5 2 2 s s 8 20 oO 5 10 15 20 25 8 25 Time (years) FIGURE 2.4. Hearing Data. Individual profiles of 30 randomly selected subjects in the hearing data set, for the left and the right ear separately. 2.8.2 The Hearing Data Also recorded in the BLSA study are hearing threshold sound pressure lev- els (SPLs in dB), measured at 11 different frequencies [varying from 125 to 8000 hertz (Hz)] on both ears, yielding a maximum of 22 observations per visit. This was done by means of a sound proof chamber and a Bekesy audiometer. Using these data, Brant and Fozard (1990) have shown that the relationship between hearing threshold level and frequency can be well described by a quadratic function of the logarithm of frequency, the pa- rameters of which depend on age and are highly subject-specific. Morrell and Brant (1991) and Brant and Pearson (1994) considered the data of 268 elderly male participants whose first: visit occurred at: about 70 years of age or older. They studied how hearing thresholds change over time and how these evolutions depend on age and on the frequency under consideration. For our purposes, we now consider all available hearmg thresholds for 500 Hz, from male BLSA participants only, without otologic disease, unilateral hearing loss, or evidence of noise-induced hearing loss. Individual profiles on the left and right ear separately are shown in Figure 2.4 for 30 randomly selected subjects. In total, we have 6170 observations (3089 on the left ear and 3081 on the right ear), from 681 males. Their age at the first visit ranged from 17.2 to 90.5 years, with median value equal to 53 years. The number of visits per subject varied from 1 to 15, and some of the participants were followed for over 22 years (median 7.5 years).2.4 The Vorozole Study 15 TABLE 2.4. Heights of Schoolgirls. Classification of 20 preadolescent, school girls in three groups. according to their mother’s height Mothers height | Children numbers Small mothers < 155 em 1-6 Medium mothers [155cm: 164m] 718 Tall mothers > 164 em 14 > 20 2.4 The Vorozole Study This study was an open-label, multicenter, parallel group design conducted at 67 North American centers. Patients were randomized to either the new drug Vorozole (2.5 mg taken once daily) or the standard drug megestrol acetate (40 mg four times daily). The patient population consisted of post- menopausal patients with histologically confirmed estrogen-receptor pos- itive metastatic breast carcmoma. All 452 randomized patients were fol- lowed until disease progression or death. The main objective was to com- pare the treatment groups with respect to response rate, whereas secondary objectives included a comparison relative to duration of response, time to progression, survival, safety, pain relief, performance status, and quality of life. Full details of this study are reported in Goss et al. (1999). In this book, we will focus on overall quality of life, measured by the total Functional Living Index: Cancer (FLIC; Schipper et al. 1984). Precisely, a higher FLIC score is the more desirable outcome. Even though this outcome is, strictly speaking, of the ordinal type. the total number of categories encountered exceeds 70, justifying the use of continuous-ontcome methods. Patients underwent screening and for those deemed eligible, a detailed ex- amination at baseline (occasion 0) took place. Further measurement oc- casions were months 1, then from months 2 at bimonthly intervals until month 44. Goss et al. (1999) analyzed FLIC using a two-way ANOVA model with effects for treatment, disease status, as well as their interaction. No signifi- cant difference was found. Apart from treatment, important covariates are dominant site of the disease as well as clinical stage. This example will be used, for example, to introduce exploratory tools in Chapter 4.16 2. Examples Short mother Medium mother hy Height Cem) o y 7 é é 7 é é © Age (years) Age (years) Tall mother 150 om) ht Heit 6 7 3 10 8 Age (years) FIGURE 2.5. Heights of Schoolgirls. Growth curves of 20 school girls from age 6 to 10, for girls with small, medium, or tall mothers. 2.5 Heights of Schoolgirls Goldstein (1979, Table 4.3, p. 101) reports growth curves of 20 preado- lescent girls, measured on a yearly basis from age 6 to 10. The girls were classified according to the height of their mother, which was discretized as in Table 2.4. The individual profiles are shown in Figure 2.5, for each group separately. The measurements are given at exact years of age, some having been previously adjusted to these. The values Goldstein reports for the fifth girl in the first group are 114.5, 112, 126.4, 131.2, and 135.0. This suggests that the second measurement is incorrect. We therefore replaced it by 122. An extensive analysis of this data set can be found in Section 4.2 of Verbeke and Molenberghs (1997). Of primary interest is to test. whether the growth of these schoolgirls is related to the height of their mothers. 2.6 Growth Data These data, introduced by Potthoff and Roy (1964), contain growth mea- surements for 11 girls and 16 boys. For each subject, the distance from the center of the pituitary to the maxillary fissure was recorded at ages 8,2.6 Growth Data 17 TABLE 2.5. Growth Data for 11 Girls and 16 Boys. Measurements marked with « were deleted by Little and Rubin (1987). Age (in years) Age (in years) Girl 8 10 12 14. Boy 8 10 12 14 1 21.0 200 21.5 23.0 1 26.0 25.0 29.0 31.0 2 210 215 240 25.5 2 21.5 22.5° 23.0 26.5 3 20.5 240° 245 26.0 3 238.0 22.5 24.0 27.5 4 23.5 245 25.0 26.5 4 25.5 27.5 26.5 27.0 5 21.5 23.0 22.5 23.5 5 20.0 23.5 22.5 26.0 6 20.0 21.0% 21.0 22.5 6 24.5 25.5 27.0 28.5 7 215 22.5 23.0 25.0 7 220 22.0 24.5 26.5 8 23.0 23.0 23.5 24.0 8 240 21.5 24.5 25.5 9 20.0 21.0% 22.0 21.5 9 23.0 205 31.0 26.0 10 165 19.0° 19.0 19.5 10 27.5 28.0 310 315 11 24.5 25.0 28.0 28.0 11 23.0 23.0 23.5 25.0 12 21.5 23.5" 24.0 28.0 13° 17.0) 24.5* 26.0 29.5 14 22.55 25.5 25.5 26.0 15 28.0 24.5 26.0 30.0 16 22.0 21.5" 23.5 25.0 Source: Pothoff and Roy (1964). Jennrich and Schluchter (1986). 10, 12, and 14. The data were used by Jennrich and Schluchter (1986) to illustrate estimation methods for unbalanced data, where unbalancedness is now to be interpreted in the sense of an unequal number of boys and girls. Little and Rubin (1987) deleted 9 of the [(11 + 16) x 4] measurements, rendering 9 incomplete subjects. Deletion is confined to the age LO mea- surements. Little and Rubin (1987) describe the mechanism to be such that subjects with a low value at age 8 are more likely to have a missing value at. age 10. The data are presented in Table 2.5. The measurements that were deleted are marked with an asterisk. In Section 17.4.1, the complete data will be analyzed in some detail. Sections 17.4.2 and 17.4.3 are devoted to frequentist and likelihood-based ignorable analyses of the incomplete ver- sion of the data, respectively. Section 17.4.4 is devoted to insight in the missingness mechanism.18 2. Examples ta 6 P6E P45 10 4 oe : >= a =e Hoe a = 6 o Yin FIGURE 2.6. Mastitis in Dairy Cattle. The first panel shows a scatter plot of the second measurement versus the first measurement. The second panel shows a scatter plot of the change versus the baseline measurement. 2.7 Mastitis in Dairy Cattle This example. concerning the occurrence of the infectious disease mastitis in dairy cows, was introduced in Diggle and Kenward (1994) and reanalyzed in Kenward (1998). Data were available of the milk yields in thousands of liters of 107 dairy cows from a single herd in 2 consecutive years: Yj; (i = 1,...,107;7 = 1,2). In the first year, all animals were supposedly free of mastitis; in the second year, 27 became infected. Mastitis typically reduces milk yield, and the question of scientific interest is whether the probability of occurrence of mastitis is related to the yield that would have been observed had mastitis not occurred. A graphical representation of the complete data is given in Figure 2.6.3 A Model for Longitudinal Data 3.1 Introduction In practice, longitudinal data are often highly unbalanced in the sense that not an equal number of measurements is available for all subjects and/or that measurements are not taken at fixed time points. In the rat data set and the toenail data set, presented in Section 2.1 and in Section 2.2, respectively, a fixed number of measurements was scheduled to be taken on all subjects, at fixed time points. However, during the study, rats died, and patients left the toenail study prematurely, implying unbalance. This is different from the prostate data and the hearing data (Sections 2.3.1 and 2.3.2, respectively), where the unbalance is an immediate result from the fact that the volunteers participating in the BLSA were asked to return approximately every 2 years for medical examination. Due to their unbalanced nature, many longitudinal data sets cannot be analyzed using multivariate regression techniques (see, for example, Seber 1984, Chapters 8 and 9, Hand and Taylor 1987). A natural alternative arises from observing that subject-specific longitudinal profiles can often be well approximated by linear regression functions. One hereby summarizes the vector of repeated measurements for each subject by a vector of a relatively small number of estimated subject-specific regression coefficients. Afterward, in a second stage, multivariate regression techniques can be used to relate these estimates to known covariates such as treatment, disease20 3. A Model for Longitudinal Data classification, baseline characteristics, and so forth. This so-called two-stage analysis will be introduced in Section 3.2. Afterward, in Section 3.3, the general linear mixed model will be introduced as a result. of combining the two stages into one single statistical model. 3.2 A Two-Stage Analysis 8.2.1 Stage 1 Let the random variable Y;; denote the (possibly transformed) response of interest, for the 7th individual, measured at time fj, 7 = L....,N, j=1,...,m, and let Y; be the ,;-dimensional vector of all repeated mea- surements for the ith subject, that is, ¥; = (Vir, Yi2,---, Yin,)'. The first stage of the two-stage approach assumes that Y; satisfies the linear regres- sion model Yi = 2B; +64, (3.1) where Z; is a (n; x q) matrix of known covariates, modeling how the re- sponse evolves over time for the ith subject. Further, 6; is a g-dimensional vector of unknown subject-specific regression coefficients, and €; is a vec- tor of residual components €;;, 7 = 1 n,. It is usually assumed that all €; are independent and normally distributed with mean vector zero, and covariance matrix o7J,,, where J;,, is the n;-dimensional identity matrix. This latter assumption will be extended in Section 3.3. Obviously, model (3.1) includes very flexible models for the description of subject-specific profiles. In practice, polynomials will often suffice. How- ever, extensions such as fractional polynomial models (Royston and Alt- man. 1994), or extended spline functions (Pan and Goldstein 1998) can be considered as well. We refer to Lesaffre, Asefa and Verbeke (1999) for an example where subject-specific profiles have been modeled using fractional polynomials. 8.2.2 Stage 2 In a second step, a multivariate regression model of the form B, = KiB+bi, (3.2) is used to explain the observed variability between the subjects, with re- spect to their subject-specific regression coefficients B;. K; is a (gq x p)3.2 A Two-Stage Analysis 21 matrix of known covariates, and @ is a p-dimensional vector of unknown regression parameters. Finally, the b; are assumed to be independent, fol- lowing a g-dimensional normal distribution with mean vector zero and gen- eral covariance matrix D. 8.2.8 Example: The Rat Data The rat. data presented in Section 2.1 have been analyzed by Verbeke and Lesaffre (1999), who describe the subject-specific profiles shown in Fig- ure 2.1 by straight lines, after transforming the original time scale (age expressed in days) logarithmically (see also Section 4.3.3). The first-stage model (3.1) then becomes Yip = Brut Poitig +eyj, JHl,---,7, (3.3) where fiz = In{1 + (Age;; — 45)/10)], implying that ¢ = 0 corresponds to the start of the treatment. The matrix Z; has two columns: one containing only ones, and one containing all time points ti;, 7 =1...., NG. In the second stage, the subject-specific intercepts and time effects are related to the treatment of the rats (low dose, high dose, control). Our second-stage model (3.2) then becomes Pri = Bo + bi. (3.4) Boi = Pr Li + Pai + BsCi + bri, in which L,;, H;, and C; are indicator variables defined to be one if the rat belongs to the low-dose group, the high-dose group, or the control group, respectively, and zero otherwise. The randomization in combination with the chosen transformation of the original time scale allows us to assume the subject-specific intercepts 6; not to depend on the treatment. The parameter fp can be interpreted as the average response at the start of the treatment, whereas the parameters (1, 2, and (3 represent the average time effects for each treatment group separately. Of primary interest is the comparison of these average slopes, since this directly measures the treatment effect on the average growth. 8.24 Example: The Prostate Data Pearson et al. (1994) and Verbeke and Molenberghs (1997, Chapter 3) have previously analyzed the prostate data presented in Section 2.3.1, assuming that each individual profile shown in Figure 2.3 can be well approximated22 3. A Model for Longitudinal Data by a quadratic function over time, where time is expressed as years before diagnosis (see also Section 4.3.4). The regression model (3.1) in the first stage is then ¥iy = n(PSAy +1) = Pri + Boitag + Bait?y + ij, J =1,.--)may (3.5) and the columns of the covariate matrix 7%; contain only ones, all time points 3, and all squared time points Gi. In the second stage, the subject-specific intercepts and linear as well as quadratic time effects are related to the diagnostic class of the subject (control, BPH case, local cancer case, or metastatic cancer case). The age at the time of diagnosis is included as a covariate in order to correct for the age differences among the four diagnostic groups. Model (3.2) in the second stage then becomes Pri = Bi Age; + B2Ci + 3B; + Bali + BoMli + duis Boi = BoAge; + BrCi + Pa Bi + Poli + Brod: + bei, (3.6) Psi = Br Age; + Bi2Ci + BisBi + Prali + GisMi + bai, in which Age, equals the subject’s age at diagnosis (t = 0), and where C;, B;, Li, and M; are indicator variables defined to be one if the subject is a control, a BPH case, a local cancer case, or a metastatic cancer case, respectively, aud zero otherwise. The parameters 2, 63, 34, and fs are the average intercepts for the controls, the BPH cases, the L/R cancer cases, and the metastatic cancer cases, respectively, after correction for age at diagnosis. Similar interpretations hold for the other parameters in (3.6). 8.2.8 Two-Stage Analysis In practice, the regression parameters in (3.2) are of primary interest. They can be estimated by sequentially fitting the models (3.1) and (3.2). First, all 3, are estimated by fitting model (3.1) to the observed data vector ys of each subject separately, yielding estimates B,. Afterward, model (3.2) is fitted to the estimates B. providing inferences for @. Fitting the models (3.5) and (3.6) to the prostate data, Verbeke and Molen- berghs (1997, Section 3.3) found that the subject-specific regression para- meters 3, did not depend on age at diagnosis (at the 5% level of signifi- cance), but highly significant differences were found among the diagnostic groups. No significant differences were obtained between the controls and3.3 The General Linear Mixed-Effects Model 23 the BPH cases, and the two groups of cancer patients only differed with respect to their intercepts. Note how this two-stage analysis can be interpreted as the calculation (first staye) aud analysis (second stage) of summary statistics. First, the actually observed data vector y; is summarized by B. for each subject separately. Afterward, regression methods are used to assess the relation between the so-obtained summary statistics and relevant covariates. Other summary statistics frequently used in practice are the area under each individual profile (AUC), the mean response for each individual. the largest observa- tion (peak), the half-time, and so forth (see, for example, Weiner 1981 and Rang and Dale 1990). As for any analysis of summary statistics, the two-stage analysis obviously suffers from at least two problems. First, information is lost in summarizing the vector y; of observed measurements for the ith subject by 2B. Second, random variability is introduced by replacing the G; in model (3.2) by their estimates B;. Moreover, the covariance matrix of B; highly depends on the number of measurements available for the ith subject as well as on the time points at which these measurements were taken, and this has not been taken into account in the secoud stage of the analysis. In Section 3.3, it will be shown how this can be solved by combining the two stages into one model, the so-called linear mixed-effects model. 3.3. The General Linear Mixed-Effects Model 8.3.1 The Model Tn order to combine the models from the two-stage analysis, we replace G; in (3.1) by expression (3.2), yielding ¥i = XB + Zbi + ei, (3.7) where X; = Z;K; is the appropriate (n; x p) matrix of known covari- ates, and where all other components are as defined earlier. Model (3.7) is called a linear mixed (-effects) model with fixed effects @ and with subject- specific effects bj. It assumes that the vector of repeated measurements on each subject follows a linear regression model where some of the re- gression parameters are population-specific (i.e., the same for all subjects), whereas other parameters are subject-specific. As in Section 3.2.2. the b; are assumed to be random and are therefore often called random effects.24 3. A Model for Longitudinal Data In general, a linear mixed-effects model is any model which satisfies (Laird and Ware 1982) ¥i = XiB+ Zibi + &% bi ~ N(0,D), (3.8) 3.8) By... Oy. Ea. ees €n independent, where Y; is the n;-dimensional response vector for subject 7, 1
0), and which are shown in Figure 3.2 for ¢ = 1. Note that the most important qualitative difference between these functions is their behavior near u = 0, although their tail behavior is also different. Although Diggle, Liang, and Zeger (1994) discuss model (3.11) in full gen- erality, they do not fit any models which simultaneously include serial cor- relation as well as random effects other than intercepts. They argue that, in applications, the effect: of serial correlation is very often dominated by the combination of random effects and measurement error. In practice, this3.3 The General Linear Mixed-Effects Model 29 is often reflected in estimation problems for models which include several random effects, serial correlation, as well as measurement error. We refer to Section 9.4 for an example. In Chapter 10, we will discuss how appropriate residual covariance structures can be found in the presence of random ef- fects, other than just intercepts. We also refer to Chapter 4 in the book by Davidian and Giltinan (1995) for a discussion of components of variability in the context of nonlinear mixed models.4 Exploratory Data Analysis 4.1 Introduction Most books on longitudinal data discuss exploratory analysis. See, for ex- ample, Diggle, Liang, and Zeger (1994). However, most effort is spent to model building and formal aspects of inference. In this section, we present a selected set of techniques to underpin the model building. We distinguish between two modes of display. In Section 4.2, the marginal distribution of the responses in the Vorozole study is explored, that is. we explore the ob- served profiles averaged over (sub)populations. Three aspects of the data will be looked at in turn: the average evolution, the variance function, and the correlation structure. Afterward, in Section 4.3, we will discuss some procedures for exploring the observed profiles in a subject-specific way. 4.2 Exploring the Marginal Distribution 4.2.1 The Average Evolution ‘The average evolution describes how the profile for a number of relevant subpopulations (or the population as a whole) evolves over time. The results32 4. Exploratory Data Analysis ‘Standard Standard (Detrended) Standard (Stand. Resid.) 2060 n60__-20 TG 5 0 6 20 28 30 New [Stand. Resid.) 20 20 20 =20 -80. 80. V a os 0 20 2 so O 8 © WW 20 2 30 'O 58 0 & 20 2B a0 FIGURE 4.1. Vorozole Study. Individual profiles, raw residuals, and standardized residuals. of this exploration will be useful in order to choose a. fixed-eflects structure for the linear mixed model. ‘The individual profiles are displayed in Figure 4.1, and the mean profiles, per treatment arm, are plotted in Figure 4.2. The average profiles indicate an increase over time which is slightly stronger for the Vorozole group. In addition, the Vorozole group is. with the exception of month 16, consistently higher than the AGT group. Of course, at this point it is not yet possible to decide on the significance of this difference. It is useful to explore the treatment difference separately since even when both evolutions might be complicated, the treatment difference, which is often of primary interest, could follow a simple model, or vice versa. The treatment difference is plotted in Figure 4.3. The individual profiles augment the averaged plot with a suggestion of the variability seen within the data. The thinning of the data toward the later study times suggests that trends at later times should be treated with caution. Although these plots also give us some indications about the variability at given times and even about the correlation between measure- ments of the same individual, it is easier to base such considerations on residual profiles and standardized residual profiles.4.2 Exploring the Marginal Distribution 33 Mean Profiles 6} | e— standard a a e--) New “ ‘ a 2 4 6 8 10 2 14 16 FIGURE 4.2. Vorozole Study. Mean profiles. 4.2.2 The Variance Structure In addition to the average evolution, the evolution of the variance is impor- tant. to build an appropriate longitudinal model. Clearly, one has to correct the measurements for the fixed-effects structure and hence raw residuals must be used. Again, two plots are of interest. The first one pictures the average evolution of the variance as a function of time; the second one merely produces the individual residual plots. ‘The detrended profiles are displayed in Figure 4.1, and the corresponding variance function is plotted in Figure 4.4. The variance function seems to be relatively stable and hence a constant variance model could be a plausible starting point. The individual de- trended profiles show subjects" tendency, most clearly in the Vorozole group, to decrease right before they leave the study. In addition, the detrended profiles suggest that the variance would decrease over time. This is in con- tradiction with the variance function; it is entirely due to considerable attrition. This observation suggests that caution should be used with in- complete data.
You might also like
Statistical Regression Modeling With R: Ding-Geng (Din) Chen Jenny K. Chen
PDF
No ratings yet
Statistical Regression Modeling With R: Ding-Geng (Din) Chen Jenny K. Chen
239 pages
Analyzing and Modeling Rank Data
PDF
No ratings yet
Analyzing and Modeling Rank Data
28 pages
Multilevel Modeling Using R
PDF
No ratings yet
Multilevel Modeling Using R
253 pages
Applied Multivariate Statistical Analysis Solution Manual PDF
PDF
No ratings yet
Applied Multivariate Statistical Analysis Solution Manual PDF
18 pages
HW 03 Sol
PDF
No ratings yet
HW 03 Sol
9 pages
Spatio-Temporal Statistics With R
PDF
No ratings yet
Spatio-Temporal Statistics With R
396 pages
(Monographs On Statistics and Applied Probability (Series) 26) Silverman, B. W - Density Estimation For Statistics and Data Analysis-Routledge (2018)
PDF
No ratings yet
(Monographs On Statistics and Applied Probability (Series) 26) Silverman, B. W - Density Estimation For Statistics and Data Analysis-Routledge (2018)
186 pages
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002) (518s) - MVsa - PDF
PDF
No ratings yet
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002) (518s) - MVsa - PDF
518 pages
Cause and Correlation in Biology - A User's Guide To Path Analysis, Structural Equations and Causal Inference
PDF
100% (2)
Cause and Correlation in Biology - A User's Guide To Path Analysis, Structural Equations and Causal Inference
330 pages
Practical PCA Methods in R
PDF
No ratings yet
Practical PCA Methods in R
29 pages
2015 Book RegressionModelingStrategies-1 PDF
PDF
No ratings yet
2015 Book RegressionModelingStrategies-1 PDF
598 pages
Reml Guide
PDF
No ratings yet
Reml Guide
93 pages
Lme4: Mixed-Effects Modeling With R
PDF
No ratings yet
Lme4: Mixed-Effects Modeling With R
145 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
PDF
No ratings yet
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Introduction To Mixed Modeling Procedures: Sas/Stat 13.2 User's Guide
PDF
No ratings yet
Introduction To Mixed Modeling Procedures: Sas/Stat 13.2 User's Guide
18 pages
Stan Reference 2.7.0
PDF
No ratings yet
Stan Reference 2.7.0
534 pages
Survival Plots SURVMINER Package Tutorial
PDF
No ratings yet
Survival Plots SURVMINER Package Tutorial
5 pages
Gary King, Ori Rosen, Martin A. Tanner - Ecological Inference - New Methodological Strategies (Analytical Methods For Social Research) (2004)
PDF
100% (1)
Gary King, Ori Rosen, Martin A. Tanner - Ecological Inference - New Methodological Strategies (Analytical Methods For Social Research) (2004)
433 pages
Statistical Modelling in Biostatistics and Bioinformatics
PDF
100% (2)
Statistical Modelling in Biostatistics and Bioinformatics
250 pages
R Markdown
PDF
No ratings yet
R Markdown
15 pages
Advances in Principal Component Analysis Research and Development - Ganesh R. Naik
PDF
No ratings yet
Advances in Principal Component Analysis Research and Development - Ganesh R. Naik
256 pages
Penalized Regression
PDF
No ratings yet
Penalized Regression
19 pages
Bayesian Clinical Trials
PDF
No ratings yet
Bayesian Clinical Trials
10 pages
Priors Algorithms Bayesian
PDF
No ratings yet
Priors Algorithms Bayesian
108 pages
Workflow of Statistical Data Analysis
PDF
No ratings yet
Workflow of Statistical Data Analysis
105 pages
Bayesian Econometrics Introduction
PDF
No ratings yet
Bayesian Econometrics Introduction
107 pages
Stan Reference 2.14.0
PDF
No ratings yet
Stan Reference 2.14.0
601 pages
Survival Analysis For Epidemiologic
PDF
100% (2)
Survival Analysis For Epidemiologic
297 pages
How Many Subjects Statistical Power Analysis in Research
PDF
100% (1)
How Many Subjects Statistical Power Analysis in Research
107 pages
Statistical Implicative Analysis - Theory and Applications PDF
PDF
100% (1)
Statistical Implicative Analysis - Theory and Applications PDF
511 pages
Statistics
PDF
No ratings yet
Statistics
27 pages
RYAN, THOMAS P. - [Wiley Series in Probability and Statistics] Modern Regression Methods __ (2
PDF
No ratings yet
RYAN, THOMAS P. - [Wiley Series in Probability and Statistics] Modern Regression Methods __ (2
658 pages
Longitudinal PDF
PDF
No ratings yet
Longitudinal PDF
664 pages
Innovative Statistical Methods For Public Health Data
PDF
No ratings yet
Innovative Statistical Methods For Public Health Data
354 pages
Intermediate R - Nonlinear Regression in R
PDF
No ratings yet
Intermediate R - Nonlinear Regression in R
4 pages
R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF
PDF
No ratings yet
R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF
1 page
EpidemiologyUsingR PDF
PDF
No ratings yet
EpidemiologyUsingR PDF
302 pages
Survival Analysis in R
PDF
No ratings yet
Survival Analysis in R
16 pages
Modelos de Fragilidad en El Análisis de Supervivencia PDF
PDF
No ratings yet
Modelos de Fragilidad en El Análisis de Supervivencia PDF
320 pages
Writing Reproducible Reports: Knitr With R Markdown
PDF
No ratings yet
Writing Reproducible Reports: Knitr With R Markdown
24 pages
Виллемсе И., Ниелисани П. Статистические методы и навыки расчетов
PDF
100% (2)
Виллемсе И., Ниелисани П. Статистические методы и навыки расчетов
328 pages
Chap 1-4, Statistical Inference, by Casella and Berger PDF
PDF
No ratings yet
Chap 1-4, Statistical Inference, by Casella and Berger PDF
686 pages
R Lesson (1 of 2) PDF
PDF
No ratings yet
R Lesson (1 of 2) PDF
182 pages
Applied Categorical and Count Data Analysis (PDFDrive)
PDF
50% (2)
Applied Categorical and Count Data Analysis (PDFDrive)
380 pages
Practical Linear Algebra
PDF
100% (1)
Practical Linear Algebra
253 pages
Statistical Methods For Dynamic Treatment Regimes
PDF
No ratings yet
Statistical Methods For Dynamic Treatment Regimes
220 pages
Longitudinal Data Analysis
PDF
100% (1)
Longitudinal Data Analysis
103 pages
Wiley Encyclopedia of Statistics in Behavioral Science Vol 1-4 2005
PDF
100% (2)
Wiley Encyclopedia of Statistics in Behavioral Science Vol 1-4 2005
2,990 pages
Econometrics in R: Grant V. Farnsworth October 26, 2008
PDF
No ratings yet
Econometrics in R: Grant V. Farnsworth October 26, 2008
50 pages
A Parametric Approach To Nonparametric Statistics: Mayer Alvo Philip L. H. Yu
PDF
100% (4)
A Parametric Approach To Nonparametric Statistics: Mayer Alvo Philip L. H. Yu
277 pages
Epidemiology with R
PDF
No ratings yet
Epidemiology with R
246 pages
Biostatistics Concepts and Applications For Biologists
PDF
No ratings yet
Biostatistics Concepts and Applications For Biologists
210 pages
Poisson Point Processes Imaging, Tracking, and Sensing
PDF
No ratings yet
Poisson Point Processes Imaging, Tracking, and Sensing
280 pages
Advanced Statistical Methods
PDF
No ratings yet
Advanced Statistical Methods
63 pages
Statistical Methods For Bioinformatics Lecture 2
PDF
No ratings yet
Statistical Methods For Bioinformatics Lecture 2
47 pages
Zlib - Pub - Modern Statistical Methods For Spatial and Multivariate Data
PDF
100% (1)
Zlib - Pub - Modern Statistical Methods For Spatial and Multivariate Data
184 pages
Linear Mixed Models For Longitudinal Data
PDF
100% (1)
Linear Mixed Models For Longitudinal Data
579 pages
(eBook PDF) Biostatistics: A Foundation for Analysis in the Health Sciences, 11th Editionpdf download
PDF
100% (2)
(eBook PDF) Biostatistics: A Foundation for Analysis in the Health Sciences, 11th Editionpdf download
47 pages
1
PDF
No ratings yet
1
18 pages
Linear Models and The Relevant Distributions and Matrix Algebra
PDF
No ratings yet
Linear Models and The Relevant Distributions and Matrix Algebra
539 pages