100% found this document useful (10 votes)
62 views85 pages

Instant Download Bayesian Methods For Data Analysis Third Edition Carlin B.P. PDF All Chapter

Bayesian

Uploaded by

nyaayedna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (10 votes)
62 views85 pages

Instant Download Bayesian Methods For Data Analysis Third Edition Carlin B.P. PDF All Chapter

Bayesian

Uploaded by

nyaayedna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Download the full version of the ebook at ebookfinal.

com

Bayesian Methods for Data Analysis Third Edition


Carlin B.P.

https://ebookfinal.com/download/bayesian-methods-for-data-
analysis-third-edition-carlin-b-p/

OR CLICK BUTTON

DOWNLOAD EBOOK

Download more ebook instantly today at https://ebookfinal.com


Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

Statistical Methods for Spatial Data Analysis 1st Edition


Oliver Schabenberger

https://ebookfinal.com/download/statistical-methods-for-spatial-data-
analysis-1st-edition-oliver-schabenberger/

ebookfinal.com

Statistical Methods for Microarray Data Analysis Methods


and Protocols 1st Edition Andrei Y. Yakovlev

https://ebookfinal.com/download/statistical-methods-for-microarray-
data-analysis-methods-and-protocols-1st-edition-andrei-y-yakovlev/

ebookfinal.com

Bayesian Methods for Ecology 1st Edition Michael A.


Mccarthy

https://ebookfinal.com/download/bayesian-methods-for-ecology-1st-
edition-michael-a-mccarthy/

ebookfinal.com

Statistical Methods for the Analysis of Biomedical Data


Second Edition Robert F. Woolson

https://ebookfinal.com/download/statistical-methods-for-the-analysis-
of-biomedical-data-second-edition-robert-f-woolson/

ebookfinal.com
Doing Bayesian Data Analysis Second Edition A Tutorial
with R JAGS and Stan John Kruschke

https://ebookfinal.com/download/doing-bayesian-data-analysis-second-
edition-a-tutorial-with-r-jags-and-stan-john-kruschke/

ebookfinal.com

Doing Bayesian Data Analysis A Tutorial Introduction with


R and BUGS 1st Edition John K. Kruschke

https://ebookfinal.com/download/doing-bayesian-data-analysis-a-
tutorial-introduction-with-r-and-bugs-1st-edition-john-k-kruschke/

ebookfinal.com

Bayesian Models for Categorical Data Wiley Series in


Probability and Statistics 1st Edition Peter Congdon

https://ebookfinal.com/download/bayesian-models-for-categorical-data-
wiley-series-in-probability-and-statistics-1st-edition-peter-congdon/

ebookfinal.com

Data Analysis and Graphics Using R An Example Based


Approach Third Edition John Maindonald

https://ebookfinal.com/download/data-analysis-and-graphics-using-r-an-
example-based-approach-third-edition-john-maindonald/

ebookfinal.com

An Introduction to Statistical Methods and Data Analysis


6th Edition R. Lyman Ott

https://ebookfinal.com/download/an-introduction-to-statistical-
methods-and-data-analysis-6th-edition-r-lyman-ott/

ebookfinal.com
Bayesian Methods
for Data Analysis
Third Edition

C6978_FM.indd 1 5/7/08 2:04:37 PM


CHAPMAN & HALL/CRC
Texts in Statistical Science Series
Series Editors
Bradley P. Carlin, University of Minnesota, USA
Julian J. Faraway, University of Bath, UK
Martin Tanner, Northwestern University, USA
Jim Zidek, University of British Columbia, Canada

Analysis of Failure and Survival Data Epidemiology — Study Design and


P. J. Smith Data Analysis, Second Edition
The Analysis of Time Series — M. Woodward
An Introduction, Sixth Edition Essential Statistics, Fourth Edition
C. Chatfield D.A.G. Rees
Applied Bayesian Forecasting and Time Series Extending the Linear Model with R:
Analysis Generalized Linear, Mixed Effects and
A. Pole, M. West and J. Harrison Nonparametric Regression Models
Applied Nonparametric Statistical Methods, J.J. Faraway
Fourth Edition A First Course in Linear Model Theory
P. Sprent and N.C. Smeeton N. Ravishanker and D.K. Dey
Applied Statistics — Handbook of GENSTAT Generalized Additive Models:
Analysis An Introduction with R
E.J. Snell and H. Simpson S. Wood
Applied Statistics — Principles and Examples Interpreting Data — A First Course
D.R. Cox and E.J. Snell in Statistics
A.J.B. Anderson
Bayesian Data Analysis, Second Edition
An Introduction to Generalized
A. Gelman, J.B. Carlin, H.S. Stern
Linear Models, Third Edition
and D.B. Rubin
A.J. Dobson and A.G. Barnett
Bayesian Methods for Data Analysis,
Introduction to Multivariate Analysis
Third Edition
C. Chatfield and A.J. Collins
B.P. Carlin and T.A. Louis
Introduction to Optimization Methods and
Beyond ANOVA — Basics of Applied Their Applications in Statistics
Statistics B.S. Everitt
R.G. Miller, Jr.
Introduction to Probability with R
Computer-Aided Multivariate Analysis, K. Baclawski
Fourth Edition
A.A. Afifi and V.A. Clark Introduction to Randomized Controlled
Clinical Trials, Second Edition
A Course in Categorical Data Analysis J.N.S. Matthews
T. Leonard
Introduction to Statistical Methods for
A Course in Large Sample Theory Clinical Trials
T.S. Ferguson Thomas D. Cook and David L. DeMets
Data Driven Statistical Methods Large Sample Methods in Statistics
P. Sprent P.K. Sen and J. da Motta Singer
Decision Analysis — A Bayesian Approach Linear Models with R
J.Q. Smith J.J. Faraway
Elementary Applications of Probability Markov Chain Monte Carlo —
Theory, Second Edition Stochastic Simulation for Bayesian
H.C. Tuckwell Inference, Second Edition
Elements of Simulation D. Gamerman and H.F. Lopes
B.J.T. Morgan Mathematical Statistics
K. Knight

C6978_FM.indd 2 5/7/08 2:04:37 PM


Texts in Statistical Science

Bayesian Methods
for Data Analysis
Third Edition

Bradley P. Carlin
Univesity of Minnesota
Minneapolis, MN, U.S.A.

Thomas A. Louis
Johns Hopkins Bloomberg School of Public Health
Baltimore, MD, U.S.A.

C6978_FM.indd 3 5/7/08 2:04:37 PM


Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2009 by Taylor & Francis Group, LLC


Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works


Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1

International Standard Book Number-13: 978-1-58488-697-6 (Hardcover)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the
CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Carlin, Bradley P.
Bayesian methods for data analysis / authors, Bradley P. Carlin and Thomas A.
Louis. -- 3rd ed.
p. cm. -- (Chapman & Hall/CRC texts in statistical science
series ; 78)
Originally published: Bayes and Empirical Bayes methods for data analysis. 1st ed.
Includes bibliographical references and index.
ISBN 978-1-58488-697-6 (alk. paper)
1. Bayesian statistical decision theory. I. Louis, Thomas A., 1944- II. Carlin, Bradley
P. Bayes and Empirical Bayes methods for data analysis. III. Title. IV. Series.

QA279.5.C36 2008
519.5’42--dc22 2008019143

Visit the Taylor & Francis Web site at


http://www.taylorandfrancis.com

and the CRC Press Web site at


http://www.crcpress.com

C6978_FM.indd 4 5/7/08 2:04:38 PM


to

Caroline, Samuel, Joshua, and Nathan

and

Germaine, Margit, and Erica


Contents

Preface to the Third Edition xiii

1 Approaches for statistical inference 1


1.1 Introduction 1
1.2 Motivating vignettes 2
1.2.1 Personal probability 2
1.2.2 Missing data 2
1.2.3 Bioassay 3
1.2.4 Attenuation adjustment 4
1.3 Defining the approaches 4
1.4 The Bayes-frequentist controversy 6
1.5 Some basic Bayesian models 10
1.5.1 A Gaussian/Gaussian (normal/normal) model 11
1.5.2 A beta/binomial model 11
1.6 Exercises 13

2 The Bayes approach 15


2.1 Introduction 15
2.2 Prior distributions 27
2.2.1 Elicited priors 28
2.2.2 Conjugate priors 32
2.2.3 Noninformative priors 36
2.2.4 Other prior construction methods 40
2.3 Bayesian inference 41
2.3.1 Point estimation 41
2.3.2 Interval estimation 48
2.3.3 Hypothesis testing and Bayes factors 50
2.4 Hierarchical modeling 59
2.4.1 Normal linear models 59
2.4.2 Effective model size and the DIC criterion 70
2.5 Model assessment 79
2.5.1 Diagnostic measures 79
viii CONTENTS

2.5.2 Model averaging 89


2.6 Nonparametric methods 93
2.7 Exercises 98

3 Bayesian computation 105


3.1 Introduction 105
3.2 Asymptotic methods 108
3.2.1 Normal approximation 108
3.2.2 Laplace’s method 110
3.3 Noniterative Monte Carlo methods 112
3.3.1 Direct sampling 112
3.3.2 Indirect methods 115
3.4 Markov chain Monte Carlo methods 120
3.4.1 Gibbs sampler 121
3.4.2 Metropolis-Hastings algorithm 130
3.4.3 Slice sampler 139
3.4.4 Hybrid forms, adaptive MCMC, and other algorithms 140
3.4.5 Variance estimation 150
3.4.6 Convergence monitoring and diagnosis 152
3.5 Exercises 159

4 Model criticism and selection 167


4.1 Bayesian modeling 168
4.1.1 Linear models 168
4.1.2 Nonlinear models 174
4.1.3 Binary data models 176
4.2 Bayesian robustness 181
4.2.1 Sensitivity analysis 181
4.2.2 Prior partitioning 188
4.3 Model assessment 194
4.4 Bayes factors via marginal density estimation 196
4.4.1 Direct methods 197
4.4.2 Using Gibbs sampler output 198
4.4.3 Using Metropolis-Hastings output 200
4.5 Bayes factors via sampling over the model space 201
4.5.1 Product space search 203
4.5.2 “Metropolized” product space search 205
4.5.3 Reversible jump MCMC 206
4.5.4 Using partial analytic structure 208
4.6 Other model selection methods 210
4.6.1 Penalized likelihood criteria: AIC, BIC, and DIC 210
4.6.2 Predictive model selection 215
4.7 Exercises 217
CONTENTS ix

5 The empirical Bayes approach 225


5.1 Introduction 225
5.2 Parametric EB (PEB) point estimation 226
5.2.1 Gaussian/Gaussian models 227
5.2.2 Computation via the EM algorithm 228
5.2.3 EB performance of the PEB 234
5.2.4 Stein estimation 236
5.3 Nonparametric EB (NPEB) point estimation 240
5.3.1 Compound sampling models 240
5.3.2 Simple NPEB (Robbins’ method) 240
5.4 Interval estimation 244
5.4.1 Morris’ approach 245
5.4.2 Marginal posterior approach 246
5.4.3 Bias correction approach 248
5.5 Bayesian processing and performance 251
5.5.1 Univariate stretching with a two-point prior 251
5.5.2 Multivariate Gaussian model 252
5.6 Frequentist performance 253
5.6.1 Gaussian/Gaussian model 254
5.6.2 Beta/binomial model 255
5.7 Empirical Bayes performance 258
5.7.1 Point estimation 259
5.7.2 Interval estimation 262
5.8 Exercises 265

6 Bayesian design 269


6.1 Principles of design 269
6.1.1 Bayesian design for frequentist analysis 269
6.1.2 Bayesian design for Bayesian analysis 271
6.2 Bayesian clinical trial design 274
6.2.1 Classical versus Bayesian trial design 275
6.2.2 Bayesian assurance 277
6.2.3 Bayesian indifference zone methods 279
6.2.4 Other Bayesian approaches 282
6.2.5 Extensions 286
6.3 Applications in drug and medical device trials 287
6.3.1 Binary endpoint drug trial 287
6.3.2 Cox regression device trial with interim analysis 297
6.4 Exercises 308

7 Special methods and models 311


7.1 Estimating histograms and ranks 311
7.1.1 Bayesian ranking 311
7.1.2 Histogram and triple goal estimates 324
x CONTENTS

7.1.3 Robust prior distributions 328


7.2 Order restricted inference 333
7.3 Longitudinal data models 334
7.4 Continuous and categorical time series 341
7.5 Survival analysis and frailty models 343
7.5.1 Statistical models 343
7.5.2 Treatment effect prior determination 344
7.5.3 Computation and advanced models 345
7.6 Sequential analysis 346
7.6.1 Model and loss structure 347
7.6.2 Backward induction 348
7.6.3 Forward sampling 349
7.7 Spatial and spatio-temporal models 352
7.7.1 Point source data models 353
7.7.2 Regional summary data models 356
7.8 Exercises 361

8 Case studies 373


8.1 Analysis of longitudinal AIDS data 374
8.1.1 Introduction and background 374
8.1.2 Modeling of longitudinal CD4 counts 375
8.1.3 CD4 response to treatment at two months 384
8.1.4 Survival analysis 385
8.1.5 Discussion 386
8.2 Robust analysis of clinical trials 387
8.2.1 Clinical background 387
8.2.2 Interim monitoring 388
8.2.3 Prior robustness and prior scoping 393
8.2.4 Sequential decision analysis 398
8.2.5 Discussion 401
8.3 Modeling of infectious diseases 402
8.3.1 Introduction and data 402
8.3.2 Stochastic compartmental model 403
8.3.3 Parameter estimation and model building 406
8.3.4 Results 409
8.3.5 Discussion 414

Appendices 417

A Distributional catalog 419


A.1 Discrete 420
A.1.1 Univariate 420
A.1.2 Multivariate 421
A.2 Continuous 421
CONTENTS xi

A.2.1 Univariate 421


A.2.2 Multivariate 425

B Decision theory 429


B.1 Introduction 429
B.1.1 Risk and admissibility 430
B.1.2 Unbiased rules 431
B.1.3 Bayes rules 433
B.1.4 Minimax rules 434
B.2 Procedure evaluation and other unifying concepts 435
B.2.1 Mean squared error (MSE) 435
B.2.2 The variance-bias tradeoff 435
B.3 Other loss functions 436
B.3.1 Generalized absolute loss 437
B.3.2 Testing with a distance penalty 437
B.3.3 A threshold loss function 437
B.4 Multiplicity 438
B.5 Multiple testing 439
B.5.1 Additive loss 439
B.5.2 Non-additive loss 440
B.6 Exercises 441

C Answers to selected exercises 445

References 487

Author index 521

Subject index 529


Preface to the Third Edition

As has been well-discussed, the explosion of interest in Bayesian methods


over the last ten to twenty years has been the result of the convergence of
modern computing power and efficient Markov chain Monte Carlo (MCMC)
algorithms for sampling from posterior distributions. Practitioners trained
in traditional, frequentist statistical methods appear to have been drawn
to Bayesian approaches for two reasons. One is that Bayesian approaches
implemented with the majority of their informative content coming from
the current data, and not any external prior information, typically have
good frequentist properties (e.g., low mean squared error in repeated use).
But perhaps more importantly, these methods as now readily implemented
in WinBUGS and other MCMC-driven packages now offer the simplest ap-
proach to hierarchical (random effects) modeling, as routinely needed in
longitudinal, frailty, spatial, time series, and a wide variety of other set-
tings featuring interdependent data.
This book represents the third edition of a book originally titled Bayes
and Empirical Bayes Methods for Data Analysis, first published in 1996.
This original version was primarily aimed at advanced students willing to
write their own Fortran or C++ code to implement empirical Bayes or fully
Bayes–MCMC analyses. When we undertook our first revision in 2000,
we sought to improve the usefulness of the book for the growing legion
of applied statisticians who wanted to make Bayesian thinking a routine
part of their data analytic toolkits. As such, we added a number of new
techniques needed to handle advanced computational and model selection
problems, as well as a variety of new application areas. However, the book’s
writing style remained somewhat terse and mathematically formal, and
thus potentially intimidating to those with only minimal exposure to the
traditional approach. Now, with the WinBUGS language freely available to
any who wish to try their hands at hierarchical modeling, we seek to further
broaden the reach of our book to practitioners for whom statistical analysis
is an important component of their work, but perhaps not their primary
interest.
As such, we have made several changes to the book’s structure, the most
significant of which is the introduction of MCMC thinking and related data
xiv PREFACE TO THE THIRD EDITION

analytic techniques right away in Chapter 2, the basic Bayes chapter. While
the theory supporting the use of MCMC is only cursorily explained at this
point, the aim is get the reader up to speed on the way that a great deal of
applied Bayesian work is now routinely done in practice. While a probabilist
might disagree, the real beauty of MCMC for us lies not in the algorithms
themselves, but in the way their power enables us to focus on statistical
modeling and data analysis in a way impossible before. As such, Chapter 2
is now generously endowed with data examples and corresponding R and
WinBUGS code, as well as several new homework exercises along these same
lines. The core computing and model criticism and selection material, for-
merly in Chapters 5 and 6, has been moved up to Chapters 3 and 4, in
keeping with our desire to get the key modeling tools as close to the front
of the book as possible. On a related note, new Sections 2.4 and 4.1 contain
explicit descriptions and illustrations of hierarchical modeling, now com-
monplace in Bayesian data analysis. The philosophically related material
on empirical Bayes and Bayesian performance formerly in Chapters 3 and
4 has been thinned and combined into new Chapter 5. Compensating for
this, the design of experiments material formerly (and rather oddly) tacked
onto Chapter 4 has been expanded into its own chapter (Chapter 6) that in-
cludes more explicit advice for clinical trialists and others requiring a basic
education in Bayesian sample size determination, as well as the frequentist
checks still often required of such designs (e.g., by regulatory agencies) be-
fore they are put into practice. Finally, the remaining chapters have been
updated as needed, including a completely revised and expanded Subsec-
tion 7.1 on ranking and histogram estimation, and a new Subsection 8.3
case study on infectious disease modeling and the 1918 flu epidemic.
As with the previous two editions, this revision presupposes no previ-
ous exposure to Bayes or empirical Bayes (EB) methods, but readers with
a master’s-level understanding of traditional statistics – say, at the level
of Hogg and Craig (1978), Mood, Graybill, and Boes (1974), or Casella
and Berger (1990) – may well find the going easier. Thanks to the rear-
rangements mentioned above, a course on the basics of modern applied
Bayesian methods might cover only Chapters 1 to 4, since they provide
all that is needed to do some pretty serious hierarchical modeling in stan-
dard computer packages. In the Division of Biostatistics at the University
of Minnesota, we do essentially this in a three-credit-hour, single-semester
(15 week) course aimed at master’s and advanced undergraduate students
in statistics and biostatistics, and also at master’s and doctoral students
in other departments who need to know enough hierarchical modeling to
analyze their data. For those interested in fitting advanced models be-
yond the scope of standard packages, or in doing methodological research
of their own, the material in the latter chapters may well be crucial. At
Minnesota, we also have a one-semester course of this type, aimed at
doctoral and advanced master’s students in statistics and biostatistics.
PREFACE TO THE THIRD EDITION xv

See http://www.biostat.umn.edu/~brad/ on the web for many of our


datasets and other teaching-related information.
We owe a debt of gratitude to those who helped in our revision process.
Haijun Ma and Laura Hatfield did an enormous amount of work on the
new examples and homework problems in Chapters 2 and 4. Speaking of
new homework problems, many were authored by Sudipto Banerjee, as part
of his teaching an early master’s-level version of this material at the Uni-
versity of Minnesota. The Bayesian clinical trial material in Sections 6.2
and 6.3 owes a great deal to the relentless Brian Hobbs. Much of Subsec-
tion 7.1.1 is due to Rongheng Lin, and virtually all of Section 8.3 is due
to Anny-Yue Yin, both recent graduates from Johns Hopkins Biostatis-
tics. Gareth Roberts patiently explained the merits of Langevin-Hastings
sampling in terms so plain and convincing that we had no choice but to
include it with the other Metropolis-Hastings material in Subsection 3.4.2.
The entire University of Minnesota 2008 spring semester “Introduction to
Bayesian Analysis” class, co-taught with Prof. Banerjee and ably assisted
by Ms. Hatfield (who is also currently co-developing an instructor’s man-
ual for the book) served as aggressive copy-editors, finding flawed home-
work problems, missing references, and an embarrassing number of other
goof-ups. Rob Calver, David Grubbs, and the legendary Bob Stern at Chap-
man and Hall/CRC/Taylor and Francis Group (or whatever the company’s
called now) were pillars of strength and patience, as usual. Finally, we thank
our families, whose ongoing love and support made all of this possible.
Bradley P. Carlin Minneapolis, Minnesota
Thomas A. Louis Baltimore, Maryland
April 2008
CHAPTER 1

Approaches for statistical inference

1.1 Introduction

The practicing statistician faces a variety of challenges: designing complex


studies, summarizing complex data sets, fitting probability models, draw-
ing conclusions about the present, and making predictions for the future.
Statistical studies play an important role in scientific discovery, in policy
formulation, and in business decisions. Applications of statistics are ubiq-
uitous, and include clinical decision making, conducting an environmental
risk assessment, setting insurance rates, deciding whether (and how) to
market a new product, and allocating federal funds. Currently, most statis-
tical analyses are performed with the help of commercial software packages,
most of which use methods based on a classical, or frequentist, statistical
philosophy. In this framework, maximum likelihood estimates (MLEs) and
hypothesis tests based on p-values figure prominently.
Against this background, the Bayesian approach to statistical design
and analysis is emerging as an increasingly effective and practical alterna-
tive to the frequentist one. Indeed, due to computing advances that enable
relevant Bayesian designs and analyses, the philosophical battles between
frequentists and Bayesians that were once common at professional statis-
tical meetings are being replaced by a single, more eclectic approach. The
title of our book makes clear which philosophy and approach we prefer;
that you are reading it suggests at least some favorable disposition (or at
the very least, curiosity) on your part as well. Rather than launch head-
long into another spirited promotional campaign for Bayesian methods,
we begin with motivating vignettes, then provide a basic introduction to
the Bayesian formalism followed by a brief historical account of Bayes and
frequentist approaches, with attention to some controversies. These lead
directly to our methodological and applied investigations.
2 APPROACHES FOR STATISTICAL INFERENCE

1.2 Motivating vignettes


1.2.1 Personal probability
Suppose you have submitted your first manuscript to a journal and have
assessed the chances of its being accepted for publication. This assessment
uses information on the journal’s acceptance rate for manuscripts like yours
(let’s say around 30%), and your evaluation of the manuscript’s quality.
Subsequently, you are informed that the manuscript has been accepted
(congratulations!). What is your updated assessment of the probability
that your next submission (on a similar topic) will be accepted?
The direct estimate is of course 100% (thus far, you have had one success
in one attempt), but this estimate seems naive given what we know about
the journal’s overall acceptance rate (our external, or prior, information in
this setting). You might thus pick a number smaller than 100%; if so, you
are behaving as a Bayesian would because you are adjusting the (unbiased,
but weak) direct estimate in the light of your prior information. This ability
to formally incorporate prior information into an analysis is a hallmark of
Bayesian methods, and one that frees the analyst from ad hoc adjustments
of results that “don’t look right.”

1.2.2 Missing data


Consider Table 1.1, reporting an array of stable event prevalence or inci-
dence estimates scaled per 10,000 population, with one value (indicated by
“”) missing at random. The reader may think of them as geographically
aligned disease prevalences, or perhaps as death rates cross-tabulated by
clinic and age group.

79 87 83 80 78
90 89 92 99 95
96 100  110 115
101 109 105 108 112
96 104 92 101 96

Table 1.1 An array of well-estimated rates per 10,000 with one estimate missing.

With no direct information for , what would you use for an estimate?
Does 200 seem reasonable? Probably not, since the unknown rate is sur-
rounded by estimates near 100. To produce an estimate for the missing
cell you might fit an additive model (rows and columns) and then use the
model to impute a value for , or merely average the values in surround-
ing cells. These are two examples of borrowing information. Whatever your
approach, some number around 100 seems reasonable.
MOTIVATING VIGNETTES 3

Now assume that we obtain data for the  cell and the estimate is, in
fact, 200, based on 2 events in a population of 100 (200 = 10000 × 2/100).
Would you now estimate  by 200 (a very unstable estimate based on very
little information), when with no information a moment ago you used 100?
While 200 is a perfectly valid estimate (though its uncertainty should be
reported), some sort of weighted average of this direct estimate (200) and
the indirect estimate you used when there was no direct information (100)
seems intuitively more appealing. The Bayesian formalism allows just this
sort of natural compromise estimate to emerge.
Finally, repeat this mental exercise assuming that the direct estimate is
still 200 per 10,000, but now based on 20 events in a population of 1000,
and then on 2000 events in a population of 100,000. What estimate would
you use in each case? Bayes and empirical Bayes methods structure this
type of statistical decision problem, automatically giving increasing weight
to the direct estimate as it becomes more reliable.

1.2.3 Bioassay
Consider a carcinogen bioassay where you are comparing a control group
(C) and an exposed group (E) with 50 rodents in each (see Table 1.2). In
the control group, 0 tumors are found; in the exposed group, there are 3,
producing a non-significant, one-sided Fisher exact test p-value of approx-
imately 0.125. However, your colleague, who is a veterinary pathologist,
states, “I don’t know about statistical significance, but three tumors in 50
rodents is certainly biologically significant!”

C E Total
Tumor 0 3 3
No Tumor 50 47 97
50 50 100

Table 1.2 Hypothetical bioassay results; one-sided p = 0.125.

This belief may be based on information from other experiments in the


same lab in the previous year in which the tumor has never shown up in con-
trol rodents. For example, if there were 400 historical controls in addition
to the 50 concurrent controls, none with a tumor, the one-sided p-value be-
comes 0.001 (see Table 1.3). Statistical and biological significance are now
compatible. In general, it can be inappropriate simply to pool historical
and concurrent information. However, Bayes and empirical Bayes methods
may be used to structure a valid synthesis; see, for example, Tarone (1982),
Dempster et al. (1983), Tamura and Young (1986), Louis and Bailey (1990),
and Chen et al. (1999).
4 APPROACHES FOR STATISTICAL INFERENCE

C E Total
Tumor 0 3 3
No Tumor 450 47 497
450 50 500

Table 1.3 Hypothetical bioassay results augmented by 400 historical controls, none
with a tumor; one-sided p = 0.001.

1.2.4 Attenuation adjustment


In a standard errors-in-variables simple linear regression model, the least
squares estimate of the regression slope (β) is biased toward 0, an example
of attenuation. More formally, suppose the true regression is Y = βx + ,
 ∼ N (0, σ2 ), but Y is regressed not on x but on X ≡ x + δ, where
δ ∼ N (0, σδ2 ). Then the least squares estimate β̂ has expectation E[β̂] ≈ ρβ,
σ2
with ρ = σ2 +σδ2
≤ 1. If ρ is known or well-estimated, one can correct for
attenuation and produce an unbiased estimate by using β̂/ρ to estimate β.
Though unbiasedness is an attractive property, especially when the stan-
dard error associated with the estimate is small, in general it is less im-
portant than having the estimate “close” to the true value. The expected
squared deviation between the true value and the estimate (mean squared
error, or MSE) provides an effective measure of proximity. Fortunately for
our intuition, MSE can be written as the sum of an estimator’s sampling
variance and its squared bias,
2
MSE = variance + (bias) . (1.1)
The unbiased estimate sets the second term to 0, but it can have a very
large MSE relative to other estimators; in this case, because dividing β̂ by
ρ inflates the variance as the price of eliminating bias. Bayesian estimators
typically strike an effective tradeoff between variance and bias.

1.3 Defining the approaches


Three principal approaches to inference guide modern data analysis: fre-
quentist, Bayesian, and likelihood. We now briefly describe each in turn.
The frequentist evaluates procedures based on imagining repeated sam-
pling from a particular model (the likelihood), which defines the probabil-
ity distribution of the observed data conditional on unknown parameters.
Properties of the procedure are evaluated in this repeated sampling frame-
work for fixed values of unknown parameters; good procedures perform well
over a broad range of parameter values.
The Bayesian requires a sampling model and, in addition, a prior dis-
DEFINING THE APPROACHES 5

tribution on all unknown quantities in the model (parameters and missing


data). The prior and likelihood are used to compute the conditional distri-
bution of the unknowns given the observed data (the posterior distribution),
from which all statistical inferences arise. Allowing the observed data to
play some role in determining the prior distribution produces the empirical
Bayes (EB) approach. The Bayesian evaluates procedures over repeated
sampling of unknowns from the posterior distribution for a given data set.
The empirical Bayesian may also evaluate procedures under repeated sam-
pling of both the data and the unknowns from their joint distribution.
Finally, the likelihoodist (or Fisherian) develops a sampling model but not
a prior, as does the frequentist. However, inferences are restricted to pro-
cedures that use the data only as reported by the likelihood, as a Bayesian
would. Procedure evaluations can be from a frequentist, Bayesian, or EB
point of view.
As presented in Appendix B, the frequentist and Bayesian approaches
can be connected in a decision-theoretic framework, wherein one considers
a model where the unknown parameters and observed data have a joint
distribution. The frequentist conditions on parameters and replicates (in-
tegrates) over the data; the Bayesian conditions on the data and replicates
(integrates) over the parameters. EB (or preposterior) evaluations integrate
over both parameters and data, and can be represented as frequentist per-
formance averaged over the prior distribution, or as Bayesian posterior
performance averaged over the marginal sampling distribution of the ob-
served data (the sampling distribution conditional on parameters, averaged
over the prior). Preposterior properties are relevant to both Bayesians and
frequentists (see Rubin, 1984).
Historically, frequentists have criticized Bayesian procedures for their in-
ability to deal with all but the most basic examples, for overreliance on
computationally convenient priors, and for being too fragile in their depen-
dence on a specific prior (i.e., for a lack of robustness in settings where the
data and prior conflict). Bayesians have criticized frequentists for failure
to incorporate relevant prior information, inefficiency, inflexibility, and in-
coherence (i.e., a failure to process available information systematically, as
a Bayesian approach would). Another common Bayesian criticism is that,
while frequentist methods do avoid dependence on any single set of prior
beliefs, the resulting claims of “objectivity” are often illusory since such
methods still require myriad assumptions about the underlying data gen-
erating mechanism, such as a simple (often normal) model free from con-
founding, selection bias, measurement error, etc. Bayesians often remark
that the choice of prior distribution is only one assumption that should be
explicitly declared and checked in a statistical analysis. Greenland (2006)
points out that statistically significant frequentist findings in observational
epidemiology rarely come with the requisite “truth-in-packaging” caveats
that prior selection forces Bayesians to provide routinely.
6 APPROACHES FOR STATISTICAL INFERENCE

Recent computing advances have all but eliminated constraints on pri-


ors and models, but leave open the more fundamental difficulties of prior
selection and possible non-robustness. In this book, therefore, we shall of-
ten seek the middle ground: Bayes and EB procedures that offer many of
the Bayes advantages, but do so without giving up too much frequentist
robustness. Procedures occupying this middle ground can be thought of
as having good performance over a broad range of prior distributions, but
not so broad as to include all priors (which would in turn require good
performance over all possible parameter values).

1.4 The Bayes-frequentist controversy

While probability has been the subject of study for hundreds of years (most
notably by mathematicians retained by rich noblemen to advise them on
how to maximize their winnings in games of chance), statistics is a relatively
young field. Linear regression first appeared in the work of Francis Galton
in the late 1800s, with Karl Pearson adding correlation and goodness-of-
fit measures around the turn of the last century. The field did not really
blossom until the 1920s and 1930s, when R.A. Fisher developed the notion
of likelihood for general estimation, and Jerzy Neyman and Egon Pearson
developed the basis for classical hypothesis testing. A flurry of research
activity was energized by the World War II, which generated a wide variety
of difficult applied problems and the first substantive government funding
for their solution in the United States and Great Britain.
By contrast, Bayesian methods are much older, dating to the original
1763 paper by the Rev. Thomas Bayes, a minister and amateur mathe-
matician. The area generated some interest by Laplace, Gauss, and others
in the 19th century, but the Bayesian approach was ignored (or actively
opposed) by the statisticians of the early 20th century. Fortunately, dur-
ing this period several prominent non-statisticians, most notably Harold
Jeffreys (a physicist) and Arthur Bowley (an econometrician), continued
to lobby on behalf of Bayesian ideas (which they referred to as “inverse
probability”). Then, beginning around 1950, statisticians such as L.J. Sav-
age, Bruno de Finetti, Dennis Lindley, and many others began advocating
Bayesian methods as remedies for certain deficiencies in the classical ap-
proach. The following example discusses the case of interval estimation.
iid
Example 1.1 Suppose Xi ∼ N (θ, σ 2 ), i = 1, . . . , n, where N denotes the
normal (Gaussian) distribution and iid stands for “independent and iden-
tically distributed.” We desire a 95% interval estimate for the population
mean θ. Provided n is sufficiently large (say, bigger than 30), a classical
approach would use the confidence interval

δ(x) = x̄ ± 1.96s/ n ,
THE BAYES-FREQUENTIST CONTROVERSY 7

where x = (x1 , . . . , xn ), x̄ is the sample mean, and s is the sample standard


deviation. This interval has the property that, on average over repeated
applications, δ(x) will fail to capture the true mean θ only 5% of the time.
An alternative interpretation is that, before any data are collected, the
probability that the interval contains the true value is 0.95. This property
is attractive in the sense that it holds for all true values of θ and σ 2 .
On the other hand, its use in any single data-analytic setting is some-
what difficult to explain and understand. After collecting the data and
computing δ(x), the interval either contains the true θ or it does not; its
coverage probability is not 0.95, but either 0 or 1. After observing x, a
statement like, “the true θ has a 95% chance of falling in δ(x),” is not
valid, though most people (including most statisticians irrespective of their
philosophical approach) interpret a confidence interval in this way. Thus,
for the frequentist, “95%” is not a conditional coverage probability, but
rather a tag associated with the interval to indicate either how it is likely
to perform before we evaluate it, or how it would perform over the long
haul. A 99% frequentist interval would be wider, a 90% interval narrower,
but, conditional on x, all would have coverage probability 0 or 1.

By contrast, Bayesian confidence intervals (known as “credible sets,” and


discussed further in Subsection 2.3.2) are free of this awkward frequentist
interpretation. For example, conditional on the observed data, the proba-
bility is 0.95 that θ is in the 95% credible interval. Of course, this natural
interpretation comes at the price of needing to specify a (possibly quite
vague) prior distribution for θ.
The Neyman-Pearson testing structure can also lead to some very odd
results, which have been even more heavily criticized by Bayesians.

Example 1.2 Consider the following simple experiment, originally sug-


gested by Lindley and Phillips (1976), and reprinted many times. Suppose
in 12 independent tosses of a coin, I observe 9 heads and 3 tails. I wish
to test the null hypothesis H0 : θ = 1/2 versus the alternative hypothesis
Ha : θ > 1/2, where θ is the true probability of heads. Given only this
much information, two choices for the sampling distribution emerge:
1. Binomial: The number n = 12 tosses was fixed beforehand, and the
random quantity X was the number of heads observed in the n tosses.
Then X ∼ Bin(12, θ), and the likelihood function is given by
   
n x 12 9
L1 (θ) = θ (1 − θ) n−x
= θ (1 − θ)3 . (1.2)
x 9

2. Negative binomial: Data collection involved flipping the coin until the
third tail appeared. Here, the random quantity X is the number of heads
required to complete the experiment, so that X ∼ N egBin(r = 3, θ),
8 APPROACHES FOR STATISTICAL INFERENCE

with likelihood function given by


   
r+x−1 x 11 9
L2 (θ) = θ (1 − θ) =
r
θ (1 − θ)3 . (1.3)
x 9
Under either of these two alternatives, we can compute the p-value corre-
sponding to the rejection region, “Reject H0 if X ≥ c.” Doing so using the
binomial likelihood (1.2), we obtain
12  
12 j
α1 = Pθ= 12 (X ≥ 9) = θ (1 − θ)12−j = .075 ,
j=9
j

while for the negative binomial likelihood (1.3),


∞  
2+j j
α2 = Pθ= 12 (X ≥ 9) = θ (1 − θ)3 = .0325 .
j=9
j

Thus, using the “usual” Type I error level α = .05, we see that the two
model assumptions lead to two different decisions: we would reject H0 if X
were assumed negative binomial, but not if it were assumed binomial. But
there is no information given in the problem setting to help us make this
determination, so it is not clear which analysis the frequentist should regard
as “correct.” In any case, assuming we trust the statistical model, it does
not seem reasonable that how the experiment was monitored should have
any bearing on our decision; surely only its results are relevant! Indeed, the
likelihood functions tell a consistent story, since (1.2) and (1.3) differ only
by a multiplicative constant that does not depend on θ.
A Bayesian explanation of what went wrong in the previous example
would be that the Neyman-Pearson approach allows unobserved outcomes
to affect the rejection decision. That is, the probability of X values “more
extreme” than 9 (the value actually observed) was used as evidence against
H0 in each case, even though these values did not occur. More formally, this
is a violation of a statistical axiom known as the Likelihood Principle, a no-
tion present in the work of Fisher and Barnard, but not precisely defined
until the landmark paper by Birnbaum (1962). In a nutshell, the Likeli-
hood Principle states that once the data value x has been observed, the
likelihood function L(θ|x) contains all relevant experimental information
delivered by x about the unknown parameter θ. In the previous exam-
ple, L1 and L2 are proportional to each other as functions of θ, hence are
equivalent in terms of experimental information (recall that multiplying a
likelihood function by an arbitrary function h(x) does not change the MLE
θ̂). Yet in the Neyman-Pearson formulation, these equivalent likelihoods
lead to two different inferences regarding θ. Put another way, frequentist
test results actually depend not only on what x was observed, but on how
the experiment was stopped.
THE BAYES-FREQUENTIST CONTROVERSY 9

Some statisticians attempt to defend the results of Example 1.2 by ar-


guing that all aspects of the design of an experiment are relevant pieces
of information even after the data have been collected, or perhaps that
the Likelihood Principle is itself flawed (or at least should not be consid-
ered sacrosanct). We do not delve deeper into this foundational discussion,
but refer the interested reader to the excellent monograph by Berger and
Wolpert (1984) for a presentation of the consequences, criticisms, and de-
fenses of the Likelihood Principle. Violation of the Likelihood Principle is
but one of the possibly anomalous properties of classical testing methods;
more are outlined later in Subsection 2.3.3, where it is also shown that
Bayesian hypothesis testing methods overcome these difficulties.
If Bayesian methods offer a solution to these and other drawbacks of the
frequentist approach that were publicized several decades ago, it is per-
haps surprising that Bayesian methodology did not make bigger inroads
into actual statistical practice until only recently. There are several rea-
sons for this. First, the initial staunch advocates of Bayesian methods were
primarily subjectivists, in that they argued forcefully that all statistical cal-
culations should be done only after one’s own personal prior beliefs on the
subject had been carefully evaluated and quantified. But this raised con-
cerns on the part of classical statisticians (and some Bayesians) that the
results obtained would not be objectively valid, and could be manipulated
in any way the statististician saw fit. (The reply of some subjectivists that
frequentist methods were invalid anyway and should be discarded did lit-
tle to assuage these concerns.) Second, and perhaps more important from
an applied standpoint, the Bayesian alternative, while theoretically sim-
ple, required evaluation of complex integrals in even fairly rudimentary
problems. Without inexpensive, high-speed computing, this practical im-
pediment combined with the theoretical concerns to limit the growth of
realistic Bayesian data analysis. However, this growth finally did occur in
the 1980s, thanks to a more objective group of Bayesians with access to
inexpensive, fast computing. The objectivity issue is discussed further in
Chapter 2, while computing is the subject of Chapter 3.
These two main concerns regarding routine use of Bayesian analysis (i.e.,
that its use is not easy or automatic, and that it is not always clear how
objective results may be obtained) were raised in the widely-read paper
by Efron (1986). While the former issue has been largely resolved in the
intervening years thanks to computing advances, the latter remains a chal-
lenge (see Subsection 2.2.3 below). Still, we contend that the advantages
in using Bayes and EB methods justify the increased effort in computation
and prior determination. Many of these advantages are presented in detail
in the popular textbook by Berger (1985, Section 4.1), to wit:

1. Bayesian methods provide the user with the ability to formally incorpo-
rate prior information.
10 APPROACHES FOR STATISTICAL INFERENCE

2. Inferences are conditional on the actual data.


3. The reason for stopping the experimentation does not affect Bayesian
inference (a concern in Example 1.2).
4. Bayesian answers are more easily interpretable by nonspecialists (a con-
cern in Example 1.1).
5. All Bayesian analyses follow directly from the posterior; no separate the-
ories of estimation, testing, multiple comparisons, etc. are needed.
6. Any question can be directly answered through Bayesian analysis.
7. Bayes and EB procedures possess numerous optimality properties.
As an example of the sixth point, to investigate the bioequivalence of two
drugs, we would need to test H0 : θ1 = θ2 versus Ha : θ1 = θ2 . That
is, we must reverse the traditional null and alternative roles, because the
hypothesis we hope to reject is that the drugs are different, not that they
are the same. This reversal turns out to make things quite awkward for
traditional testing methods (see e.g., Berger and Hsu, 1996), but not for
Bayesian testing methods, since they treat the null and alternative hypothe-
ses equivalently; they really can “accept” the null, rather than merely “fail
to reject” it. Finally, regarding the seventh point, Bayes procedures are typ-
ically consistent, can automatically impose parsimony in model choice, and
can even define the class of optimal frequentist procedures, thus “beating
the frequentist at his own game.” We return to this final issue (frequentist
motivations for using Bayes and EB procedures) in Chapter 5.

1.5 Some basic Bayesian models


The most basic Bayesian model has two stages, with a likelihood specifica-
tion Y |θ ∼ f (y|θ) and a prior specification θ ∼ π(θ), where either Y or θ
can be vectors. In the simplest Bayesian analysis, π is assumed known, so
that by probability calculus, the posterior distribution of θ is given by
f (y|θ)π(θ)
p(θ|y) = , (1.4)
m(y)
where 
m(y) = f (y | θ)π(θ)dθ , (1.5)

the marginal density of the data y. Equation (1.4) is a special case of


Bayes’ Theorem, the general form of which we present in Section 2.1. For
all but rather special choices of f and π (see Subsection 2.2.2), evaluating
integrals such as (1.5) used to be difficult or impossible, forcing Bayesians
into unappealing approximations. However, recent developments in Monte
Carlo computing methods (see Chapter 3) allow accurate estimation of
such integrals, and thus have enabled advanced Bayesian data analysis.
SOME BASIC BAYESIAN MODELS 11

1.5.1 A Gaussian/Gaussian (normal/normal) model


We now consider the case where both the prior and the likelihood are
Gaussian (normal) distributions, namely,
θ ∼ N (μ, τ 2 ) (1.6)
Y |θ ∼ N (θ, σ ).
2

The marginal distribution of Y given in (1.5) turns out to be N (μ, σ 2 +τ 2 ),


and the posterior distribution (1.4) is also Gaussian with mean and variance
E(θ | Y ) = Bμ + (1 − B)Y (1.7)
V ar(θ | Y ) = (1 − B)σ , 2
(1.8)
where B = σ 2 /(σ 2 + τ 2 ). Since 0 ≤ B ≤ 1, the posterior mean is a weighted
average of the prior mean μ and the direct estimate Y ; the Bayes estimate is
pulled back (or shrunk) toward the prior mean. Moreover, the weight on the
prior mean B depends on the relative variability of the prior distribution
and the likelihood. If σ 2 is large relative to τ 2 (i.e., our prior knowledge is
more precise than the data information), then B is close to 1, producing
substantial shrinkage. If σ 2 is small (i.e., our prior knowledge is imprecise
relative to the data information), B is close to 0 and the estimate is moved
very little toward the prior mean. As we show in Chapter 5, this shrinkage
provides an effective tradeoff between variance and bias, with beneficial
effects on the resulting mean squared error; see equation (1.1).
If one is willing to assume that the structure (1.6) holds, but that the
prior mean μ and variance τ 2 are unknown, then hierarchical or empirical
Bayes methods can be used; see Chapters 2 and 5, respectively.

1.5.2 A beta/binomial model


Next, consider applying the Bayesian approach given in (1.4)–(1.5) to esti-
mating a binomial success probability. With Y the number of events in n
independent trials and θ the event probability, the sampling distribution is
 
n y
P (Y = y | θ) = f (y | θ) = θ (1 − θ)n−y .
y
In order to obtain a closed form for the marginal distribution, we use the
Beta(a, b) prior distribution for θ (see Section A.2 in Appendix A). For
convenience we reparametrize from (a, b) to (μ, M ) where μ = a/(a+b), the
prior mean, and M = a + b, a measure of prior precision. More specifically,
the prior variance is then μ(1 − μ)/(M + 1), a decreasing function of M .
The marginal distribution of Y is then referred to as beta-binomial, and
can be shown to have mean and variance satisfying
     
Y Y μ(1 − μ) n−1
E = μ and V ar = 1+ .
n n n M +1
12 APPROACHES FOR STATISTICAL INFERENCE

The term in square brackets is known variously as the “variance inflation


factor,” “design effect,” or “component of extra-binomial variation.”
The posterior distribution of θ given Y in (1.4) is again Beta with mean
 
M n Y
θ̂ = μ+ , (1.9)
M +n M +n n
  
n Y
= μ+ −μ
M +n n
and variance V ar(θ | Y ) = [θ̂(1 − θ̂)]/(M + n). Note that, similar to the
posterior mean (1.7) in the Gaussian/Gaussian example, θ̂ is a weighted
average of the prior mean μ and the maximum likelihood estimate Y /n,
with weight depending on the relative size of M (the information in the
prior) and n (the information in the data).

Discussion
Statistical decision rules can be generated by any philosophy under any col-
lection of assumptions. They can then be evaluated by any criteria, even
those arising from an utterly different philosophy. We contend (and the
subsequent chapters will show) that the Bayesian approach is an excel-
lent “procedure generator,” even if one’s evaluation criteria are frequentist
provided that the prior distributions introduce only a small amount of in-
formation. This agnostic view considers features of the prior (possibly the
entire prior) as “tuning parameters” that can be used to produce a decision
rule with broad validity. The Bayesian formalism will be even more effective
if one desires to structure an analysis using either personal opinion or ob-
jective information external to the current data set. The Bayesian approach
also encourages documenting assumptions and quantifying uncertainty. Of
course, no approach automatically produces broadly valid inferences, even
in the context of the Bayesian models. A procedure generated by a high-
information prior with most of its mass far from the truth will perform
poorly under both Bayesian and frequentist evaluations.
Statisticians need design and analysis methods that strike an effective
tradeoff between efficiency and robustness, irrespective of the underlying
philosophy. For example, in estimation, central focus should be on reduc-
tion of MSE and related performance measures through a tradeoff between
variance and bias. This concept is appropriate for both frequentists and
Bayesians. In this context, our strategy will be to use the Bayesian formal-
ism to reduce MSE even when evaluations are frequentist.
Importantly, the Bayesian formalism properly propagates uncertainty
through the analysis enabling a more realistic (typically inflated) assess-
ment of the variability in estimated quantities of interest. Also, the for-
malism structures the analysis of complicated models where intuition may
produce faulty or inefficient approaches. This structuring becomes espe-
EXERCISES 13

cially important in multiparameter models in which the Bayesian approach


requires that the joint posterior distribution of all parameters structure all
analyses. Appropriate marginal distributions focus on specific parameters,
and this integration ensures that all uncertainties influence the spread and
shape of the marginal posterior.

1.6 Exercises
1. Let θ be the true proportion of men in your community over the age of
40 with hypertension. Consider the following “thought experiment”:
(a) Though you may have little or no expertise in this area, give an initial
point estimate of θ.
(b) Now suppose a survey to estimate θ is established in your community,
and of the first 5 randomly selected men, 4 are hypertensive. How does
this information affect your initial estimate of θ?
(c) Finally, suppose that at the survey’s completion, 400 of 1000 men
have emerged as hypertensive. Now what is your estimate of θ?
What guidelines for statistical inference do your answers suggest?
2. Repeat the journal publication thought problem from Subsection 1.2.1
for the situation where
(a) you have won a lottery on your first try.
(b) you have correctly predicted the winner of the first game of the World
Series (professional baseball).
3. Assume you have developed predictive distributions of the length of time
it takes to drive to work, one distribution for Route A and one for Route
B. What summaries of these distributions would you use to select a route
(a) to maximize the probability that the drive takes less than 30 minutes?
(b) to minimize your average commuting time?
4. For predictive distributions of survival time associated with two medical
treatments, propose treatment selection criteria that are meaningful to
you (or if you prefer, to society).
5. Here is an example in a vein similar to that of Example 1.2, and orig-
inally presented by Berger and Berry (1988). Consider a clinical trial
established to study the effectiveness of vitamin C in treating the com-
mon cold. After grouping subjects into pairs based on baseline variables
such as gender, age, and health status, we randomly assign one member
of each pair to receive vitamin C, with the other receiving a placebo.
We then count how many pairs had vitamin C giving superior relief
after 48 hours. We wish to test H0 : P (vitamin C better) = 12 versus
Ha : P (vitamin C better) = 12 .
14 APPROACHES FOR STATISTICAL INFERENCE

(a) Consider the expermental design wherein we sample n = 17 pairs,


and observe x = 13 preferences for vitamin C. What is the p-value
for the above two-sided test?
(b) Now consider a two-stage design, wherein we first sample n1 = 17
pairs, and observe x1 preferences for vitamin C. In this design, if
x1 ≥ 13 or x1 ≤ 4, we stop and reject H0 . Otherwise, we sample an
additional n2 = 27 pairs, and subsequently reject H0 if X1 + X2 ≥ 29
or X1 + X2 ≤ 15. (This second stage rejection region was chosen
because, under H0 , P (X1 + X2 ≥ 29 or X1 + X2 ≤ 15) = P (X1 ≥
13 or X1 ≤ 4), the p-value in part (a) above.)
If we once again observe x1 = 13, what is the p-value under this new
design? Is your answer consistent with that in part (a)?
(c) What would be the impact on the p-value if we kept adding stages to
the design, but kept observing x1 = 13?
(d) How would you analyze these data in the presence of a necessary but
unforeseen change in the design – say, because the first five patients
developed an allergic reaction to the treatment, and the trial was
stopped by its clinicians.
(e) What does all of this suggest about the claim that p-values constitute
“objective evidence” against H0 ?
6. In the Normal/Normal example of Subsection 1.5.1, let σ 2 = 2, μ = 0,
and τ 2 = 2.
(a) Suppose we observe y = 4. What are the mean and variance of the
resulting posterior distribution? Sketch the prior, likelihood, and pos-
terior on a single set of coordinate axes.
(b) Repeat part (a) assuming τ 2 = 18. Explain any resulting differences.
Which of these two priors would likely have more appeal for a fre-
quentist statistician?
7. In the basic diagnostic test setting, a disease is either present (D = 1)
or absent (D = 0), and the test indicates either disease (T = 1) or no
disease (T = 0). Represent P (D = d|T = t) in terms of test sensitivity,
P (T = 1|D = 1), specificity, P (T = 0|D = 0), and disease prevalence,
P (D = 1), and relate to Bayes’ theorem (1.4).
8. In analyzing data from a Bin(n, θ) likelihood, the MLE is θ̂MLE = Y /n,
which has MSE = Ey|θ (θ̂MLE − θ)2 = V ary|θ (θ̂MLE ) = θ(1 − θ)/n. Find
the MSE of the estimator θ̂Bayes = (Y + 1)/(n + 2) and discuss in what
contexts you would prefer it over θ̂MLE . (θ̂Bayes is the estimator from
equation (1.9) with μ = 1/2 and M = 2.)
CHAPTER 2

The Bayes approach

2.1 Introduction
We begin by reviewing the fundamentals introduced in Chapter 1. The
Bayesian approach begins exactly as a traditional frequentist analysis does,
with a sampling model for the observed data y = (y1 , . . . , yn ) given a vector
of unknown parameters θ. This sampling model is typically given in the
form of a probability distribution f (y|θ). When viewed as a function of θ
instead of y, this distribution is usually called the likelihood, and sometimes
written as L(θ; y) to emphasize our mental reversal of the roles of θ and
 Note that L need not be a probability distribution for θ given y; that is,
y.
L(θ; y)dθ need not be 1; it may not even be finite. Still, given particular
data values y, it is very often possible to find the value θ that maximizes
the likelihood function, i.e.,
θ = argmaxθ L(θ; y) .
This value is called the maximum likelihood estimate (MLE) for θ. This
idea dates to Fisher (1922; see also Stigler, 2005) and continues to form
the basis for many of the most commonly used statistical analysis methods
today.
In the Bayesian approach, instead of supposing that θ is a fixed (though
unknown) parameter, we think of it as a random quantity as well. This
is operationalized by adopting a probability distribution for θ that sum-
marizes any information we have about it not related to that provided by
the data y, called the prior distribution (or simply the prior). Just as the
likelihood had parameters θ, the prior may have parameters η; these are
often referred to as hyperparameters, in order to distinguish them from the
likelihood parameters θ. For the moment we assume that the hyperparam-
eters η are known, and thus write the prior as π(θ) ≡ π(θ|η). Inference
concerning θ is then based on its posterior distribution, given by
p(y, θ) p(y, θ)
p(θ|y) = = 
p(y) p(y, θ) dθ
f (y|θ)π(θ)
=  . (2.1)
f (y|θ)π(θ) dθ
16 THE BAYES APPROACH

This formula is known as Bayes’ Theorem, and first appeared (in a some-
what simplified form) in Bayes (1763). Notice the contribution of both the
experimental data (in the form of the likelihood f ) and prior opinion (in
the form of the prior π) to the posterior in the last expression of equation
(2.1). The posterior is simply the product of the likelihood and the prior,
renormalized so that it integrates to 1 (and is thus itself a valid probability
distribution).
Readers less comfortable with the probability calculus needed to handle
the continuous variables in (2.1) may still be familiar with a discrete, set
theoretic version of Bayes’ Theorem from a previous probability or statis-
tics course. In this simpler formulation, we are given an event of interest A
and a collection of events Bj , j = 1, . . . , J that are mutually exclusive and
exhaustive (that is, exactly one of them must occur). Given the probabili-
ties of each of these events P (Bj ), as well as the conditional probabilities
P (A|Bj ), from fundamental rules of probability, we have
P (A, Bj ) P (A, Bj )
P (Bj |A) = = J
P (A) j=1 P (A, Bj )
P (A|Bj )P (Bj )
= J
, (2.2)
j=1 P (A|Bj )P (Bj )
where P (A, Bj ) indicates the joint event where both A and Bj occur; many
textbooks write P (A∩Bj ) for P (A, Bj ). The reader will appreciate that all
four expressions in (2.2) are just a discrete finite versions of the correspond-
ing expressions in (2.1), with the Bj playing the role of the parameters θ
and A playing the role of the data y.
This simplified version Bayes’ Theorem (referred to by many textbook
authors as Bayes’ Rule) may appear too simple to be of much practical
value, but interesting applications do arise:
Example 2.1 Ultrasound tests done near the end of the first trimester of a
pregnancy are often used to predict the sex of the baby. However, the errors
made by radiologists in reading ultrasound results are not symmetric, in
the following sense: girls are virtually always correctly identified as girls,
while boys are sometimes misidentified as girls (in cases where the penis is
not clearly visible, perhaps due to the child’s position in the womb). More
specifically, a leading radiologist states that
P (test + |G) = 1 and P (test + |B) = .25 ,
where “test +” denotes that the ultrasound test predicts the child is a girl.
Thus, we have a 25% false positive rate for girl, but no false negatives.
Suppose a particular woman’s test comes back positive for girl, and we
wish to know the probability she is actually carrying a girl. Assuming 48%
of babies are girls, we can use (2.2) where “boy” and “girl” provide the
J = 2 mutually exclusive and exhaustive cases Bj . Thus, with A being the
INTRODUCTION 17

event of a positive test, we have


P (test + |G)P (G)
P (G | test+) =
P (test + |G)P (G) + P (test + |B)P (B)
(1)(.48)
= = .787 ,
(1)(.48) + (.25)(.52)
or only a 78.7% chance the baby is, in fact, a girl.
Now let us return to the general case, where expressions on either side of
the conditioning bar can be continuous random variables. Apparently the
greatest challenge in evaluating the posterior lies in performing the integral
in the denominator of (2.1). Notice we are writing this as a single integral,
but in fact it is a multiple integral, having dimension equal to the number
of parameters in the θ vector.
The denominator integral is sometimes written as m(y) ≡ m(y|η), the
m denoting that this is the marginal distribution of the data y given the
value of the hyperparameter η. This distribution plays a crucial role in
model checking and model choice, since it can be evaluated at the observed
y; we will say much more about this in Subsections 2.3.3 and 2.5 below.
In some cases this integral can be evaluated in closed form, leading to a
closed form posterior:
Example 2.2 Consider the normal (Gaussian) likelihood, where f (y|θ) =
2
√1
σ 2π
exp(− (y−θ)
2σ2 ), y ∈ , θ ∈ , and σ is a known positive constant.
Henceforth we employ the shorthand notation f (y|θ) = N (y|θ, σ 2 ) to de-
note a normal density with mean θ and variance σ 2 . Suppose we take
π(θ|η) = N (θ|μ, τ 2 ), where μ ∈ and τ > 0 are known hyperparameters,
so that η = (μ, τ ). By plugging these expressions into equation (2.1), it is
fairly easy to show that the posterior distribution for θ is given by
 
σ2 μ + τ 2 y σ2 τ 2
p(θ|y) = N θ , ; (2.3)
σ2 + τ 2 σ2 + τ 2
see the end of this chapter, Exercise 2. Writing B = σ 2 /(σ 2 + τ 2 ), this
posterior has mean
Bμ + (1 − B)y .
Since 0 < B < 1, the posterior mean is a weighted average of the prior mean
and the observed data value, with weights that are inversely proportional
to the corresponding variances. For this reason, B is sometimes called a
shrinkage factor, because it gives the proportion of the distance that the
posterior mean is “shrunk back” from the ordinary frequentist estimate y
toward the prior mean μ. Note that when τ 2 is large relative to σ 2 (i.e.,
vague prior information), B is small and the posterior mean is close to the
data value y. On the other hand, when τ 2 is small relative to σ 2 (i.e., a
highly informative prior), B is large and the posterior mean is close to the
prior mean μ.
18 THE BAYES APPROACH

prior

0.5
likelihood
posterior

0.4
density

0.3
0.2
0.1
0.0

−2 0 2 4 6 8 10

Figure 2.1 Prior, likelihood, and posterior distributions, elementary nor-


mal/normal model with a single observation y = 6.

Turning to the posterior variance, which from (2.3) is


Bτ 2 ≡ (1 − B)σ 2 ,
note that it is smaller than the variance of either the prior or the likelihood.
It is easy to show that the precision in the posterior is the sum of the
precisions in the likelihood and the prior, where precision is defined to
be the reciprocal of the variance; again see Exercise 2. Thus the posterior
distribution offers a sensible compromise between our prior opinion and the
observed data, and the combined strength of the two sources of information
leads to increased precision in our understanding of θ.
As a concrete example, suppose that μ = 2, τ = 1, y = 6, and σ = 1.
Figure 2.1 plots the prior (centered at θ = 2), the likelihood (centered at
θ = 6), and the posterior arising from (2.3). Note that due to the equal
weighting of the prior and the likelihood in this problem (τ = σ), the
posterior mean is 4, the unweighted average of the prior and the likelihood
means. The posterior is not however “twice as tall” as either the prior or
the likelihood since it is precisions (not density heights) that are additive
here: the posterior has precision 1 + 1 = 2, hence variance 1/2, hence
standard deviation 1/2 ≈ .707. Thus, the posterior covers a range of
roughly 4 ± 3(.707) ≈ (1.88, 6.12), as seen in Figure 2.1.
INTRODUCTION 19

Given a sample of n independent observations, we can obtain f (y|θ) as


n
i=1 f (yi |θ), and proceed with equation (2.1). But evaluating this expres-
sion may be simpler if we can find a statistic S(y) that is sufficient for
θ (that is, for which f (y|θ) = h(y)g(S(y)|θ)). To see this, note that for
S(y) = s,
f (y|θ)π(θ)
p(θ|y) = 
f (y|u)π(u) du
h(y)g(S(y)|θ)π(θ)
= 
h(y)g(S(y)|u)π(u) du
g(s|θ)π(θ)
= = p(θ|s) ,
m(s)
provided m(s) > 0, since h(y) cancels in the numerator and denominator of
the middle expression. So if we can find a sufficient statistic, we may work
with it instead of the entire dataset y, thus reducing the dimensionality of
the problem.
Example 2.3 Consider again the normal/normal setting of Example 2.2,
but where we now have an independent sample of size n from f (y|θ). Since
S(y) = ȳ is sufficient for θ, we have that p(θ|y) = p(θ|ȳ). But, because we
know that f (ȳ|θ) = N (θ, σ 2 /n), equation (2.3) implies that
 
(σ 2 /n)μ + τ 2 ȳ (σ 2 /n)τ 2
p(θ|ȳ) = N θ , 2
(σ 2 /n) + τ 2 (σ /n) + τ 2
 2 2

σ μ + nτ ȳ σ2 τ 2
= N θ , . (2.4)
σ 2 + nτ 2 σ 2 + nτ 2
Returning to the specific setting of Example 2.2, suppose we keep μ = 2
and τ = σ = 1, but now let ȳ = 6. Figure 2.2 plots the prior distribution,
along with the posterior distributions arising from two different sample
sizes, 1 and 10. As already seen in Figure 2.1, when n = 1, the prior and
likelihood receive equal weight, and hence the posterior mean is 4 = 2+6 2 .
When n = 10, the data dominate the prior, resulting in a posterior mean
much closer to ȳ. Notice that the posterior variance also shrinks as n gets
larger; the posterior collapses to a point mass on ȳ as n tends to infinity.
Plots of posterior distributions like Figures 2.1 and 2.2 are easily drawn
in the R package; see www.r-project.org. This software, the freeware heir
to the S and S-plus languages, has become widely popular with statis-
ticians for its easy blending of data manipulation and display, graphics,
programmability, and both built-in and user-contributed packages for fit-
ting and testing a wide array of common statistical models. The reader is
referred to Venables and Ripley (2002) for an extensive and popular R tu-
torial; here as an initial illustration we provide the code to draw Figure 2.2.
First, we set up a function called postplot to calculate the posterior mean
and variance in our normal/normal model, as indicated by equation (2.4):
20 THE BAYES APPROACH

prior

1.2
posterior with n= 1
posterior with n= 10

1.0
0.8
density

0.6
0.4
0.2
0.0

−2 0 2 4 6 8

Figure 2.2 Prior and posterior distributions based on samples of size 1 and 10,
normal/normal model.

R code postplot <- function(mu, tau, ybar, sigma, n){


denom <- sigma^2/n + tau^2
mu <- (sigma^2/n*mu + tau^2*ybar)/denom
stdev <- sqrt(sigma^2/n*tau^2/denom)
return(c(mu,stdev))
}
Next we set up the horizontal plotting axis, and plot the prior using the
dnorm function, which gives the density of a normal with the given mean
and variance over the grid:
R code x <- seq(-2,8,length.out=100)
plot(x, dnorm(x, mean=2, sd=1), ylim=c(0, 1.3),
xlab=expression(theta), ylab="density", type= "l")
Here the ylim option sets the vertical scale, while the type="l" option
indicates we want a connected line graph (not merely points plotted at
each grid value). Next, we may add the posterior densities for n = 1 and
n = 10 using the lines command:
R code param1 <- postplot(2,1,6,1,1)
lines(x, dnorm(x, mean=param1[1], sd=param1[2]), lty=2)
param10 <- postplot(2,1,6,1,10)
lines(x, dnorm(x, mean=param10[1], sd=param10[2]), lty=3)
INTRODUCTION 21

where we request dashed (lty=2) or dotted (lty=3) line types for the two
posteriors, since we’ve already used a solid line (lty=1) for the prior. Fi-
nally, we can add a legend to the figure by typing
R code legend(-2, 1.3, legend=c("prior", "posterior with n=1",
"posterior with n=10"), lty=1:3, ncol=1)
This completes Figure 2.2.
The R language also allows the user to draw random samples from a
wide variety of probability distributions. This is sometimes called Monte
Carlo sampling, after the city known for its famous casinos (and presumably
rarely visited by probabilists who named the technique). In the previous
example, we may draw 2000 independent random samples directly from the
posterior (say, in the n = 1 case) using the rnorm command:
R code y1 <- rnorm(2000, mean=param1[1], sd=param1[2])
## param1: [1] 4.0000000 0.7071068
A histogram of these samples can be added to our previous plot using hist
and lines as follows:
R code r1 <- hist(y1, freq=F, breaks=20, plot=F)
lines(r1, lty=2, freq=F, col="gray90")
producing Figure 2.3. We remark that hist can be used without first stor-
ing its result in r1, but this will start the plot over again, erasing the
three existing curves. The empirical mean and standard deviation of our
Monte Carlo samples may be obtained simply by typing mean(y1) and
sd(y1) in R; for our sample we obtained 4.03 and 0.693, very close to the
true values of 4.0 and 0.707, respectively. These estimates could be made
more accurate by increasing the Monte Carlo sample size (say, from 2000
to 10,000 or 100,000); we defer details about the error inherent in Monte
Carlo estimation to Section 3.3.
Of course, Monte Carlo methods are not necessary in this simple nor-
mal/normal example, since the integral in the denominator of Bayes’ Theo-
rem can be evaluated in closed form. In this case, we would likely prefer the
resulting smooth curve and corresponding exact answers for the posterior
mean and variance to the bumpy histogram and the estimated mean and
variance produced by Monte Carlo. However, if the likelihood f and the
prior π do not permit evaluation of the denominator integral, Monte Carlo
methods are generally the preferred method for estimating the posterior.
The reason for this is the approach’s great generality: samples can typi-
cally be drawn from any posterior regardless of how high-dimensional the
parameter vector θ is. Thus, while we no longer get a smooth, exact func-
tional form for the posterior p(θ|y), we gain the ability to work problems
of essentially unlimited complexity.
Chapter 3 contains an extensive discussion of the various computational
methods useful in Bayesian data analysis; for the time being, we give only
22 THE BAYES APPROACH

prior

1.2
posterior with n=1
posterior with n=10

1.0
0.8
density
0.6
0.4
0.2
0.0

−2 0 2 4 6 8

Figure 2.3 Prior and posterior distributions for the normal/normal model, with
a histogram of Monte Carlo draws superimposed for the n = 1 case.

the briefest indication of how these methods can be implemented in the


WinBUGS language. WinBUGS is a freely available program developed by
statisticians and probabilistic expert systems researchers at the Medical
Research Council Biostatistics Unit at the University of Cambridge, Eng-
land. In a nutshell, it allows us to draw samples from any posterior dis-
tribution, freeing us from having to worry overmuch about the integral in
(2.1). This allows us instead to focus on the statistical modeling, which is
after all our primary interest. WinBUGS uses syntax very similar to that of
R, and in fact can now be called from R using the BRugs package, a subject
to which we return below.
WinBUGS implements a particular form of Monte Carlo algorithm known
as Markov chain Monte Carlo, or MCMC for short. Again, while we re-
turn to this subject in excruciating detail in Section 3.4, for now we sim-
ply note that the approach essentially amounts to sampling sequentially
from each parameter’s full conditional distribution, i.e., the distribution of
each model parameter given every other model parameter and the data.
So for example, in a 10-parameter model, the full conditional for θ1 is
p(θ1 |θ2 , θ3 , . . . , θ10 , y). It turns out these full conditional distributions are
often easy to sample from even when the full joint posterior distribution (in
this case, p(θ1 , . . . , θ10 |y)) is not. Because of the sequential nature of the
INTRODUCTION 23

Monte Carlo updating, an MCMC algorithm produces correlated (not inde-


pendent) draws from the true joint posterior, so deciding when to stop the
algorithm can be tricky (see Subsection 3.4.6). However, for a broad class
of fairly standard linear hierarchical models, MCMC methods work well
and are widely accepted as the “industry standard” for Bayesian modeling.
Example 2.4 Consider once again the normal/normal setting of Exam-
ple 2.3, but where we now wish to use WinBUGS to sample from the posterior
of θ. As of the current writing, the latest version (1.4.3) of the program
may be downloaded from www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml.
Once installed, a good way to learn the basics of the language is to follow
the tutorial: click on Help, pull down to User Manual, and then click on
Tutorial. Perhaps even more easily, one can watch “WinBUGS – The
Movie,” a delightful Flash introduction to running the software available
at www.statslab.cam.ac.uk/∼krice/winbugsthemovie.html. Finally, a
large collection of worked examples are available within WinBUGS by clicking
on Help and pulling down to Examples Vol I or Examples Vol II. These
examples enable the familiar statistical computing strategy of attempting
to find a piece of code that does something similar to what you want to
do, modifying it to fit your particular model and data set, and then hoping
the code still runs once the modifications are made.
WinBUGS solutions to Bayesian hierarchical modeling problems require
three basic elements: (1) some WinBUGS code to specify the statistical model,
(2) a file containing the data, and (3) a short file giving starting values for
the MCMC sampling algorithm. For our normal/normal problem, the first
element looks like this:
BUGS code model{
prec.ybar <- n/sigma2
prec.theta <- 1/tau2
ybar ~ dnorm(theta, prec.ybar)
theta ~ dnorm(mu, prec.theta)
}

Notice the use of <- (arrow) for assignment and ~ (tilde) for “is distributed
as,” consistent with usual notation in R and statistical practice generally.
However, note that the second parameter in the normal distribution expres-
sion dnorm is the precision (reciprocal variance), not the variance itself, a
convention that many students find confusing initially. To be consistent
with our notation from Examples 2.2 and 2.3, the code above includes
assignment statements for the precision in the data (prec.ybar) and the
prior (prec.theta) that give the appropriate conversions from the corre-
sponding variances, σ 2 /n and τ 2 .
The data file in this simple problem consists of the one line
BUGS code list(ybar=6, mu=2, sigma2=1, tau2=1, n=1)
24 THE BAYES APPROACH

theta sample: 30000


0.6
0.4
0.2
0.0
0.0 2.0 4.0 6.0
Figure 2.4 Estimated posterior density based on 30,000 Gibbs samples (histogram
not shown), computed by WinBUGS using the “density” command within the
Sample Monitor tool.

Thus the data file includes not only the data ȳ, but also the sample size n
and the hyperparameters μ, σ2 , and τ 2 . Finally, the initial values file takes
the form
BUGS code list(theta = 0)
which simply initializes the sampler at 0 (though in this univariate problem
the starting place actually turns out to be arbitrary).
Running the sampler for 30,000 iterations, we obtained a smoothed kernel
density estimate plotted in Figure 2.4, which looks very similar to the
histogram in Figure 2.3. The sample mean and standard deviation of the
30,000 draws were 3.997 and 0.704, very close to the true values of 4.0 and
0.707.
Again, we do not need all this MCMC power to solve this tiny little
problem, but the benefits of WinBUGS will become very apparent as this
chapter wears on and we begin tackling models with many more unknown
parameters and more complicated distributional structures.

Models with three or more stages


The basic Bayesian model considered in equation (2.1) has two stages,
one for f (y|θ), the likelihood of the data y given the parameters θ, and
one for π(θ|η), the prior distribution of the model parameters θ given a
vector of hyperparameters η. In many cases, however, we may need to
use a model with more than two stages. For instance, suppose we were
unsure as to the proper value for the η vector. The proper Bayesian solution
would be to quantify this uncertainty in a second-stage prior distribution
(sometimes called a hyperprior). Denoting this distribution by h(η), the
INTRODUCTION 25

desired posterior for θ is now obtained by also marginalizing over η,



p(y, θ) p(y, θ, η) dη
p(θ|y) = = 
p(y) p(y, u, η) dη du

f (y|θ)π(θ|η)h(η) dη
=  . (2.5)
f (y|u)π(u|η)h(η) dη du
Alternatively, we might simply replace η by an estimate η̂ obtained as the
value that maximizes the marginal distribution m(y|η) viewed as a func-
tion of η. Inference is now based on the estimated posterior distribution
p(θ|y, η̂), obtained by plugging η̂ into equation (2.1) above. Such an ap-
proach is often referred to as empirical Bayes analysis, since we are using
the data to estimate the prior parameter η. This is the subject of Chap-
ter 5, where we also draw comparisons with the straight Bayesian approach
using hyperpriors (c.f. Meeden, 1972; Deely and Lindley, 1981).
Of course, in principle, there is no reason why the hyperprior for η can-
not itself depend on a collection of unknown parameters λ, resulting in a
generalization of (2.5) featuring a second-stage prior h(η|λ) and a third-
stage prior g(λ). This enterprise of specifying a model over several levels
is called hierarchical modeling, with each new distribution forming a new
level in the hierarchy. The proper number of levels varies with the problem;
see the discussions in Sections 2.4 and 4.1 below. Because we are continu-
ally adding randomness as we move down the hierarchy, subtle changes to
levels near the top are not likely to have much of an impact on the one at
the bottom (the data level), which is typically the only one for which we
actually have observations.
In order to concentrate on more foundational issues, in the remainder of
this section we will typically limit ourselves to the simplest model having
only two levels (likelihood and prior), again suppressing the dependence of
the prior on the known value of η.

Bayesian estimation and prediction


Observe that equation (2.1) may be expressed in the convenient shorthand
p(θ|y) ∝ f (y|θ)π(θ) ,
or in words, “the posterior is proportional to the likelihood times the prior.”
Clearly the likelihood may be multiplied by any constant (or even any
function of y alone) without altering the posterior.
Bayes’ Theorem may also be used sequentially: suppose we have two
independently collected samples of data, y1 and y2 . Then
p(θ|y1 , y2 ) ∝ f (y1 , y2 |θ)π(θ)
= f2 (y2 |θ)f1 (y1 |θ)π(θ)
∝ f2 (y2 |θ)p(θ|y1 ) . (2.6)
26 THE BAYES APPROACH

That is, we can obtain the posterior for the full dataset (y1 , y2 ) by first
finding p(θ|y1 ) and then treating it as the prior for the second portion of
the data y2 . This easy algorithm for updating the posterior is quite natural
when the data arrive sequentially over time, as in a clinical trial or perhaps
a business or economic setting.
Many authors (notably Geisser, 1993) have argued that concentrating
on inference for the model parameters is misguided, since θ is merely an
unobservable, theoretical quantity. Switching to a different model for the
data may result in an entirely different θ vector. Moreover, even a perfect
understanding of the model does not constitute a direct attack on the
problem of predicting how the system under study will behave in the future
– often the real goal of a statistical analysis. To this end, suppose that yn+1
is a future observation, independent of y given the underlying θ. Then the
predictive distribution for yn+1 is given by

p(yn+1 |y) = p(yn+1 , θ|y)dθ

= f (yn+1 |θ, y)p(θ|y)dθ

= f (yn+1 |θ)p(θ|y)dθ , (2.7)

the last equality holding thanks to the conditional independence of yn+1


and y given the parameters θ (i.e., the usual independence of observations
often assumed in the likelihood model). The predictive distribution sum-
marizes the information concerning the likely value of a new observation,
given the likelihood, the prior, and the data we have observed so far. (We
remark that some authors refer to this distribution as the
 posterior predic-
tive, and refer to the marginal distribution m(yn+1 ) = f (yn+1 |θ)π(θ)dθ
as the prior predictive, since the latter summarizes our information con-
cerning yn+1 before having seen the data.)
The Bayesian decision-making paradigm improves on the traditional, fre-
quentist approach to statistical analysis in its more philosophically sound
foundation, its unified and streamlined approach to data analysis, and its
ability to formally incorporate the prior opinion of one or more experi-
menters into the results via the prior distribution π. Practicing statisti-
cians and biostatisticians, once reluctant to adopt the Bayesian approach
due to general skepticism concerning its philosophy and a lack of neces-
sary computational tools, are now turning to it with increasing regularity
as traditional analytic approaches emerge as both theoretically and prac-
tically inadequate. For example, hierarchical Bayes methods form an ideal
setting for combining information from several published studies of the
same research area, an emerging scientific discipline commonly referred to
as meta-analysis (DuMouchel and Harris, 1983; Cooper and Hedges, 1994).
More generally, the hierarchical structure allows for honest assessment of
PRIOR DISTRIBUTIONS 27

heterogeneity both within and between groups, such as laboratories or cen-


sus areas.
In the remainder of this chapter, we discuss the steps necessary to con-
duct a fully Bayesian data analysis, including prior specification, posterior
and predictive inference, hierarchical modeling, model selection, and diag-
nosis of departures from the assumed prior and likelihood form. Throughout
our discussion, we illustrate with examples from the realms of biomedical
science, public health, and environmental risk assessment as appropriate.

2.2 Prior distributions


Implementation of the Bayesian approach as indicated in the previous sub-
section depends on a willingness to assign probability distributions not only
to data variables like y, but also to parameters like θ. Such a requirement
may or may not be consistent with the usual long-run frequency notion of
probability. For example, if
θ = true probability of success for a new surgical procedure,
then it is possible (at least conceptually) to think of θ as the limiting value
of the observed success rate as the procedure is independently repeated
again and again. But if
θ = true proportion of U.S. men who are HIV-positive,
the long-term frequency notion does not apply; it is not possible to even
imagine “running the HIV epidemic over again” and reobserving θ. More-
over, the randomness in θ does not arise from any real-world mechanism; if
an accurate census of all men and their HIV status were available, θ could
be computed exactly. Rather, here θ is random only because it is unknown
to us, though we may have some feelings about it (say, that θ = .05 is
more likely than θ = .50). Bayesian analysis is predicated on such a belief
in subjective probability, wherein we quantify whatever feelings (however
vague) we may have about θ before we look at the data y in a distribution
π. This distribution is then updated by the data via Bayes’ Theorem (as
in (2.1) or (2.5)) with the resulting posterior distribution reflecting a blend
of the information in the data and the prior.
Historically, a major impediment to widespread use of the Bayesian
paradigm has been that determination of the appropriate form of the prior
π (and perhaps the hyperprior h) is often an arduous task. Typically, these
distributions are specified based on information accumulated from past
studies or from the opinions of subject-area experts. In order to streamline
the elicitation process, as well as simplify the subsequent computational
burden, experimenters often limit this choice somewhat by restricting π to
some familiar distributional family. An even simpler alternative, available
in some cases, is to endow the prior distribution with little informative
28 THE BAYES APPROACH

content, so that the data from the current study will be the dominant
force in determining the posterior distribution. We address each of these
approaches in turn.

2.2.1 Elicited priors


Suppose for the moment that θ is univariate. Perhaps the simplest ap-
proach to specifying π(θ) is first to limit consideration to a manageable
collection of θ values deemed “possible,” and subsequently to assign proba-
bility masses to these values in such a way that their sum is 1, their relative
contributions reflecting the experimenter’s prior beliefs as closely as possi-
ble. If θ is discrete-valued, such an approach may be quite natural, though
perhaps time-consuming. If θ is continuous, we must instead assign the
masses to intervals on the real line, rather than to single points, resulting
in a histogram prior for θ. Such a histogram (necessarily over a bounded
region) may seem inappropriate, especially in concert with a continuous
likelihood f (y|θ), but can perhaps be thought of as just another discrete
finite approximation to a continuous underlying “truth,” similar to those
employed by many numerical integration routines. Moreover, a histogram
prior may have as many bins as the patience of the elicitee and the accu-
racy of his prior opinion will allow. It is vitally important, however, that
the range of the histogram be sufficiently wide, since, as can be seen from
(2.1), the support of the posterior will necessarily be a subset of that of
the prior. That is, the data cannot possibly lend credence to intervals for
θ deemed “impossible” in the prior.
Alternatively, we might simply assume that the prior for θ belongs to
a parametric distributional family π(θ|η), choosing η so that the result
matches the elicitee’s true prior beliefs as nearly as possible. For exam-
ple, if η were two-dimensional, then specification of two moments (say, the
mean and the variance) or two quantiles (say, the 50th and 75th) would
be sufficient to determine its exact value. This approach limits the effort
required of the elicitee, and also overcomes the finite support problem in-
herent in the histogram approach. It may also lead to simplifications in the
posterior computation, as we shall see in Subsection 2.2.2.
A limitation of this approach is of course that it may not be possible for
the elicitee to “shoehorn” his or her prior beliefs into any of the standard
parametric forms. In addition, two distributions that look virtually identical
may in fact have quite different properties. For example, Berger (1985, p.
79) points out that the Cauchy(0,1) and Normal(0, 2.19) distributions have
identical 25th, 50th, and 75th percentiles (–1, 0, and 1, respectively) and
density functions that appear very similar when plotted, yet may lead to
quite different posterior distributions.
Early statistical methods for eliciting priors on an unknown proportion
were given by Winkler (1967), and by Kadane et al. (1980) for normal linear
PRIOR DISTRIBUTIONS 29

models with unknown variance parameters. More general reviews of the


area have been published by Kadane and Wolfson (1996, 1998), Chaloner
(1996), Garthwaite et al. (1995), and, most recently, in the fine book by
O’Hagan et al. (2006). These references provide overviews of the various
philosophies of elicitation, as well as reviews of the methods proposed for
various models and distributional settings.
In particular, O’Hagan et al. (2006) provide an extensive review of both
the statistical and psychological literature, the latter of which has much
to say about how the ways humans think about and update probabilistic
statements. For instance, many previous psychological studies have shown
that humans tend to be overconfident in their probability assessments: in
situations where the “correct answer” can be ascertained after the fact
(e.g., weather forecasting), the 95% subjective confidence intervals tend
to include these answers far less than 95% of the time. This appears to
be the result of the inherent difficulty in considering the unlikely events
that lead to observations in the tails of a distribution. Overconfidence may
also result from elicitees failing to condition on events outside their own
personal range of experience. A doctor’s opinion about the likely success
of a new AIDS drug may reflect his own experience successfully treating
affluent gay men, but not include the likely outcomes when the drug is given
to innercity intravenous drug-using women, an important but very different
component of the drug’s target population. As a remedy, O’Hagan et al.
(2006) recommend avoiding elictation of extreme quantiles (say, the 95th),
and instead focus on quantiles closer to the middle of the distribution (e.g.,
the 50th, 25th, and 75th). A general strategy they recommend is to first
elicit the prior median of θ with a question like
Can you determine a value (your median) such that θ is equally likely to be
less than or greater than this point?
and then follow up with a question about the 25th percentile, say
Suppose you were told that θ is below your assessed median. Can you now
determine a new value (the 25th percentile) such that it is equally likely that
θ is less than or greater than this value?
Repeating this latter question with “below” replaced by “above” yields
an elicited value for the 75th percentile, which can then be used to assess
symmetry of the elicitee’s opinion, as well as overall consistency with the
first two answers.
An opportunity to experiment with both histogram and functional form-
matching prior elicitation in a simple setting is afforded below in Exercise 4.
We now illustrate multivariate elicitation in a challenging, real-life setting.

Example 2.5 In attempting to model the probability pijkl of an incorrect


response by person j in cue group i on a working memory exam question
having k “simple conditions” and “query complexity” level l, Carlin et al.
30 THE BAYES APPROACH

(1992) consider the model


⎧ (i)

⎨ θj + γk, l=0
(i)
logit(pijkl ) = θj + γk + α, l=1 .

⎩ (i)
θj + γk + α + β, l = 2
(i)
In this model, θj denotes the effect for subject ij, γ denotes the effect due
to the number of simple conditions in the question, α measures the marginal
increase in the logit error rate in moving from complexity level 0 to 1, and
β does the same in moving from complexity level 1 to 2. Prior information
concerning the parameters was based primarily on rough impressions, pre-
(i)
cluding precise prior elicitation. As such, π({θj }, γ, α, β) was defined as a
product of independent, normally distributed components, namely,

I 
J
(i)
N (θj |μθ , σθ2 ) × N (γ|μγ , σγ2 ) × N2 (α, β|μα , μβ , σα2 , σβ2 , ρ) , (2.8)
i=1 j=1

where N represents the univariate normal distribution and N2 represents


the bivariate normal distribution. Hyperparameter mean parameters were
chosen by eliciting “most likely” error rates, and subsequently converting
to the logit scale. Next, variances were determined by considering the effect
on the error rate scale of a postulated standard deviation (and resulting
95% confidence set) on the logit scale. Note that the symmetry imposed by
the normal priors on the logit scale does not carry over to the error rate
(probability) scale. While neither normality nor prior independence was
deemed totally realistic, prior (2.8) was able to provide a suitable starting
place, subsequently updated by the information in the data.
Even when scientifically relevant prior information is available, elicita-
tion of the precise forms for the prior distribution from the experimenters
can be a long and difficult process, with several simplifying assumptions
required to complete the task in reasonable time. For instance, in the pre-
vious example, the assumption of prior independence across parameters
was probably unrealistic, but was necessary to avoid elicitation of manifold
covariances as well as variances, say in a multivariate normal prior (see
Chapter 6 of O’Hagan et al., 2006, for more on this subject). In addition,
the way that questions are presented to the elicitee can have a significant
impact on the results. In the previous example, questions asked about 95%
confidence intervals probably led to overconfident priors; tail elicitation
should have focused on 25th and 75th (or perhaps even the 33rd and 66th)
percentiles.
Greenland (2007a) argues for “data priors,” which seek the data equiv-
alent of the level of information desired in a prior. Such an approach is
useful for elucidating parallels between frequentist and Bayesian inference,
and may also allow Bayesian computation to be carried out in standard,
PRIOR DISTRIBUTIONS 31

frequentist software packages (because now both the prior and likelihood
information can be input as data records in the program). In the main,
however, prior elicitation issues tend to be application- and elicitee-specific,
meaning that general purpose algorithms are typically unavailable; see e.g.
Greenland (2006, 2007b) for illustration of his prior data approach in the
2 × 2 table and linear regression settings, respectively.
As with many other areas of Bayesian endeavor, the difficulty of prior
elicitation has been ameliorated somewhat through the addition of interac-
tive computing, especially dynamic graphics and object-oriented computer
languages such as R. As of the current writing, the best source of informa-
tion on up-to-date elicitation software may be www.shef.ac.uk/beep/, the
website of the BEEP (Bayesian Elicitation of Expert Probabilities) project.
BEEP is the grant-funded research effort associated with the O’Hagan et
al. (2006) book. Currently available are fairly high-level programs by Paul
Garthwaite and David Jenkinson to elicit beliefs about relationships assum-
ing a particular kind of linear model. Another, more low-level approach
called SHELF, developed by Tony O’Hagan and Jeremy Oakley, is aimed
at eliciting a single distribution in as rigorous and defensible a framework
as possible. This R software includes templates for elicitors to follow that
will both guide them through the process and also provide a record of
the elicitation. Software by John Paul Gosling called ROBEO is also avail-
able for implementing the nonparametric elicitation method of Oakley and
O’Hagan (2007) and its extensions. The group also plans extensions to mul-
tivariate settings and to allow for uncertainty in the elicited probabilities
or quantiles.
Example 2.6 In the arena of monitoring clinical trials, Chaloner et al.
(1993) show how to combine histogram elicitation, matching a functional
form, and interactive graphical methods. Following the advice of Kadane
et al. (1980), these authors elicit a prior not on θ (in this case, an unob-
servable proportional hazards regression parameter), but on corresponding
observable quantities familiar to their medically-oriented elicitees, namely
the proportion of individuals failing within two years in a population of
control patients, p0 , and the corresponding two-year failure proportion in
a population of treated patients, p1 . Writing the survivor function at time
t for the controls as S(t), under the proportional hazards model we then
have that p0 = 1 − S(2) and p1 = 1 − S(2)exp(θ) . Hence, the equation
log[− log(1 − p1 )] = θ + log[− log(1 − p0 )] (2.9)
gives the relationship between p0 , p1 , and θ.
Since the effect of a treatment is typically thought of by clinicians as
being relative to the baseline (control) failure probability p0 , they first
elicit a best guess for this rate, p̂0 . Conditional on this modal value, they
then elicit an entire distribution for p1 , beginning with initial guesses for
the upper and lower quartiles of p1 ’s distribution. These values determine
32 THE BAYES APPROACH

initial guesses for the parameters μ and σ of a smooth density function that
corresponds on the θ scale to an extreme value distribution. This smooth
functional form is then displayed on the computer screen, and the elicitee
is allowed to experiment with new values for μ and σ in order to obtain
an even better fit to his or her true prior beliefs. Finally, fine tuning of the
density is allowed via mouse input directly onto the screen. The density is
restandardized after each such change, with updated quantiles computed
for the elicitee’s approval. At the conclusion of this process, the final prior
distribution is discretized onto a suitably fine grid of points (p11 , . . . , p1K ),
and finally converted to a histogram-type prior on θ via equation (2.9), for
use in computing its posterior distribution.

In this example, the interactive nature of the computer program allows


continual reassessment and checking of the elicited prior against the ex-
pert’s opinion, a good guard against bias due to overconfidence or other
sources. However, the inherent difficulty of the elicitation task makes us
lean toward its use only in situations where anticipated data sample sizes
are small, and the experts possess good reliable prior information (often in
the form of intimate knowledge of previous data) on the subject at hand.
The only context in which we shall view the use of elicited priors as es-
sential will be in the area of experimental design, where some idea of the
nature of the system being studied must be input in order to plan efficient
experiments and accurately predict their operating characteristics. We will
return to this subject in detail in Chapter 6.

2.2.2 Conjugate priors

In choosing a prior belonging to a specific distributional family π(θ|η), some


choices may be more computationally convenient than others. In particu-
lar, it may be possible to select a member of that family that is conjugate
with the likelihood f (y|θ), that is, one that leads to a posterior distribu-
tion p(θ|y) belonging to the same distributional family as the prior. The
computational advantages are best illustrated through an example.

Example 2.7 Suppose that X is the number of pregnant women arriving


at a particular hospital to deliver their babies during a given month. The
discrete count nature of the data plus its natural interpretation as an arrival
rate suggest adopting a Poisson likelihood,

e−θ θx
f (x|θ) = , x ∈ {0, 1, 2, . . .}, θ > 0.
x!
To effect a Bayesian analysis, we require a prior distribution for θ having
support on the positive real line. A reasonably flexible choice is provided
PRIOR DISTRIBUTIONS 33

by the gamma distribution,


θα−1 e−θ/β
π(θ) = , θ > 0, α > 0, β > 0,
Γ(α)β α
or θ ∼ G(α, β) in distributional shorthand. Note that we have suppressed
π’s dependence on η = (α, β) since we assume it to be known. The gamma
distribution has mean αβ, variance αβ 2 , and can have a shape that is
either one-tailed (α ≤ 1) or two-tailed (α > 1); for large α the distribution
resembles a normal distribution. The β parameter is a scale parameter,
stretching or shrinking the distribution relative to 0, but not changing its
shape. Using Bayes’ Theorem (2.1) to obtain the posterior density, we have
p(θ|x) ∝ f (x|θ)π(θ)
  
∝ e−θ θx θα−1 e−θ/β
= θx+α−1 e−θ(1+1/β) . (2.10)
Notice that since our intended result is a normalized function of θ, we are
able to drop any multiplicative functions that do not depend on θ. (For
example, in the first line we have dropped the marginal distribution m(x)
in the denominator, since it is free of θ.) But now looking at (2.10), we see
that it is proportional to a gamma distribution with parameters α = x + α
and β  = (1 + 1/β)−1 . Note this is the only function proportional to (2.10)
that still integrates to 1. Because density functions uniquely determine dis-
tributions, we know that the posterior distribution for θ is indeed G(α , β  ),
and that the gamma is the conjugate family for the Poisson likelihood.
As a concrete illustration, suppose we observe x = 42 moms arriving at
our hospital to deliver babies during December 2007. Suppose we adopt
a G(5, 6) prior, which has mean 5(6) = 30 and variance 5(62 ) = 180 (see
dotted line in Figure 2.5), reflecting the hospital’s totals for the past 24
months, which have been slightly less busy on average. The R code to set
up a grid for θ and compute the gamma prior and posterior is
R code alpha<-5
beta<-6
theta<-seq(0, 100, length.out=101)
prior <- dgamma(theta, shape=alpha, scale=beta)
x<-42
posterior <- dgamma(theta, shape=x+alpha, scale=1/(1+1/beta) )
using the intrinsic dgamma (density of a gamma) function. We may now
plot these two densities by typing:
R code plot(theta, posterior, xlab=expression(theta), ylab="density")
lines(theta, prior, lty=3)
As in Figure 2.3, we may also sample directly from the gamma posterior,
this time using the rgamma function:
Exploring the Variety of Random
Documents with Different Content
and the foot of the northern tyrant will be on our necks within the
next year. As the commander of the finest army in the south, I do not
believe you will disappoint them.
Hood. Let the prisoner be brought forth.
Brightly. Sentinel, the Commander-in-Chief would speak with
the prisoner at once. (Sentinel unlocks the door, and kicks Halcom to
wake him. He springs to his feet.)
Halcom. Well, what next? (Sentinel points to the door, and
Halcom passes out, &c.)
Hood. You are a native of Tennessee?
Halcom. Well?
Hood. What do you mean by well?
Halcom. Interpret to suit yourself.
Hood. It has been represented that you are a traitor to your native
state.
Halcom. Undoubtedly.
Hood. Do you deny it?
Halcom. Who is my accuser?
Brightly. I!
Halcom. An assassin and ravisher of defenceless women!
Brightly. Liar!
Halcom. A coward, who covers his tracks with the knife and torch!
Brightly. A traitor accuses me!
Halcom. A blatant ruffian, who fights only when no danger steps
in his way. (Brightly draws to attack him. Hood steps between.)
Hood. Enough of this.
Halcom. Leave him to his way.
Hood. You were captured yesterday—
Halcom. While insensible from wounds.
Hood. While fighting against your native state.
Halcom. To save her honor.
Hood. By virtue of treason.
Halcom. Who are you that speaks of treason?
Hood. A soldier who never forgets his obligations to the soil that
gave him heritage.
Halcom. Whose sword is dishonored with blighted virtue and
broken hearts, bartered for gold in the shambles of the auction yards.
Hood. Keep your foul tongue civil, or I may forget myself.
Halcom. It is honorable to be a traitor, when allegiance would
strangle liberty—outrage virtue—rob the poor of the right to their
miserable earnings, and trample on the most sacred affections of the
heart.
Hood. The defence of a hypocrite.
Halcom. Only cowards defend dishonor. (Brightly draws, and
attempts to rush on him. D’A. dashes between.)
D’A. The man is unarmed.
Brightly. Which leaves him no right to convey an insult.
Hood. Call a court-martial at once. The military law shall settle
this. (Brightly hurries out, R.) D’Arneaux, search his person for
arms. (D’A. makes a fruitless search. Enter Brightly with a drum
and camp-stool, followed by a rebel officer.) Col. Gilday, you will act
as judge advocate. (Gilday prepares for business.) Capt. Brightly,
take the stand. (Sworn.) State to the court what you know of this
man.
Brightly. The prisoner’s name is Francis Halcom. He is a native of
Creelsboro’, Tennessee, on the Cumberland river. I have known the
family since my childhood. With the exception of three years in
Massachusetts for education, Creelsboro’ has always been his home.
When Tennessee withdrew from the confederation, he immediately
went north, raised troops, and has since led them on to pillage and
murder in his native state. Yesterday, he was captured with arms in
his hands, fighting as becomes a traitor. (Steps aside.)
Hood. D’Arneaux, take the stand. (Sworn.) Tell the court what you
know of this case.
D’A. I am acquainted with all the facts related by Captain Brightly.
In addition, while the prisoner was absent in Massachusetts, his
family was assassinated, and home burned, on account of political
differences. When the war broke out, he was exiled for the same
reason.
Hood. You would defend this murderer?
D’A. Justice demands all the facts.
Hood. Which palliate nothing.
D’A. Had the assassin destroyed my family, and deprived me of my
civil rights in the name of the state, I too would have been a traitor!
Hood. Leave your sword at my headquarters, and consider
yourself under arrest. Step aside.
D’A. I wash my hands of this murder about to be consummated.
Hood. Go to your quarters, sir. I command here. (D’A. leaves
slowly. To Halcom.) You have heard the evidence against you—what
have you to say?
Halcom. Of what use is a defence in such a court as this?
Hood. The court will hear an excuse, even.
Halcom. The principal evidence is guilty of the murder of my
family.
Brightly. I demand that he shall be made to prove that.
Halcom. The closing of my life saves his.
Brightly. I demand an end of this cant.
Hood. I will hold him responsible for every word he speaks.
Halcom. Who speaks of responsibility? The history of today is yet
to be written. When it is, a page will be given to the infamy of the
leaders of this revolt. Two thousand years of the world’s best
civilization tramples with disdain on the barbarisms for which you
contend. Justice, Christianity and manhood alike repudiate the
dishonor your sword sustains. What is treason? (Pointing to B.) To
defend my country against such reptiles as that!
Brightly. Will the court listen to this croaking liar longer?
Hood. Leave him to his falsehoods. They but invite the bullet still
more.
Halcom. Most wise judge! How evenly are the scales of justice
balanced in your court! How commendable are the tales that suit the
judge! How villainously disgusting are the defensive presumptions of
the prisoner, that might so basely impugn the intentions of the court!
Hood. Who hatches crime, will defend a lie!
Halcom. Who subverts justice, is a traitor to God!
Brightly. Let the bullet settle this at once.
Hood. (To the court.) Gentlemen of the court, you have heard the
evidence. Is the prisoner guilty?
All. Aye, guilty!
Hood. Captain Brightly, return the prisoner to the cabin. He will
be allowed fifteen minutes to prepare. You will then call a squad of
men, and see to it that he is shot to the death.
Halcom. Gen. Hood, I request that I may die by the hand of a
brave and honorable man.
Hood. So I have decreed!
Halcom. His hands are tainted with the murder of defenceless
women.
Brightly. ’Tis false!
Halcom. So is he a coward! Twice I have thrown my knife at his
feet to defend himself against my empty hands, and he has refused!
Brightly. (To Hood.) Do you believe the falsehoods of a traitor?
Halcom. Then be it so now!
Hood. (To Brightly.) Well?
Brightly. I will not risk a life that may be of use to my country, in
a duel with a man who has been condemned to death for treason.
Hood. Well said, sir! Sentinel, remand the prisoner. (Exit Hood, R.
Sentinel points to the cabin. Halcom goes slowly, as if to enter.
Halts at door and turns.)
Halcom. Keele Brightly, the chances of war have favored you. I am
the last of my family. My mother’s ashes are still unavenged. I have
had faith in God. Justice may come at last from other hands than
mine. (Turns and enters the cabin, and falls on one knee. Sentinel
locks the door. Brightly leaves R. As he disappears, Sentinel
resumes his beat, and Zina shows around L. end of cabin, and taps
lightly to attract Halcom’s attention. He hastens to listen.)
Zina. (Peering between the logs.) It is I, Zina, come to save you.
There is a bar behind the door. Bar the door on the inside, and make
no noise. Then return quickly.
Halcom. God bless your brave little heart! (Bars the door, and
returns to listen.)
Zina. This cabin is close to the river. Your friends are on the other
side. The walls are too strong to be broken. I will climb to the roof,
tear off some boards, throw a rope over a limb, and drop it through
the opening. On this, ascend to the roof quickly. The river is too deep
to ford. A log is lodged on the shore in rear of the cabin. With the
rope, swing yourself astride this. Pull a rope fastened to the other
shore, and it will soon land you with your friends on the other side. If
you are fired upon from this side, throw yourself into the water and
cling to the log.
Halcom. But what chance of escape is there for yourself?
Zina. Don’t fear for me.
Halcom. I will not accept my life, even, at the slightest risk to your
own.
Zina. Do not hesitate. If you do, you are lost.
Halcom. Tell me, on honor, is there any danger for yourself?
(Enter Brightly, with squad of men, for execution, R.)
Zina. On my honor, I shall be safe. Watch for the rope. I join you at
your own camp. (Zina springs to rear of cabin, and ascends to roof,
while Brightly is saying)—
Brightly. Sentinel, bring out the prisoner. (Meantime Zina is
tugging to get off a board. Sentinel finds door fast.) Break down the
door; there is an attempt to escape! (Rebs rush at door, one with an
axe. Zina gets off first board at word “escape.” Heavy firing, long
roll, L.) Some to the roof! Smash the door! (Zina gets off second
board at word “door;” then fires at rebs climbing up sides, when
they retreat. Brightly to rebs retreating, sword drawn. Gets off
third board.) Back to the roof, cowards, or I will spit you like dogs.
Get a log and crush it! (Meantime, she fires again, drives them back,
and gets off fourth board.)
Soldier. (Entering L. in haste.) The Yanks are bridging the river.
Brightly. Fight them like hell! (Fourth board drops; rebs crash in
the door. Zina screams, flings rope into tree, and drops it through
hole. Meantime shots inside cabin, and rebs tumble out door.
Halcom climbs up a rope to roof. Rebs climb cabin to catch him on
roof. As H. arrives on roof, Zina pushes him off rear into the water,
and turns on the rebs.)
Zina. (Drawing knife.) Back, you cowards, or I kill you this time!
(Brightly dashes to R. rear. Curtain. Encore.)
(Curtain rises on last tableau, except Zina has seized the rope.
Suddenly she places her knife in her teeth, springs off rear, and
swings into the water. Brightly dashes off building to L.)
Soldiers on Roof. (Rising.) She is swimming the river! (Brightly
seizes a rifle from a soldier, dashes round L., and, during a flash of
lightning, fires at her. D’Arneaux dashing in L., knocks the rifle
aside, too late. Brightly springs to R.)
D’A. You have murdered that heroic girl! Take your knife, coward,
for, by heaven, one of us shall follow!
Brightly. (To soldiers.) Arrest that man for treason! (Soldiers
surround D’A. with a cordon of bayonets, when he drops his knife
and hangs his head.)
Brightly. I have waited for this! A court-martial and the bullet
shall end it! (Curtain.)
ACT IV.

Scene 1. Night. Heavy forest. Gen. Sherman disc. looking away to


R. Occasional flashes of lightning, and thunder in the distance.
Occasional picket firing, R. Staff, L.

Sherman. A terrible storm! The men must be wet and hungry.


Orderly! (Enter Ord. L. U. E.) Tell the commissary to hurry the hot
coffee and fresh food to the front at once. (Ex. Ord. L. U. E.) I must
cross the river before daylight, or my opportunity is lost. Martel!
(Enter Telegraph Operator, L. U. E.) Tell Schofield and Howard they
must force a passage of the river at four o’clock, at all hazards. (Op.
works machine and waits.) Do they understand?
Operator. They do. (Enter Halcom, R. U. E., coatless, hair
dishevelled, wounded.)
Sherman. (Rushing to grasp his hand.) In heaven’s name,
Halcom, from where do you come?
Halcom. The rebel camp.
Sherman. How did you escape? (Men offer clothing.)
Halcom. Ask God, and the angel sent to my relief. (Declining
clothes.) Thank you, gentlemen, I need nothing now but a coat.
Sherman. Ah! A woman at the bottom of it. (Halcom watches out
R.) I sent word to Hood that if any harm came to you, I would
retaliate on every rebel officer in my charge.
Halcom. Thank you, General. But your communication would,
doubtless, have come too late. But for my escape, I should have been
executed two hours ago.
Sherman. Your escapes are marvelous. By the way, I have orders
from Washington to advance you to the first vacancy among the
corps commanders.
Halcom. (Dropping his head.) I had not expected that.
Sherman. Why not? In this army, sir, the best man wins.
Halcom. I am a native and citizen of the south.
Sherman. There are no lines for loyalty in this country.
Halcom. I am indebted to you for this.
Sherman. You are indebted to your own right arm, sir.
Halcom. I have been but a simple soldier, no more entitled to
advancement than the private who takes the brunt of the fight in the
first line.
Sherman. Halcom, some men are born to command—to lead a
forlorn hope—
Halcom. Which I never have.
Sherman. Indeed! When at Lookout Mountain the storm of rebel
shot had melted the first line, and the reserves were already
wavering, and you seized and dared them to follow their flag, rallying
the broken ranks to that wild charge that swept the rebel army from
its entrenchments among the clouds, it was a glory beside which the
command of this army pales into insignificance!
Halcom. Then the soldier shares equally with his commander!
(Watches out R.)
Sherman. But you have not told me of this marvelous escape.
Halcom. Ask me of something I cannot comprehend, and you have
all I can give.
Sherman. It often acts like that.
Halcom. How?
Sherman. Simple as any other phase of life. A storm at night. A
handsome cavalier, unjustly condemned, awaiting execution. A
lovely maiden hovers near. She drugs the guard, and sets the
prisoner free. Bewildered by the ecstasy of love in such a moment of
excitement, both are lost in its wild delirium. They wake to an utter
incomprehensibility of all that has passed.
Halcom. General, I am content if such chafing pleases you. But I
am weighted with an anxiety that will drive me mad. When I can
know the heroic girl is safe, who perhaps has sacrificed her life to
save mine, I can forget that I am a coward, and unfit to live! (Crosses
over to L.)
Sherman. Ah! I am getting interested in this case. Who is this
woman? What do you fear? Where is she? I can hardly imagine a
situation in this country or in either army, that can be dangerous to a
woman!
Halcom. No danger to a woman? They killed my mother when she
was helpless, and, with my sister, burned her in her own home.
Sherman. Such men are devils!
Halcom. And so am I! Can you trace the maniac through
Nashville, Chickamauga, and over Lookout Mountain, to the banks of
this river, and not guess at the origin of the hell that is so fast
consuming my life?
Sherman. Treat it calmly, Halcom. It is something that can never
be mended. Leave the past to take care of itself.
Halcom. There are fires that refuse to be quenched. No one has
struggled more manfully than myself to forget this. When I would
forget, memory conjures up the scene in the old home! My mother’s
helpless struggles with the devils who crushed her innocent life! Of
my sister burned alive! My God! How can I forget this?
Sherman. Tell me of your capture and escape.
Halcom. (Hesitating.) My division was overwhelmed by the whole
rebel army. In the desperate struggle, I was left wounded and
senseless on the field of battle. I was discovered by my old enemy
and conveyed to an old hut on the banks of the Chattahoochee. After
a parley with Hood and others, I was tried by a drum-head court-
martial for treason to my native state, and sentenced to die fifteen
minutes later. I was remanded to the hut to await the preparations
for my execution. I could see no chance for escape, for Brightly had
the details of my execution at his own command. The rifles were
already loading that were to send me to eternity. I had sunk on my
knees for the last prayer, when a tapping on the logs outside, in rear
of the hut, attracted my attention. I hastened to listen. It was too
dark to see. But through the crevices between the logs, I learned that
the little rebel owl who had escaped your bullet, because she was not
a man, had come to effect my escape.
Sherman. That child? Surely, I was only in jest.
Halcom. That heroic child had eluded your guard, swam the river
at midnight in the violence of that terrible thunder-storm, dragging a
log hitched to a rope that led to the friendly shore, that I might
escape.
Sherman. Impossible!
Halcom. I refused to save my life at the hazard of hers. She had
planned to escape with me. I heard the tramp of the soldiers detailed
to take my life. I heard her clambering to the roof of the hut; the
orders to drag me out to die; the sentinel try the barred door; the
crack of the breaking boards as she was making an opening for my
escape; the crash of the axe breaking the door; an order that sent the
devils to the roof to prevent my escape; the ring of her pistol as she
drove them back to the earth again. The door crashed in, and the
devils were upon me; a rope fell at my feet. With almost superhuman
strength, I flung them back and gained the roof. A crowd were
clambering up the sides to destroy us. I sprang forward to her
defence. In an instant, she pushed me clear of the hut, safely into the
river.
Sherman. Did you leave her!
Halcom. The next flash of lightning revealed her on the roof, with
her knife drawn, holding the traitors at bay, that I might escape. I
sprang back for the shore. I heard a splash in the water. The next
lightning flash revealed her battling the rapids of the river to gain the
other shore. A shot from the rebel side, and all was dark again. I
sprang after her. Two hours I have frantically searched this bank of
the river, without avail. She has perished in the rapids of the river, or
by that coward shot from the rebel rifle, and I live like a coward!
(Zina staggers in at R. U. E., as if unconscious of the presence of
any one; wounded in the left side of the head, often looking behind
to see if she is pursued. She staggers and is about to fall, when she is
discovered by Halcom, who springs forward, and catches her in his
arms. Sherman tears off his military cloak, and wraps it about her.)
Halcom. She has fainted.
Sherman. And is wounded. (They revive her.)
Zina. Please let me stay on this side of the river.
Sherman. Let you stay on this side of the river! I will shoot any
man who attempts to prevent it! You shall command this army if you
like. (Zina faints again.)
Halcom. The poor child is dying.
Sherman. Not a bit of it. She is too smart to die! Take her to my
quarters. Orderly, here! (Enter Ord. L. U. E.; with Halcom takes her
out, L. U. E.) Have my surgeon attend that girl, and tell him if he lets
her die, I will hang him an hour after. (Exit Ord. L.) I am the biggest
ass in the service. If I ever abuse a woman again, I hope I may be
shot by an idiot! (Exit L. Enter Barney and Hez. L. U. E.)
Barney. Now whin I would be arrestin’ a blackguard like that,
don’t you be a botherin’ me.
Hezekiah. Now you git out. I guess it was jest about as cheap for
him ter git away, as it would be for you to get a collapse in your real
estate. (Set guns against tree, sit down and wipe perspiration, &c.)
Barney. Now look in these two eyes of me. Didn’t ye be kickin’ that
blackguard whin I would be takin’ him?
Hezekiah. I rayther kalkerlate you was on the pint er passin’ in yer
chips when I lit on that critter.
Barney. Ah ha! I’m nobody, I s’pose. Was I?
Hezekiah. I guess that feller was the most astonished piece er
meat I ever traveled over. I kalkerlate that when I lit on the other
eend of his corperation, he come to the conklusion that he was
wrastlin’ with a first-class earthquake.
Barney. I don’t care about thim airthquakes. I want none er thim.
My reputashin is spit upon.
Hezekiah. I reckon I never jumped onter anything in that line er
critter that wanted ter go home so bad as he did.
Barney. Now look in me two eyes and be talkin’ honest about it,
and no braggin’. Didn’t ye be makin’ that blackguard get away when I
would arrest him?
Hezekiah. Now, Irish, you just spill your gas in some other line er
preachin’, er else I’ll let him get your guzzle next time. (Enter
Brightly and rebel soldiers, R. U. E., stealthily, seize the guns and
cover both.)
Barney. Now whin I arrest a blackguard again, don’t you be
botherin’ me.
Brightly. Throw up your hands! (Points gun at them.)
Bar. (Turning in surprise.) Stop that! That gun is loaded.
Hezekiah. (Throws off coat.) If I don’t make him drop that gun.
(Turns and meets gun—subsides.)
Brightly. Surrender, or I’ll kill you like a dog.
Hezekiah. Don’t care ef I dew.
Brightly., (pointing R. U. E.) Step into line there. (Both comply.)
Hezekiah. Say? Got eny terbacker in yer trowsis?
Brightly. Shut your mouth and march now, or I will see what
virtue there is in this gun.
Hezekiah. (March off R. U. E.) Don’t care if I dew.
Scene 2. Gen. Hood’s headquarters. Gen. seated at table, rear
centre. D’Arneaux and two guards, L., facing R.

Hood. Lt. D’Arneaux, when you entered the military service, I


believed that you would soon wear the stars of a division
commander. Instead, you have presented us with the strange
anomaly of patriot and traitor. While to me you have presented a
soul of honor, you have sought every opportunity to strike your
country a cowardly blow in the dark!
D’A. And I deny the falsehood with my whole soul and life.
Hood. Under the circumstances, a denial is wholly unnecessary.
You have had a fair trial. No one regrets more than myself the
military necessity that compels me to sign the warrant for your
execution. Your brilliant military record is no excuse for disloyalty,
and a most flagrant treason.
D’A. As I expect to meet God before the next sunset, that
accusation is doubly false, though it comes from your own lips!
Hood. There are a score of witnesses who saw you attempt the life
of your superior officer. (D’A. hangs his head in silence.) If there had
never occurred another offence, the articles of war meet you with the
bullet. (To guards.) Remove the prisoner to the care of the guard.
(Ex. D’A. and guard, L.) Orderly! (Enter rebel Orderly, L. U. E.)
Take this dispatch to Gen. McGruder. (Exit Ord. with dispatch.
Enter Keele Brightly, L., salutes.)
Brightly. I have the honor to report that I have captured two
Yankees, found lurking within our lines as spies.
Hood. Have them brought in. (Brightly salutes and retires, L.)
The camp is swarming with them! It is utterly useless to attempt to
prevent it without recourse to the most severe measures! This
careless indifference of the guards allows a constant betrayal of my
means of defence. (Enter Brightly, L., followed by Hez. and Barney,
under guard.) The guard will retire. (Exit guard, R. Brightly
observes R.)
Hezekiah. (Rushing up to shake hands with Hood.) How de dew,
Gineral? (Hood refuses to shake. Hez. astonished.) Don’t blame ye a
Hannah Cook! Never felt so mean about anything afore in my life.
You must think I’m putty darn small pertaters, to let myself get
roped in by a pair er runts like them. (Looks in Hood’s face a
moment.)
Hood. Well, sir, what have you to say for spying?
Hezekiah. Now you get out! Why I know you (grabs Hood’s hand)
jest as well as I do Abe Linkon. (Hood tries to disengage his hand.)
Why, you are that old covey that I met down there in the woods, that
wanted ter know where the old man lived. (Lets go his hand.) Don’t
blame ye for wantin’ ter give me the shake. Say? Got any terbacker in
yer trowsis?
Hood. No, sir!
Hezekiah. (Confidentially.) Say, I never felt so disgraceful about
anything afore in my life. ’Tween you and I, let me have a chance ter
distribit their meat in a fair scratch, and I’ll give ye forty dollars.
Hood. (To Brightly.) Who is this fellow?
Brightly. His name is Goferum.
Hood. Goferum! What a name!
Hezekiah. (Dashing to L., and throwing off coat.) Jess you say. I
want you to understand that forty dollars is scarcer than fools are in
this country. (Coat off, turns.)
Hood. (To Brightly.) Seize the fool! (Barney throws off coat, &c.)
Hezekiah. You bet! (As he dashes for Brightly, he meets a pistol,
and knocks it one side as it goes off. Clinches Brightly, throws him,
and proceeds to punch his ribs, and struggle around.)
Hood. (Meantime.) Guards, ho! (Barney dashes about for a
fight.)
Barney. (To Hood.) Don’t you say guard-house to me, you
grayback thafe er the wurruld!
Hood. Guards, ho! Guards, ho!
Barney. Come out er that! Come out, you thafe er the wurruld.
Come out, and I bat your dam head off you. Come out. (Dashes
forward, kicks table over, clinches Hood, throws him, and proceeds
to punch his ribs, as guards rush in R., and overpower them.)
Scene 3. Landscape and wood front. Enter Sally with pail, L.,
female attire.

Sally. (Looking about.) Now didn’t I wool that sargeant. I’ll bet he
hain’t got brains enough for a mule. It takes seven hundred er them
fellers to know as much as a Yankee. When he was stealin’ the
chickens at that deserted house, I told him it warn’t fair to steal my
chickens, when I was givin’ his men coffee. Gorry, won’t they sleep
some! Now Hez. he has learned ter steal chickens since he come
down here. You jest wait and see me break him er that when I get
him back to Pordunk! Now I should like to see a man of mine stealin’
chickens, or runnin’ after other wimen! Now wouldn’t there be the
handsomest fuss Pordunk ever looked at! (Looking about.) I guess
them fellers are snorin’ by this time. (Exit R., cautiously.)
Scene 4. Room covering whole stage. Door at R. centre. Large
box, R. U. E. Hezekiah and Barney disc. rear centre, chained to a ring
in the floor.

Hezekiah. I’ll bet ye tew dollars that feller come to the


conclewshun that he must er stole my gun from a whole regiment.
Barney. And the grayback thafe at the table, that twitted me about
the guard-house.
Hezekiah. Guess he thought he was goin’ through a fullin’ mill.
Barney. The blackguard! (Very sober.)
Hezekiah. ’Drather give fifty dollars than ter had yer hit the old
General.
Barney. How the divil should I know he was a general, without the
two brass things on ’im?
Hezekiah. All them fellers az has ritin’ tools and tables in their
tents, is generals.
Barney. Didn’t the sargeant tell me I was never to know one er
thim without the two brass things on him?
Hezekiah. It don’t make no difference, now ye bin gone and done
it.
Barney. Didn’t he begin it, twittin’ me about the guard-house, the
thafe!
Hezekiah. He was only callin’ the guard for help.
Barney. The blackguard! Whin he was as big as I! And he called
thim three spalpeens a coort, when it takes more than two dozen to
make one er thim any day. (Door opens R., rebel soldier enters and
reads from a paper.)
Soldier. The General commanding orders that the two union
prisoners, O’Flanagan and Goferum, convicted of spying in the
confederate camp, be notified that they are to be shot at daylight. Per
order General commanding. (Exit soldier, R. Barney and Hez. look
at each other a moment in silence.)
Barney. He will do that?
Hezekiah. That’s the kind of hairpin he is.
Barney. The blackguard!
Hezekiah. Wal, I guess I’ve airn’t the powder and shot. If my old
shooter hain’t tapped a hundred and fifty er them critters, you can
jest hope ter holler.
Barney. I will get some lawyer to appeal that coort.
Hezekiah. You get out!
Barney. That was no coort. The constitution of Ameriky says
nothing about a coort like that.
Hezekiah. It don’t make no difference. The shootin’ will come.
They don’t care for constitewshuns down here.
Barney. I’ll have that thafe tried for murder if he does that. And I’ll
tell him that to his face, too. I don’t care who any man is that will do
an illagal thing like that.
Hezekiah. They don’t stop for law down here.
Barney. The more the shame for ’em. He will have the contimpt er
the wurruld upon ’im.
Hezekiah. It wouldn’t do no good. They’ll bury you at daylight.
(Short silence.)
Barney. And there ain’t niver a praste to be had in this haythen
country at all.
Hezekiah. Ye don’t need none. If I hain’t licked rebels enough ter
get ter heaven without a priest, they can jest kick me out.
Barney. Havn’t I done that same meself?
Hezekiah. So ye have, Barney, and this ain’t yer own country,
neither. If they don’t give ye two harps to my one, it ain’t doin’ the
fair thing by ye.
Barney. Divil a bit do I care for a harp, if I can get out er this.
(Door opens, and Sally appears with two carbines in her hands;
hesitates a moment.)
Hezekiah. Now let me die.
Barney. ’Pon my word.
Hezekiah. Come here, and let me see if you ain’t a ghost. (Sally
lays carbines behind the box and rushes to embrace Hez.)
Barney. Give us a taste er that.
Hezekiah. You git out. There ain’t enough ter go round. (Sally
tries to unfasten irons.)
Barney. Oh don’t you spread yourself. I have one er thim. (Turns
away.)
Sal. (hunting round for axe.) Hain’t ye got no axe, Hez.?
Hezekiah. ’Taint no use, Sal. Them irons can’t be broke.
Sally. You git out, Hez. You jest show me where they keep the axe.
Hezekiah. They don’t leave no axes round here. If ye had one, ye’d
get up such a noise, old Hood and the whole coop would be down
here whoopin’.
Sally. I got the whole caboodle asleep with opium.
Hezekiah. ’Taint no use, Sal. That Keele Brightly said we was
spies, and we’re goin ter get shot at daylight. (Sally speechless with
astonishment.)
Barney. The thafe. (Sally drops on her knees sobbing.)
Sally. Oh what shall I do?
Hezekiah. I know how’ yer heart is, Sal, but ye can’t do us no
good. Jest git out as fast as ye can, and save yourself.
Barney. And tell Gineral Halcom about it, and divil a bit but he
will bat that spalpeen in the mornin’.
Sally. (Springing to her feet and wiping eyes.) I have it. (Dashes
for the door.) I know what I’ll do.
Hezekiah. Say, Sal. (She turns back.) Perhaps I shan’t never see ye
again. (Sally falls on his breast sobbing.) Tell mother she ain’t got
nothin’ to be ashamed on about me, except I’m rough, and can’t talk
so fine as some folks. Now she is cheated out of her part er the farm,
and the old man is so mean. I don’t know what she will do. I’ve sent
her all my wages and bounty.
Sally. Keep yer upper lip solid, Hez.; cos if yer lost to yer mother,
she can have a home with me as long as she lives. Good bye. I got to
get ye out, and I ain’t no time to lose. (Dashes out at R. door.)
Barney. ’Pon my word, that gal will knock the hell’s blazes out er
thim spalpeens, or I’m a thafe and a liar.
Hezekiah. Ain’t she a rusher?
Barney. ’Pon me word she is. Yer a lucky boy to have a gal like
that.
Hezekiah. Makes me sick, cos it’s all goin’ for nothin’. (Makes a
bad face, as if to cry.)
Barney. Ah-r, don’t be doin’ that. Thim blackguards will be sayin’
yer a Yankee coward.
Hezekiah. The man that can’t grind out some grief at leavin’ a gal
like that, ain’t got brains enough to know what he’s losin’.
Barney. Indade! Isn’t Biddy Maloney as fine a gal as she, barrin’
the fitin’? (Door opens at R., and Keele Brightly enters, followed by
D’Arneaux and guard, one of whom proceeds to iron D’A. to the
same ring with Hez. and Barney.)
Brightly. (Looking about and at prisoners.) As incomprehensible
as ever. The guard drugged and disarmed, and the prisoners
unmolested. Corporal, place a guard of twenty men around this
building, and you have my orders to shoot any person, man or
woman, approaching it without authority. I have placed a barrel of
powder beneath, with a fuse attached, leading out under the door. If
the Yankees attack us before daybreak, fire the fuse, or kill the
prisoners, and join your regiment at once. (Guard leaves with
Corporal, R. Brightly lingers to see all is secure, then leaves R.)
Hezekiah. (To Barney.) Bet ye tew dollars this old machine is
about gin out. They’re killin’ their own.
Barney. (To Hez.) Is he a Gineral? (D’A. hangs head.)
Hezekiah. (To D’A.) Say! Yer couldn’t tell a feller who’s gittin’
licked outside, could ye? (D’A. gives them no attention.)
Barney. (To D’A.) You don’t be talkin’?
Hezekiah. (To D’A.) Talk is cheap, and I thought I’d give ye a
chance on what ye had the most on.
Barney. Shoot thim at daylight, sez he. (Makes a bad face as if
about to cry.)
Hezekiah. Don’t be blubberin’, Barney.
Barney. Don’t you see the daylight is comin’ through thim cracks
there?
Hezekiah. Let her come. It ain’t goin’ to last long. (A board lifts up
at L. and Zina crawls up through.)
D’A. Zina!
Hezekiah. Now let me die!
Barney. ’Pon my word! (Zina motions quiet.)
Zina. The guard! Master D’Arneaux, how are you here?
D’A. A victim of the falsehood of your master.
Zina. How?
D’A. Convicted of treason by false testimony, and sentenced to die
at sunrise.
Zina. Oh this is so cowardly and unjust to you, who have been so
brave and kind. Oh what shall I do?
D’A. You can do nothing, Zina.
Zina. I will go to the General and say it is not true.
D’A. You are but a poor slave girl. It would avail nothing. Zina,
through economy and speculations, I have become possessed of five
thousand dollars in gold. It is all buried beneath the roots of the old
cotton-wood that stands by the grave of our Nelly. No one but my
mother knows this. If, by the fortunes of war, I should fall, it would
keep my mother from want. If, when peace and independence come,
and I should live, to buy your freedom, when I had determined to
offer you my heart, hand, and the honor of a soldier.
Zina. Oh you would not throw yourself away on a poor slave! You
do not know what you say!
D’A. This has been the nurtured ambition of my heart, since, with
all your native goodness, I saw your generous devotion to my
helpless old mother.
Zina. How can you love a poor, degraded slave girl, who has
nothing to offer but these miserable rags, and the memory that she
came of the hated race, so despised by all the world. (Falls on her
knees, covers face.)
D’A. As God loves goodness in the human heart—as manhood
admires the noble, unselfish woman, though her covering be
undeserving rags—as the heart plays captive to the most generous
impulses of nature—as the honor of a soldier reaches out to grasp its
ideal, so do I offer my tribute of love. Zina, all these dreams of the
future die with me when the sun rises over the eastern hills. Go out
from here. Avoid the guard. Find the money, and fly with my mother,
where you can be free. Save my mother from want, and I am content.
Waste no time, or you too may be lost.
Zina. Oh I cannot be so cowardly as to leave you now! (Rising.)
D’A. Why did you come here, where there is nothing but danger?
Zina. (Pointing to Hez. and Barney.) To save these who have been
so good and kind to me. When my master had turned me away to
starve, these men gave me their own food and blankets when the
storm was cold and pitiless. (Shot R. Zina goes to R. door to listen.)
D’A. (To Hez. and Bar.) My hand, good fellows. One often sees
that to admire in an enemy. (Shake all, Hez. grudgingly. Zina looks
around the room and discovers the carbines, places them on the
box.)
Barney. When I was first lookin’ at ye, didn’t I be knowin’ ye was
no blackguard.
D’A. When the other world begins to lift its shadows to light us to
the other side, the animosities of this life should be forgotten.
Hezekiah. (To D’A.) Give me your hand again. I allus said I’d
never shake with a rebel, but I’ll take it all back.
D’A. Zina, before I die, there is a secret in your history the
excitement of the hour had well nigh caused me to forget. It came to
me by accident. You were not born a slave!
Z. Then who am I?
D’A. A lost child of the Halcoms!
Zina. (Falling on her knees and covering her face.) My brave,
noble brother!
D’A. While confined, previous to my trial, I overheard
conversation between Brightly and one of his ruffian comrades,
detailing your history and a plan for your destruction. The reason—
slavery is abrogated, and you are one of the Halcoms. Seventeen
years since, Brightly was the leader of a band of Regulators, raised to
protect the planters from the abolitionists, who were running off
their help. I was a member of that company, though a mere boy. An
old political grudge had existed between Brightly and your father for
many years. On a dark December night, backed by a crowd of
selected desperadoes, he murdered your father when he was without
means of defence, outraged and killed your mother,—then fired the
house.
Zina. (Shuddering.) My poor mother! (Sobbing.)
D’A. Some of those men are now standing guard around this
building. You were then a helpless infant in the cradle. Old Milly, the
nurse, escaped with you to the wood. Two days after you were both
kidnapped by Brightly, taken to his plantation in Alabama, where he
raised you as a slave. At the time of the murder, your brother Frank,
at the age of 12 years, was educating in the free schools of New
England. During the last 15 years he has not ceased to search for the
murderer of his family. He has no knowledge that you have been
saved from the burning home. Within the last three years, Brightly
has repeatedly tried to sell you to cotton planters on the coast. Only
my vigilance and the color of your skin have prevented it. It was
Brightly’s hand that sent the bullet after your life, on the night of
your brother’s escape. If you are found here, your life is lost. Go now.
Day is breaking. God bless you. Remember my mother. (Distant
rapid firing.)
Zina. (Springing to her feet and listening,) Hark! My brother is
coming!
D’A. Escape while you can. Quick, or you will be lost!
Zina. (Flings off turban.) I will defend you until his sword shall
save us!
D’A. You cannot, you are a weak girl! (Zina bars the door and
slings carbine on belt.)
Zina. So I can fight and die with you! (Rebs. attack the door
furiously. Zina holds it.)
D’A. This building is mined and you will be blown to atoms. (Zina
holds the door.)
Zina. I have filled the powder with water!
D’A. You will be killed. Conceal yourself beneath the floor. (Rebs.
knock holes in middle of door with an axe.)
Hezekiah. Yes, go, Zina. God bless yer brave little heart.
Barney. Please go, little girl, ye can’t do us no good! (Heavy,
increasing firing R. Blows on the door rapid and continuous. She
holds it.)
D’A. You cannot defend us! (Zina seizes carbine and, springing
back, exclaims:)
Zina. I am a Halcom! This rifle shall avenge my mother’s life.
(Confederates smash the door until they knock it to pieces. Then the
door breaks down and a crowd of rebels rush through, 5 rapid shots
from Zina and they retreat to outside, 3 men fall. She drops the old
and seizes another carbine as Brightly urges them back. Five more
shots throw them into a crowding confusion at the door, when she
stops firing from unloading. Brightly and six soldiers rush to left
front. Zina draws knife to defend prisoners.)
Brightly. (As he and soldiers dash to L.) Kill the prisoners.
(Soldiers spring forward to bayonet them and are met by Zina.)
Zina. Who strikes the helpless is a coward! (Soldiers hesitate, with
bayonets at her breast.)
Brightly. You shall be food for my dogs!
Zina. Coward! Thief! Assassin of my mother!
Brightly. So you bite the hand that fed you to life!
Zina. My hands have earned your bread and mine!
Brightly. (To soldiers.) Kill her! (Halcom dashes in R. followed
by soldiers, who cover rebs.)
Halcom. Throw down your arms! (Rebels drop arms and Zina
rushes into her brother’s arms saying:)
Zina. My brother!
Halcom. I have long suspected this. My mother’s face lives in this
girl and in my memory seventeen years since as she begged for mercy
from a man who never felt it.
Brightly. I am a prisoner of war.
Halcom. We have met, sir, for the last time. You shall fight women
and helpless prisoners no longer.
Brightly. Then have done with your preaching and come on!
(Drops sword and draws knife.)
Halcom. I will not keep you waiting long! You shall fight for your
life this time like an honorable man!
Brightly. (To reb. soldiers) The psalm of a traitor who has
stabbed his country in the back!
Halcom. (To prisoners and Union soldiers.) If this man passes my
hands safely he shall go free! (Taking advantage while Halcom is
speaking to the Union prisoners, Brightly rushes forward to stab
him in the back, treacherously. Zina catches his purpose, drops on
one knee, knocks his hand up and drives her knife to the hilt in the
ruffian’s heart. Brightly staggers back and falls. Zina springs up,
aghast at the result, then drops knife, covering her face, says:—)
Zina. My poor mother! (Drops on her knees, then face, sobbing
until curtain falls.)

THE END.
TRANSCRIBER’S NOTES
1. The stage directions were inconsistently formatted. Some
were italicized and some not. Also some were in
parentheses and some in square brackets. (As if the
typesetter ran out of parentheses or italics
occassionally.) They were all altered to parentheses and
italics.
2. Silently corrected typographical errors and variations in
spelling.
3. Retained anachronistic, non-standard, and uncertain
spellings as printed.
*** END OF THE PROJECT GUTENBERG EBOOK ZINA: THE SLAVE
GIRL; OR, WHICH THE TRAITOR? ***

Updated editions will replace the previous one—the old editions will
be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free


distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be


used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:

You might also like