Buy ebook A Gentle Introduction to Stata Fourth Edition Alan C. Acock cheap price

Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

Download the full version of the ebook at ebookfinal.

com

A Gentle Introduction to Stata Fourth Edition Alan


C. Acock

https://ebookfinal.com/download/a-gentle-introduction-to-
stata-fourth-edition-alan-c-acock/

OR CLICK BUTTON

DOWNLOAD EBOOK

Download more ebook instantly today at https://ebookfinal.com


Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

A Handbook of Statistical Analyses Using Stata Fourth


Edition Sophia Rabe-Hesketh

https://ebookfinal.com/download/a-handbook-of-statistical-analyses-
using-stata-fourth-edition-sophia-rabe-hesketh/

ebookfinal.com

An Introduction to Stata Programming 2nd Edition


Christopher F. Baum

https://ebookfinal.com/download/an-introduction-to-stata-
programming-2nd-edition-christopher-f-baum/

ebookfinal.com

Introduction to Pharmaceutical Calculations Fourth Edition


Judith A Rees

https://ebookfinal.com/download/introduction-to-pharmaceutical-
calculations-fourth-edition-judith-a-rees/

ebookfinal.com

Probability Decisions and Games A Gentle Introduction


using R 1st Edition Abel Rodríguez

https://ebookfinal.com/download/probability-decisions-and-games-a-
gentle-introduction-using-r-1st-edition-abel-rodriguez/

ebookfinal.com
An Introduction to Stochastic Modeling Fourth Edition
Pinsky

https://ebookfinal.com/download/an-introduction-to-stochastic-
modeling-fourth-edition-pinsky/

ebookfinal.com

Stata reference manual release 7 Stata

https://ebookfinal.com/download/stata-reference-manual-
release-7-stata/

ebookfinal.com

Missing Data A Gentle Introduction Methodology In The


Social Sciences 1st Edition Patrick E. Mcknight

https://ebookfinal.com/download/missing-data-a-gentle-introduction-
methodology-in-the-social-sciences-1st-edition-patrick-e-mcknight/

ebookfinal.com

A Visual Guide to Stata Graphics 3rd Edition By Michael N.


Mitchell

https://ebookfinal.com/download/a-visual-guide-to-stata-graphics-3rd-
edition-by-michael-n-mitchell/

ebookfinal.com

A Gentle Guide to Constraint Logic Programming via ECLiPSe


3rd Edition Antoni Niederlinski

https://ebookfinal.com/download/a-gentle-guide-to-constraint-logic-
programming-via-eclipse-3rd-edition-antoni-niederlinski/

ebookfinal.com
A Gentle Introduction to Stata
4th Edition
A Gentle Introduction to Stata
4th Edition

ALAN C. ACOCK
Oregon State University

A Stata Press Publication


StataCorp LP
College Station, Texas
®
Copyright c 2006, 2008, 2010, 2012, 2014 by StataCorp LP
All rights reserved. First edition 2006
Second edition 2008
Third edition 2010
Revised third edition 2012
Fourth edition 2014

Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845
Typeset in LATEX 2ε
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1

ISBN-10: 1-59718-142-0
ISBN-13: 978-1-59718-142-6

Library of Congress Control Number: 2014935652

No part of this book may be reproduced, stored in a retrieval system, or transcribed, in any
form or by any means—electronic, mechanical, photocopy, recording, or otherwise—without
the prior written permission of StataCorp LP.
Stata, , Stata Press, Mata, , and NetCourse are registered trademarks of
StataCorp LP.
Stata and Stata Press are registered trademarks with the World Intellectual Property Organi-
zation of the United Nations.
LATEX 2ε is a trademark of the American Mathematical Society.
Contents
List of figures xiii
List of tables xix
List of boxed tips xxi
Preface xxv
Support materials for the book xxix
1 Getting started 1
1.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The Stata screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Using an existing dataset . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 An example of a short Stata session . . . . . . . . . . . . . . . . . . 11
1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Entering data 21
2.1 Creating a dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 An example questionnaire . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Developing a coding system . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Entering data using the Data Editor . . . . . . . . . . . . . . . . . . 29
2.4.1 Value labels . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 The Variables Manager . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6 The Data Editor (Browse) view . . . . . . . . . . . . . . . . . . . . . 40
2.7 Saving your dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.8 Checking the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
vi Contents

2.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Preparing data for analysis 49
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Planning your work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Creating value labels . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4 Reverse-code variables . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Creating and modifying variables . . . . . . . . . . . . . . . . . . . . 63
3.6 Creating scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.7 Saving some of your data . . . . . . . . . . . . . . . . . . . . . . . . 71
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4 Working with commands, do-files, and results 75
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 How Stata commands are constructed . . . . . . . . . . . . . . . . . 76
4.3 Creating a do-file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4 Copying your results to a word processor . . . . . . . . . . . . . . . . 86
4.5 Logging your command file . . . . . . . . . . . . . . . . . . . . . . . 87
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5 Descriptive statistics and graphs for one variable 91
5.1 Descriptive statistics and graphs . . . . . . . . . . . . . . . . . . . . 91
5.2 Where is the center of a distribution? . . . . . . . . . . . . . . . . . . 92
5.3 How dispersed is the distribution? . . . . . . . . . . . . . . . . . . . 96
5.4 Statistics and graphs—unordered categories . . . . . . . . . . . . . . 98
5.5 Statistics and graphs—ordered categories and variables . . . . . . . . 107
5.6 Statistics and graphs—quantitative variables . . . . . . . . . . . . . 109
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6 Statistics and graphs for two categorical variables 121
6.1 Relationship between categorical variables . . . . . . . . . . . . . . . 121
Contents vii

6.2 Cross-tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122


6.3 Chi-squared test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.3.1 Degrees of freedom . . . . . . . . . . . . . . . . . . . . . . . 127
6.3.2 Probability tables . . . . . . . . . . . . . . . . . . . . . . . . 127
6.4 Percentages and measures of association . . . . . . . . . . . . . . . . 130
6.5 Odds ratios when dependent variable has two categories . . . . . . . 133
6.6 Ordered categorical variables . . . . . . . . . . . . . . . . . . . . . . 135
6.7 Interactive tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.8 Tables—linking categorical and quantitative variables . . . . . . . . 140
6.9 Power analysis when using a chi-squared test of significance . . . . . 143
6.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7 Tests for one or two means 149
7.1 Introduction to tests for one or two means . . . . . . . . . . . . . . . 149
7.2 Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.3 Random sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.4 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.5 One-sample test of a proportion . . . . . . . . . . . . . . . . . . . . . 155
7.6 Two-sample test of a proportion . . . . . . . . . . . . . . . . . . . . 157
7.7 One-sample test of means . . . . . . . . . . . . . . . . . . . . . . . . 162
7.8 Two-sample test of group means . . . . . . . . . . . . . . . . . . . . 164
7.8.1 Testing for unequal variances . . . . . . . . . . . . . . . . . 170
7.9 Repeated-measures t test . . . . . . . . . . . . . . . . . . . . . . . . 171
7.10 Power analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.11 Nonparametric alternatives . . . . . . . . . . . . . . . . . . . . . . . 183
7.11.1 Mann–Whitney two-sample rank-sum test . . . . . . . . . . 183
7.11.2 Nonparametric alternative: Median test . . . . . . . . . . . 184
7.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
viii Contents

8 Bivariate correlation and regression 189


8.1 Introduction to bivariate correlation and regression . . . . . . . . . . 189
8.2 Scattergrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8.3 Plotting the regression line . . . . . . . . . . . . . . . . . . . . . . . . 195
8.4 An alternative to producing a scattergram, binscatter . . . . . . . . 196
8.5 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.6 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.7 Spearman’s rho: Rank-order correlation for ordinal data . . . . . . . 211
8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9 Analysis of variance 215
9.1 The logic of one-way analysis of variance . . . . . . . . . . . . . . . . 215
9.2 ANOVA example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
9.3 ANOVA example using survey data . . . . . . . . . . . . . . . . . . . 225
9.4 A nonparametric alternative to ANOVA . . . . . . . . . . . . . . . . 228
9.5 Analysis of covariance . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.6 Two-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.7 Repeated-measures design . . . . . . . . . . . . . . . . . . . . . . . . 249
9.8 Intraclass correlation—measuring agreement . . . . . . . . . . . . . . 255
9.9 Power analysis with ANOVA . . . . . . . . . . . . . . . . . . . . . . 257
9.9.1 One-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . 257
Power analysis for two-way ANOVA . . . . . . . . . . . . . 260
9.9.2 Power analysis for repeated-measures ANOVA . . . . . . . . 262
9.9.3 Summary of power analysis for ANOVA . . . . . . . . . . . 264
9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
10 Multiple regression 267
10.1 Introduction to multiple regression . . . . . . . . . . . . . . . . . . . 267
10.2 What is multiple regression? . . . . . . . . . . . . . . . . . . . . . . . 268
10.3 The basic multiple regression command . . . . . . . . . . . . . . . . 269
Contents ix

10.4 Increment in R-squared: Semipartial correlations . . . . . . . . . . . 273


10.5 Is the dependent variable normally distributed? . . . . . . . . . . . . 275
10.6 Are the residuals normally distributed? . . . . . . . . . . . . . . . . . 278
10.7 Regression diagnostic statistics . . . . . . . . . . . . . . . . . . . . . 283
10.7.1 Outliers and influential cases . . . . . . . . . . . . . . . . . . 283
10.7.2 Influential observations: DFbeta . . . . . . . . . . . . . . . . 286
10.7.3 Combinations of variables may cause problems . . . . . . . . 287
10.8 Weighted data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10.9 Categorical predictors and hierarchical regression . . . . . . . . . . . 291
10.10 A shortcut for working with a categorical variable . . . . . . . . . . . 299
10.11 Fundamentals of interaction . . . . . . . . . . . . . . . . . . . . . . . 301
10.12 Nonlinear relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
10.12.1 Fitting a quadratic model . . . . . . . . . . . . . . . . . . . 311
10.12.2 Centering when using a quadratic term . . . . . . . . . . . . 317
10.12.3 Do we need to add a quadratic component? . . . . . . . . . 319
10.13 Power analysis in multiple regression . . . . . . . . . . . . . . . . . . 321
10.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
10.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
11 Logistic regression 329
11.1 Introduction to logistic regression . . . . . . . . . . . . . . . . . . . . 329
11.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
11.3 What is an odds ratio and a logit? . . . . . . . . . . . . . . . . . . . 334
11.3.1 The odds ratio . . . . . . . . . . . . . . . . . . . . . . . . . 336
11.3.2 The logit transformation . . . . . . . . . . . . . . . . . . . . 336
11.4 Data used in the rest of the chapter . . . . . . . . . . . . . . . . . . 337
11.5 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
11.6 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
11.6.1 Testing individual coefficients . . . . . . . . . . . . . . . . . 346
11.6.2 Testing sets of coefficients . . . . . . . . . . . . . . . . . . . 347
11.7 More on interpreting results from logistic regression . . . . . . . . . . 349
x Contents

11.8 Nested logistic regressions . . . . . . . . . . . . . . . . . . . . . . . . 353


11.9 Power analysis when doing logistic regression . . . . . . . . . . . . . 355
11.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
11.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
12 Measurement, reliability, and validity 361
12.1 Overview of reliability and validity . . . . . . . . . . . . . . . . . . . 361
12.2 Constructing a scale . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
12.2.1 Generating a mean score for each person . . . . . . . . . . . 363
12.3 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
12.3.1 Stability and test–retest reliability . . . . . . . . . . . . . . 367
12.3.2 Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
12.3.3 Split-half and alpha reliability—internal consistency . . . . 368
12.3.4 Kuder–Richardson reliability for dichotomous items . . . . . 371
12.3.5 Rater agreement—kappa (κ) . . . . . . . . . . . . . . . . . . 372
12.4 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
12.4.1 Expert judgment . . . . . . . . . . . . . . . . . . . . . . . . 375
12.4.2 Criterion-related validity . . . . . . . . . . . . . . . . . . . . 376
12.4.3 Construct validity . . . . . . . . . . . . . . . . . . . . . . . . 377
12.5 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
12.6 PCF analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
12.6.1 Orthogonal rotation: Varimax . . . . . . . . . . . . . . . . . 386
12.6.2 Oblique rotation: Promax . . . . . . . . . . . . . . . . . . . 388
12.7 But we wanted one scale, not four scales . . . . . . . . . . . . . . . . 389
12.7.1 Scoring our variable . . . . . . . . . . . . . . . . . . . . . . . 390
12.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
12.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
13 Working with missing values—multiple imputation 393
13.1 The nature of the problem . . . . . . . . . . . . . . . . . . . . . . . . 393
13.2 Multiple imputation and its assumptions about the mechanism for
missingness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Contents xi

13.3 What variables do we include when doing imputations? . . . . . . . 397


13.4 Multiple imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
13.5 A detailed example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
13.5.1 Preliminary analysis . . . . . . . . . . . . . . . . . . . . . . 400
13.5.2 Setup and multiple-imputation stage . . . . . . . . . . . . . 402
13.5.3 The analysis stage . . . . . . . . . . . . . . . . . . . . . . . 405
13.5.4 For those who want an R2 and standardized βs . . . . . . . 406
13.5.5 When impossible values are imputed . . . . . . . . . . . . . 408
13.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
13.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
14 The sem and gsem commands 413
14.1 Ordinary least-squares regression models using sem . . . . . . . . . . 413
14.1.1 Using the SEM Builder to fit a basic regression model . . . 415
14.2 A quick way to draw a regression model and a fresh start . . . . . . 422
14.2.1 Using sem without the SEM Builder . . . . . . . . . . . . . 425
14.3 The gsem command for logistic regression . . . . . . . . . . . . . . . 425
14.3.1 Fitting the model using the logit command . . . . . . . . . 426
14.3.2 Fitting the model using the gsem command . . . . . . . . . 428
14.4 Path analysis and mediation . . . . . . . . . . . . . . . . . . . . . . . 434
14.5 Conclusions and what is next for the sem command . . . . . . . . . . 438
14.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
A What’s next? 443
A.1 Introduction to the appendix . . . . . . . . . . . . . . . . . . . . . . 443
A.2 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
A.2.1 Web resources . . . . . . . . . . . . . . . . . . . . . . . . . . 444
A.2.2 Books about Stata . . . . . . . . . . . . . . . . . . . . . . . 446
A.2.3 Short courses . . . . . . . . . . . . . . . . . . . . . . . . . . 449
A.2.4 Acquiring data . . . . . . . . . . . . . . . . . . . . . . . . . 449
A.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
xii Contents

References 453
Author index 457
Subject index 459
Figures

1.1 Stata menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 Stata’s opening screen . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 The toolbar in Stata for Windows . . . . . . . . . . . . . . . . . . . 9
1.4 The toolbar in Stata for Mac . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Stata command to open cancer.dta . . . . . . . . . . . . . . . . . . 10
1.6 The summarize dialog box . . . . . . . . . . . . . . . . . . . . . . . 12
1.7 Histogram of age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8 The histogram dialog box . . . . . . . . . . . . . . . . . . . . . . . 14
1.9 The tabs on the histogram dialog box . . . . . . . . . . . . . . . . . 15
1.10 The Titles tab of the histogram dialog box . . . . . . . . . . . . . . 15
1.11 First attempt at an improved histogram . . . . . . . . . . . . . . . . 16
1.12 Final histogram of age . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1 Example questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . 24


2.2 The Data Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Data Editor (Edit) and Data Editor (Browse) icons on the toolbar . 30
2.4 Variable name and variable label . . . . . . . . . . . . . . . . . . . . 31
2.5 Data Editor with a complete dataset . . . . . . . . . . . . . . . . . . 33
2.6 The Variables Manager icon on the Stata toolbar . . . . . . . . . . . 34
2.7 Using the Variables Manager to add a label for gender . . . . . . . 35
2.8 Variables Manager with value labels added . . . . . . . . . . . . . . 38
2.9 Dataset shown in the Data Editor (Browse) mode . . . . . . . . . . 41
2.10 The describe dialog box . . . . . . . . . . . . . . . . . . . . . . . . 46

3.1 The Variables Manager . . . . . . . . . . . . . . . . . . . . . . . . . 56


xiv Figures

3.2 The Variables Manager with value labels assigned . . . . . . . . . . 57


3.3 recode: specifying recode rules on the Main tab . . . . . . . . . . . 60
3.4 recode: specifying new variable names on the Options tab . . . . . . 60
3.5 The generate dialog box . . . . . . . . . . . . . . . . . . . . . . . . 66
3.6 Two-way tabulation dialog box . . . . . . . . . . . . . . . . . . . . . 67
3.7 The Main tab for the egen dialog box . . . . . . . . . . . . . . . . . 68
3.8 The by/if/in tab for the egen dialog box . . . . . . . . . . . . . . . . 70

4.1 The Do-file Editor icon on the Stata menu . . . . . . . . . . . . . . 81


4.2 The Do-file Editor of Stata for Windows . . . . . . . . . . . . . . . . 82
4.3 The Do-file Editor toolbar of Stata for Windows . . . . . . . . . . . 82
4.4 Highlighting in the Do-file Editor . . . . . . . . . . . . . . . . . . . . 83
4.5 Commands in the Do-file Editor window of Stata for Mac . . . . . . 85

5.1 How many children do families have? . . . . . . . . . . . . . . . . . 94


5.2 Distributions with same M = 1000 but SDs = 100 or 200 . . . . . . . 97
5.3 Dialog box for frequency tabulation . . . . . . . . . . . . . . . . . . 98
5.4 The Options tab for pie charts (by category) . . . . . . . . . . . . . 103
5.5 Pie charts of marital status in the United States . . . . . . . . . . . 103
5.6 The Graph Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7 Using the histogram dialog box to make a bar chart . . . . . . . . . 106
5.8 Bar chart of marital status of U.S. adults . . . . . . . . . . . . . . . 106
5.9 Histogram of political views of U.S. adults . . . . . . . . . . . . . . . 109
5.10 Histogram of time spent on the World Wide Web . . . . . . . . . . . 112
5.11 Histogram of time spent on the World Wide Web (fewer than 25
hours a week, by gender) . . . . . . . . . . . . . . . . . . . . . . . . 113
5.12 The Main tab for the tabstat dialog box . . . . . . . . . . . . . . . 114
5.13 Box plot of time spent on the World Wide Web (fewer than 25
hours a week, by gender) . . . . . . . . . . . . . . . . . . . . . . . . 116

6.1 The Main tab for creating a cross-tabulation . . . . . . . . . . . . . 123


6.2 Results of search chitable . . . . . . . . . . . . . . . . . . . . . . 128
Figures xv

6.3 Entering data for a table . . . . . . . . . . . . . . . . . . . . . . . . 139


6.4 Summarizing a quantitative variable by categories of a categorical
variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.5 The Bar label properties dialog box . . . . . . . . . . . . . . . . . . 142
6.6 Bar graph summarizing a quantitative variable by categories of a
categorical variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

7.1 Restrict observations to those who score 1 on wrkstat . . . . . . . . 163


7.2 Two-sample t test using groups dialog box . . . . . . . . . . . . . . . 166
7.3 Cohen’s d effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.4 Power and sample-size control panel . . . . . . . . . . . . . . . . . . 178

8.1 Dialog box for a scattergram . . . . . . . . . . . . . . . . . . . . . . 191


8.2 Scattergram of son’s education on father’s education . . . . . . . . . 192
8.3 Scattergram of son’s education on father’s education with “jitter” . 193
8.4 Scattergram of son’s education on father’s education with a
regression line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8.5 Scattergram relating hourly wage to job tenure . . . . . . . . . . . . 197
8.6 Average wage by tenure . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.7 Relationship between wages and tenure with a discontinuity in the
relationship at 3 years . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.8 Relationship between wages and tenure with a discontinuity in the
relationship at 3 years; whites shown with solid lines and blacks
shown with dashed lines . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.9 The Model tab of the regress dialog box . . . . . . . . . . . . . . . 207
8.10 Confidence band around regression prediction . . . . . . . . . . . . . 211

9.1 One-way analysis-of-variance dialog box . . . . . . . . . . . . . . . . 219


9.2 Bar graph of relationship between prestige and mobility . . . . . . . 227
9.3 Bar graph of support for stem cell research by political party
identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9.4 Box plot of support for stem cell research by political party
identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.5 The Specification 1 dialog box under margins . . . . . . . . . . . . . 239
xvi Figures

9.6 Hours of TV watching by whether the person works full time . . . . 247
9.7 Hours of TV watching by whether the person is married . . . . . . . 248
9.8 Hours of TV watching by whether the person is married and
whether the person works full time . . . . . . . . . . . . . . . . . . . 249
9.9 Effect size for power of 0.80, alpha of 0.05 for N ’s from 40 to 500 . . 259
9.10 Effect size for power of 0.80 with two rows in each of the three
columns for N ’s from 100 to 300 . . . . . . . . . . . . . . . . . . . . 261
9.11 Effect size for power of 0.80, alpha of 0.05, four repeated
measurements, and a 0.60 correlation between measurements for
N ’s from 100 to 300 . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

10.1 The Model tab for multiple regression . . . . . . . . . . . . . . . . . 269


10.2 The Main tab of the pcorr dialog box . . . . . . . . . . . . . . . . . 274
10.3 Histogram of dependent variable, env con . . . . . . . . . . . . . . . 275
10.4 Hanging rootogram of dependent variable, env con . . . . . . . . . . 276
10.5 Heteroskedasticity of residuals . . . . . . . . . . . . . . . . . . . . . 280
10.6 Residual-versus-fitted plot . . . . . . . . . . . . . . . . . . . . . . . . 281
10.7 Actual value of environmental concern regressed on the predicted
value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
10.8 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.9 Education and gender predicting income, no interaction . . . . . . . 303
10.10 Education and gender predicting income, with interaction term . . . 307
10.11 Five quadratic curves . . . . . . . . . . . . . . . . . . . . . . . . . . 310
10.12 Graph of quadratic model . . . . . . . . . . . . . . . . . . . . . . . . 312
10.13 binscatter representation of nonlinear relationship between the
log of wages and total years of experience . . . . . . . . . . . . . . . 313
10.14 Quadratic model of relationship between total experience and log
of income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
10.15 Quadratic model relating log of income to total experienced where
experience is centered . . . . . . . . . . . . . . . . . . . . . . . . . . 318
10.16 Comparison of linear and quadratic models . . . . . . . . . . . . . . 319

11.1 Positive feedback and divorce . . . . . . . . . . . . . . . . . . . . . . 331


11.2 Predicted probability of positive feedback and divorce . . . . . . . . 332
Figures xvii

11.3 Predicted probability of positive feedback and logit of divorce . . . . 333


11.4 Positive feedback and divorce using OLS regression . . . . . . . . . . 334
11.5 Dialog box for doing logistic regression . . . . . . . . . . . . . . . . . 339
11.6 Risk factors associated with teen drinking . . . . . . . . . . . . . . . 344
11.7 Estimated probability that an adolescent drank in last month
adjusted for age, race, and frequency of family meals . . . . . . . . . 353

12.1 Scree plot: National priorities . . . . . . . . . . . . . . . . . . . . . . 386

14.1 SEM Builder on a Mac . . . . . . . . . . . . . . . . . . . . . . . . . . 415


14.2 Initial SEM diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
14.3 Adding variable names and correlations of independent variables . . 417
14.4 Result without any reformatting . . . . . . . . . . . . . . . . . . . . 419
14.5 Intermediate results . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
14.6 The SEM Text dialog box in Stata for Mac . . . . . . . . . . . . . . 421
14.7 Final result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
14.8 Regression component dialog box . . . . . . . . . . . . . . . . . . . . 423
14.9 Quick drawing of regression model . . . . . . . . . . . . . . . . . . . 424
14.10 Maximum likelihood estimation of model using listwise deletion . . . 424
14.11 A logistic regression model with the outcome, obese, clicked to
highlight it . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
14.12 Initial results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
14.13 Dialog box for changing information in a textbox . . . . . . . . . . . 433
14.14 Final results for logistic regression . . . . . . . . . . . . . . . . . . . 433
14.15 BMI predicted without using the quickfood variable . . . . . . . . . 434
14.16 A path model with the quickfood variable mediating part of the
effect of educ and incomeln on bmi . . . . . . . . . . . . . . . . . . 435
14.17 Direct effects without the mediator . . . . . . . . . . . . . . . . . . . 436
14.18 Final mediation model . . . . . . . . . . . . . . . . . . . . . . . . . . 437
14.19 More complex path model . . . . . . . . . . . . . . . . . . . . . . . . 440
xviii Figures

A.1 Growth of downloads of files from Statistical Software Components


(source: http://logec.repec.org/scripts/seriesstat.pf?
item=repec:boc:bocode) . . . . . . . . . . . . . . . . . . . . . . . . . 445
A.2 A path model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Tables

2.1 Example codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


2.2 Example coding sheet . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 New variable names and labels . . . . . . . . . . . . . . . . . . . . . 31

3.1 Sample project task outline . . . . . . . . . . . . . . . . . . . . . . . 51


3.2 NLSY97 sample codebook entries . . . . . . . . . . . . . . . . . . . . 53
3.3 Reverse-coding plan . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Arithmetic symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.1 Relational operators used by Stata . . . . . . . . . . . . . . . . . . . 78

5.1 Level of measurement and choice of average . . . . . . . . . . . . . . 94

9.1 Hypothetical data—wide view . . . . . . . . . . . . . . . . . . . . . 217


9.2 Hypothetical data—long view . . . . . . . . . . . . . . . . . . . . . . 218

10.1 Regression equation and Stata output . . . . . . . . . . . . . . . . . 271


10.2 Effect size of f 2 and R2 . . . . . . . . . . . . . . . . . . . . . . . . . 322

12.1 Four kinds of reliability and the appropriate statistical measure . . . 365
12.2 Correlations you might expect for one factor . . . . . . . . . . . . . 378
12.3 Correlations you might expect for two factors . . . . . . . . . . . . . 379

14.1 Selected families available with gsem . . . . . . . . . . . . . . . . . . 429


14.2 Direct and indirect effects of mother’s education and family income
on her BMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Boxed tips

Why do we show the dot prompt with these commands? . . . . . . . . . . 2


Setting how much output is in the Results window . . . . . . . . . . . . . . 4
Work along with the book . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Searching for help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Internet access to datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Clearing the Results window: The cls command . . . . . . . . . . . . . . . 16
When to use Submit and when to use OK . . . . . . . . . . . . . . . . . . . 17

Variables and items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22


I typed the letter l for the number 1 . . . . . . . . . . . . . . . . . . . . . 32
Saving data and different versions of Stata . . . . . . . . . . . . . . . . . . 42
Scrolling the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Working with Excel files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

What is a Stata dictionary file? . . . . . . . . . . . . . . . . . . . . . . . . . 50


Stata and capitalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
More on recoding rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Beyond egen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Deciding among different ways to do something . . . . . . . . . . . . . . . . 71

What is a command? What is a do-file? . . . . . . . . . . . . . . . . . . . . 76


Stata do-files for this book . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Saving tabular output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Tabulating a series of variables and including missing values . . . . . . . . 99


Obtaining both numbers and value labels . . . . . . . . . . . . . . . . . . . 102
xxii Boxed tips

Independent and dependent variables . . . . . . . . . . . . . . . . . . . . . 124


Reporting chi-squared results . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Why can φ be negative? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Random sample and randomization . . . . . . . . . . . . . . . . . . . . . . 151


Distinguishing between two p-values . . . . . . . . . . . . . . . . . . . . . . 157
Proportions and percentages . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Degrees of freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Effect size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

How can you get the same result each time? . . . . . . . . . . . . . . . . . 191
Predictors and outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Statistical and substantive significance . . . . . . . . . . . . . . . . . . . . . 202
Multiple-comparison procedures with correlations . . . . . . . . . . . . . . 206

Can Stata give me an F table? . . . . . . . . . . . . . . . . . . . . . . . . . 221


What are categorical covariates and what are continuous covariates? . . . . 233
Estimating the effect size and omega-squared, ω 2 . . . . . . . . . . . . . . 241
2
Estimating the effect size and omega-squared, ω , continued . . . . . . . . 242

Names for categorical variables . . . . . . . . . . . . . . . . . . . . . . . . . 292


More on testing a set of parameter estimates . . . . . . . . . . . . . . . . . 297
Tabular presentation of hierarchical regression models . . . . . . . . . . . . 299
Centering quantitative predictors before computing interaction terms . . . 305
Do not compare correlations across populations . . . . . . . . . . . . . . . . 308

Predicting a count variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 338


Using Stata as a calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Odds ratio versus relative-risk ratio . . . . . . . . . . . . . . . . . . . . . . 345

Requiring a 75% completion rate . . . . . . . . . . . . . . . . . . . . . . . . 364


A problem generating a total scale score . . . . . . . . . . . . . . . . . . . . 366
Alpha, average correlation, number of items . . . . . . . . . . . . . . . . . . 371
Boxed tips xxiii

What is a strong kappa? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374


What’s in a name? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Preface
This book was written with a particular reader in mind. This reader is learning social
statistics and needs to learn Stata but has no prior experience with other statistical
software packages. When I learned Stata, I found there were no books written explicitly
for this type of reader. There are certainly excellent books on Stata, but they assume
extensive prior experience with other packages, such as SAS or IBM SPSS Statistics; they
also assume a fairly advanced working knowledge of statistics. These books moved
quickly to advanced topics and left my intended reader in the dust. Readers who have
more background in statistical software and statistics will be able to read chapters
quickly and even skip sections. The goal is to move the true beginner to a level of
competence using Stata.

With this target reader in mind, I make far more use of the menus and dialog boxes
in Stata’s interface than do any other books about Stata. Advanced users may not
see the value in using the interface, and the more people learn about Stata, the less
they will rely on the interface. Also, even when you are using the interface, it is still
important to save a record of the sequence of commands you run. Although I rely on
the commands much more than the dialog boxes in the interface in my own work, I still
find value in the interface. The dialog boxes in the interface include many options that
I might not have known or might have forgotten.
To illustrate the interface as well as graphics, I have included more than 100 figures,
many of which show dialog boxes. I present many tables and extensive Stata “results”
as they appear on the screen. I interpret these results substantively in the belief that
beginning Stata users need to learn more than just how to produce the results—users
also need to be able to interpret them.

I have tried to use real data. There are a few examples where it is much easier to
illustrate a point with hypothetical data, but for the most part, I use data that are in
the public domain. For example, I use the General Social Surveys for 2002 and 2006
in many chapters, as well as the National Survey of Youth, 1997. I have simplified the
files by dropping many of the variables in the original datasets, but I have kept all the
observations. I have tried to use examples from several social-science fields, and I have
included a few extra variables in several datasets so that instructors, as well as readers,
can make additional examples and exercises that are tailored to their disciplines. People
who are used to working with statistics books that have contrived data with just a few
observations, presumably so work can be done by hand, may be surprised to see more
than 1,000 observations in this book’s datasets. Working with these files provides better
xxvi Preface

experience for other real-world data analysis. If you have your own data and the dataset
has a variety of variables, you may want to use your data instead of the data provided
with this book.
The exercises use the same datasets as the rest of the book. Several of the exercises
require some data management prior to fitting a model because I believe that learning
data management requires practice and cannot be isolated in a single chapter or single
set of exercises.
This book takes the student through much of what is done in introductory and
intermediate statistics courses. It covers descriptive statistics, charts, graphs, tests of
significance for simple tables, tests for one and two variables, correlation and regression,
analysis of variance, multiple regression, logistic regression, reliability, factor analysis,
and path analysis. There are chapters on constructing scales to measure variables and
on using multiple imputation for working with missing values.
By combining this coverage with an introduction to creating and managing a dataset,
the book will prepare students to go even further on their own or with additional re-
sources. More advanced statistical analysis using Stata is often even simpler from a
programming point of view than what we will cover here. If an intermediate course
goes beyond what we do with logistic regression to multinomial logistic regression, for
example, the programming is simple enough. The logit command can simply be re-
placed with the mlogit command. The added complexity of these advanced statistics is
the statistics themselves and not the Stata commands that implement them. Therefore,
although more advanced statistics are not included in this book, the reader who learns
these statistics will be more than able to learn the corresponding Stata commands from
the Stata documentation and help system.
I would like to point out the use of punctuation after quotes in this book. While the
standard U.S. style of punctuation calls for periods and commas at the end of a quote
to always be enclosed within the quotation marks, Stata Press follows a style typically
used in mathematics books and British literature. In this style, any punctuation mark
at the end of a quote is included within the quotation marks only if it is part of the
quote. For instance, the pleased Stata user said she thought that Stata was a “very
powerful program”. Another user simply said, “I love Stata.”
I assume that the reader is running Stata 13, or a later version, on a Windows-based
PC. Stata works equally as well on Mac and on Unix systems. Readers who are running
Stata on one of those systems will have to make a few minor adjustments to some of
the examples in this book. I will note some Mac-specific differences when they are
important. In preparing this book, I have used both a Windows-based PC and a Mac.

Corvallis, OR Alan C. Acock


March 2014
Acknowledgments
I acknowledge the support of the Stata staff who have worked with me on this project.
Special thanks goes to Lisa Gilmore, the Stata Press production manager, and Deirdre
Skaggs, the Stata Press technical editor. I also thank my students who have tested my
ideas for the book. They are too numerous to mention, but Shauna Tominey deserves
special recognition for going through the entire draft of the second edition to find errors.
Stata has many outstanding technical support people. I was lucky have Kristin
MacDonald be assigned the task as a technical support for the fourth edition. After I
made an initial draft of changes and additions to this edition of the book, Kristin found
several errors. She helped make sure these were fixed. The remaining errors are my
responsibility. This edition is a vastly better book in terms of the statistical analysis,
efficient use of Stata coding, and ease of reading because of Kristin’s work. You, the
reader, will benefit from her amazingly helpful technical support.
My education benefited from the knowledge of many, but I would like to acknowledge
two of my former professors who were especially important. Henry (Bud) Kass, one
of my undergraduate professors, was a role model for how to work with students; he
encouraged me to pursue my graduate education. Louis Gray, one of my graduate school
professors, taught me the power of quantitative analysis and shared his enthusiasm for
research.
Finally, I thank my wife, Toni Acock, for her support and for her tolerance of my
endless excuses for why I could not do things. She had to pick up many tasks I should
have done, and she usually smiled when told it was because I had to finish this book.
Support materials for the book
All the datasets and do-files for this book are freely available for you to download. In
the Command window, type

. net from http://www.stata-press.com/data/agis4/


. net describe agis4
. net get agis4

Notice that each of these commands is preceded by a period (.) and a space. This is a
convention used by Stata. When you enter the command, you just type the command
without the . and space that precede it in the instructions.
Stata comes in several varieties. Small Stata is limited to analyzing datasets with a
maximum of 99 variables and 1,200 observations. If you are using Small Stata, you will
be able to do everything in this book, but you will need to download a different set of
datasets that meet these restrictions. In the Command window, type

. net from http://www.stata-press.com/data/agis4/


. net describe agis4_small
. net get agis4_small

Stata will place the datasets in a directory where you can access them. On a Windows
machine, this is probably C:\Users\userid \Documents. You may want to create a new
directory and copy the materials there. If you have several projects, it may be useful
to have a separate folder for each project. For simplicity, throughout this book, we will
use C:\data as the data directory.
To open one of the datasets that you downloaded from the commands above, for
example, relate.dta, type use relate in the Command window. If you are using
Small Stata, small is appended to the dataset name (relate small.dta), so you
would type use relate small. Those readers using Small Stata will need to append
small whenever I mention a dataset in this book.
If your computer is connected to the Internet, you can also load the dataset by
specifying the complete URL of the dataset. For example,

. use http://www.stata-press.com/data/agis4/firstsurvey

This text complements the material in the Stata manuals but does not replace it.
For example, chapters 5 and 6, respectively, show how to generate graphs and tables,
but these are only a few of the possibilities described in the Stata Reference manuals.
All reference material is available in PDF format. In the Stata menu, click on Help ⊲ PDF
xxx Support materials for the book

Documentation. One of the best aspects of the Stata documentation is that it provides
several real-data examples for most commands. An entry will start with a fairly simple
example and then give examples that are more complex. Looking at the examples is
how I have learned much of what I know about Stata. You will find that the capabilities
for many of the commands I discuss far exceed what I was able to cover here.
If you remember the name of a command, you can type help command name in
the Command window. For example, typing help summarize would display a Viewer
window with brief information and examples of how to run the command. If you do not
know the exact name of the command, you could just enter the first part. For example,
typing help sum opens a window with two options, one of which is summarize. If you
enter the wrong name for a command, say, you type help summary, Stata opens a
Viewer window with a list of files where the word “summary” was listed as a keyword.
You scroll through the list and find the summarize command. If you click on summarize,
the help file for the summarize command opens in the Viewer window.
The help file does not give you all the detailed explanation and examples that you
get from the PDF documentation, but it is often all you need. You can open the PDF
document for a specific command by clicking on the command name in the Title section
or in the Also See menu of the help file.
My hope in writing this book is to give you sufficient background so that you can
use the manuals effectively.
1 Getting started

1.1 Conventions
1.2 Introduction
1.3 The Stata screen
1.4 Using an existing dataset
1.5 An example of a short Stata session
1.6 Summary
1.7 Exercises

1.1 Conventions
Listed below are the conventions that are used throughout the book. I thought it might
be convenient to list them all in one place should you want to refer to them quickly.

Typewriter font. I use this font when something would be meaningful to Stata as
input. I also use it to indicate Stata output.
I use a typewriter font to indicate the text to type in the Command window.
Because Stata commands do not have any special characters at the end, any
punctuation mark at the end of a command in this book is not part of the com-
mand. Sometimes, to be consistent with Stata manuals, I will put a command on
a line by itself with the dot preceding it, as in

. sysuse cancer, clear

All of Stata’s dialog boxes generate commands, which will be displayed in the
Review window and in the Results window. In the Results window, each com-
mand will be preceded by the dot prompt. If you make a point of looking at the
command Stata prints each time you use the dialog boxes, you will quickly learn
the commands. I may include the equivalent command in the text after explaining
how to navigate to it through the dialog boxes.

1
2 Chapter 1 Getting started

Why do we show the dot prompt with these commands?

When we show a listing of Stata commands, we place a dot and a space in front of
each command. When you enter these commands in the Command window, you
enter the command itself and not the dot prompt or space. We include these because
Stata always shows commands this way in the Results window. Stata manuals and
many other books about Stata follow this convention.

When you type a Stata command in the Command window, you execute the
command when you press the Enter key. The command may wrap onto more than
one line, but if you press the Enter key in the middle of entering a command,
Stata will interpret that as the end of the command and will probably generate
an error. The rule is that you should just keep typing when entering a command
in the Command window, no matter how long the command is. Press Enter only
when you want to execute the command.
I also use the typewriter font for variable names, for names of datasets, and to
show Stata’s output. In general, I use the typewriter font whenever the text is
something that can be typed into Stata or when the text is something that Stata
might print as output. This approach may seem cumbersome now, but you will
catch on quickly.
Folder names, filenames, and filename extensions, as in “The survey.dta file is in
the C:\data directory (or folder)”, are also denoted in the typewriter font. Stata
assumes that .dta will be the extension, so you can use just the filename without
an extension, if you prefer.

Sans serif font. I use this font to indicate menu items (in conjunction with the ⊲ symbol),
button names, dialog-box tab names, and particular keys:

• Menu items, such as “Select Data ⊲ Data utilities ⊲ Rename groups of variables
from the Stata menu” (see figure 1.1).

Figure 1.1. Stata menu


1.1 Conventions 3

• Buttons that can be clicked on, as in “Remember, if you are working on a


dialog box, it will now be up to you to click on OK or Submit, whichever you
prefer.”
• Keys on your keyboard, as in “The Page Up and Page Down keys will move
you backward and forward through the commands in the Review window.”
Some functions require the use of the Shift, Ctrl, or Alt key, which will be
held down while the second key is pressed. For example, Alt+f will open the
File menu.

Slant font. I use this font for dialog-box titles and when I talk about labeled elements
of a dialog box, with both items capitalized as they are on the dialog box.

Italics font. I use this font when I refer to a word that is to be replaced.

Quotes. I use double quotes when I am talking about labels in a general way, but I
will use the typewriter font to indicate a specific label in a dataset. For example,
if we decided to label the variable age “Age at first birth”, we would enter Age
at first birth in the textbox.

Capitalization. Stata is case sensitive, so summarize is a Stata command, whereas


Summarize is not and will generate an error if you use it. Stata also recognizes
capitalization in variable names, so agegroup, Agegroup, and AgeGroup will be
three different variables. Although you can certainly use capital letters in variable
names, you will probably find yourself making more typographical errors if you
do. I have found that using all lowercase letters when creating variable names is
usually the best practice.
I will capitalize the names of the various Stata windows, but I do not set them off
by using a different font. For example, we will type commands in the Command
window and look at the output in the Results window.
4 Chapter 1 Getting started

Setting how much output is in the Results window

The default size for the scrollback buffer size for the Results window is 200 kilo-
bytes, approximately 200,000 characters. If you have many results being displayed
in the Results window, the default is to drop the oldest lines once you use up the
200 kilobyte buffer. If you want to be able to scroll back further, you can make
the buffer size larger, up to 2,000 kilobytes. Select Edit ⊲ Preferences ⊲ General
Preferences... and click on the Windowing tab. Stata for Mac users can make this
change by selecting Stata ⊲ Preferences ⊲ General Preferences... and clicking on
the Windows tab. You might change the scrollback buffer size from the default
200 kilobytes to 500 kilobytes. This change will not take effect until you restart
Stata.

Stata for Unix users cannot make this change from the Preferences dialog
box; they must type the command set scrollbufsize 500000 directly in the
Command window.

Typing the command sets the scrollback buffer size in bytes by default, whereas
using the menu method sets the size in kilobytes.

Many Stata users find having to click on the more message when it ap-
pears in the Results window irritating. It is designed to make it easier to read
the results of a single command, but if you do not like this feature, you can type
the command set more off or set more off, permanently. The permanently
option specifies that the setting be remembered for each future Stata session until
you reverse the action by typing set more on or set more on, permanently.

1.2 Introduction
The best way to learn data analysis is to actually do it with real data. These days,
doing statistics means doing statistics with a computer and a software package. There
is no other software package that can match the internal consistency of Stata, which
makes it easy to learn and a joy to use. Stata empowers users more effectively than any
other statistical package.
1.2 Introduction 5

Work along with the book

Although it is not necessary, you will probably find it helpful to have Stata running
while you read this book so that you can follow along and experiment for yourself.
Having your hands on a keyboard and replicating the instructions in this book
will make the lessons that much more effective, but more importantly, you will get
in the habit of just trying something new when you think of it and seeing what
happens. In the end, experimentation is how you will really learn how Stata works.
The other great advantage to following along is that you can save the examples we
do for future use.

Stata is a powerful tool for analyzing data. Stata makes statistics and data analysis
fun because it does so much of the tedious work for you. A new Stata user should
start by using the dialog boxes. As you learn more about Stata, you will be able to
do more sophisticated analyses with Stata commands. Learning Stata well now is an
investment that will pay off in saved time later. Stata is constantly being extended with
new capabilities, which you can install using the Internet from within Stata. Stata is a
program that grows with you.
Stata is a command-driven program. It has a remarkably simple command structure
that you use to tell it what you want it to do. You can use a dialog box to generate
the commands (this is a great way to learn the commands or prompt yourself if you
do not remember one exactly), or you can enter commands directly. If you enter the
summarize command, you will get a summary of all the variables in your dataset (mean,
standard deviation, number of observations, minimum value, and maximum value).
Enter the command tabulate gender, and Stata will make a frequency distribution of
the variable called gender, showing you the number and percentage of men and women
in your dataset.
After you have used Stata for a while, you may want to skip the dialog box and
enter these commands directly. When you are just beginning, however, it is easy to be
overwhelmed by all the commands available in Stata. If you were learning a foreign
language, you would have no choice but to memorize hundreds of common words right
away. This is not necessary when you are learning Stata because the dialog boxes are
so easy to use.
6 Chapter 1 Getting started

Searching for help

Stata can help when you want to find out how to do something. You can use the
search command along with a keyword. For example, you believe that a t test is
what you want to use to compare two means. Enter search t test; Stata searches
its own resources and others that it finds on the Internet. The first entry of the
results is
[R] ttest . . . . . . . . . . . . . . . t tests (mean-comparison tests)
(help ttest)

The [R] at the beginning of the line means that details and examples can be found
in the Stata Base Reference Manual. Click on the blue ttest to go to the help file
for the ttest command. If you think this help is too cryptic, repeat the search t
test command and look farther down the list. Scroll past the lines starting with
Video, and look for the lines starting with FAQ (frequently asked questions). One
of these is “What statistical analysis should I use?” Click on the blue URL to go
to a UCLA webpage that will help you decide whether the t test is the best choice
for what you are doing. You might click on some of the other resources to see how
much support you get from a wide variety of resources.

When using the search command, you need to pick a keyword that Stata
knows. You might have to try different keywords before you get one that works.
Searching these Internet locations is a remarkable capability of Stata. If you are
reading this book and want to know more about a command, the online help is
the first place to start. Suppose that we are discussing the summarize command
and you want to know more options for this command. Type help summarize
and you will get an informative help screen. To obtain complete information for a
command, you should see the PDF documentation. The PDF documentation can be
opened from the Stata menu by selecting Help ⊲ PDF Documentation. Bookmarks
to all the Stata manuals are available; click on the plus sign (+) next to each
manual to see bookmarks to sections therein.

Stata has done a lot to make the dialog boxes as friendly as possible so that you
feel confident using them. The dialog boxes often show many options, which control the
results that are shown and how they are displayed. You will discover that the dialog
boxes have default values that are often all you need, so you may be able to do a great
deal of work without specifying any options.
As we progress, you will be doing more complex analyses. You can do these using the
dialog boxes, but Stata lets you create files that contain a series of commands you can
run all at once. These files, called do-files, are essential once you have many commands
to run. You can reopen the do-file a week or even several months later and repeat
exactly what you did. Keeping a record of what you do is essential; otherwise, you will
not be able to replicate results of elaborate analyses. Fortunately, Stata makes this easy.
1.3 The Stata screen 7

You will learn more about replicating results in chapter 4. The do-files that reproduce
most of the tables, graphs, and statistics for each chapter are available on the webpage
for this book (http://www.stata-press.com/data/agis4/).
Because Stata is so powerful and easy to use, I may include some analyses that are
not covered in your statistics textbook. If you come to a procedure that you have not
already learned in your statistics text, give it a try. If it seems too daunting, you can
skip that section and move on. On the other hand, if your statistics textbook covers
a procedure that I omit, you might search the dialog boxes yourself. Chances are that
you will find it there.
Depending on your needs, you might want to skip around in the book. Most people
tend to learn best when they need to know something, so skipping around to the things
you do not know may be the best use of the book and your time. Some topics, though,
require prior knowledge of other topics, so if you are new to Stata, you may find it best
to work through the first four chapters carefully and in order. After that, you will be
able to skip around more freely as your needs or interests demand.

1.3 The Stata screen


When you open Stata, you will see a screen that looks something like figure 1.2.

Figure 1.2. Stata’s opening screen


8 Chapter 1 Getting started

You can rearrange the windows to look the way you want them, although many users
are happy with the default layout. If you are satisfied with the defaults, you might
skip the next couple paragraphs and come back to them if you change your mind later.
Many experienced Stata users have particular ways to arrange these screens. Feel free
to experiment with the layout.
Selecting Edit ⊲ Preferences gives you several options. One thing you might want
to do is change the size of the buffer for the Results window. The factory default of
200 kilobytes may be too small to be able to scroll through all your results. To change
the size of the buffer, select Edit ⊲ Preferences ⊲ General Preferences... and then click
on the tab labeled Windowing. Depending on how much memory your computer has
available, you might want to raise the default value to as much as 500 kilobytes. You
can resize the Stata interface as you would any other Windows product. There are other
options you can try under Windowing and each of the other tabs. It is nice to personalize
your interface in a way that is attractive to you. I will use the generic “factory settings”
for this book, however. If you make several changes and want to return to the starting
point, select Edit ⊲ Preferences ⊲ Load Preference Set ⊲ Widescreen Layout (default). If
you are using Stata for Mac, select Stata ⊲ Preferences ⊲ Manage Preferences ⊲ Factory
Settings.
When you open a file that contains Stata data, which we will call a Stata dataset, a
list of the variables will appear in the Variables window. The Variables window reports
the name of the variable (for example, abortion) and a label for the variable (for
example, Attitude toward abortion). Other information about the variable is shown
in the Properties window, such as the type of variable (for example, float) and the
format of the variable (for example, %8.0g). For now, just consider the name and label.
You can vary the width of each column in the Variables window by placing your cursor
on the vertical line between the name and label, clicking on it, and then dragging your
cursor to the right or left.
When Stata executes a command, it prints the results or output in the Results
window. First, it prints the command preceded by a . (dot) prompt, and then it prints
the output. The commands you run are also listed in the Review window. If you click
on one of the commands listed in the Review window, it will appear in the Command
window. If you double-click on one of the commands listed in the Review window, it
will be executed. You will then see the command and its output, if any, in the Results
window.
When you are not using the interface, you enter commands in the Command window.
You can use the Page Up and Page Down keys on your keyboard to recall commands
from the Review window. On a Mac that does not have the Page Up and Page Down
keys, you can use the fn key with the arrow up or arrow down key. You can also edit
commands that appear in the Command window. I will illustrate all these methods in
the coming chapters.
1.4 Using an existing dataset 9

The gray bar at the bottom of the screen, called the status bar, displays the current
working directory (folder). This directory may be different on different computers de-
pending on how Stata was installed. The working directory is where Stata will look for a
file or save a file unless you specify the full path to a different directory that contains the
file. If you have a project and want to store all files related to that project in a particu-
lar directory, say, C:\data\thesis, you could enter the command cd C:\data\thesis.
This command assumes that this directory already exists on your computer.
On a Mac, the gray bar at the bottom looks slightly different. To change the working
directory on a Mac or Unix computer from the current working directory to a Documents
folder in your home directory, you would type cd "~/Documents". Stata recognizes the
tilde to represent your home directory. If you had a folder in your Documents folder
called Learning Stata, you would type cd "~/Documents/Learning Stata". Also on
a Mac, you have help if you cannot remember where you saved a file containing data:
You can click on the magnifying glass in the upper right corner of your screen to search
the name of the file, and then click on the file to open it. You may want to type the
clear command first.
Stata has the usual Windows title bar across the top, on the right side of which
are the three buttons (in order from left to right) to minimize, to expand to full-screen
mode, and to close the program. Immediately below the Stata title bar is the menu bar,
where the names of the menus appear. Some of the menu items (File, Edit, and Window)
will look familiar because they are used in other programs. The Data, Graphics, and
Statistics menus are specific to Stata, but their names provide a good idea of what you
will find under them.
Figures 1.3 and 1.4 show the Stata toolbar as it appears in Windows and Mac,
respectively. The icons provide alternate ways to perform some of the actions you
would normally do with the menus. If you hold the cursor over any of these icons for a
couple of seconds, a brief description of the function appears. For a complete list of the
toolbar icons and their functions, see the Getting Started with Stata manual.

Figure 1.3. The toolbar in Stata for Windows

Figure 1.4. The toolbar in Stata for Mac

1.4 Using an existing dataset


Chapter 2 discusses how to create your own dataset, save it, and use it again. You will
also learn how to use datasets that are on the Internet. For now, we will use a simple
dataset that came with Stata. Although we could use the dialog box to do this, we will
enter a simple command. Click once in the Command window to put the cursor there,
10 Chapter 1 Getting started

and then type the command sysuse cancer, clear; the Command window should
look like the one in figure 1.5.

Figure 1.5. Stata command to open cancer.dta

The sysuse command we just used will find the sample dataset on your computer
by name alone, without the extension; in this case, the dataset name is cancer, and the
file that is found is actually called cancer.dta. The cancer dataset was installed with
Stata. This particular dataset has 48 observations and 4 variables related to a cancer
treatment.
What if you forget the command sysuse? You could open a file that comes with
Stata by using the menu File ⊲ Example Datasets.... A new window opens in which
you click on Example datasets installed with Stata. The next window then lists all the
datasets that come with Stata. You can click on use to open the dataset.
Now that we have some data read into Stata, type describe in the Command
window. That is it: just type describe and press the Enter key. describe will yield a
brief description of the contents of the dataset.

. describe
Contains data from C:\Program Files\Stata13\ado\base/c/cancer.dta
obs: 48 Patient Survival in Drug Trial
vars: 8 3 Mar 2011 16:09
size: 576

storage display value


variable name type format label variable label

studytime int %8.0g Months to death or end of exp.


died int %8.0g 1 if patient died
drug int %8.0g Drug type (1=placebo)
age int %8.0g Patient’s age at start of exp.
_st byte %8.0g
_d byte %8.0g
_t byte %10.0g
_t0 byte %10.0g

Sorted by:

The description includes a lot of information: the full name of the file, cancer.dta
(including the path entered to read the file); the number of observations (48); the
number of variables (8); the amount of memory the data consume (576 bytes); a brief
description of the dataset (Patient Survival in Drug Trial); and the date the file was last
saved. The body of the table displayed shows the names of the variables on the far left
and the labels attached to them on the far right. We will discuss the middle columns
later.
1.5 An example of a short Stata session 11

Now that you have opened cancer.dta, note that the Variables window lists the
eight variables studytime, died, drug, age, st d, t, and t0.

Internet access to datasets

Stata can use data stored on the Internet just as easily as data stored on your com-
puter. If you did not have the cancer.dta file installed on your computer, you could
read it by entering webuse cancer. However, you are not limited to data stored at
the Stata site. Typing use http://www.ats.ucla.edu/stat/stata/notes/hsb2
will open a dataset stored at the UCLA website.

Stata does not discard changes to the dataset currently in memory unless
you tell it to do so. That is, if you have a dataset in memory and you have modified
it, you will receive an error message if you try to load another dataset. You need
to save the dataset in memory, type the clear command to discard the changes,
or type the clear option of the use command to discard the changes. You can
then load the new dataset.

Stata provides all the datasets for every example in its manuals. For exam-
ple, click on File ⊲ Example Datasets.... A new window opens in which you click
on Stata 13 manual datasets. There you might click on Base Reference Manual
[R]; scroll down to correlate, and click on use to open any of the datasets or
describe to see what variables are in the dataset.

1.5 An example of a short Stata session


If you do not have cancer.dta loaded, type the command sysuse cancer. We will
execute a basic Stata analysis command. Type summarize in the Command window
and then press Enter.
Rather than typing in the command directly, you could use the dialog box by se-
lecting Data ⊲ Describe data ⊲ Summary statistics to open the corresponding dialog box.
Simply clicking on the OK button located at the bottom of the dialog box will produce
the summarize command we just entered. Because we did not enter any variables in the
dialog box, Stata assumed that we wanted to summarize all the variables in the dataset.
You might want to select specific variables to summarize instead of summarizing
them all. Open the dialog box again and click on the pulldown menu within the Variables
box, located at the top of the dialog box, to display a list of variables. Clicking on a
variable name will add it to the list in the box. Dialog boxes allow you to enter a
variable more than once, in which case the variable will appear in the output more than
once. You can also type variable names in the Variables box. A last alternative is to
click on the variable name in the Variables window. Figure 1.6 shows the dialog box
with the drop-down variable list displaying the variables in your dataset:
12 Chapter 1 Getting started

Figure 1.6. The summarize dialog box

In the bottom left corner of the dialog box, there are three icons: , , and .
The icon gives us a help screen explaining the various options. The explanations are
brief, but there are examples at the bottom of the Viewer window. The icon resets
the dialog box. Just to the right of the icon is an icon that looks like two pages. If
you click on this icon, the command is copied to the Clipboard.
If you enter the summarize command directly in the Command window, simply
follow it with the names of the variables for which you want summary statistics. For
example, typing summarize studytime age will display only statistics for the two vari-
ables named studytime and age.
In the Results window, the summarize command will display the number of obser-
vations (also called cases or N ), the mean, the standard deviation, the minimum value,
and the maximum value for each variable.

. summarize
Variable Obs Mean Std. Dev. Min Max

studytime 48 15.5 10.25629 1 39


died 48 .6458333 .4833211 0 1
drug 48 1.875 .8410986 1 3
age 48 55.875 5.659205 47 67
_st 48 1 0 1 1

_d 48 .6458333 .4833211 0 1
_t 48 15.5 10.25629 1 39
_t0 48 0 0 0 0

The first line of output displays the dot prompt followed by the command. After
that, the output appears as a table. As you can see, there are 48 observations in this
1.5 An example of a short Stata session 13

dataset. Observations is a generic term. These could be called participants, patients,


subjects, organizations, cities, or countries depending on your field of study. In Stata,
each row of data in a dataset is called an observation. The average, or mean, age is
55.875 years with a standard deviation of 5.659,1 and the subjects are all between 47
(the minimum) and 67 (the maximum) years old.
If you have computed means and standard deviations by hand, you know how long
this can take. Stata’s virtually instant statistical analysis is what makes Stata so valu-
able. It takes time and skill to set up a dataset so that you can use Stata to analyze it,
but once you learn how to set up a dataset (chapter 2), you will be able to compute a
wide variety of statistics in little time.
We will do one more thing in this Stata session: we will make the histogram for the
age variable, shown in figure 1.7.
.08
.06
Density
.04
.02
0

45 50 55 60 65
Patient’s age at start of exp.

Figure 1.7. Histogram of age

A histogram is just a graph that shows the distribution of a variable, such as age, that
takes on many values.
Simple graphs are simple to create. Just type the command histogram age in the
Command window, and Stata will produce a histogram using reasonable assumptions.
I will show you how to use the dialog boxes for more complicated graphs shortly.
At first glance, you may be happy with this graph. Stata used a formula to determine
that six bars should be displayed, and this is reasonable. However, Stata starts the
lowest bar (called a bin) at 47 years old, and each bin is 3.33 years wide (this information
is displayed in the Results window) even though we are not accustomed to measuring
years in thirds of a year. Also notice that the vertical axis measures density, but we
1. I may round numbers in the text to fewer digits than shown in the output unless it would make
finding the corresponding number in the output difficult.
14 Chapter 1 Getting started

might prefer that it measure the frequency, that is, the number of people represented
by each bar.
Using the dialog box can help us customize our histogram. Let’s open the histogram
dialog box shown in figure 1.8 by selecting Graphics ⊲ Histogram from the menu bar.

Figure 1.8. The histogram dialog box

Let’s quickly go over the parts of the dialog box. There is a textbox labeled Variable
with a pulldown menu. As we saw on the summarize dialog, you can pull down the list
of variables and click on a variable name to enter it in the box, or you can type the
variable’s name yourself. Only one variable can be used for a histogram, and here we
want to use age. If we stop here and click on OK, we will have re-created the histogram
shown in figure 1.7.
There are two radio buttons visible to the right of the Variable box: one labeled Data
are continuous (which is shown selected in figure 1.8) and one labeled Data are discrete.
Radio buttons indicate mutually exclusive items—you can choose only one of them.
Here we are treating age as if it were continuous, so make sure that the corresponding
radio button is selected. On the right side of the Main tab is a section labeled Y axis.
Click on the radio button for Frequency so that the histogram shows the frequency of
each interval. In the section labeled Bins, check the box labeled Width of bins and type
2.5 in the textbox that becomes active (because the variable is age, the 2.5 indicates
2.5 years). Also check the box labeled Lower limit of first bin and type 45, which will
be the smallest age represented by the bar on the left.
The dialog box shows a sequence of tabs just under its title bar, as shown in figure 1.9.
Different categories of options will be grouped together, and you make a different set
of options visible by clicking on each tab. The options you have set on the current tab
will not be canceled by clicking on another tab.
1.5 An example of a short Stata session 15

Figure 1.9. The tabs on the histogram dialog box

Graphs are usually clearer when there is a title of some sort, so click on the Titles
tab and add a title. Here we type Age Distribution of Participants in Cancer
Study in the Title box. Let’s add the text Data: Sample cancer dataset to the Note
box so that we know which dataset we used for this graph. Your dialog box should look
like figure 1.10.

Figure 1.10. The Titles tab of the histogram dialog box

Now click on the Overall tab. Let’s select s1 monochrome from the pulldown menu on the
Scheme box. Schemes are basically templates that determine the standard attributes of
a graph, such as colors, fonts, and size; which elements will be shown; and more.
From the Legend tab, under the Legend behavior section, click on the radio button
for Show legend. Whether a legend will be displayed is determined by the scheme that
is being used, and if we were to leave Default checked, our histogram might have a
legend or it might not, depending on the scheme. Choosing Show legend or Hide legend
overrides the scheme, and our selection will always be honored.
Now that we have made these changes, click on Submit instead of OK to generate
the histogram shown in figure 1.11. The dialog box does not close. To close the dialog
box, click on the X (close) button in the upper right corner, but we are not ready to do
that yet.
16 Chapter 1 Getting started

Age Distribution of Participants in Cancer Study

10
8
Frequency
4 2
0 6

45 50 55 60 65
Patient’s age at start of exp.

Frequency
Data: Sample cancer dataset

Figure 1.11. First attempt at an improved histogram

If you look at the complex command that the dialog box generated, you will see why
even experienced Stata programmers will often rely on the dialog box to create graph
commands. In reading this command, you will want to ignore the opening dot (Stata
prints this in front of commands in the Results window, but the dot is not part of the
command and you do not type it). Stata prints the > sign at the start of the second and
third line, which might be confusing. Stata uses the Enter key to submit a command.
Because of this, Stata sees the entire command as one line. To print the entire line in
the confines of the Results window, Stata inserts the > for a line break. If you wanted to
enter this command in the Command window, you would simply type the entire thing
without the > and let Stata do the wrapping as needed in the Command window. Never
press the Enter key until you have entered the entire command.
. histogram age, width(2.5) start(45) frequency
> title(Age Distribution of Participants in Cancer Study)
> note(Data: Sample cancer dataset) legend(on) scheme(s1mono)

Clearing the Results window: The cls command

As you run commands, the results are displayed in the Results window. There may
be times when you want to clear the Results window, so that, for example, seeing
the top of the results of a command is easier, especially if your commands and
results are lengthy. Beginning with Stata 13, you can type the cls command (with
no options) to clear the Results window.

It is much more convenient to use the dialog box to generate that command than
to try to remember all its parts and the rules of their use. If you do want to enter a
long command in the Command window, remember to type it as one line. Whenever
1.5 An example of a short Stata session 17

you press Enter, Stata assumes that you have finished the command and are ready to
submit it for processing.

When to use Submit and when to use OK

Stata’s dialogs give you two ways to run a command: by clicking on OK or by


clicking on Submit. If you click on OK, Stata creates the command from your
selections, runs the command, and closes the dialog box. This is just what you
want for most tasks. At times, though, you know you will want to make minor
adjustments to get things just right, so Stata provides the Submit button, which
still runs the command but leaves the dialog open. This way, you can go back to
the dialog box and make changes without having to reopen the dialog box.

The resulting histogram in figure 1.11 is an improvement, but we might want fewer
bins. Here we are making small changes to a Stata command, then looking at the results,
and then trying again. The Submit button is useful for this kind of interactive, iterative
work. If the dialog box is hidden, we can use the Alt+Tab (Windows) or Cmd+Tab
(Mac) key combination to move through Stata’s windows until the one we want is on
top again.
Instead of a width of 2.5 years, let’s use 5 years, which is a more common way to
group ages. If you clicked on OK instead of on Submit, you need to reopen the histogram
dialog box as you did before. When you return to a dialog that you have already used
in the current Stata session, the dialog box reappears with the last values still there.
So all you need to do is change 2.5 to 5 in the Width of bins box on the Main tab and
click on Submit. The result is shown in figure 1.12.

Age Distribution of Participants in Cancer Study


15 10
Frequency
5
0

45 50 55 60 65 70
Patient’s age at start of exp.

Frequency
Data: Sample cancer dataset

Figure 1.12. Final histogram of age


18 Chapter 1 Getting started

Notice how different the three graphs appear. You need to use judgment to pick the best
combination and avoid using graphs that misrepresent the distribution. A good graph
will give the reader a true picture of the distribution, but a poor graph may be quite
deceptive. When people say that you can lie with statistics, they are often thinking
about graphs that do not provide a fair picture of a distribution or a relationship. Can
you think of any more improvements? The legend at the bottom center of the graph is
unnecessary. You might want to go back to the dialog box, click on the Legend tab, and
click on Hide legend to turn off the legend.
To finish our first Stata session, we need to close Stata. Do this with File ⊲ Exit. If
you are using Stata for Mac, select Stata ⊲ Quit Stata.

1.6 Summary
We covered the following topics in this chapter:

• The font and punctuation conventions I will use throughout the book
• The Stata interface and how you can customize it
• How to open a sample Stata dataset
• The parts of a dialog box and the use of the OK and Submit buttons
• How to summarize the variables
• How to create and modify a simple histogram

1.7 Exercises
Some of these exercises involve little or no writing; they are simply things you can do to
make sure you understand the material. Other exercises may require a written answer.

1. You can copy and paste text to and from Stata as you wish. You should try
highlighting some text in Stata’s Results window, copying it to the Clipboard, and
pasting it into another program, such as your word processor. To copy highlighted
text, you can use the Edit ⊲ Copy menu or, as indicated on the menu, Ctrl+c. You
will probably need to change the font to a monospaced font (for example, Courier),
and you may need to reduce its font size (for example, to 9 point) after pasting
it to prevent the lines from wrapping. You may wish to experiment with copying
Stata output into your word processor now so that you know which font size and
typeface work best. It may help to use a wider margin, such as 1 inch, on each
side.
2. After you highlight material in the Results window, right-click on it. You can copy
this output in several formats, including Copy, Copy Table (only works with some
1.7 Exercises 19

commands), Copy Table as HTML, and Copy as Picture (copies a graphic image
of what you highlighted). The Copy option works nicely, but you will need to
use a monospaced font, such as Courier, and may need to use a smaller font size
when you paste it into your word processing document. The Copy Table option
is limited because it only works with a few commands. The Copy Table as HTML
option will create a table that looks like what you would see on a webpage. Using
Microsoft Word, you can edit the table by making columns wider or narrower and
by aligning the columns so that each number has the same number of decimal
places. Just copy the tabular results and not the command when using the HTML
option. The Copy as Picture option works nicely in Windows, but you cannot edit
it in Word because it is a graphic image. In Word, you can resize the image.
Run the summarize command and copy the results to a Word document by using
each of the options. Highlight the table. Right-click on it and then select the
option you want. Switch to your word processor. Press Ctrl+v to paste what you
copied. In your word processor, make the table as nice as you can by adjusting
the font, font size, margins, etc.

3. Stata has posted all the datasets from its manuals that were used to illustrate
how to do procedures. You can access the manual datasets from within Stata by
going to the File ⊲ Example Datasets... menu, which will open a Viewer window.
Click on Stata 13 manual datasets and then click on User’s Guide [U].
The Viewer window works much like a web browser, so you can click on any of
the links in the list of datasets. Scroll down to chapter 25, and select the use link
for censusfv.dta, which opens a dataset that is used for chapter 25 of the User’s
Guide. Run two commands, describe and summarize. What is the variable
divorcert and what is the mean (average) divorce rate for the 50 states?

4. Open cancer.dta. Create histograms for age using bin widths of 1, 3, and 5. Use
the right mouse button to copy each graph to the Clipboard, and then paste it
into your word processor. Does the overall shape of the histogram change as the
bins get wider? How?

5. UCLA has a Stata portal containing a lot of helpful material about Stata. You
might want to browse the collection now just to get an idea of the topics covered
there. The URL for the main UCLA Stata page is

http://www.ats.ucla.edu/stat/stata/

In particular, you might want to look at the links listed under Learning Stata.
On the Stata Starter Kit page, you will find a link to Class notes with movies.
These movies demonstrate using Stata’s commands rather than the dialog box.
The topics we will cover in the first few chapters of this book are also covered on
the UCLA webpage using the commands. Each movie is about 25 minutes long.
Some of these movies are for older versions of Stata, but they are still useful.
Exploring the Variety of Random
Documents with Different Content
King's garden, saw a great shadow cross the moon. Brushing his
hand uneasily across his eyes he looked again, but this time the
shadow had gone. Concluding that it had been but a dark cloud, the
Soothsayer drew a deep breath and, leaning forward, broke the
golden pear from the sacred bough. Now Akbad hardly knew what to
expect, but the thing that did happen exceeded his wildest
imaginings. The pear in his hands grew larger and larger, bursting
finally with such a golden splutter and glare he was almost blinded.
Stars! It was a pair of wings!
Thoroughly frightened, the soothsayer fell back against the tree,
putting up both hands to beat off the whirling pinions. But it was no
use. The great wings swooped down upon him and next moment had
fastened themselves to his shoulders. His heart, as they lifted him
into the air, dropped so suddenly into his boots both boots fell off.
Motionless and helpless and just above the emerald tree he hung
suspended, trembling so violently his turban came unwound and
fluttered like a banner in the evening breeze. For about as long as
you could count ten Akbad dangled limply between the golden wings.
Then recovering a little of his courage he moistened his lips and
muttered weakly.
"Take me to the Emerald City of Oz." Next instant, another shadow
had crossed the moon and Akbad, like some strange ungainly bird,
was being borne swiftly and silently towards the South.
The Vanished Queen of the Ozure Isles
CHAPTER 3
The Strange Public Benefactor

In the dusty shop of Dan, the second-hand man, there was no sound
except the whirr of a rickety sewing machine in the back room. Dan
bought old clothes which he mended and pressed and sold again to
people who could not afford new ones. Usually he spent every
evening in his dim little Boston shop, but to-night Dan's niece was to
be married, and the old clothes man was hurriedly stitching up a rent
in a dress suit he had bought that very morning from a dusky
gentleman in Grant street. It was worn and shabby, but surveying
himself in the cracked mirror a few moments later Danny felt he
would look quite as fine as the groom. Well pleased with his
appearance he nodded to his reflection and taking down a second-
hand high hat from his shelf let himself out into the night.
It was a warm starry evening in May and, coming to the end of the
narrow street in which he lived, Dan struck out across a small park,
whistling softly to himself. He would have preferred his pipe, but in
honor of the grand occasion had purchased a handful of five cent
cigars. Placing one between his teeth, he fumbled in his pocket for
the box of matches he had surely placed there before starting. His
fingers closed instead on a small leather book.
"What's this?" exclaimed Danny in surprise and, stepping under a
park lamp, he began fluttering over the pages. It was filled with
closely written paragraphs in a strangely cramped hand. The words
were no words Danny had ever heard or seen. To prove it he settled
his specs more firmly and read a whole paragraph aloud, moistening
his lips between the long hard sentences, and keeping his cigar in
place in his mouth with great difficulty.
"Well, did anyone ever hear the like of that?" chuckled Danny,
winking up at the statue of a Public Benefactor who stood facing him
in a small plot of grass. "What do you think of it yourself, old felly?"
"I hardly know," murmured the Public Benefactor, letting the arm
which had been stiffly extended fall heavily at his side. "I hardly
know. You see, I've never thought before, and—"
"Merciful mackerel!" The cigar fell from Danny's lips, the high hat
from his head and hurling the leather book into a clump of bushes,
he turned and fled for his life, bumping into trees and benches and
running in the opposite direction from the wedding. In fact, I am not
sure he ever did get to the wedding at all. The Public Benefactor
watched him go with round unwinking eyes, then stepping down
from his pedestal, picked up the high hat, fortunately an extremely
large one, and placed it gravely upon his head.
"Now for an umbrella," murmured the stone gentleman determinedly.
"I must have an umbrella. What I've suffered all these years, rain and
snow. Ah—hh." Catching sight of an old lady hurrying down one of
the cinder paths, he called loudly. "Stop! Stop! Give me that
umbrella!" For some seconds the old lady who was quite deaf paid no
attention, but when, looking over her shoulder, she saw a gray stone
gentleman in a frock coat pounding after her, waving both arms, she
picked up her skirts, jumped over a little hedge and fell face down
among the pansies. Without feeling at all sorry, or stopping to help
her to her feet, the Public Benefactor took the umbrella from her
hand. Opening it with a little grunt of satisfaction and holding it over
his head as he had seen other people do, he stepped carelessly over
the old lady and continued down the cinder path. "I've always
wanted to be like other people," mused the statue, striding along
contentedly, "and now, I am. But I wonder why I never did this
before?"
Why indeed? Simply because he had never been alive before. The
words in the little black book must have held some strange and
mysterious force; the owner of Danny's dress suit must have been a
powerful magician to bring this cold statue to life. And as he strode
across the little Boston park, with Danny's hat upon his head and the
old lady's umbrella clasped tightly in his hand, little boys who had
come for a quiet game of marbles before bed time, men and women
on their way home to tea, stared in perfect astonishment and then
took to their heels, screaming hoarsely as they ran.
"I'm acting just the way they are acting, and yet they run away,"
grumbled the Public Benefactor crossly. "What's the matter with them
anyway?" He sank down on a park bench to puzzle it all out, but the
bench, which had been built to hold only ordinary folk, crumpled like
a match under his great weight. A tramp who had been asleep on the
other end, wakened by the terrible tumble, took one glance at the
stone man, then rolled into a clump of shrubbery where he lay
trembling so violently leaves fell in showers to the walk. By the time
the Public Benefactor had struggled to his feet a great crowd had
gathered. At a safe distance they peered at him, waving their arms,
shaking their heads and looking so frightened the Public Benefactor
began to feel frightened himself.
Turning his back upon them, he walked out of the park and straight
into the middle of a busy crossing. Here he stopped to gaze at a
winking electric sign when a dreadful thump almost knocked the
umbrella from his hand, and a series of shouts almost raised the hat
from his head. A motor truck going at a fast clip had run right into
him! But instead of upsetting the stone man, the truck splintered to
bits and lay scattered about the street like a broken toy! Surely a
pleasant change from breaking up poor pedestrians. But the truck
driver did not seem to think so. Separating himself from the
wreckage, he advanced threateningly upon the Public Benefactor. But
one good look at that calm stone figure seemed to be enough. A
mounted policeman leaning down seized the high hatted gentleman
by the arm, then feeling the hard stone beneath his fingers he reined
back his horse and blew a shrill blast on his whistle.
In less than a minute the street was a seething mass of men,
women, little girls and boys, all striving for a glimpse of the man who
had stopped a truck. Next someone turned in a fire alarm and the fire
engines came clanging on the scene. The firemen not knowing what
else to do turned their hose full upon the offending statue.
Alarmed and disgusted, and protecting himself as well as he could
with the old lady's umbrella, the Public Benefactor decided to return
to his pedestal. But in the excitement he took a wrong turning. Then
he began to run and the crowd to run after him—faster and faster
and faster. His stone feet, thudding upon the asphalt, shook the
houses on both sides and, dodging as best he could the sticks, stones
and other missiles of his pursuers, the poor bewildered statue ran on.
Being very large and perfectly tireless, he soon out-distanced them
and, looking over his shoulder to make sure, failed to notice the
steep embankment ahead, till it was too late. The workmen
themselves had not intended to blow such a terrific hole in the earth;
a thin crust of earth at the bottom hid the yawning cavity from view.
But the stone man, tumbling head over heels down the steep sides,
crashed through this crust as if it had been paper and plunged into a
damp darkness.
"What now?" groaned the statue dismally, clutching his umbrella.
"Am I a bird? Why, Oh why did I ever leave my pedestal?" But
wishing made no difference at all and down he dropped to the very
bottom of no where. Then all at once he crashed through a crust of
blue sky out into the blazing sunlight and thumped down in the
middle of a broad green field. Luckily he landed upon his feet, but so
hard and so heavily that he went down to his knees in soft earth. For
a few moments he stood perfectly still. Then, closing his umbrella, he
pulled one leg and then the other out of the mud and took a few
steps to shake the stuff from his stone shins.
"It was night and now it is day. I was there and now I am here. What
next?" he muttered uneasily. The country into which he had fallen so
suddenly seemed safe enough. Green fields, dotted with feathery
trees, stretched to the right and left. But after the dusty Boston park
it seemed large and lonely. As he gazed about uncertainly, he noticed
a blue figure, walking briskly along a yellow highway that ran through
the center of the fields. He had never in his whole carved career seen
a fellow like this and as the figure drew nearer he grasped his
umbrella firmly and made ready to fight or run.
It was a Scarecrow, a live, jolly, sure enough straw stuffed Scarecrow.
As he came opposite he took off his hat.
"Good after-night," said the Scarecrow politely. The Public Benefactor
made an unsuccessful effort to remove his own hat, but he had
jammed it down too hard.
"I suppose you mean good morning," he remarked stiffly, returning
the Scarecrow's bow.
"Have it your own way," smiled the Scarecrow, with a care free wave,
"and speaking of ways, where are you going?"
"I'm not going, I'm coming," announced the Public Benefactor sulkily.
The experiences of the past few hours had made him suspicious of
every place and everybody. The Scarecrow considered his answer for
a few seconds in silence, then stepping closer inquired earnestly, "Tell
me, are you a person?"
"Are you?" At this quick and unexpected turning of his question, the
Scarecrow threw back his head and laughed heartily.
"I don't know," he admitted merrily, "whether I'm a person or not,
but I do know that I'm alive and it's great fun to be alive!"
"Is it?" The Public Benefactor looked dubiously into the Scarecrow's
cheerful cotton countenance. "I'm not sure I like it," he sighed,
shaking his head ponderously.
"Oh, you'll get used to it." Clapping on his hat, the straw man
regarded his companion attentively. "You're the only live statue I've
ever seen," he observed at last. "How do you happen to be alive?"
There was something so jolly about this queer fellow, the poor statue
began to feel a little happier.
"First," he began slowly, "I was quarried, then I was hacked and
hewn into my present shape. For many years I stood on a pedestal in
a little park in the city of Boston. While I could neither move nor talk
I could see and hear all that went on about me. And what I saw and
heard was interesting enough. I watched the children sail their boats
in the small pond, listened to the band on warm summer evenings
and observed the strange habits of the men and women who walked
about under the trees. If I had just had a hat or umbrella to protect
me from the rain and snow, I could have been perfectly happy."
"You must be perfectly happy now," put in the Scarecrow slyly, "for I
see you have both." The Public Benefactor shook his head impatiently
at the interruption.
"Once a year," he continued pompously, "a crowd of citizens came
and hung wreaths around my neck, and in long tedious speeches
which I could not understand referred to me as a great public
benefactor. Do you know what a Public Benefactor is?" he inquired
curiously.
"Well," answered the Scarecrow cautiously, "you probably founded a
school or a library or gave large sums of money to the poor. What
was your name anyway?"
"I never knew," replied the gray stone gentleman sadly. "It was
carved on the base of my pedestal and as I was unable to bend over
I could never discover this interesting information."
"Then I shall call you Benny," decided the Scarecrow cheerfully,
"short for public benefactor, you know. Do you look like the person
you're supposed to be?"
The statue shook his head. "I don't know that either," he admitted
gloomily.
"Oh, never mind that," said the Scarecrow, sitting down on a nearby
tree stump. "You are a speaking likeness of somebody, but how did
you come to life?"
"I was coming to that," exclaimed Benny quickly, and in short excited
sentences he told how an old Irishman in evening clothes had
stopped under the park lamp and read some strange words from a
little black book and how he immediately felt a desire to step down
from his pedestal. "So I did," he went on mournfully, and proceeded
to relate his terrifying experiences and his final fall into this strange
land. "It is very queer," he finished in a depressed voice. "When I was
uninteresting and unalive, people treated me with respect and hung
wreaths around my neck, yet when I came to life they turned a hose
on me and even hit me with bricks."
The Scarecrow shook his head. "There's no accounting for mortals,"
he explained solemnly, "but now that you are in the fairy Kingdom of
Oz, things will be different. Anybody can be alive here, and no
questions asked. They even let me live!" he concluded gaily.
"Is it a republic?" asked Benny, eyeing the Scarecrow with new
interest.
"Indeed not!" exclaimed the straw man loftily. "We are a magic
monarchy under the beneficent rule of a little fairy and there—," he
waved proudly to the left, "lies the capital. If you wish, I will take you
to the Emerald City at once and present you to the Queen. What
would you like to be now that you are alive?" he asked curiously.
"Well," said Benny after a moment's thought, "I should like to be a
real person. Do you think I could ever be a real person, Scarecrow?"
The Scarecrow took off his hat and pulled several wisps of straw from
his head.
"I don't see why not," he decided brightly. "The way to be a real
person is to act like a real person. Just begin acting like a real
person, Benny, my boy, and first thing you know you'll be one!"
"Is that what you did?" Benny looked doubtfully at this strange citizen
of Oz. The Scarecrow nodded modestly and, taking the stone man's
elbow, started down the yellow brick highway. "Look alive now," he
chuckled merrily, "for you are to meet a Queen."
"It's hard for a stone man to look alive but I'll do the best I can,"
sighed the Public Benefactor in a resigned voice. "How do you
happen to be alive yourself?" he inquired heavily.
"That!" said the Scarecrow airily, "that is a long story, you see—"
"I see a great ugly bird," interrupted the Public Benefactor, waving his
umbrella wildly. "Let's run; I never did like birds. They perch on my
head."
"Pray do not concern yourself," begged his companion earnestly, "and
try to act like a real person, can't you?" Withdrawing his arm from
Benny's the Scarecrow took off his hat and blinked upward.
"Well," queried Benny nervously, "what would a real person do now?"
"He would run," choked the Scarecrow in a hoarse whisper. "Run you
son of a boulder, run!"
CHAPTER 4
Finding a Mortal Maiden
So well did Benny carry out the Scarecrow's instruction, the flimsy
straw man was jerked from the ground and fairly flew through the air
at the stone man's side. And so intent were they both upon their
running, they never saw the little girl in the pink dress until they had
bumped right into her. Now to be run into is upsetting under any
circumstances, but to be run into by a live statue is the most
upsetting thing yet. Trot, for it was Trot, not only was upset but rolled
over and over and bumped her head on an emerald milestone at the
side of the road.
"Stop!" cried the Scarecrow, recognizing her at once. "Now see what
you've done!"
"But the bird!" quavered Benny coming to a reluctant halt and
glancing fearfully over his shoulder.
With an impatient exclamation the Scarecrow dropped his hand and
hurried over to Trot. "Fancy, running into you like this," he puffed
ruefully.
"Fancy it!" gasped Trot rubbing her head with one hand and her knee
with the other, "I don't fancy it at all. Why don't you look where
you're going!" She frowned crossly at the Scarecrow and then
catching a glimpse of Benny jumped to her feet in real alarm. "Who's
he?" she asked in a frightened whisper.
"Just now he's a Public Benefactor, but he's trying to be a real
person," explained the Scarecrow hastily. "Benny, old fellow, this is
Trot, a little girl from California who was shipwrecked and came to
the Land of Oz. She lives in the royal palace with Ozma. Benny comes
from America too," he added proudly.
This Is Trot

"But the bird!" panted Benny, nodding absently to Trot.


"You see my dear, we were escaping from a horriblus bird when we
ran across you," apologized the Scarecrow with an anxious glance
upward.
"I don't see any bird." Still rubbing her knee, Trot looked up too and
after they had all gazed intently at the sky for several minutes they
had to admit that Trot was right. There was not even a speck in the
bright blue expanse overhead.
"But there was a bird, a most fearful, queerful bird," the Scarecrow
assured her positively. Trot gave a little sniff and while she did not
exactly say so, both Benny and the Scarecrow felt that she did not
believe there had been any bird at all.
"I was coming to see you," continued the Scarecrow in a slightly
embarrassed voice. "How fortunate that we met this way, now we
can all go to the Emerald City together." Trot, looking down at her
skinned knee and feeling the lump on her forehead, could not help
thinking it had not been so fortunate for her, but being a really
sweet-tempered little girl she said nothing further and walked along
quietly between these two singular looking gentlemen. The
Scarecrow she had known for years, but she kept stealing inquisitive
glances at his solemn stone companion. Seeing her evident interest,
the straw man told her all about Benny's strange coming to life and
his fall into Oz.
"Do you think I can ever be a real person?" asked Benny wistfully as
the Scarecrow finished his story. "Now, as you see, I am a hard
person of stone. But I wish to be like other people, to laugh, to sing,
to dance and be happy."
It was hard to imagine this pompous looking image singing and
dancing, but Trot had seen stranger things than this happen in the
marvelous Land of Oz, so, stifling her misgivings, smiled at him
kindly.
"You'll have to be a little careful about dancing," she cautioned
gently, "not to step on anyone's foot, or hold them too tightly or—"
"Ho Ho!" roared the Scarecrow. "I should say you had better be
careful. One step from your stone toes, and one squeeze from those
stone arms would finish any partner brave enough to waltz round
with you." At this the stone man looked so downcast that Trot felt
really sorry for him.
"I guess stone arms and legs are not much use," he sighed, rolling
his eyes sadly at the little girl.
"But they're terribly strong," Trot reminded him cheerfully, "and
would be fine in a battle. And after awhile, when you're quite used to
being alive, I wouldn't mind dancing with you," finished Trot in a little
burst of generosity.
"Wouldn't you?" Stopping stock still, Benny began to bow. "My dear,"
exclaimed the stone man gratefully, and bending so low he almost
lost his balance, "those are the kindest words I've heard since I came
to life and to Oz." Trot, pleased and delighted at such appreciation,
curtsied back.
"Hurrah!" shouted the Scarecrow, tossing his hat into the air. "You're
acting realer every minute. Do you know, this reminds me of my first
journey to the Emerald City. I was not always the accomplished
person you see before you," he confided mysteriously.
For a long time Benny had been trying to puzzle out just what kind of
a person the Scarecrow was. Never in his whole park experience had
he seen anyone so curiously constructed, so unsteady and flimsy, yet
so gaily alive. He listened attentively therefore as the straw man
began to tell his story to his new friend.
"I am a Scarecrow," he began impressively, and I must admit he was
as fond of talking about himself as most of the gentlemen of my own
acquaintance. Trot who had heard the story many times began to
hum a little tune and to think of something else.
"Originally," continued the Scarecrow brightly, "I was intended to
scare away the crows from a farmer's corn field. My head is a small
stuffed sack on which the features are neatly painted. This blue suit
and these red boots and cotton gloves belonged to the farmer; also
this hat. Having assembled me in this more or less careless fashion
and stuffed me with hay, he hung me upon a tall pole in the corn
field and went about his planting. For a long time I hung around, not
knowing how interesting life could be. Then, one day," the Scarecrow
paused and waved his arms dramatically, "along came Dorothy, a
little girl about the size of Trot. She had been blown from Kansas by a
cyclone and was on her way to the Emerald City to ask the Wizard of
Oz to send her back home. Well, to make a long story short, Dorothy
lifted me from my pole and I found I could walk and talk almost as
fast as she could. But while I was alive, I realized that I could never
be a really important person with a head full of hay. So I decided to
go to the Emerald City with Dorothy and ask the Wizard of Oz to give
me some brains."
"Well, did he?" Benny looked curiously at the Scarecrow's bulging
forehead.
"Haven't you noticed them?" demanded the Scarecrow in a vexed
voice. Removing his hat he tapped the top of his head proudly. "In
here are the finest and most magic brains in Oz," he announced
seriously. "Not only did they help me to become an Emperor, but they
have since solved many questions of state for our present ruler, Ozma
of Oz. I can think of anything, can't I, Trot?"
The little girl nodded politely and Benny, much impressed, watched
the Scarecrow put on his hat. "I have a castle of my own in the
Winkie Country but spend most of my time in the Emerald City," he
concluded proudly.
"Did the Wizard send Dorothy back to America?" asked Benny, as the
Scarecrow stopped to pick a green rose for Trot.
"Certainly!" answered the Scarecrow, pulling two thorns from his
cotton thumb, "but she is in Oz again. No one who has lived in Oz
can stay away long. Dorothy lives in the castle with Ozma, Betsy and
Trot. Betsy Bobbin is another little girl from America, so you see you'll
have lots of company, old fellow."
Princess Dorothy

"Does the Wizard live there, too?" questioned Benny eagerly, as the
Scarecrow clumsily presented the rose to Trot, "and do you think he
could change me to a real person?"
"Of course, but if I were you, I should stay as you are. There are lots
of real people but precious few stone ones. Think of the advantages!"
Tapping Benny lightly on the chest the Scarecrow began to
enumerate them. "First of all," he explained merrily, "you will never
tire, need food or suffer pain. You will never wear out nor require
clothes. Why, you have all the advantages of life without any of its
inconveniences. Isn't that true, Trot?"
Trot smiled and made a gesture that might have been "yes" or "no".
It would have taken a wiser person than Trot to settle a question like
the Scarecrow's.
They were drawing nearer to the Emerald City every moment now.
Over the tree tops ahead, Benny could see the tall towers and
flashing spires of the castle. The air was fresh, fragrant and somehow
exciting. On each side of the yellow brick road, cozy green cottages
with domed emerald roofs began to appear. Friendly faced folk, in
stiff green silk costumes, waved to them from the doorways. Trot and
the Scarecrow waved back, and Benny, taking off his hat and bowing
stiffly from time to time, decided that he was going to find life in the
Land of Oz extremely pleasant and interesting. At Trot's suggestion
they turned off the yellow brick highway to take a short cut to the
castle.
"Well," laughed Trot, dancing along through the pleasant little wood,
"We'll soon be in the Emerald City now, and then—and then!"
"Then what?" wheezed the Scarecrow, stopping to swing on a low
branch.
"Why, then we'll have a party!" exclaimed Trot. "Don't we always
have a party when you come to the castle, but this party will be for
Benny, in honor of his coming to life." The stone man was not sure
just what a party was, but so long as Trot was in it he knew
everything would be all right. "We'll have games," continued the little
girl happily, "and music and riddles and refreshments—and—"
"Stop!" roared an imperious voice in Trot's ear. "Now then, will you
come along peaceably or must I use force?" At this sudden horrid
interruption, Benny and the Scarecrow swung round in perfect
astonishment.
"A—a Goblin!" faltered Trot, catching wildly at Benny.
"Run! Run! That awful bird!" panted the Scarecrow, taking a great
leap forward.
"Run if you want to," rumbled the Public Benefactor stopping short.
"But as I am not a real person, I shall stay here and fight. Get away
from here, you wild Whankus! Leave Trot alone, you old
Wallybuster!" Words that he had never known were in his head came
tumbling from Benny's stone lips and brandishing his umbrella
threateningly he stepped between the little girl and the great ugly
bird-man. But Akbad, for of course it was Akbad, paid no attention to
Benny's expostulations. He was looking earnestly at the picture he
had torn from his history of Oz. All night the magic wings had carried
him steadily toward the capital and it was Akbad who had scared the
two travelers. After frightening them to his heart's content, he had
alighted in a small orchard to refresh himself with a few peaches.
When he flew on again the wings had carried him straight after Trot
and her companions. Looking down and seeing a little girl with them
this time, he had immediately dropped to earth.
"You'll do, you're one of them!" shrilled the Soothsayer, waving the
picture triumphantly. "Come on, there's no time to lose!" Before
either Benny or the Scarecrow realized what was happening, Akbad
seized the little girl and spread his great golden wings.
"Stop!" yelled the Scarecrow, running back and catching Trot by the
hand.
"Stop!" gritted Benny, making a wild snatch for the Soothsayer's
heels. As Benny's stone fingers closed around his ankles Akbad
soared into the air. You would have thought the great weight of the
stone man would have held him down. But what are a thousand
pounds to a pair of magic wings! Up and away, over the sparkling
spires of the capital circled Akbad, paying no more attention to Benny
than to a feather and scarcely noticing the Scarecrow at all.
"Take us to the Ozure Isles," he commanded, tightening his grasp on
Trot's arm.
CHAPTER 5
In the Cave of Quiberon

It had taken the golden wings nearly nine hours to carry Akbad to the
Emerald City. It took scarcely five to bring him back, so that it was a
little after noon when the Soothsayer and his prisoners reached the
sparkling shores of the Ozure Isles. Not a word had been spoken by
anyone during the entire flight. Trot had started to scream, but the
wind rushing down her throat about a mile a minute had almost
choked her. When she managed to get her mouth shut again she was
glad to keep it that way, her eyes too, for that matter. Benny was too
startled to say anything and the Scarecrow had all he could do to
keep himself from blowing apart. But as Akbad, folding his wings,
began to descend, Trot with a long sigh opened her eyes.
The five lovely islands of Cheeriobed lay glittering just below and Trot
gave a little gasp of relief and pleasure, as they hovered over the
gorgeous Sapphire City. Frightened though she was, Trot's heart
began to beat with excitement and curiosity. Surely nothing so very
dreadful could happen in a place like this! But Akbad did not stop,
and flying over the beautiful city carried them to the extreme end of
the last island. Here the waters of Orizon were pounding and roaring
between two jeweled cliffs. Between the two cliffs and at the very
mouth of a great cave, Akbad closed his wings. With a suddenness
that took what little breath Trot had left, they came tumbling down
on the narrow beach. Benny got such a thump, he let go the
Soothsayer's heels and almost fell into the lake. Trot and the
Scarecrow rolled over twice and, clutching each other wildly, sat up,
simply speechless with indignation.
"You," puffed Akbad, for he, too, was worn out by the long fly, "you
have been chosen to save the Ozure Isles." He shook his long finger
in Trot's face. "These others may escape if they wish, but you must
stay and serve the monster Quiberon." As Trot, blinking her eyes
between shock and consternation, tried to understand what it was all
about, there came a great snort and splashing and in toward the
cave swam the monster himself.
"Here's your mortal maiden!" yelled Akbad, and spreading his wings,
rose quickly into the air, leaving Trot and her friends to face the giant
fear-fish. Benny had by this time struggled to his feet, but at sight of
the monster he nearly lost his balance again. As for Trot and the
Scarecrow, after one horrified glance, they seized hands and dashed
in the only direction open to them—straight into the blue cave.
"Wait!" thundered Quiberon, shooting a long tongue of flame from
his fiery nostrils. He was so close that the fire and smoke blackened
both Benny's eyes. With a grunt of surprise and displeasure, the
stone man snatched up his umbrella and pounded after Trot and the
Scarecrow.
"Wait," Thundered Quiberon

"I thought you said that in Oz things would be different," shouted


Benny, grinding the jeweled pebbles on the floor of the cave to
powder beneath his flying stone boots.
"Well, isn't this different?" stuttered the Scarecrow, tripping over a
sapphire boulder and sprawling upon his nose.
"Oh, hurry!" begged Trot, jerking him quickly to his feet. "Here it
comes." At another time the three travelers might have paused to
admire the great jeweled grotto, but with this snorting, puffing
monster at their heels they scarcely glanced at the sapphire icicles
hanging from the roof and jutting out from the sides and the
sparkling gems that strewed the floor of the cave. Water rushed
through the center and it was no easy task running over the rocks
and boulders at the side. The glowing eyes of the monster lighted up
the whole cavern. Like a steam engine, he puffed and snorted behind
them, filling the air with a sulphurous smoke, till it smelled like
twenty Fourths of July rolled into one. At every flash from his nostrils,
the poor Scarecrow would wince and shudder.
"One spark, and I am an ash heap!" groaned the unhappy straw
man, leaping wildly from boulder to rock.
"What shall we do now?" wailed Trot, stopping in dismay, for they
had come to the very back of the cavern and could run no farther.
"I don't know what a real person would do," panted Benny glancing
around desperately, "but I'll do something. Quick, squeeze into that
little opening." There was just time for Trot and the Scarecrow to slip
into the narrow crevice at the back of the cave before Quiberon
dragged himself out of the water and flung himself up on the rocks.
"Where is the mortal maiden?" roared the great dragon, as Benny
placed himself bravely between his friends and the monster.
"Turn off your fire works! Do you want to burn her to a crisp?"
shouted the stone man, waving his umbrella boldly under Quiberon's
very nose. "Can't you talk without smoking?" he continued crossly,
"You're turning me quite black."
"Speak without smoking," muttered the monster in a puzzled voice.
"Well, I might try it. Is this better?" he grunted presently. Benny
nodded and waving the cloud of smoke from before his eyes peered
anxiously downward.
"What do you want with Trot?" he asked suspiciously.
"I want her for a servant," answered Quiberon promptly. "She must
polish my scales, comb my hair," he lifted a great silver lock that
hung between his horns, "sweep out the cave and tell me stories."

You might also like