An Introduction To Stata Programming 2nd Edition Christopher F. Baum 2024 Scribd Download
An Introduction To Stata Programming 2nd Edition Christopher F. Baum 2024 Scribd Download
com
https://ebookfinal.com/download/an-introduction-to-stata-
programming-2nd-edition-christopher-f-baum/
OR CLICK BUTTON
DOWNLOAD EBOOK
https://ebookfinal.com/download/simply-java-an-introduction-to-java-
programming-programming-series-1st-edition-james-levenick/
ebookfinal.com
https://ebookfinal.com/download/a-gentle-introduction-to-stata-fourth-
edition-alan-c-acock/
ebookfinal.com
https://ebookfinal.com/download/an-introduction-to-programming-with-
mathematica-3rd-edition-paul-r-wellin/
ebookfinal.com
https://ebookfinal.com/download/an-introduction-to-network-
programming-with-java-1st-edition-jan-graba/
ebookfinal.com
Java An Introduction to Problem Solving and Programming
7th Edition Walter Savitch
https://ebookfinal.com/download/java-an-introduction-to-problem-
solving-and-programming-7th-edition-walter-savitch/
ebookfinal.com
https://ebookfinal.com/download/a-friendly-introduction-to-
mathematical-logic-2nd-edition-christopher-c-leary/
ebookfinal.com
https://ebookfinal.com/download/geochemistry-an-introduction-2ed-
edition-albarede-f/
ebookfinal.com
https://ebookfinal.com/download/an-introduction-to-programming-in-
emacs-lisp-2-revised-edition-robert-j-chassell/
ebookfinal.com
https://ebookfinal.com/download/british-culture-an-introduction-3rd-
edition-david-p-christopher/
ebookfinal.com
Second Edition
CHRISTOPHER F. BAUM Department of Economics and School of Social
Work
Boston College
®
Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845
Typeset in LATEX 2
10 9 8 7 6 5 4 3 2 1
3
Contents
Figures
Tables
Preface
Acknowledgments
5
The import delimited command
Accessing data stored in spreadsheets
Fixed-format data files
2.7.2 Importing data from other package formats
2.8 Guidelines for Stata do-file programming style
2.8.1 Basic guidelines for do-file writers
2.8.2 Enhancing speed and efficiency
2.9 How to seek help for Stata programming
3 Do-file programming: Functions, macros, scalars, and matrices
3.1 Introduction
3.1.1 What you should learn from this chapter
3.2 Some general programming details
3.2.1 The varlist
3.2.2 The numlist
3.2.3 The if exp and in range qualifiers
3.2.4 Missing-data handling
Recoding missing values: The mvdecode and mvencode
commands
3.2.5 String-to-numeric conversion and vice versa
Numeric-to-string conversion
Working with quoted strings
3.3 Functions for the generate command
3.3.1 Using if exp with indicator variables
3.3.2 The cond() function
3.3.3 Recoding discrete and continuous variables
3.4 Functions for the egen command
Official egen functions
egen functions from the user community
3.5 Computation for by-groups
3.5.1 Observation numbering: _n and _N
3.6 Local macros
3.7 Global macros
3.8 Extended macro functions and macro list functions
3.8.1 System parameters, settings, and constants: creturn
3.9 Scalars
3.10 Matrices
4 Cookbook: Do-file programming I
4.1 Tabulating a logical condition across a set of variables
4.2 Computing summary statistics over groups
6
4.3 Computing the extreme values of a sequence
4.4 Computing the length of spells
4.5 Summarizing group characteristics over observations
4.6 Using global macros to set up your environment
4.7 List manipulation with extended macro functions
4.8 Using creturn values to document your work
5 Do-file programming: Validation, results, and data management
5.1 Introduction
5.1.1 What you should learn from this chapter
5.2 Data validation: The assert, count, and duplicates commands
5.3 Reusing computed results: The return and ereturn commands
5.3.1 The ereturn list command
5.4 Storing, saving, and using estimated results
5.4.1 Generating publication-quality tables from stored estimates
5.5 Reorganizing datasets with the reshape command
5.6 Combining datasets
5.7 Combining datasets with the append command
5.8 Combining datasets with the merge command
5.8.1 The one-to-one match-merge
5.8.2 The dangers of many-to-many merges
5.9 Other data management commands
5.9.1 The fillin command
5.9.2 The cross command
5.9.3 The stack command
5.9.4 The separate command
5.9.5 The joinby command
5.9.6 The xpose command
6 Cookbook: Do-file programming II
6.1 Efficiently defining group characteristics and subsets
6.1.1 Using a complicated criterion to select a subset of
observations
6.2 Applying reshape repeatedly
6.3 Handling time-series data effectively
6.3.1 Working with a business-daily calendar
6.4 reshape to perform rowwise computation
6.5 Adding computed statistics to presentation-quality tables
6.6 Presenting marginal effects rather than coefficients
6.6.1 Graphing marginal effects with marginsplot
7
6.7 Generating time-series data at a lower frequency
6.8 Using suest and gsem to compare estimates from nonoverlapping
samples
6.9 Using reshape to produce forecasts from a VAR or VECM
6.10 Working with IRF files
7 Do-file programming: Prefixes, loops, and lists
7.1 Introduction
7.1.1 What you should learn from this chapter
7.2 Prefix commands
7.2.1 The by prefix
7.2.2 The statsby prefix
7.2.3 The xi prefix and factor-variable notation
7.2.4 The rolling prefix
7.2.5 The simulate and permute prefixes
7.2.6 The bootstrap and jackknife prefixes
7.2.7 Other prefix commands
7.3 The forvalues and foreach commands
8 Cookbook: Do-file programming III
8.1 Handling parallel lists
8.2 Calculating moving-window summary statistics
8.2.1 Producing summary statistics with rolling and merge
8.2.2 Calculating moving-window correlations
8.3 Computing monthly statistics from daily data
8.4 Requiring at least n observations per panel unit
8.5 Counting the number of distinct values per individual
8.6 Importing multiple spreadsheet pages
9 Do-file programming: Other topics
9.1 Introduction
9.1.1 What you should learn from this chapter
9.2 Storing results in Stata matrices
9.3 The post and postfile commands
9.4 Output: The export delimited, outfile, and file commands
9.5 Automating estimation output
9.6 Automating graphics
9.7 Characteristics
10 Cookbook: Do-file programming IV
10.1 Computing firm-level correlations with multiple indices
8
10.2 Computing marginal effects for graphical presentation
10.3 Automating the production of LATEX tables
10.4 Extracting data from graph files’ sersets
10.5 Constructing continuous price and returns series
11 Ado-file programming
11.1 Introduction
11.1.1 What you should learn from this chapter
11.2 The structure of a Stata program
11.3 The program statement
11.4 The syntax and return statements
11.5 Implementing program options
11.6 Including a subset of observations
11.7 Generalizing the command to handle multiple variables
11.8 Making commands byable
Program properties
11.9 Documenting your program
11.10 egen function programs
11.11 Writing an e-class program
11.11.1 Defining subprograms
11.12 Certifying your program
11.13 Programs for ml, nl, and nlsur
Maximum likelihood estimation of distributions’
parameters
11.13.1 Writing an ml-based command
11.13.2 Programs for the nl and nlsur commands
11.14 Programs for gmm
11.15 Programs for the simulate, bootstrap, and jackknife prefixes
11.16 Guidelines for Stata ado-file programming style
11.16.1 Presentation
11.16.2 Helpful Stata features
11.16.3 Respect for datasets
11.16.4 Speed and efficiency
11.16.5 Reminders
11.16.6 Style in the large
11.16.7 Use the best tools
12 Cookbook: Ado-file programming
12.1 Retrieving results from rolling
12.2 Generalization of egen function pct9010() to support all pairs of
quantiles
9
12.3 Constructing a certification script
12.4 Using the ml command to estimate means and variances
12.4.1 Applying equality constraints in ml estimation
12.5 Applying inequality constraints in ml estimation
12.6 Generating a dataset containing the longest spell
12.7 Using suest on a fixed-effects model
13 Mata functions for do-file and ado-file programming
13.1 Mata: First principles
13.1.1 What you should learn from this chapter
13.2 Mata fundamentals
13.2.1 Operators
13.2.2 Relational and logical operators
13.2.3 Subscripts
13.2.4 Populating matrix elements
13.2.5 Mata loop commands
13.2.6 Conditional statements
13.3 Mata’s st_ interface functions
13.3.1 Data access
13.3.2 Access to locals, globals, scalars, and matrices
13.3.3 Access to Stata variables’ attributes
13.4 Calling Mata with a single command line
13.5 Components of a Mata function
13.5.1 Arguments
13.5.2 Variables
13.5.3 Stored results
13.6 Calling Mata functions
13.7 Example: st_ interface function usage
13.8 Example: Matrix operations
13.8.1 Extending the command
13.9 Mata-based likelihood function evaluators
13.10 Creating arrays of temporary objects with pointers
13.11 Structures
13.12 Additional Mata features
13.12.1 Macros in Mata functions
13.12.2 Associative arrays in Mata functions
13.12.3 Compiling Mata functions
13.12.4 Building and maintaining an object library
13.12.5 A useful collection of Mata routines
10
14 Cookbook: Mata function programming
14.1 Reversing the rows or columns of a Stata matrix
14.2 Shuffling the elements of a string variable
14.3 Firm-level correlations with multiple indices with Mata
14.4 Passing a function to a Mata function
14.5 Using subviews in Mata
14.6 Storing and retrieving country-level data with Mata structures
14.7 Locating nearest neighbors with Mata
14.8 Using a permutation vector to reorder results
14.9 Producing LATEX tables from svy results
14.10 Computing marginal effects for quantile regression
14.11 Computing the seemingly unrelated regression estimator
14.12 A GMM-CUE estimator using Mata’s optimize() functions
References
Author index
Subject index
11
Figures
5.1 Superimposed scatterplots
6.1 Change in Treasury bill rate
6.2 Average marginal effects in a probit model
6.3 Predictions of real exchange rates
7.1 Rolling quantile regression coefficients
7.2 Distribution of the sample median via Monte Carlo simulation
7.3 Q–Q plot of the distribution of the sample median
8.1 Moving-average growth rates
8.2 Estimated monthly volatility from daily data
9.1 Automated graphics
10.1 Point and interval elasticities computed with margins
10.2 Air quality in U.S. cities
12.1 Rolling lincom estimates
12
Tables
2.1 Numeric data types
5.1 Models of sulphur dioxide concentration
9.1 Grunfeld company statistics
9.2 Grunfeld company estimates
9.3 Wage equations for 1984
10.1 Director-level variables
11.1 MCAS percentile ranges
14.1 Demographics: Full sample
13
Preface
This book is a concise introduction to the art of Stata programming. It
covers three types of programming that can be used in working with Stata:
do-file programming, ado-file programming, and Mata functions that work
in conjunction with do- and ado-files. Its emphasis is on the automation of
your work with Stata and how programming on one or more of these levels
can help you use Stata more effectively.
The format of the book may be unfamiliar to readers who have some
familiarity with other books that help you learn how to use Stata. Beginning
with chapter 4, each even-numbered chapter is a “cookbook” chapter
containing several “recipes”, 47 in total. Each recipe poses a problem: how
can I perform a certain task with Stata programming? The recipe then
provides a complete worked solution to the problem and describes how the
features presented in the previous chapter can be put to good use. You may
not want to follow a recipe exactly from the cookbook; just as in cuisine, a
minor variation on the recipe may meet your needs, or the techniques
presented in that recipe may help you see how Stata programming applies to
your specific problem.
Most Stata users who delve into programming use do-files to automate
and document their work. Consequently, the major focus of the book is do-
file programming, covered in chapters 3, 5, 7, and 9. Some users will find
that writing formal Stata programs, or ado-files, meets their needs.
Chapter 11 is a concise summary of ado-file programming, with the
cookbook chapter that follows presenting several recipes that contain
developed ado-files. Stata’s matrix programming language, Mata, can also
be helpful in automating certain tasks. Chapter 13 presents a summary of
14
Mata concepts and the key features that allow interchange of variables,
scalars, macros, and matrices. The last chapter, cookbook chapter 14,
presents several examples of Mata functions developed to work with ado-
files. All the do-files, ado-files, Mata functions, and datasets used in the
book’s examples and recipes are available from the Stata Press website, as
discussed in Notation and typography.
The second edition of this book contains several new recipes illustrating
how do-files, ado-files, and Mata functions can be used to solve
programming problems. Several recipes have also been updated to reflect
new features in Stata added between versions 10 and 14. The discussion of
maximum-likelihood function evaluators has been significantly expanded in
this edition. The new topics covered in this edition include factor variables
and operators; use of margins, marginsplot, and suest; Mata-based
likelihood function evaluators; and associative arrays.
15
Acknowledgments
I must acknowledge many intellectual debts that have been incurred during
the creation of the first and second editions of this book. I am most indebted
to Nicholas J. Cox, who served as a technical reviewer of the original
manuscript, both for his specific contributions to this project and for his
willingness to share his extensive understanding of Stata with all of us in the
Stata user community. His Speaking Stata columns alone are worth the cost
of a subscription to the Stata Journal. Studying Nick’s many routines and
working with him on developing several Stata commands has taught me a
great deal about how to program Stata effectively.
16
Notation and typography
In this book, I assume that you are somewhat familiar with Stata, that you
know how to input data, and that you know how to use previously created
datasets, create new variables, run regressions, and the like.
I designed this book for you to learn by doing, so I picture you reading
this book while sitting at a computer and using the sequences of commands
contained in the book to replicate my results. In this way, you will be able to
generalize these sequences to suit your own needs.
I use the italic font for words that are not supposed to be typed; instead,
you are to substitute another word or words for them. For example, if I said
to type by(groupvar), you should replace “groupvar” with the actual name
of the group variable.
All the datasets and do-files for this book are freely available for you to
download. You can also download all the user-written commands described
in this book. See http://www.stata-press.com/data/itsp2.html for
instructions.
In a net-aware Stata, you can also load the dataset by specifying the
complete URL of the dataset. For example,
This text complements the material in the Stata manuals but does not
replace it, so I often refer to the Stata manuals by using [R] , [P] , etc. For
example, [R] summarize refers to the Stata Base Reference Manual entry
for summarize, and [P] syntax refers to the entry for syntax in the Stata
Programming Reference Manual.
1
17
Chapter 1
Why should you become a Stata programmer?
This book provides an introduction to several contexts of Stata
programming. I must first define what I mean by “programming”. You can
consider yourself a Stata programmer if you write do-files, which are text
files of sequences of Stata commands that you can execute with the do
([R] do) command, by double-clicking on the file, or by running them in the
Do-file Editor ([R] doedit). You might also write what Stata formally
defines as a program, which is a set of Stata commands that includes the
program ([P] program) command. A Stata program, stored in an ado-file,
defines a new Stata command. You can also use Stata’s matrix programming
language, Mata, to write routines in that language that are called by ado-
files. Any of these tasks involves Stata programming.1
With that set of definitions in mind, we must deal with the why: why
should you become a Stata programmer? After answering that essential
question, this text takes up the how: how you can become a more efficient
user of Stata by using programming techniques, be they simple or complex.
Do-file programming
19
by a set of do-files that execute every step, from the input of the raw data to
the production of the final tables and graphs. Because a do-file can call
another do-file (and so on), a hierarchy of do-files can be used to handle a
complex project.
That advice does not imply that Stata’s interactive capabilities should
be shunned. Stata is a powerful and effective tool for exploratory data
analysis and ad hoc queries about your data. But data-management tasks and
the statistical analyses leading to tabulated results should not be performed
with “point-and-click” tools that leave you without an audit trail of the steps
you have taken.
Ado-file programming
On a second level, you may find that despite the breadth of Stata’s official
and user-written commands, there are tasks you must repeatedly perform
that involve variations on the same do-file. You would like Stata to have a
command to perform those tasks. At that point, you should consider Stata’s
ado-file programming capabilities. Stata has great flexibility: a Stata
command need be no more than a few lines of Stata code. Once defined, that
command becomes a “first-class citizen”. You can easily write a Stata
program, stored in an ado-file, that handles all the features of official Stata
commands such as if exp, in range, and command options. You can (and
should) write a help file that documents the program’s operation for your
benefit and for those with whom you share the code. Although ado-file
programming requires that you learn how to use some additional commands
used in that context, it can help you become more efficient in performing the
data-management, statistical, or graphical tasks that you face.
20
Mata programming for ado-files
On a third level, your ado-files can perform some complicated tasks that
involve many invocations of the same commands. Stata’s ado-file language
is easy to read and write, but it is interpreted. Stata must evaluate each
statement and translate it into machine code. The Mata programming
language (help mata) creates compiled code, which can run much faster
than ado-file code. Your ado-file can call a Mata routine to carry out a
computationally intensive task and return the results in the form of Stata
variables, scalars, or matrices. Although you may think of Mata solely as a
matrix language, it is actually a general-purpose programming language,
suitable for many nonmatrix-oriented tasks, such as text processing and list
management.
The level of Stata programming that you choose to attain and master
depends on your needs and skills. As I have argued, the vast majority of
interactive Stata users can and should take the next step of learning how to
use do-files efficiently to take full advantage of Stata’s capabilities and to
save time. A few hours of investment in understanding the rudiments of do-
file programming—as covered in the chapters to follow—will save you
days or weeks over the course of a sizable research project.
21
The chapters of this book present the details of the three types of Stata
programming discussed above, placing the greatest emphasis on effective
use of do-file programming. Each fairly brief chapter on the structure of
programming techniques is followed by a “cookbook” chapter. These
chapters contain several “recipes” for the solution of a particular,
commonly encountered problem, illustrating the necessary programming
techniques to compose a solution. Like in a literal cookbook, the recipes
here are illustrative examples; you are free to modify the ingredients to
produce a somewhat different dish. The recipes as presented may not
address your precise problem, but they should prove helpful in devising a
solution as a variation on the same theme.
Newer versions of the user-written commands that you install today may
become available. The official Stata command adoupdate
([R] adoupdate), which you can use at any time, will check to see whether
newer versions of any user-written commands are available. Just as the
command update query will determine whether your Stata executable and
official ado-files are up to date, adoupdate will determine whether any
user-written commands installed in your copy of Stata are up to date.
. There are also specialized forms of Stata programming, such as dialog programming,
scheme programming, and class programming. A user-written program can present a
dialog, like any official Stata command, if its author writes a dialog file. The command
can also be added to the User menu of Stata’s graphical interface. For more information,
see [P] dialog programming and [P] window programming. Graphics users can write
their own schemes to set graphic defaults. See [G-4] schemes intro for details. Class
programming allows you to write object-oriented programs in Stata. As [P] class
indicates, this has primarily been used in Stata’s graphics subsystem and graphical user
interface. I do not consider these specialized forms of programming in this book.
. For details on the SSC (Boston College) archive of user-contributed routines, type
help ssc .
22
Chapter 2
Some elementary concepts and tools
2.1 Introduction
This chapter lays out some of the basics that you will need to be an effective
Stata programmer. The first section discusses navigational and
organizational issues: How should you organize your files? How will Stata
find a do-file or an ado-file? The following sections describe how to edit
files, appropriate data types, several useful commands for programmers,
and some guidelines for Stata programming style. The last section suggests
how you can seek help for your programming problems.
Know where your files are: master the current working directory and
the ado-path
Learn how to edit do- and ado-files effectively
Use appropriate data types for your variables
Use compress when useful
Use time-series operators effectively
Use factor variables and operators effectively
Use capture, preserve, and restore to work efficiently
Use Stata’s data input commands effectively
Adopt a good style for do-file programming and internal documentation
Know where (and when) to seek help with your Stata programming
Know how to trace your do-file’s execution to diagnose errors
24
and organizational issues to help ensure that your files are in the best places.
Like most programs, Stata has a concept of the current working directory
(CWD). At any point in time, Stata is referencing a specific directory or
folder accessible from your computer. It may be a directory on your own
hard disk, or one located on a network drive or removable disk. In
interactive mode, Stata displays the CWD in its status bar. Why is the CWD
important? If you save a file—a .dta file or a log file—it will be placed in
the CWD unless you provide a full file specification directing it to another
directory. That is, save myfile, replace will save myfile in the CWD.
Likewise, if you attempt to use a file with the syntax use myfile, clear,
it will search for myfile in the CWD, returning an error if the file is not
located.
One of the most common problems that beginning users of Stata face is
saving a data file and not knowing where it was saved. Of course, if you
never change the CWD, all of your materials will be in the same place, but
do you really want everything related to your research to be located in one
directory? On my Mac OS X computer, the directory is
/Users/baum/Documents, and I would rather not commingle documents
with Stata data files, log files, and graph files. Therefore, I would probably
change the CWD to a directory, or folder, dedicated to my research project
and set up multiple directories for separate projects. You can change your
CWD with the cd command; for example, cd /data/city, cd
d:/project, or cd "My Documents/project1".1 You can use the pwd
command to display the CWD at any time. Both cd and pwd are described in
[D] cd.
You may want Stata to automatically change the CWD to your preferred
location when you start Stata. You can accomplish this with profile.do.
This file, placed in your home directory,2 will execute a set of commands
when you invoke Stata. You might place in profile.do the command cd
c:/data/NIHproject to direct Stata to automatically change the CWD to
that location.
25
The sysdir ([P] sysdir) command provides a list of six directories or
folders on your computer that are important to Stata. The BASE directory
contains the Stata program itself and the official ado-files that make up most
of Stata. You should not tamper with the files in this directory. Stata’s
update ([R] update) command will automatically modify the contents of
the BASE directory. The SITE directory may reference a network drive in a
university or corporate setting where a system administrator places ado-
files to be shared by many users.
The PERSONAL directory is, as its name suggests, personal. You can
place your own ado-files in that directory. If you want to modify an official
Stata ado-file, you should make a copy of it, change its name (for instance,
rename sureg.ado to sureg2.ado), and place it in your PERSONAL
directory.
Why are there all these different places for Stata’s ado-files? The
answer lies in the information provided by the adopath ([P] sysdir)
command:
Like sysdir, this command lists six directories. The order of these
directories is important because it defines how Stata will search for a
command. It will attempt to find foo.ado in BASE, the location of Stata’s
official ado-files. The third directory4 is “.”, that is, the CWD. The fourth is
PERSONAL, while the fifth is PLUS.5 This pecking order implies that if
26
foo.ado is not to be found among Stata’s official ado-files or the SITE
directory, Stata will examine the CWD. If that fails, it will look for foo.ado
in PERSONAL (and its subdirectories). If that fails, it will look in PLUS (and
its subdirectories) and as a last resort in OLDPLACE. If foo.ado is nowhere
to be found, Stata will generate an unrecognized command error.
This search hierarchy indicates that you can locate an ado-file in one of
several places. In the next section, I discuss how you might choose to
organize ado-files, as well as do-files and data files related to your
research project.
It is crucially important that you place ado-files on the ado-path. You can
place them in your CWD ([3] above in the ado-path listing), but that is
generally a bad idea because if you work in any other directory, those ado-
files will not be found. If the ado-files are your own or have been written by
a coworker, place them in PERSONAL. If you download ado-files from the
SSC archive, please heed the advice that you should always use Stata—not a
web browser—to perform the download and locate the files in the correct
directory (in PLUS).
What about your do-files, data files, and log files? It makes sense to
create a directory, or folder, in your home directory for each separate
project and to store all project-related files in that directory. You can
always fully qualify a data file when you use it in a do-file, but if you move
that do-file to another computer the do-file will fail to find the datafile.
Referencing files in the same directory simplifies making a copy of that
directory for a coworker or collaborator and makes it possible to run the
do-files from an external drive, such as a flash disk, or from a shared
storage location, such as Dropbox or Google Drive.
You might also have several projects that depend on one or two key data
files. Rather than duplicating possibly large data files in each project
27
directory, you can refer to them with a relative file specification. Say that
your research directory is d:/data/research with subdirectories
d:/data/research/project1 and d:/data/research/project2. Place
the key data file master.dta in the research directory, and refer to it in
the project directories with use ../master, clear. The double dot
indicates that the file is to be found in the parent (enclosing) directory,
while allowing you to move your research directory to a different drive (or
even to a Mac OS X or a Linux computer) without having to alter the use
statement.
The Do-file Editor has an advantage over most external editors: it allows
you to execute only a section of the file by selecting those lines and hitting
the Do icon. You should recognize that do- and ado-files are merely text
files with file types of .do or .ado rather than .txt. As such, it is a very
poor idea to edit them in a word processor, such as Microsoft Word. A
word processor must read the text file and convert it into its own binary
format; when the file is saved, it must reverse the process.6 Furthermore, a
word processor will usually present the file in a variable-width character
format, which is harder to read. But the biggest objection to word
processing a do-file or ado-file is the waste of your time: it is considerably
faster to edit a file in the Do-file Editor and execute it immediately without
the need to translate it back into text.
You can use other text editors (but not word processors) to edit do-files.
However, Stata’s Do-file Editor supports syntax highlighting, automatic
indentation, line numbering, bookmarks, and collapsible nodes, so there are
few features provided by external editors that are not readily available in
the Do-file Editor.7
28
Stata, as a programming language, supports more data types than do many
statistical packages. The major distinction to consider is between numeric
and string data types. Data-management tasks often involve conversions
between numeric and string variables. For instance, data read from a text
file (such as a .csv or tab-delimited file created by a spreadsheet) are
often considered to be a string variable by Stata even though most of the
contents are numeric. The commands destring and tostring (see
[D] destring for both) are helpful in this regard, as are encode and decode
(see [D] encode for both).
Stata’s numeric data types include byte, int, long, float, and double.
The byte, int, and long data types can only hold integer contents. See
table 2.1 for a summary of the numeric data types.
byte 1
int 2
long 4
float 4
double 8
29
The long integer data type can hold all signed nine-digit integers but
only some ten-digit integers. Integers are held in their exact representation
by Stata so that you can store a nine-digit integer (such as a U.S. Social
Security number) as a long data type. However, lengthy identification
numbers also can be stored as a double data type or as a string variable.
That will often be a wise choice, because then you need not worry about
possible truncation of values. You also will find it useful to use string
variables when a particular identification code could contain characters.
For instance, the CUSIP (Committee on Uniform Security Identification
Procedures) code used to identify U.S. security issues used to be wholly
numeric but now may contain one or more nonnumeric characters. Storing
these values as strings avoids later problems with numeric missing values.
Further details of this issue can be found in [U] 12.2.2 Numeric storage
types, Gould (2006b) , and Cox (2006b). The implications are clear: an if
condition that tests some floating-point value for equality, such as if diff
== 0.01, is likely to fail when you expect that it would succeed.8 A float
contains approximately 7 significant digits in its mantissa. This implies that
if you read a set of nine-digit U.S. Social Security numbers into a float,
they will not be held exactly. A double contains approximately 15
significant digits. We know that residuals computed from a linear regression
using regress and predict eps, residual should sum to exactly 0. In
Stata’s finite-precision computer arithmetic using the default float data
type, residuals from such a regression will sum to a value in the range of
rather than 0.0. Thus, discussions of the predict ([R] predict)
command often advise using predict double eps, residual to
compute more accurate residuals.
30
What are the implications of finite-precision arithmetic for Stata
programming?
You should store ID numbers with many digits as string variables, not
as integers, floats, or doubles.
You should not rely on exact tests of a floating-point value against a
constant, not even 0. The reldif() function ([FN] Mathematical
functions) can be used to test for approximate equality.
As suggested above, use double floating-point values for any
generated series where a loss of precision might be problematic, such
as residuals, predicted values, scores, and the like.
You should be wary of variables’ values having very different scales,
particularly when a nonlinear estimation method is used. Any
regression of price from the venerable auto.dta reference dataset
on a set of regressors will display extremely large sums of squares in
the analysis of variance table. Scaling price from dollars to
thousands of dollars obviates this problem. The scale of this variable
does not affect the precision of linear regression, but it could be
problematic for nonlinear estimation techniques.
Use integer data types where it is appropriate to do so. Storing values
as byte or int data types when feasible saves disk space and
memory.
31
Stata does not have a separate data type for calendar dates. Dates are
represented, as they are in a spreadsheet program, by numbers known as %t
values measuring the time interval from a reference date or “epoch”. For
example, the epoch for Stata and for SAS is midnight on 1 January 1960.
Days following that date have positive integer values, while days prior to it
have negative integer values. Dates represented in days are known as %td
values. Other calendar frequencies are represented by the number of weeks,
months, quarters, or half-years since that reference date: %tw, %tm, %tq, and
%th values, respectively. The year is represented as a %ty value, ranging
from AD 100 to AD 9999. You can also use consecutive integers and the
generic form, as %tg.
When working with variables containing dates and times, you must
ensure that the proper Stata data type is used for their storage. Weekly and
lower-frequency values (including generic values) can be stored as data
type int or as data type float. Daily (%td) values should be stored as data
type long or as data type float. If the int data type is used, dates more
than 32,740 days from 1 January 1960 (that is, beyond 21 August 2049)
cannot be stored.
32
More stringent requirements apply to clock (date-and-time) values.
These values must be stored as data type double to avoid overflow
conditions. Clock values, like other time values, are integers, and there are
86,400,000 milliseconds in a day. The double data type is capable of
precisely storing date-and-time measurements within the range of years
defined in Stata (AD 100–9999).
If you are preparing to move data from a spreadsheet into Stata with the
import delimited ([D] import delimited) command, make sure that any
date variables in the spreadsheet display as four-digit years. It is possible to
deal with two-digit years (for example, 11/08/06) in Stata, but it is easier
to format the dates with four-digit years (for example, 11/08/2006) before
reading those data into Stata.
Stata provides the time-series operators L., F., D., and S., which allow the
specification of lags, leads (forward values), differences, and seasonal
differences, respectively.10 The time-series operators make it unnecessary
to create a new variable to use a lag, difference, or lead. When combined
with a numlist, they allow the specification of a set of these constructs in a
single expression.
Consider the lag operator, L., which when added to the beginning of a
33
variable name refers to the (first-)lagged value of that variable: L.x. A
number can follow the operator: L4.x refers to the fourth lag of x. More
generally, a numlist can be used: L(1/4).x refers to the first through fourth
lags of x, and L(1/4).(x y z) defines a list of four lagged values of each
of the variables x, y, and z. Similarly to the lag operator, the lead operator,
F., allows the specification of future values of one or more variables.
Strictly speaking, the lead operator is unnecessary because a lead is a
negative lag, and an expression such as L(-4/4).x will work, labeling the
negative lags as leads.
You can also combine the time-series operators: LD.x is the lag of the
first difference of x (that is, ) and refers to the same expression
as DL.x. Any of the above expressions can be used almost anywhere that a
varlist is required.
34
The time-series operators also provide an important benefit in the
context of longitudinal or panel datasets ([XT] xt), where each observation,
, is identified with both an and a subscript. If those data are xtset
([XT] xtset) or tsset ([TS] tsset) , using the time-series operators will
ensure that references will not span panels. For instance, z[_n-1] in a
panel context will allow you to reference (the last observation of panel
1) as the prior value of (the first observation of panel 2). In contrast,
L.z (or D.z) will never span panel boundaries. Panel data should always
be xtset or tsset, and any time-series references should use the time-
series operators.
is equivalent to
You can force inclusion of the base level or exclusion of other levels by
specifying variations on the i. operator. For instance, if we wanted to
specify 2 as the base level for gender, we could do this using the notation
35
ib2.gender. Or, we could include both values of gender in our model and
exclude the constant by specifying ibn.gender and the noconstant
option. For more information about selecting levels for indicator variables,
see [U] 11.4.3 Factor variables.
For instance, we could include main effects for gender and region
and create an interaction between them in our model by typing
Or, we could include main effects and an interaction between region and
income by typing
For example, suppose you are analyzing the bpress dataset containing
36
blood pressure measurements of individuals categorized by gender (with
indicator variable sex, 0 male) and one of three agegrp codes (1 = 30–
45, 2 = 46–59, 3 = 60+).
You can then easily obtain marginal effects for each of the covariates:
37
2.5 Handling errors: The capture command
When an error is encountered in an interactive Stata session, it is displayed
on the screen. When a do-file is being executed, however, Stata’s default
behavior causes the do-file to abort when an error occurs.11 There are
circumstances when a Stata error should be ignored, for example, when
calculating a measure from each by-group that can be computed only if there
are more than 10 observations in the by-group.
Several Stata commands replace the data in memory with a new dataset. For
instance, the collapse ([D] collapse) command makes a dataset of
summary statistics, whereas contract ([D] contract) makes a dataset of
frequencies or percentages. In a program, you may want to invoke one of
these commands, but you may want to retain the existing contents of memory
for further use in the do-file. You need the preserve and restore (see
[P] preserve for both) commands, which will allow you to set aside the
current contents of memory in a temporary file and bring them back when
needed. For example,
38
We use and modify auto.dta, and then preserve the modified file. The
collapse command creates a dataset with one observation for each value
of rep78, the by() variable. We sort that dataset of summary statistics and
save it.
The restore command brings the preserved dataset back into memory.
We sort by rep78 and use merge to combine the individual automobile
data in memory with the summary statistics from repstats.dta. Although
these computations could have been performed without collapse,12 the
convenience of that command is clear. The ability to set the current dataset
aside (without having to explicitly save it) and bring it back into memory
when needed is a useful feature.
39
This section details data input and manipulation issues.13 Source data can
be downloaded from a website, acquired in spreadsheet format, or made
available in the format of some other statistical package. The following two
subsections deal with those variations.
Before carrying out statistical analysis with Stata, many researchers must
face several thorny issues in converting their foreign data into a Stata-usable
form. These issues range from the mundane (for example, a text-file dataset
may have coded missing values as 99) to the challenging (for example, a
text-file dataset may be in a hierarchical format, with master records and
detail records). Although a brief guide to these issues cannot possibly cover
all the ways in which external data can be organized and transformed for
use in Stata, several rules apply:
Familiarize yourself with the various Stata commands for data input.
Each has its use, and in the spirit of “do not pound nails with a
screwdriver”, data handling is much simpler if you use the correct tool.
Reading [U] 21 Entering and importing data is well worth the
investment of your time.
When you need to manipulate a text file, use a text editor, not a word
processor or a spreadsheet.
Get the data into Stata as early in the process as you can, and perform
all manipulations via well-documented do-files that can be edited and
re-executed if need be (or if a similar dataset is encountered). Given
this exhortation, I will not discuss input ([D] input) or the Data
Editor, which allows interactive entry of data, or various copy-and-
paste strategies involving simultaneous use of a spreadsheet and Stata.
Such strategies are not reproducible and should be avoided.
Keeping track of multiple steps of the data input and manipulation
process requires good documentation. If you ever need to replicate or
audit the data manipulation process, you will regret it if your
documentation did not receive the proper attention.
Working with anything but a simple rectangular data array will almost
always require the use of append ([D] append), merge ([D] merge),
or reshape ([D] reshape). You should review the information about
those commands in chapter 5 and understand their capabilities.
40
Handling text files
Text files —often described as ASCII files—are the most common source of
raw data in microeconomic research. Text files can have any file extension:
they can be labeled .raw or .csv (as Stata would prefer) or .txt or
.asc. A text file is just that: text. Word processing programs, like Microsoft
Word or OpenOffice, are inappropriate tools for working with text files
because they have their own native, binary format and generally use features
such as proportional spacing, which causes columns to be misaligned. A
word processor uses a considerable amount of computing power to translate
a text file into its own native format before it can display it on the screen.
The inverse of that translation must be used to create a text file that can be
subsequently read by Stata.
Stata does not read binary files other than those in its own .dta
format.14 The second rule above counsels the use of a text editor rather than
a word processor or spreadsheet when manipulating text files. Every
operating system supports a variety of text editors, many of which are freely
available.15 You will find that a good text editor is much faster than a word
processor when scrolling through a large data file. Many text editors
colorize Stata commands, making them useful for Stata program
development. Text editors are also extremely useful when working with
large survey datasets that are accompanied with machine-readable
codebooks, often many megabytes in size. Searching those codebooks for
particular keywords with a powerful text editor is efficient.
Text files can be free format or fixed format. A free-format file contains
several fields per record, separated by delimiters, characters that are not to
be found within the fields. A purely numeric file (or one with simple string
variables, such as U.S. state codes) can be whitespace-delimited; that is,
successive fields in the record are separated by one or more whitespace
characters:
The columns in the file need not be aligned. These data can be read from a
text file (by default with extension .raw) with Stata’s infile ([D] infile
41
(free format)) command, which must assign names (and if necessary, data
types) to the variables:
Here we must indicate that the first variable is a string variable of maximum
length two characters (str2); otherwise, every record will generate an
error that says state cannot be read as a number.
However, this scheme will break down as soon as we hit New Hampshire.
The space within the state name will be taken as a delimiter, and Stata will
become quite befuddled. If string variables with embedded spaces are to be
used in a space-delimited file, they themselves must be delimited (usually
with quotation marks in the text file):
42
So what should you do if your text file is space-delimited and contains
string variables with embedded spaces? That is a difficult question because
no mechanical transformation will generally solve this problem. For
instance, using a text editor to change multiple spaces to a single space and
then to change each single space to a tab character will not help, because it
will then place a tab between New and Hampshire.
If the data are downloadable from a web page that offers formatting
choices, you should choose a tab-delimited rather than a space-delimited
format. The other option, comma-delimited text or comma-separated values
(.csv), has its own difficulties. Consider field contents (without quotation
marks) such as “College Station, TX”, “J. Arthur Jones, Jr.”, “F. Lee Bailey,
Esq.”, or “T. Frank Kennedy, S.J.”. If every city name is followed by a
comma, then no problem, because the city and state can then be read as
separate variables. But if some are written without commas (“Brighton
MA”), the problem returns. In any case, parsing proper names with
embedded commas is problematic. Tab-delimited text avoids most of these
problems.
If we are to read tab-delimited text files, the infile ([D] infile (free
format)) command is no longer the right tool for the job; we should now use
import delimited ([D] import delimited).16 The import delimited
command reads a tab-delimited or comma-delimited (.csv) text file
regardless of whether a spreadsheet program was involved in its creation.
43
For instance, most database programs contain an option to generate a tab-
delimited or comma-delimited export file, and many datasets available for
web download are in one of these formats.
44
usually denoting the omission of one or more fields on that record. Such an
omission can also trigger one or more error messages. For instance, leaving
out a numeric field on a particular record will move an adjacent string field
into that variable. Stata will then complain that it cannot read the string as a
number. A distinct advantage of the tab- and comma-delimited formats is
that missing values can be coded with two successive delimiters. As will be
discussed in chapter 5, assert ([D] assert) can be used advantageously to
ensure that reasonable values appear in the data.
You can also use the import excel ([D] import excel) command to
read the contents of an Excel or Excel-compatible worksheet, in either .xls
or .xlsx format, into Stata. You can specify the worksheet to be input and
optionally provide a cell range from which data are to be read. The first
row can be used to provide Stata variable names. Note that if a column of
45
the worksheet contains at least one cell with nonnumerical text (such as NA),
the entire column is imported as a string variable. Therefore, you should be
familiar with Stata’s string-to-numeric conversion capabilities, as discussed
in section 3.2.5.
Two caveats regarding dates: First, both Excel and Stata work with the
notion that calendar dates are successive integers from an arbitrary starting
point. To read the dates into a Stata date variable, they must be formatted
with a four-digit year, preferably in a format with delimiters (for example,
12/6/2004 or 6-Dec-2004). It is much easier to make these changes in the
spreadsheet program before reading the data into Stata. Second, Mac OS X
users of Excel should note that Excel’s default is the 1904 date system. If the
spreadsheet was produced in Excel for Windows and the steps above are
used to create a new sheet with the desired data, the dates will be off by
four years (the difference between Excel for Mac and Excel for Windows
defaults). Uncheck the preference Use the 1904 date system before saving
the file as text.
46
The codebook specifies the column in which each variable starts (LOC)
and the number of columns it spans (WIDTH).20 In this fragment of the
47
codebook, only integer numeric variables appear. The missing-data codes
(MD) for each variable are also specified. The listing above provides the full
codebook details for variable 13, marital status, quoting the question posed
by the interviewer, coding of the six possible responses, and the frequency
counts of each response.
48
Alternatively, we could set up a dictionary file for the fixed-format
version of infile ([D] infile (fixed format)). This is the more powerful
option because it allows us to attach variable labels and specify value
labels. However, rather than specifying the column range of each field that
you want to read, you must indicate where it starts and its field width, given
as the %infmt for that variable. With a codebook like the one displayed
above, we have the field widths available. We could also calculate the field
widths from the starting and ending column numbers. We must not only
specify which are string variables but also give their data storage type. The
storage type could differ from the %infmt for that variable. You might read
a six-character code into a ten-character field knowing that other data use
the latter width for that variable.
49
The _column() directives in this dictionary are used where dictionary
fields are not adjacent. You could skip back and forth along the input record
because the columns read need not be in ascending order. But then we could
achieve the same thing with the order ([D] order) command after data
input. We are able to define variable labels by using infile.
In both examples above, the dictionary file specifies the name of the
data file, which need not be the same as that of the dictionary file. For
example, highway.dct could read highway.raw, and if that were the case,
the latter filename need not be specified. But we might want to use the same
dictionary to read more than one .raw file. To do so, leave the filename out
of the dictionary file, and use the using modifier to specify the name of the
.raw file. After loading the data, we can describe ([D] describe) its
contents:
50
The dictionary indicates that value labels are associated with the variables,
but it does not define those labels. Commands such as
51
%6.2f, a value such as 123456 can be read properly as daily sales of
$1,234.56.
The previous section discussed how foreign data files could be brought into
Stata. Often, the foreign data are already in the format of some other
statistical package or application. For instance, several economic- and
financial-data providers make SAS-formatted datasets readily available,
while socioeconomic datasets are often provided in SPSS format. The most
straightforward and inexpensive way to deal with these package formats
involves the third-party application Stat/Transfer, a product of Circle
Systems, Inc. Stat/Transfer has the advantage of a comarketing relationship
with StataCorp, so you can acquire a copy of Stat/Transfer at an
advantageous price from StataCorp.
52
flexible set of data types. Stata, like the C language in which its core code is
written, offers five numeric data types ([D] data types): the integer types
byte, int, and long, and the floating-point types float and double. Stata
also offers the string types str1-str2045 and strL. Most other packages
do not have this broad array of data types and instead resort to storing all
numeric data in a single data type: “Raw data come in many different forms,
but SAS simplifies this. In SAS there are just two data types: numeric and
character” (Delwiche and Slaughter 1998, 4). This simplicity bears a
sizable cost because an indicator variable requires only one byte of storage
and a double-precision floating-point variable requires eight bytes to hold
up to 15 decimal digits of accuracy.
Stata allows the user to specify the data type based on the contents of
each variable, which can result in considerable savings of both disk space
and execution time when reading or writing those variables to disk.
Stat/Transfer can be instructed to optimize a target Stata-format file in the
transfer process, or you can use Stata’s compress ([D] compress) command
to automatically perform that optimization. In any case, you should always
take advantage of this optimization because it will reduce the size of files
and require less of your computer’s memory to work with them.
53
I must also mention an alternative solution for data transfer between
databases supporting some flavor of structured query language (SQL). Stata
can perform Open Database Connectivity (ODBC) operations with databases
accessible via that protocol (see [D] odbc for details). Because most SQL
databases and non-SQL data structures, such as Excel and Microsoft Access,
support ODBC, this is often suggested as a workable solution to dealing with
foreign data. It does require that the computer system on which you are
running Stata is equipped with ODBC drivers. These are installed by default
on Windows systems with Microsoft Office but may require the purchase of
a third-party product for Mac OS X or Linux systems. If the necessary
database connectivity is available, Stata’s odbc is a full-featured solution.
It allows for both the query of external databases and the insertion or update
of records in those databases.
As you move away from interactive use of Stata and make greater use of do-
files and ado-files in your research, the style of the contents of those files
becomes more important. One of the reasons for using do-files is the audit
trail that they provide. Are your do-files readable and comprehensible—not
only today but also in several months? To highlight the importance of good
programming style practices, I present an edited excerpt from Nicholas
J. Cox’s excellent essay “Suggestions on Stata programming style”22
(Cox 2005f). The rest of this section is quoted from that essay.
People who program a great deal know this: setting rules for yourself
and then obeying them ultimately yields better programs and saves time.
They also know that programmers may differ in style and even argue
passionately about many matters of style, both large and small. In this
morass of varying standards and tastes, I suggest one overriding rule: Set
and obey programming style rules for yourself. Moreover, obey each of the
rules I suggest unless you can make a case that your own rule is as good or
54
Exploring the Variety of Random
Documents with Different Content
bij het stadhuis, waar hij de remonstrantie aan den burgemeester ter
bewaring overgaf.
Een uur later verliet Lodewijk zijne vrienden, onder voorwendsel
van zich naar huis te begeven; doch het was om vol droefheid en
eenzaam door de stad te dwalen; het was om zich geheel over te
geven aan de smart, die deze schriktooneelen hem veroorzaakten.
Wanhopig en buiten zich zelven, stapte hij langzaam door de straten
en scheen zich bijna niet meer te bekreunen over hetgeen er
gebeurde. Een gevoel van schaamte belette hem, zich naar
Godmaerts woning te begeven. Zou hij zeggen, dat dit alles onder
zijne oogen geschied was, zonder dat hij iets had kunnen doen om
het te beletten?
Nu de stormers door de onmacht der regeering van straffeloosheid
verzekerd waren, gingen zij voort met alles in de stad aan stukken te
houwen. Geen beeldje lieten zij op poort of muur ongeschonden
staan. En wanneer de vreedzame burger zich tegen hun geweld
wilde verzetten, werd hij door deze booswichten wreedelijk
mishandeld en met smaadwoorden bejegend. Een oneindig getal
inwoners, die over de gevolgen dezer goddeloosheid en vernieling
verschrikten, vielen van de zijde der hervormers af.
De zon had zich van wolken ontdaan. Heerlijk en prachtig zond zij
hare stralen boven de puinhoopen, die overal op de openbare
plaatsen bijeengezameld waren. Afwisselende scharen van
ontelbare menschen stroomden met blij gejuich door de stad.
“Heil! Heil!” schreeuwden zij, alsof eene razende vreugde hen dol
had gemaakt. Bijlen, ladders, koorden en meer ander werktuig
werden door hen zegepralend rondgedragen. Wanneer zij, aldus
loopende, op den gevel van eenig gebouw nog een beeld, hoe hoog
het ook ware, bemerkten, klommen zij, door het grauw toegejuicht,
naar boven, en het beeld viel dan onder het geroep: Heil! Heil!
kletterend en verbrijzeld op den grond.
Alle winkels waren gesloten, alle kerken beroofd, de gevels van
alle huizen en openbare gebouwen geschonden. Puinhoopen van
kostelijk marmer belemmerden de kruisstraten. Het scheen, dat de
Antwerpenaren, door uitzinnigheid verblind, hunne huizen niet meer
bewonen wilden en hunne eigene stad met hardnekkigheid
vernielden.
Van deze gruweldaden geschiedden er vele op de markten en in
de straten, waar Lodewijk voorbijging. Zoo zag hij voor de St.
Jakobskerk eenen grooten hoop beelden, kruisen en vele andere
gewijde zaken in een groot vuur, dat de stormers aangestoken
hadden, tot assche verbranden.
Op den namiddag ging hij voorbij het Minderbroedersklooster,
alwaar men bezig was met plunderen. De broeders en priesters
werden met spotternij en mishandeling verjaagd en vervolgd. Dit
ziende, verschrikte Lodewijk hevig, daar hij aan pater Franciscus
dacht, dan eerst ontwaakte hij uit de radeloosheid, welke hem dien
ganschen dag tot een gevoelloos mensch gemaakt had. Hij hief het
hoofd op; een nieuw vuur blikkerde in zijne oogen, en hij wendde
zich met haastige stappen naar de Veemarkt, om pater Franciscus te
gaan vinden en hem van mishandeling te bevrijden, indien het
mogelijk ware.
Dáár komende, vond hij voor het Predikheerenklooster eenen
ontelbaren hoop beeldenstormers, die hem den doorgang beletten.
Met veel moeite, na lang drukken en stooten, geraakte hij eindelijk
binnen in het klooster, dat met booswichten en dieven was vervuld.
Hij zag hen om de zilveren kandelaren vechten, hoorde de
schandelijkste vloeken tegen de welfsels bonzen, en vond den refter
vol dronken menschen, die in onzedige liedekens en lasterende
spotternijen zich vermaakten.
Lodewijk ging dwars door deze goddeloozen en gaf geene acht op
hunne scherts; hij klom de trap op, om zich naar de cel van pater
Franciscus te begeven, en kwam weldra op het eerste verdiep, waar
hij weinig volk aantrof.
De cellen stonden open, alles was binnen deze doodstil; eenige
deuren waren aan stukken geslagen als een teeken der
balddadigheden, die men hier gepleegd had. Reeds klopte het hart
van den jongeling langzamer; zijn hoofd viel met moedeloosheid
voorover, en er was weinig hoop meer in hem, alhoewel hij nog
voortstapte door den gang, wanneer hij op eens eenige stemmen
van verre zegepralend hoorde roepen:
“Hier hebben wij nog eenen paap! Werpt hem op de straat, dien
hond!”
Lodewijk sprong vooruit, smeet drie of vier mannen van de celdeur
weg en deed eenen stap in het kleine vertrek, terwijl de verbaasde
stormers elkander met ondervragende blikken bezagen.
Pater Franciscus lag, zoo lang hij was, met het aangezicht tegen
den grond voor een kruisbeeld uitgestrekt, zijne zilveren haren
raakten van wederzijden den vloer. Van tijd tot tijd deed hij eene
beweging als om de handen hemelwaarts te heffen, en eenige
vurige woorden, die zijnen mond ontsnapten, getuigden, dat hij bezig
was met bidden.
Er ontstond in den geest van Lodewijk eene gedachte om al de
spotters, die aan de deur stonden, te dooden; hij kon dit doen, want
zij waren weinig in getal en niet gewapend; maar hij verliet welhaast
dit inzicht en wierp zich geknield nevens pater Franciscus, wiens
eene hand hij in de zijne nam. Dan sprak hij:
“Vader, hier ben ik, uw beminde zoon Lodewijk. Ik kom u redden.”
De priester rechtte zich op de knieën, bezag Lodewijk met eenen
dankbaren blik en antwoordde, terwijl hij de oogen op het Christus-
beeld gericht hield:
“Lodewijk, mijn goede zoon, ik dank u om uwe genegenheid: maar
ik zal u niet volgen. Hier, in deze cel, wil ik sterven, indien God over
mijn leven heeft beschikt. Laat mij bidden, stoor mij niet. Ik wil de
wereld verlaten met den naam des Heeren op mijnen mond. Ga
heen, denk niet aan mij.”
Lodewijk sloeg als verdwaald zijne twee armen om het hoofd des
priesters, tranen borsten uit zijne oogen, en hij snikte:
“Gij sterven! Gij, mijn goede vader! O, Geertruid zou mij
vermaledijden, indien ik u hier liet! Kom aan, de goddeloozen zullen
u mishandelen; zij zullen u vermoorden.... Het is nog tijd.... Ik zal u
verdedigen of sterven met u.”
“Lodewijk, mijn brave zoon, wees bedaard.... Zie, de kroon des
marteldoods wordt mij aangeboden; zou ik die weigeren? De Heer
heeft mij zeventig jaren gegund, ik ben niet ondankbaar.”
De jongeling plaatste zijne hand op den mond des priesters.
“Uwe woorden zijn heilig,” riep hij, “maar zij branden op mijn hart
als vuur! O, zie mijne tranen, denk aan Geertruid, aan Godmaert. Gij
alleen kunt ons troosten: uw dood zou uwen vriend Godmaert het
leven kosten; want nu durf ik het zeggen, en gij weet het, hij zou deel
hebben in den moord; uw bloed zou op zijn hoofd terugvallen,... hij
heeft uwe vijanden opgestookt.... Zult gij wreed genoeg zijn, o goede
vader, om hem die eeuwige wroeging op den hals te laden, om uw
eigen bloed over hem te werpen, en zijne dochter haren vader te
doen beschuldigen? Neen, niet waar, gij gaat met mij? Gij zijt te
edelmoedig, te goed om uwen evennaaste, uwen vriend, dit ongeluk
aan te doen!”
Gedurende deze woorden had Lodewijk den priester met geweld
doen rechtstaan, en trok nu als zinneloos aan zijne hand om hem uit
de cel te doen gaan.
“Ik zal u volgen,” sprak eindelijk de pater, “maar luister wel op
deze woorden, mijn zoon; want ik wil, dat gij ze volbrenget als een
onverbrekelijk bevel.... Misschien zal men u en mij bespotten en
mishandelen; gij zult lijden met mij, zonder gemor, zonder
tegenweer.... Wat er ook gebeuren moge, al ware het dat men mij
het leven name, zoo is mijn wil, dat gij niets doet om mij te
verdedigen of te wreken,... ik verbied het u. Zult gij daartoe moeds
genoeg hebben?”
“Ja, ja, vader, kom; ik zal alles verdragen.”
Zij gingen dan ter celdeur uit, onder de smaadwoorden dergenen,
die zich in den gang bevonden, en kwamen weldra in den refter,
waar zij door eenen hoop dronken mannen moesten gaan. Dezen
hieven een verward gejuich aan, zoodra zij den priester zagen.
“Een paap! Een paap!” werd er geschreeuwd.
In een oogenblik was pater Franciscus van het boos gespuis
omringd; allerlei lasteringen werden hem toegesnauwd: de een trok
aan de kap van zijn habijt, de ander spuwde hem bier in het
aangezicht; doch de priester ging, met de oogen nederwaarts
geslagen, langzaam voort en scheen voor al deze balddadigheden
gevoelloos; zijn habijt was aan flarden gescheurd, bier lekte van zijn
statigen schedel.
Lodewijks gelaat was schrikkelijk. Men kon er genoeg op lezen,
wat leeuwenrazernij hem verteerde, het wit zijner oogen was onder
en boven zichtbaar, zijne tanden waren op elkander gesloten, en hij
neep onwetend de handen des priesters te pletten. Nogtans hij
herinnerde zich het ontvangen bevel en deed geen teeken, dat
aanduidde, dat hij tegenstand wilde bieden.
Na vele mishandelingen geraakten zij eindelijk op de Veemarkt,
maar hier werd hun toestand nog verergerd. Eene ontelbare menigte
volgde hen; velen kwamen aan de ooren des priesters de
walgelijkste woorden, de bloedigste blasphemieën uitspreken;
anderen wierpen met slijk en vuiligheid, zoodat de zilveren haren
van pater Franciscus schandelijk met zand en modder besmeurd
werden. Reeds had Lodewijk meermalen gesmeekt en geroepen:
“O, vader, laat mij ze dooden, of mijne aderen barsten nog! Ik kan
niet.... niet meer stil blijven. Om Gods wil, laat mij u wreken en
sterven!”
Maar de priester antwoordde:
“Hoe schoon is het, Lodewijk, te lijden omdat men zijnen God
getrouw is. Denk aan de Christenhelden der oude tijden: zij werden
gemarteld, gebrand, gepletterd, maar in het midden der ziedende
olie, onder den klauw der leeuwen, kwam uit hunnen heiligen mond
geene enkele klacht, geen enkel wraakzuchtig woord; alleen staken
zij de handen op tot God, om vergiffenis voor hunne beulen te
vragen. Volgen wij hun voorbeeld, mijn zoon; misschien zullen wij
heden met de glanzende kroon der martelie voor den Heer
verschijnen!”
Bij den hoek der Zwartzusterstraat, aan de Koepoort, stond een
half opgebouwd huis, waarbij een hoop gebroken schaliën lag.
Even was Lodewijk eenige stappen daar voorbij, of hij hoorde een
stuk schalie aan zijn oor fluiten. Weldra vlogen meer schaliën naar
hen, totdat eindelijk eene daarvan tegen het naakte voorhoofd van
pater Franciscus bonsde en hem eene wijde wonde toebracht....
Lodewijk zag het bloed over zijn aangezicht stroomen....
Nu kende hij geene voorzichtigheid meer; nu vergat hij het bevel
van den pater en, zonder meer naar hem om te zien, liep hij tot
dengene, dien hij de schalie had zien werpen, en stak hem met
zooveel geweld zijnen degen door het lijf, dat deze langs den rug
uitkwam; hij zag rond om nog meer slachtoffers te vinden, maar al
de spotters hadden zich loopend tot op eenen tamelijken afstand
verwijderd.
Ondertusschen was pater Franciscus op de straat nedergevallen;
de slag der snijdende schalie had hem zoo wreedelijk getroffen, dat
hij machteloos ten gronde was gezonken.
Lodewijk naderde hem met eenen angstigen schreeuw, en, hem
half opheffende, sleepte hij hem tot tegen den muur van een huis,
waar hij hem zittend liet nederzakken. Terwijl waren de balddadigen
met meer woede genaderd en wierpen allengs meer en meer met
steenen, schaliën en vuiligheid.
Vol wanhoop, radeloos en niet wetende wat te doen om den
priester te bevrijden, ging Lodewijk vóór hem op zijne hurken zitten
en bedekte hem zoo met zijn eigen lichaam. Steenen vlogen
onophoudelijk tegen zijne leden, en menige pijnlijke gil ontsnapte
hem. Misschien ware hij lang in deze houding gebleven, maar een
gedeelte van het gespuis kwam langs eenen anderen kant staan
werpen, zoodat zij dikwijls den priester raakten. Deze, uit zijne
machteloosheid ontwaakt, wilde met geweld Lodewijk van zich doen
weggaan.
“Laat mij sterven,” sprak hij, “laat mij martelaar zijn, stel u niet
langer bloot voor mij.... ik zal voor u bidden in den hemel. Kom, mijn
brave, mijn dappere zoon, geef mij een afscheidszoen....”
Maar Lodewijk antwoordde niet; al zijne aandacht was op de
vliegende steenen gericht; al zijne zorg bestond daarin, dat hij met
zijne armen of schouders, als met een schild, het lichaam des
priesters beschutte. Dan, eindelijk werd het getal hunner vijanden
zoo groot, dat Lodewijk den priester niet meer bevrijden kon. Hij
wierp zijne twee armen om den hals van pater Franciscus en klemde
zich vast tegen zijne borst.
“Daar is de zoen, dien gij gevraagd hebt, vader,” riep hij, “maar het
is geen afscheidszoen.... Neen, sterven wij te zamen voor onzen
God. O, ik zal ook martelaar zijn.... Hoe schoon is die zekerheid!....”
Zijne stem verging, en hij verborg zijn hoofd tegen den boezem
van pater Franciscus.
Gewis hadde hij zich in deze houding laten dooden: maar een
zware steen, die tegen het lichaam van pater Franciscus bonsde,
deed eenen luiden schreeuw uit zijne borst opklimmen. Lodewijk
rukte zich los, sprong met verdwaaldheid op en blikte tusschen
eenen hagel van steenen de straten in, om te zien of er geene hulp
te bekomen was. Op eens zag hij van verre in de Koepoortstraat
eenige menschen, die hij kende, aankomen.
Eene uitdrukking van blijdschap liep over zijn gelaat, en hij
schreeuwde als met eene bovennatuurlijke stem:
“Wolfangh! Wolfangh!”
En dan bedekte hij weder den priester met zijn lichaam.
Bij den naam van Wolfangh schenen de steenen in de handen der
werpers vastgehecht te zijn; zij bestaarden elkander ondervragend
en blikten rond, of zij waarlijk den man zouden zien, die den
alomgevreesden naam van Wolfangh droeg.
Weldra kwamen er een tiental mannen bij Lodewijk: het waren
zijne vrienden, welke hij bij het stadhuis verlaten had.
“Wolfangh! Schuermans!” riep Lodewijk, terwijl hij van voor pater
Franciscus wegging, “ziet, zoo behandelen zij den beste aller
menschen, een zeventigjarigen priester!”
“Ha!” riep Wolfangh als met vreugd, “er zijn boozer menschen dan
ik! Het bloed der moordenaars gaat stroomen!”
Dan wierp hij eenen medelijdenden blik op pater Franciscus en
eenen metenden blik op degenen, die hem mishandeld hadden: hij
nam in iedere hand eenen moordpriem en trok zijn hoofd tusschen
de schouders.... er kwam een geloei uit zijne borst als uit de keel van
eenen wilden stier.... en, eenen stormram gelijk, wierp hij zich
vooruit....
Eer Schuermans en de anderen hem volgen konden, lag er reeds
menig booswicht in zijn bloed te spartelen; en na een oogenblik was
in al de aanpalende straten geen enkel mensch meer zichtbaar.
Alleenlijk hoorde men in de verte den schreeuw: “Wolfangh!
Wolfangh!” als eenen schrikverwekkenden roep aanheffen.
Dan kwam Wolfangh terug bij pater Franciscus; hij bezag met
innige verontwaardiging het edel gelaat des priesters, dat nu onder
een masker van slijk en bloed onkennelijk was gemaakt, maar, na
eene wijl als verslagen op dit tooneel gestaard te hebben, verliet hij
Lodewijk en zijne vrienden, en liep naar de deur van het
tegenoverstaande huis. Ondanks zijn kloppen en roepen werd er niet
opengedaan.
Wolfangh ontvlamde in razernij, wanhopig wrong hij den ijzeren
klopper der deur krom, doch eensklaps hernam zijn ontembaar
gemoed de overhand: een oogenblik later stond hij voor de deur met
eenen arduinen dorpel, dien hij bij het afgebroken huis gehaald had.
Slot en grendel sprongen af.... De deur viel bonzend neder.
Kort daarna kwam Wolfangh uit het huis geloopen, in de eene
hand hield hij eene kom met water en in de andere eenige linnen
doeken. Hij knielde neder bij den priester, waschte zijn hoofd en
aangezicht, en verbond zijne wonde met zooveel behendigheid, dat
men hem voor eenen heelmeester zou hebben kunnen aanzien.
Nu kon men bemerken wat schrikkelijke verandering er in pater
Franciscus was omgegaan. Het verloren bloed had hem al zijne
krachten ontnomen; zijn ingevallen gelaat was meer dan bleek, het
was aschvervig en doorschijnend; zijne lippen waren van dezelfde
kleur als de moorddadige schaliën, die rond hem lagen. En nogtans
er blonk op het aangezicht des priesters eene hemelsche uitdrukking
van onderwerping aan den wil des Heeren, een glimlach als die der
engelen.
Lodewijk zat insgelijks bij pater Franciscus geknield en hielp
Wolfangh in het verbinden der wonde. Het was meest op Lodewijk,
dat de priester zijn verflauwend oog gericht hield.
“Ho, gij zult gered zijn, goede vader,” sprak de jongeling met
teederheid, “uwe wonde zal genezen. Gij zult nog langen tijd onze
beschermengel kunnen zijn.”
“Lodewijk, mijn dierbare zoon,” zuchtte de priester, “de Heer heeft
over mij beschikt. Hij heeft mij de kroon der martelie vergund. Ik zal
sterven. Niet van de wonde, die gij verbindt; maar een steen, — de
laatste, — heeft mij de borst ingedrukt. Ik voel het in mijn lichaam:
mijne ziel doet geweld om zich los te rukken; zij wil hemelwaarts....
doch ween niet om mij; mijn lot is te schoon.”
Op deze rede antwoordde Lodewijk niets; alleenlijk staarde hij met
stijve blikken op des priesters gelaat.
“Gij bemint mij dan zeer?” sprak pater Franciscus, terwijl hij
Lodewijks hand drukte.
Die woorden deden de tranen als beken uit de oogen des
jongelings stroomen.
“O ja, gij bemint mij zeer!” herhaalde de priester. “Ik zal voor u
bidden, Lodewijk.”
Nu werd pater Franciscus door Wolfangh en Schuermans
voorzichtig opgelicht, met alle voorzorg ondersteund en langzaam
naar de Keizerstraat voortgeleid, terwijl Van Halen en de andere
vrienden van Lodewijk zich bereid hielden om den eersten spotter
het leven te benemen.
Zij kwamen eindelijk aan Godmaerts woning en werden door
Theresia binnengelaten.
X
Gloria in altissimis Deo, et in terra pax hominibus bonae
voluntatis.
Luc. Cap. ii. v. 14.
Glorie aan God in den Hooge, en vrede op de aarde aan de
menschen van goeden wil.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information