Journal of Statistical Software: Reviewer: Oliver Kirchkamp, Hiltrud Niggemann
Journal of Statistical Software: Reviewer: Oliver Kirchkamp, Hiltrud Niggemann
Introduction
Stata is a statistical software package which allows solving a number of standard problems
with the help of simple and often straightforward commands. More involved problems may
need some programming: Non-standard estimators can often be implemented with the help
of small programs. Also, the systematic preparation of data and the efficient presentation of
results usually requires writing a program. In these cases, some of Stata’s features, which are
a blessing for simple tasks, demand at least some creative rethinking. Stata’s macros, scalars,
matrices, views, and other features are quite different from the concepts found in many other
programming languages and often seem intimidating to the uninitiated. A person who wants
to use Stata (or any other statistical software package) does not only have to learn a number
of commands. Even more important is an understanding of an efficient workflow from data
management to statistical estimation.
There are books that specialize on some of the issues mentioned above. E.g. Kohler and
Kreuter (2012) provide a general overview of the Stata software. StataCorp’s User’s Guide
(2015) describes the elements of Stata and explains how to write programs. Long (2009)
presents a book entirely dedicated to workflow with Stata. Baum’s An Introduction to Stata
Programming (2nd Edition) combines a concise introduction into the Stata programming en-
vironment with a large number of applications to workflow, data management and estimation.
Overview
The book can be divided into four parts: A first part (Chapter 1 and 2) provides a general
discussion, motivation and introduction into the basic concepts of data handling and program-
ming with Stata. The second, and, in terms of pages, largest part (Chapters 3 to 10) presents
essential Stata commands that can be used in do-files (but also in ado-files). In a third part
(Chapters 11 and 12) ado-files and the syntax of self-written Stata commands are presented.
A last and fourth part (Chapters 13 and 14) introduces the Mata language. Throughout the
2 An Introduction to Stata Programming
book, chapters with odd numbers give a first introduction to a concept. Chapters with even
numbers subsequently illustrate the concept with examples.
The first two chapters provide a general discussion of the key aspects of the Stata environment.
Chapter 1 highlights the importance of reproducibility. Using a program does not only help
solving complex statistical problems. Following a program helps, in particular, structuring
and documenting even trivial tasks, thus making the analysis reproducible. Chapter 2 then
explains the basic concepts how data is organized in Stata and how and where elements of
programs are stored.
Chapters 3 to 10 introduce the main elements of the Stata language as they are used in do-files.
Chapter 3 starts with conventions how Stata’s commands are structured, how parts of these
commands can be stored and manipulated in macros, how subsets of a dataset are selected,
how one data type can be converted into another data type, and how standard statistics can
be calculated conditionally and unconditionally. Chapter 4 then applies these concepts to
standard tasks.
Chapter 5 presents several commands to validate and reorganize, reshape and append data.
The chapter also explains how Stata functions return results, how these results can be ex-
tracted, and how results can be stored. In line with efficient workflow this chapter presents
examples how to translate estimation results into publication ready tables. Chapter 6 again
presents several useful applications, in particular for merging and reshaping data, but also
for the generation of tables and graphs.
Chapter 7 discusses repetition in various forms. Stata offers a number of concepts to perform
the same task in a loop, or repeatedly for different conditions, for different values or with
different indicators. This chapter also presents commands that are useful in the context
of resampling. Chapter 8 presents applications where statistics are calculated for different
countries or individuals, or for a moving window. Furthermore, the chapter illustrates how
to collect data from several spreadsheets.
Chapter 9 deals with a restriction in Stata: There is only a single dataset. How can the
user organize data from repeated computations? How can the user compare different cases?
This chapter presents matrices and external files as one possible solution. Baum also gives
further advice how to produce publication ready tables and graphs. Chapter 10 provides
examples how statistics for subsets of the data are stored in matrices and how one can display
estimation results in a graph if these estimation results are conditional on a variable. The
chapter also gives another example for the systematic generation of publication ready tables
and explains how data can be extracted from Stata graphs.
Chapters 11 and 12 introduce the reader to writing own programs, i.e., ado-files. Chapter 11
discusses the syntax to define these programs, their options and the variables they refer to.
The chapter also explains how to return results and how to write programs so that they follow
standard conventions to select subsets of the data and so that these programs can be executed
repeatedly for subsets of the data. Some Stata commands require user-written evaluator
programs. Baum gives examples how to write such programs so that they can be used in
the context of maximum likelihood estimation, nonlinear least-squares, generalized method
of moments, or together with resampling commands. Furthermore, the chapter provides
some guidelines on programming style. Chapter 12 presents a number of examples which use
user-written programs.
Chapters 13 starts with a brief introduction into the fundamentals of the Mata language.
Journal of Statistical Software – Book Reviews 3
Thereafter, the author explains how to access and modify other Stata objects from within
Mata. He then guides the reader through applications where Mata programs, combined with
ado-files, are put to practical use. Chapter 14 presents a number of applications, including the
reorganisation of data, more involved estimation problems, and improving the presentation
of results.
combination of several basic commands. More importantly, the author points out common
mistakes, dangers and pitfalls.
The book is neither designed to be used as a comprehensive reference for Stata commands,
nor as an exhaustive reference for econometric problems. The book can and should be used as
a creative inspiration for programming techniques for Stata users. The examples are realistic
but sometimes hard to grasp for readers not from economics and finance. Each example
serves as a useful illustration for a programming concept but, obviously, the examples cannot
exhaust all possible applications. Hence, most readers will read the book from cover to cover.
The focus on practical applications also implies that related programming strategies appear
and reappear in different chapters. Here an overview could have helped the reader to find
where the related concepts appear throughout the book. The book does have an index but
this index becomes most helpful once a reader has read the entire volume and is now looking
for a specific item.
Conclusion
An Introduction to Stata Programming (2nd Edition) is a well organized book. We find
it suitable for any Stata user on an intermediate or advanced level, a user which already
has some experience with Stata and who wants go deeper into programming or who wants
to extend Stata’s built-in commands for estimation and data management. In particular the
large number of practical examples, mostly taken from economics and finance, help the reader
a lot.
References
Long JS (2009). The Workflow of Data Analysis Using Stata. Stata Press.
Reviewer:
Oliver Kirchkamp
Friedrich-Schiller-Universität Jena
School of Economics
07737 Jena, Germany
E-mail: [email protected]
URL: https://www.kirchkamp.de/
Journal of Statistical Software – Book Reviews 5
Hiltrud Niggemann
Schlehendornweg 24
07751 Jena, Germany
E-mail: [email protected]
URL: https://www.p-wert.de/