Bash, The Bourne!Again Shell: Chet Ramey Case Western Reserve University Chet@po - Cwru.edu
Bash, The Bourne!Again Shell: Chet Ramey Case Western Reserve University Chet@po - Cwru.edu
Again Shell
Chet Ramey
Case Western Reserve University
[email protected]
ABSTRACT
1. Introduction
Bash is the shell, or command language interpreter, that will appear in the GNU operating system.
The name is an acronym for the ‘‘Bourne!Again SHell’’, a pun on Steve Bourne, the author of the direct
ancestor of the current UNIX† shell /bin/sh, which appeared in the Seventh Edition Bell Labs Research ver-
sion of UNIX [1].
Bash is an sh!compatible shell that incorporates useful features from the Korn shell (ksh)[2] and the
C shell (csh)[3], described later in this article. It is ultimately intended to be a conformant implementation
of the IEEE POSIX Shell and Tools specication (IEEE Working Group 1003.2). It offers functional
improvements over sh for both interactive and programming use.
While the GNU operating system will most likely include a version of the Berkeley shell csh, bash
will be the default shell. Like other GNU software, bash is quite portable. It currently runs on nearly every
version of UNIX and a few other operating systems ! an independently-supported port exists for OS/2, and
there are rumors of ports to DOS and Windows NT. Ports to UNIX-like systems such as QNX and Minix
are part of the distribution.
The original author of bash was Brian Fox, an employee of the Free Software Foundation. The cur-
rent developer and maintainer is Chet Ramey, a volunteer who works at Case Western Reserve University.
2. What is a shell?
At its base, a shell is simply a macro processor that executes commands. A UNIX shell is both a
command interpreter, which provides the user interface to the rich set of UNIX utilities, and a programming
language, allowing these utilitites to be combined. The shell reads commands either from a terminal or a
le. Files containing commands can be created, and become commands themselves. These new commands
have the same status as system commands in directories like /bin, allowing users or groups to establish cus-
tom environments.
(history, getopts, kill, or pwd, for example) to obtain via separate utilities. Shells may be used interac-
tively or non-interactively: they accept input typed from the keyboard or from a le.
meaning. Words containing these special characters are replaced with a sorted list of matching pathnames.
If a word generates no matches, it is left unchanged.
Quoting is used to remove the special meaning of characters or words. It can disable special treat-
ment for shell operators or other special characters, prevent reserved words from being recognized as such,
and inhibit variable expansion. The shell has three quoting mechanisms: a backslash preserves the literal
value of the next character, a pair of single quotes preserves the literal value of each character between the
quotes, and a pair of double quotes preserves the literal meaning of enclosed characters while allowing
some expansions.
Some of the commands built into the shell are part of the programming language. The break and
continue commands control loop execution as in the C language. The eval builtin allows a string to be
parsed and executed as a command. Wait tells the shell to pause until the processes specied as arguments
have exited.
• Aspects of the shell’s syntax and command language. A number of special builtins such as cd and
exec are being specied as part of the shell, since their functionality usually cannot be implemented
by a separate executable;
• A set of utilities to be called by shell scripts and applications. Examples are programs like sed, tr,
and awk. Utilities commonly implemented as shell builtins are described in this section, such as test
and kill. An expansion of this section’s scope, termed the User Portability Extension, or UPE, has
standardized interactive programs such as vi and mailx;
• A group of functional interfaces to services provided by the shell, such as the traditional system C
library function. There are functions to perform shell word expansions, perform lename expansion
(globbing), obtain values of POSIX.2 system conguration variables, retrieve values of environment
variables (getenv()), and other services;
• A suite of ‘‘development’’ utilities such as c89 (the POSIX.2 version of cc), and yacc.
Bash is concerned with the aspects of the shell’s behavior dened by POSIX.2. The shell command
language has of course been standardized, including the basic ow control and program execution con-
structs, I/O redirection and pipelining, argument handling, variable expansion, and quoting. The special
builtins, which must be implemented as part of the shell to provide the desired functionality, are specied
as being part of the shell; examples of these are eval and export. Other utilities appear in the sections of
POSIX.2 not devoted to the shell which are commonly (and in some cases must be) implemented as builtin
commands, such as read and test.
POSIX.2 also species aspects of the shell’s interactive behavior as part of the UPE, including job
control, command line editing, and history. Interestingly enough, only vi-style line editing commands have
been standardized; emacs editing commands were left out due to objections.
There were certain areas in which POSIX.2 felt standardization was necessary, but no existing imple-
mentation provided the proper behavior. The working group invented and standardized functionality in
these areas. The command builtin was invented so that shell functions could be written to replace builtins;
it makes the capabilities of the builtin available to the function. The reserved word ‘‘!’’ was added to
negate the return value of a command or pipeline; it was nearly impossible to express ‘‘if not x’’ cleanly
using the sh language. There exist multiple incompatible implementations of the test builtin, which tests
les for type and other attributes and performs arithmetic and string comparisons. POSIX considered none
of these correct, so the standard behavior was specied in terms of the number of arguments to the com-
mand. POSIX.2 dictates exactly what will happen when four or fewer arguments are given to test, and
leaves the behavior undened when more arguments are supplied. Bash uses the POSIX.2 algorithm,
which was conceived by David Korn.
While POSIX.2 includes much of what the shell has traditionally provided, some important things
have been omitted as being ‘‘beyond its scope.’’ There is, for instance, no mention of a difference between
a login shell and any other interactive shell (since POSIX.2 does not specify a login program). No xed
startup les are dened, either ! the standard does not mention .profile .
4. Shell Comparison
This section compares features of bash, sh, and ksh (the three shells closest to POSIX compliance).
Since ksh and bash are supersets of sh, the features common to all three are covered rst. Some of the fea-
tures bash and ksh contain which are not in sh will be discussed. Next, features unique to bash will be
listed. The rst three sections provide a progressively more detailed overview of bash. Finally, features of
ksh-88 (the currently-available version) not in sh or bash will be presented.
Variables may be declared as integer , which causes arithmetic evaluation to be performed on the value
whenever they are assigned to.
There are new expansions to obtain the length of a variable’s value and to remove substrings match-
ing specied patterns from the beginning and end of variable values. A new form of command substitution,
$(list), is much easier to nest than ‘list‘ and has simplied quoting rules.
There are new variables to control the shell’s behavior, and additional variables set or interpreted spe-
cially by the shell. RANDOM and SECONDS are dynamic variables: their values are generated afresh
each time they are referenced. RANDOM returns a different random number each time it is referenced,
and SECONDS returns the number of seconds since the shell was started or the variable was assigned to,
plus any value assigned. PWD and OLDPWD are set to the current and previous working directories,
respectively. TMOUT controls how long the shell will wait at a prompt for input. If TMOUT is set to a
value greater than zero, the shell exits after waiting that many seconds for input. REPLY is the default
variable for the read builtin; if no variable names are supplied as arguments, the line read is assigned to
REPLY.
Various modes tell what a command word is (reserved word, alias, function, builtin, or le) or which ver-
sion of a command will be executed based on a user’s search path. Some of this functionality has been
adopted by POSIX.2 and folded into the command utility.
There is a readline command to re-read the le, so users can edit the le, change some bindings, and begin
to use them almost immediately.
Bash implements the bind builtin for more dyamic control of readline than the startup le permits.
Bind is used in several ways. In list mode, it can display the current key bindings, list all the readline edit-
ing directives available for binding, list which keys invoke a given directive, or output the current set of key
bindings in a format that can be incorporated directly into an inputrc le. In batch mode, it reads a series
of key bindings directly from a le and passes them to readline. In its most common usage, bind takes a
single string and passes it directly to readline, which interprets the line as if it had just been read from the
inputrc le. Both key bindings and variable assignments can appear in the string given to bind.
The readline library also provides an interface for word completion. When the completion character
(usually TAB) is typed, readline looks at the word currently being entered and computes the set of le-
names of which the current word is a valid prex. If there is only one possible completion, the rest of the
-8-
characters are inserted directly, otherwise the common prex of the set of lenames is added to the current
word. A second TAB character entered immediately after a non-unique completion causes readline to list
the possible completions; there is an option to have the list displayed immediately. Readline provides
hooks so that applications can provide specic types of completion before the default lename completion
is attempted. This is quite exible, though it is not completely user-programmable. Bash, for example, can
complete lenames, command names (including aliases, builtins, shell reserved words, shell functions, and
executables found in the le system), shell variables, usernames, and hostnames. It uses a set of heuristics
that, while not perfect, is generally quite good at determining what type of completion to attempt.
4.3.4. History
Access to the list of commands previously entered (the command history) is provided jointly by bash
and the readline library. Bash provides variables (HISTFILE, HISTSIZE, and HISTCONTROL) and the
history and fc builtins to manipulate the history list. The value of HISTFILE specifes the le where bash
writes the command history on exit and reads it on startup. HISTSIZE is used to limit the number of com-
mands saved in the history. HISTCONTROL provides a crude form of control over which commands are
saved on the history list: a value of ignorespace means to not save commands which begin with a space; a
value of ignoredups means to not save commands identical to the last command saved. HISTCONTROL
was named history_control in earlier versions of bash; the old name is still accepted for backwards com-
patibility. The history command can read or write les containing the history list and display the current
list contents. The fc builtin, adopted from POSIX.2 and the Korn Shell, allows display and re-execution,
with optional editing, of commands from the history list. The readline library offers a set of commands to
search the history list for a portion of the current input line or a string typed by the user. Finally, the his-
tory library, generally incorporated directly into the readline library, implements a facility for history recall,
expansion, and re-execution of previous commands very similar to csh (‘‘bang history’’, so called because
the exclamation point introduces a history substitution):
$ echo a b c d e
a b c d e
$ !! f g h i
echo a b c d e f g h i
a b c d e f g h i
$ !-2
echo a b c d e
a b c d e
$ echo !-2:1-4
echo a b c d
a b c d
The command history is only saved when the shell is interactive, so it is not available for use by shell
scripts.
The string being assigned is surrounded by single quotes so that if it is exported, SHLVL will be updated
by a child shell:
chet@odin [21:13:35] src(2:638)$ export PS1
chet@odin [21:17:40] src(2:639)$ bash
chet@odin [21:17:46] src(3:696)$
The \$ escape is displayed as ‘‘$’’ when running as a normal user, but as ‘‘#’’ when running as root.
4.4.3. Arrays
Arrays are an aspect of ksh that has no real bash equivalent. They are easy to create and manipulate:
an array is created automatically by using subscript assignment (name[index]=value), and any variable
may be referred to as an array. Ksh arrays, however, have several annoying limitations: they may be
indexed only up to 512 or 1024 elements, depending on how the shell is compiled, and there is only the
clumsy set -A to assign a list of values sequentially. Despite these limits, arrays are useful, if underutilized
by shell programmers.
4.4.5. Expansion
The ksh lename generation (globbing) facilities have been extended beyond their bash and sh coun-
terparts. In this area, ksh can be thought of as egrep to the bash grep . Ksh globbing offers things like
alternation, the ability to match zero or more instances of a pattern, and the ability to match exactly one
occurrence of any of a list of patterns.
4.4.7. History
Finally, the ksh history implementation differs slightly from bash. Each instance of bash keeps the
history list in memory and offers options to the history builtin to write the list to or read it from a named
le. Ksh keeps the history in a le, which it accesses each time a command is saved to or retrieved from
the history. Ksh history les may be shared among different concurrent instances of ksh, which could be a
benet to the user.
5. Features in Bash-2.0
The next release of bash, 2.0, will be a major overhaul. It will include many new features, for both
programming and interactive use. Redundant existing functions will be removed. There are several cases
where bash treats a variable specially to enable functionality available another way ($nolinks vs. set -o
physical, for example); the special treatment of the variable name will be removed.
5.1. Arrays
Bash-2.0 will include arrays which are a superset of those in ksh, with the size limitations removed.
The declare, readonly, and export builtins will accept options to specify arrays, and the read builtin will
have an option to read a list of words and assign them directly to an array. There will also be a new array
compound assignment syntax available for assignment statements and the declare builtin. This new syntax
has the form name=(value1 ... valueN), where each value has the form [subscript]=string. Only the string
is required. If the optional brackets and subscript are included, that index is assigned to, otherwise the
index of the element assigned is the last index assigned to by the statement plus one. Indexing starts at
zero. The same syntax is accepted by declare. Individual array elements may be assigned to using the ksh
name[subscript]=value.
5.3. Builtins
Some of the existing builtins will change in bash-2.0. As previously noted, declare, export, read-
only, and read will accept new options to specify arrays. The jobs builtin will be able to list only stopped
or running jobs. The enable command will take a new !s option to restrict its actions to the POSIX.2 spe-
cial builtins. Kill will be able to list signal numbers corresponding to individual signal names. The read-
line library interface, bind, will have an option to remove the binding for any key sequence (which is not
the same as binding it to self-insert).
There will be two new builtin commands in bash-2.0. The disown command will remove jobs from
bash’s internal jobs table when job control is active. A disowned job will not be listed by the jobs com-
mand, nor will its exit status be reported. Disowned jobs will not be sent a SIGHUP when an interactive
- 12 -
shell exits. Most of the shell’s optional or toggled functionality will be folded into the new shopt builtin.
Many of the variables which alter the shell’s behavior when set (regardless of their value) will be made
options settable with shopt. Examples of such variables include allow_null_glob_expansion,
glob_dot_lenames, and MAIL_WARNING.
5.5. Readline
Naturally, there will be improvements to readline as well. All of the POSIX.2 vi-mode editing com-
mands will be implemented; missing commands like ‘m’ to save the current cursor position (mark) and the
‘@’ command for macro expansion will be available. The ability to set the mark and exchange the current
cursor position (point) and mark will be added to the readline emacs mode as well. Since there are com-
mands to set the mark, commands to manipulate the region (the characters between the point and the mark)
will be available. Commands have been added to the readline emacs mode for more complete ksh compati-
bility, such as the C-]c character search command.
5.6. Conguration
Bash was the rst GNU program to completely autocongure. Its autoconguration mechanism pre-
dates autoconf, the current GNU conguration program, and needs updating. Bash-2.0 may include an
autoconf-based conguration script, if necessary new functionality can be added to autoconf, or its limita-
tions bypassed.
5.7. Miscellaneous
The POSIX mode will be improved in bash-2.0; it will provide a more complete superset of the
POSIX standard. For the rst time, bash will recognize the existance of the POSIX.2 special builtins.
A new trap value, DEBUG, will be present, as in ksh. Commands specied with a DEBUG trap will
be executed after every simple command. Since this makes shell script debuggers possible, I hope to
include a bash debugger in the bash-2.0 release.
6. Availability
The current version of bash is available for anonymous FTP from prep.ai.mit.edu as
/pub/gnu/bash-1.14.2.tar.gz .
- 13 -
7. Conclusion
This paper has presented an overview of bash, compared its features with those of other shells, and
hinted at features in the next release, bash-2.0.
Bash is a solid replacement for sh. It is sufciently portable to run on nearly every version of UNIX
from 4.3 BSD to SVR4.2, and several UNIX workalikes, and robust enough to replace sh on most of those
systems, It is very close to POSIX.2-conformant in POSIX mode, and is getting faster. It is not, unfortu-
nately, getting smaller, but there are many optional features. It is very easy to build a small subset to use as
a direct replacement for /bin/sh.
Bash has thousands of users worldwide, all of whom have helped to make it better. Another testa-
ment to the benets of free software.
8. References
[1] S. R. Bourne, ‘‘UNIX Time-Sharing System: The UNIX Shell’’, Bell System Technical Journal, 57(6),
July-August, 1978, pp. 1971-1990.
[2] Morris Bolsky and David Korn, The KornShell Command and Programming Language, Prentice Hall,
1989.
[3] Bill Joy, An Introduction to the C Shell, UNIX User’s Supplementary Documents, University of Califor-
nia at Berkeley, 1986.
[4] IEEE, IEEE Standard for Information Technology -- Portable Operating System Interface (POSIX) Part
2: Shell and Utilities, 1992.
9. Author Information
Chet Ramey is a software engineer working at Case Western Reserve University. He has a B.S. in
Computer Engineering and an M.S. in Computer Science, both from CWRU. He has been working on bash
for six years, and the primary maintainer for one.