Notes On Introduction To Unix
Notes On Introduction To Unix
org/article/computing/tut_unix/
HOME
Oliver
Sitemap
Section Contents
Advertising
Top
perl
Table of Contents
1. Introduction VIEW_AS_PAGE
2. 100 Useful Unix Commands VIEW_AS_PAGE
3. Getting Started: Opening the Terminal VIEW_AS_PAGE
4. The Definitive Guides to Unix, Bash, and the Coreutils VIEW_AS_PAGE
5. The Unix Filestructure VIEW_AS_PAGE
6. The Great Trailing Slash Debate VIEW_AS_PAGE
7. Where Are You? - Your Path and How to Navigate through the Filesystem VIEW_AS_PAGE
1 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
41. sort VIEW_AS_PAGE
Sitemap 42. history VIEW_AS_PAGE
Section Contents
43. Piping in Unix VIEW_AS_PAGE
Top
A word about terminology here: I'm in the habit of horrendously confusing and misusing all of the precisely
defined words "Unix", "Linux", "The Command Line", "The Terminal", "Shell Scripting", and "Bash." Properly
speaking, unix is an operating system while linux refers to a closely-related family of unix-based operating
[1]
systems, which includes commercial and non-commercial distributions . (Unix was not free under its
developer, AT&T, which caused the unix-linux schism.) The command line, as Wikipedia says, is:
... a means of interacting with a computer program where the user issues commands to the program in the
form of successive lines of text (command lines) ... The interface is usually implemented with a command line
shell, which is a program that accepts commands as text input and converts commands to appropriate
operating system functions.
So what I mean when I proselytize for "unix", is simply that you learn how to punch commands in on the
command line. The terminal is your portal into this world. Here's what my mine looks like:
2 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap
Section Contents
Top
There is a suite of commands to become familiar with—The GNU Core Utilities (wiki entry)—and, in the
course of learning them, you learn about computers. Unix is a foundational piece of a programming
education.
In terms of bang for the buck, it's also an excellent investment. You can gain powerful abilities by learning
just a little. My coworker was fresh out of his introductory CS course, when he was given a small task by our
boss. He wrote a full-fledged program, reading input streams and doing heavy parsing, and then sent an
email to the boss that began, "After 1.5 days of madly absorbing perl syntax, I completed the exercise..." He
didn't know how to use the command-line at the time, and now a print-out of that email hangs on his wall as
a joke—and as a monument to the power of the terminal.
You can find ringing endorsements for learning the command line from all corners of the internet. For
instance, in the excellent course Startup Engineering (Stanford/Coursera) Balaji Srinivasan writes:
A command line interface (CLI) is a way to control your computer by typing in commands rather than clicking
on buttons in a graphical user interface (GUI). Most computer users are only doing basic things like clicking
on links, watching movies, and playing video games, and GUIs are fine for such purposes.
But to do industrial strength programming - to analyze large datasets, ship a webapp, or build a software
startup - you will need an intimate familiarity with the CLI. Not only can many daily tasks be done more
quickly at the command line, many others can only be done at the command line, especially in non-Windows
environments. You can understand this from an information transmission perspective: while a standard
keyboard has 50+ keys that can be hit very precisely in quick succession, achieving the same speed in a GUI
is impossible as it would require rapidly moving a mouse cursor over a profusion of 50 buttons. It is for this
reason that expert computer users prefer command-line and keyboard-driven interfaces.
What do all of these have in common? All are hard to do in the GUI, but easy to do on the command line. It
would be remiss not to mention that unix is not for everything. In fact, it's not for a lot of things. Knowing
which language to use for what is usually a matter of common sense. However, when you start messing
about with computers in any serious capacity, you'll bump into unix very quickly—and that's why it's our
starting point.
3 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap Is this the truth, the whole truth, and nothing but the truth, so help my white ass? I believe it is, but also my
Section Contents trajectory through the world of computing began with unix. So perhaps I instinctively want to push this on
Top
other people: do it the way I did it. And, because it occupies a bunch of my neuronal real estate at the
< > >>
moment, I could be considered brainwashed :-)
[1] Still confused about unix vs linux? Refer to the full family tree and these more precise definitions from Wikipedia:
unix: a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, developed in the 1970s at the Bell
linux: a Unix-like and mostly POSIX-compliant computer operating system assembled under the model of free and open-source software
development and distribution [whose] defining component ... is the Linux kernel, an operating system kernel first released [in] 1991 by Linus
Torvalds ↑
If you have a PC, abandon all hope, ye who enter here! Just kidding—partially. None of the native Windows
shells, such as cmd.exe or PowerShell, are unix-like. Instead, they're marked with hideous deformities that
betray their ignoble origin as grandchildren of the MS-DOS command interpreter. If you didn't have a
[1]
compelling reason until now to quit using PCs, here you are . Typically, my misguided PC friends don't use
the command line on their local machines; instead, they have to ssh into some remote server running Linux.
(You can do this with an ssh client like PuTTY, Chrome's Terminal Emulator, or MobaXterm, but don't ask me
how.) On Macintosh you can start practicing on the command line right away without having to install a Linux
[2]
distribution (the Mac-flavored unix is called Darwin).
For both Mac and PC users who want a bona fide Linux command line, one easy way to get it is in the cloud
with Amazon EC2 via the AWS Free Tier. If you want to go whole hog, you can download and install a Linux
distribution—Ubuntu, Mint, Fedora, and CentOs are popular choices—but this is asking a lot of non-
hardcore-nerds (less drastically, you could boot Linux off of a USB drive or run it in a virtual box).
[1] I should admit, you can and should get around this by downloading something like Cygwin, whose homepage states: "Get that Linux feeling -
on Windows" ↑
[2] However, if you're using Mac OS rather than Linux, note that OS does not come with the GNU coreutils, which are the gold standard. You
The Definitive Guides to Unix, Bash, and the Coreutils TOP VIEW_AS_PAGE
Before going any further, it's only fair to plug the authoritative guides which, unsurprisingly, can be found
right on your command line:
$ man bash
$ info coreutils
(The $ at the beginning of the line represents the terminal's prompt.) These are good references, but
overwhelming to serve as a starting point. There are also great resources online:
4 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
The Linux Information Project
Sitemap GNU Bash Reference Manual
Section Contents The Linux Documentation Project
Top
SS64 | Command line reference
< > >>
Stack Overflow
Wikipedia
although these guides, too, are exponentially more useful once you have a small foundation to build on.
The root contains directories, which contain other directories, and so on, just like our tree. To get to any
particular file or directory, we need to specify the path, which is a slash-delimited address:
/dir1/dir2/dir3/some_file
Note that a full path always starts with the root, because the root contains everything. As we'll see below, this
won't necessarily be the case if we specify the address in a relative way, with respect to our current location
in the filesystem.
Let's examine the directory structure on our Macintosh. We'll go to the root directory and look down just one
level with the unix command tree. (If we tried to look at the whole thing, we'd print out every file and
directory on our computer!) We have:
While we're at it, let's also look at the directory /Users/username, which is the specially designated home
directory on the Macintosh:
5 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap
Section Contents
Top
One thing we notice right away is that the Desktop, which holds such a revered spot in the GUI, is just
another directory—simply the first one we see when we turn on our computer.
If you're on Linux rather than Mac OS, the directory tree might look less like the screenshot above and more
like this:
The naming of these folders is not always intuitive, but you can read about the role of each one at
thegeekstuff.com.
If the syntax of a unix path looks familiar, it is. A webpage's URL, with its telltale forward slashes, looks like a
unix path with a domain prepended to it. This is not a coincidence! For a simple static website, its structure
on the web is determined by its underlying directory structure on the server, so navigating to:
http://www.example.com/abc/xyz
will serve you content in the folder websitepath/abc/xyz on the host's computer (i.e., the one owned by
example.com). Modern dynamic websites are more sophisticated than this, but it's neat to reflect that the
whole word has learned this unix syntax without knowing it.
To learn more, see the O'Reilly discussion of the unix file structure.
6 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
The Great Trailing Slash Debate TOP VIEW_AS_PAGE
Sitemap
Sometimes you'll see directories written with a trailing slash, as in:
Section Contents
Top
dir1/
< > >>
This helpfully reminds you that the entity is a directory rather than a file, but on the command line using the
more compact dir1 is sufficient. There are a handful of unix commands which behave slightly differently if
you leave the trailing slash on, but this sort of extreme pedantry isn't worth worrying about.
Where Are You? - Your Path and How to Navigate through the Filesystem TOP
VIEW_AS_PAGE
When you open up the terminal to browse through your filesystem, run a program, or do anything, you're
always somewhere. Where? You start out in the designated home directory when you open up the terminal.
The home directory's path is preset by a global variable called HOME. Again, it's /Users/username on a
Mac.
As we navigate through the filesystem, there are some conventions. The current working directory (cwd)—
whatever directory we happen to be in at the moment—is specified by a dot:
./
When a program is run in the cwd, you often see the syntax:
$ ./myprogram
which emphasizes that you're executing a program from the current directory. The directory one above the
cwd is specified by two dots:
..
../
or:
~/
$ pwd
$ cd /some/path
$ cd
$ mkdir
7 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
As an example, suppose we're in our home directory, /Users/username, and want to get one back to
Sitemap /Users. We can do this two ways:
Section Contents
Top
$ cd /Users
< > >>
or:
$ cd ..
This illustrates the difference between an absolute path and a relative path. In the former case, we specify
the complete address, while in the later we give the address with respect to our cwd. We could even
accomplish this with:
$ cd /Users/username/..
$ cd /Users/username/../username/..
if our primary goal were obfuscation. This distinction between the two ways to specify a path may seem
pedantic, but it's not. Many scripting errors are caused by programs expecting an absolute path and
receiving a relative one instead or vice versa. Use relative paths if you can because they're more portable: if
the whole directory structure gets moved, they'll still work.
Let's mess around. We know cd with no arguments takes us home, so try the following experiment:
What happened? We stayed in /some/path rather than returning to /Users/username. The point? There's
nothing magical about home—it's merely set by the variable HOME. More about variables soon!
Now that we've dipped one toe into the water, let's make a list of the 10 most important unix commands in
the universe:
1. pwd
2. ls
3. cd
4. mkdir
5. echo
6. cat
7. cp
8. mv
9. rm
10. man
Every command has a help or manual page, which can be summoned by typing man. To see more
8 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
information about pwd, for example, we enter:
Sitemap
$ man ls
The man pages tend to give TMI (too much information) but the most important point is that commands have
flags which usually come in a one-dash-one-letter or two-dashes-one-word flavor:
command -f
command --flag
and the docs will tell us what each option does. You can even try:
$ man man
$ man woman
No manual entry for woman
Below we'll discuss the commands in the top 10 list in more depth.
ls TOP VIEW_AS_PAGE
Let's go HOME and try out ls with various flags:
$ cd
$ ls
$ ls -1
$ ls -hl
$ ls -al
First, vanilla ls. We see our files—no surprises. And ls -1 merely displays our files in a column. To show the
human-readable, long form we stack the -h and -l flags:
ls -hl
9 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
This is equivalent to:
Sitemap
ls -h -l
Section Contents
Top Screenshot:
< > >>
This lists the owner of the file; the group to which he belongs (staff); the date the file was created; and the
file size in human-readable form, which means bytes will be rounded to kilobytes, gigabytes, etc. The column
on the left shows permissions. If you'll indulge mild hyperbole, this simple command is already revealing
secrets that are well-hidden by the GUI and known only to unix users. In unix there are three spheres of
permission—user, group, and other/world—as well as three particular types for each sphere—read, write,
and execute. Everyone with an account on the computer is a unique user and, although you may not realize
it, can be part of various groups, such as a particular lab within a university or team in a company. To see
yourself and what groups you belong to, try:
$ whoami
$ groups
---------
rwx------
rwxrwx---
rwxrwxrwx
This means, respectively: no permission for anybody; read, write, execute permission for only the user; rwx
permission for the user and anyone in the group; and rwx permission for the user, group, and everybody
else. Permission is especially important in a shared computing environment. You should internalize now that
two of the most common errors in computing stem from the two P words we've already learned: paths and
permissions. The command chmod, which we'll learn later, governs permission.
If you look at the screenshot above, you see a tenth letter prepended to the permission string, e.g.:
drwxrwxrwx
This has nothing to do with permissions and instead tells you about the type of entity in the directory: d
stands for directory, l stands for symbolic link, and a plain dash denotes a file.
ls -al
lists all files in the directory, including dotfiles. These are files that begin with a dot and are hidden in the
GUI. They're often system files—more about them later. Screenshot:
10 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap
Section Contents
Top
Note that, in contrast to ls -hl, the file sizes are in pure bytes, which makes them a little hard to read.
A general point about unix commands: they're often robust. For example, with ls you can use an arbitrary
number of arguments and it obeys the convention that an asterisk matches anything (this is known as file
globbing, and I think of it as the prequel to regular expressions). Take:
This monstrosity would list anything in the cwd; anything in directory dir1; anything in the directory one above
us; anything in directory dir2 that ends with .txt; and anything in directory dir3 that starts with A and ends with
.html. You get the point.
$ mkdir -p squirrel/mouse/fox
(The -p flag is required to make nested directories in a single shot) The directory fox is empty, but if I ask you
to list its contents for the sake of argument, how would you do it? Beginners often do it like this:
$ cd squirrel
$ cd mouse
$ cd fox
$ ls
But this takes forever. Instead, list the contents from the current working directory, and—crucially—use bash
autocompletion, a huge time saver. To use autocomplete, hit the tab key after you type ls. Bash will show
you the directories available (if there are many possible directories, start typing the name, sq..., before you
hit tab). In short, do it like this:
$ # This is a comment.
$ # If we put the pound sign in front of a command, it won't do anything:
$ # ls -hl
Suppose you write a line of code on the command line and decided you don't want to execute it. You have
two choices. The first is pressing Cntrl-c, which serves as an "abort mission." The second is jumping to the
beginning of the line (Cntrl-a) and adding the pound character. This has an advantage over the first method
11 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
that the line will be saved in bash history (discussed below) and can thus be retrieved and modified later.
Sitemap
Section Contents In a script, pound-special-character (like #!) is sometimes interpreted (see below), so take note and include a
Top
space after # to be safe.
< > >>
Because text editors are extremely important, some people develop deep relationships with them. My co-
worker, who is a Vim aficionado, turned to me not long ago and said, "You know how you should think about
editing in Vim? As if you're talking to it." On the terminal, a ubiquitous and simple editor is nano. If you're
more advanced, try Vim or Emacs. Not immune to my co-worker's proselytizing, I've converted to Vim.
Although it's sprawling and the learning curve can be harsh—Vim is like a programming language in itself—
you can do a zillion things with it. There's a section on Vim near the end of this article.
On the GUI, there are many choices: Sublime, Visual Studio Code, Atom, Brackets, TextMate (Mac only),
Aquamacs (Mac only), etc.
$ nano file.txt
1 x
4 b
z 9
$ echo joe
$ echo "joe"
$ cat file.txt
would print out the contents of both file.txt and file2.txt concatenated together, which is where this command
gets its slightly confusing name.
12 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Top
$ echo -n "joe" # suppress newline
< > >>
$ echo -e "joe\tjoe\njoe" # interpret special chars ( \t is tab, \n newline )
$ cat -n file.txt # print file with line numbers
$ cp file1 file2
$ cp -R dir1 dir2
The first line would make an identical copy of file1 named file2, while the second would do the same thing for
directories. Notice that for directories we use the -R flag (for recursive). The directory and everything inside it
are copied.
$ cp -R dir1 ../../
Answer: it would make a copy of dir1 up two levels from our current working directory.
$ mv file1 file2
In a sense, this command also moves files, because we can rename a file into a different path. For example:
$ mv file1 dir1/dir2/file2
would move file1 into dir1/dir2/ and change its name to file2, while:
$ mv file1 dir1/dir2/
would simply move file1 into dir1/dir2/ or, if you like, rename ./file1 as ./dir1/dir2/file1.
Once we've declared something as a variable, we need to use $ to access its value (and to let bash know it's
a variable). For example:
$ a=3
$ echo a
a
$ echo $a
13 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
3
Sitemap
Section Contents So, with no $ sign, bash thinks we just want to echo the string a. With a $ sign, however, it knows we want to
Top
access what the variable a is storing, which is the value 3. Variables in unix are loosely-typed, meaning you
< > >>
don't have to declare something as a string or an integer.
We can declare and echo two variables at the same time, and generally play fast and loose, as we're used
to doing on the command line:
$ a=3; b=4
$ echo $a $b
34
$ echo $a$b # mesh variables together as you like
34
$ echo "$a$b" # use quotes if you like
34
$ echo -e "$a\t$b" # the -e flag tells echo to interpret \t as a tab
3 4
As we've seen above, if you want to print a string with spaces, use quotes. You should also be aware of how
bash treats double vs single quotes. If you use double quotes, any variable inside them will be expanded
(the same as in Perl). If you use single quotes, everything is taken literally and variables are not expanded.
Here's an example:
$ var=5
$ joe=hello $var
-bash: 5: command not found
$ joe="hello $var"
$ echo $joe
hello 5
$ joe='hello $var'
$ echo $joe
hello $var
An important note is that often we use variables to store paths in unix. Once we do this, we can use all of our
familiar directory commands on the variable:
$ d=dir1/dir2/dir3
$ ls $d
$ cd $d
$ d=.. # this variable stores the directory one above us (relative path)
$ cd $d/.. # cd two directories up
14 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME Escape sequences are important in every language. When bash reads $a it interprets it as whatever's stored
Sitemap in the variable a. What if we actually want to echo the string $a? To do this, we use \ as an escape
Section Contents
character:
Top
What if we want to echo the slash, too? Then we have to escape the escape character (using the escape
character!):
This really comes down to parsing. The slash helps bash figure out if your text is a plain old string or a
variable. It goes without saying that you should avoid special characters in your variable names. In unix we
might occasionally fall into a parsing tar-pit trap. To avoid this, and make extra sure bash parses our variable
right, we can use the syntax ${a} as in:
$ echo ${a}
3
When could this possibly be an issue? Later, when we discuss scripting, we'll learn that $n, where n is a
number, is the nth argument to our script. If you were crazy enough to write a script with 11 arguments, you'd
discover that bash interprets a=$11 as a=$1 (the first argument) concatenated with the string 1 while
a=${11} properly represents the eleventh argument. This is getting in the weeds, but FYI.
$ a=3
$ echo $a # variable a equals 3
3
$ echo $apple # variable apple is not set
$ echo ${a}pple # this describes the variable a plus the string "pple"
3pple
$ set
HOME
PS1
TMPDIR
EDITOR
DISPLAY
HOME, as we've already seen, is the path to our home directory (preset to /Users/username on
Macintosh). PS1 sets the shell's prompt. For example:
$ PS1=':-) '
15 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
changes our prompt from a dollar-sign into an emoticon, as in:
Sitemap
Section Contents
Top
This is amusing, but the PS1 variable is actually important for orienting you on the command line. A good
prompt is a descriptive one: it tells you what directory you're in and perhaps your username and the name of
the computer. The default prompt should do this but, in case you want to nerd out, you can read about the
arcane language required to customize PS1 here. (If you're familiar with git, you can even have your prompt
[1]
display the branch of the git repository you're on. )
On your computer there is a designated temporary directory and its path is stored in TMPDIR. Some
commands, such as sort, which we'll learn later, surreptitiously make use of this directory to store
intermediate files. At work, we have a shared computer system and occasionally this common directory
$TMPDIR will run out of space, causing programs trying to write there to fail. One solution is to simply set
TMPDIR to a different path where there's free space. EDITOR sets the default text editor (you can invoke it
by pressing Cntrl-x-e). And DISPLAY is a variable related to the X Window System.
Many programs rely on their own agreed-upon global variables. For example, if you're a Perl user, you may
know that Perl looks for modules in the directory whose path is stored in PERL5LIB. Python looks for its
modules in PYTHONPATH; R looks for packages in R_LIBS; Matlab uses MATLABPATH; awk uses
AWKPATH; C++ looks for libraries in LD_LIBRARY_PATH; and so on. These variables don't exist in the
shell by default. A program will make a system call and look for the variable. If the user has had the need or
foresight to define it, the program can make use of it.
[1] Here's the trick, although it's beyond the scope of this introduction. Add these lines to your bash configuation file:
# via: https://techcommons.stanford.edu/topics/git/show-git-branch-bash-prompt
parse_git_branch() {
git branch --no-color 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/()/'
}
Let's revisit the idea of a command in unix. What's a command? It's nothing more than a program sitting in a
[1]
directory somewhere. So, if ls is a program , where is it? Use the command which to see its path:
16 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
For the sake of argument, let's say I download an updated version of the ls command, and then type ls in
Sitemap my terminal. What will happen—will the old ls or the new ls execute? The PATH comes into play here
Section Contents because it also determines priority. When you enter a command, unix will look for it in each directory of the
Top
PATH, from first to last, and execute the first instance it finds. For example, if:
< > >>
PATH=/bin/dir1:/bin/dir2:/bin/dir3
and there's a command named ls in both /bin/dir1 and /bin/dir2, the one in /bin/dir1 will be executed.
$ echo $PATH
To emphasize the point again, all the programs in the directories specified by your PATH are all the
programs that you can access on the command line by simply typing their names.
The PATH is not immutable. You can set it to be anything you want, but in practice you'll want to augment,
rather than overwrite, it. By default, it contains directories where unix expects executables, like:
/bin
/usr/bin
/usr/local/bin
Let's say you have just written the command /mydir/newcommand. If you're not going to use the
command very often, you can invoke it using its full path every time you need it:
$ /mydir/newcommand
However, if you're going to be using it frequently, you can just add /mydir to the PATH and then invoke the
command by name:
This is a frequent chore in unix. If you download some new program, you will often find yourself updating the
PATH to include the directory containing its binaries. How can we avoid having to do this every time we
open the terminal for a new session? We'll discuss this below when we learn about .bashrc.
If you want to shoot yourself in the foot, you can vaporize the PATH:
[1] In fact, if you want to view the source code (written in the C programming language) of ls and other members of the coreutils, you can
download it at ftp.gnu.org/gnu/coreutils. E.g., get and untar the latest version as of this writing:
17 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
$ wget ftp://ftp.gnu.org/gnu/coreutils/coreutils-8.9.tar.xz
HOME
$ tar -xvf coreutils-8.9.tar.xz
Sitemap
Section Contents
$ ln -s /path/to/target/file mylink
This produces:
$ rm mylink
If we give the target (or source) path as the sole argument to ln, the name of the link will be the same as the
source file's. So:
$ ln -s /path/to/target/file
produces:
Links are incredibly useful for all sorts of reasons—the primary one being, as we've already remarked, if you
want a file to exist in multiple locations without having to make extraneous, space-consuming copies. You
can make links to directories as well as files. Suppose you add a directory to your PATH that has a particular
version of a program in it. If you install a newer version, you'll need to change the PATH to include the new
directory. However, if you add a link to your PATH and keep the link always pointing to the most up-to-date
directory, you won't need to keep fiddling with your PATH. The scenario could look like this:
$ ls -hl myprogram
current -> version3
version1
version2
version3
(where I'm hiding some of the output in the long listing format.) In contrast to our other examples, the link is
in the same directory as the target. Its purpose is to tell us which version, among the many crowding a
directory, we should use.
18 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap Another good practice is putting links in your home directory to folders you often use. This way, navigating to
Section Contents those folders is easy when you log in. If you make the link:
Top
$ cd MYLINK
rather than:
$ cd /some/long/and/complicated/path/to/an/often/used/directory
$ mkdir tmp
$ cd tmp
$ pwd
/Users/oliver/tmp
$ touch myfile.txt # the command touch creates an empty file
$ ls
myfile.txt
$ ls myfile_2.txt # purposely execute a command we know will fail
ls: cannot access myfile_2.txt: No such file or directory
What if we want to repeat the exact same sequence of commands 5 minutes later? Massive bombshell—we
can save all of these commands in a file! And then run them whenever we like! Try this:
$ nano myscript.sh
# a first script
mkdir tmp
cd tmp
pwd
touch myfile.txt
ls
ls myfile_2.txt
Gratuitous screenshot:
This file is called a script (.sh is a typical suffix for a shell script), and writing it constitutes our first step into
19 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
the land of bona fide computer programming. In general usage, a script refers to a small program used to
Sitemap perform a niche task. What we've written is a recipe that says:
Section Contents
This script, though silly and useless, teaches us the fundamental fact that all computer programs are
ultimately just lists of commands.
$ ./myscript.sh
-bash: ./myscript.sh: Permission denied
WTF! It's dysfunctional. What's going on here is that the file permissions are not set properly. In unix, when
you create a file, the default permission is not executable. You can think of this as a brake that's been
engaged and must be released before we can go (and do something potentially dangerous). First, let's look
at the file permissions:
$ ls -hl myscript.sh
-rw-r--r-- 1 oliver staff 75 Oct 12 11:43 myscript.sh
Let's change the permissions with the command chmod and execute the script:
$ chmod u+x myscript.sh # add executable(x) permission for the user(u) only
$ ls -hl myscript.sh
-rwxr--r-- 1 oliver staff 75 Oct 12 11:43 myscript.sh
$ ./myscript.sh
/Users/oliver/tmp/tmp
myfile.txt
ls: cannot access myfile_2.txt: No such file or directory
Not bad. Did it work? Yes, it did because it's printed stuff out and we see it's created tmp/myfile.txt:
$ ls
myfile.txt myscript.sh tmp
$ ls tmp
myfile.txt
An important note is that even though there was a cd in our script, if we type:
$ pwd
/Users/oliver/tmp
we see that we're still in the same directory as we were in when we ran the script. Even though the script
entered /Users/oliver/tmp/tmp, and did its bidding, we stay in /Users/oliver/tmp. Scripts always work
this way—where they go is independent of where we go.
If you're wondering why anyone would write such a pointless script, you're right—it would be odd if we had
occasion to repeat this combination of commands. There are some more realistic examples of scripting
below.
20 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
suffixes, like:
Sitemap
Section Contents
.txt - for text files
Top .html - for html files
< > >>
.sh - for shell scripts
.pl - for Perl scripts
.py - for Python scripts
.cpp - for c++ code
and so on. Adhering to this organizational practice will enable us to quickly scan our files, and make
[1]
searching for particular file types easier . As we saw above, commands like ls and find are particularly
well-suited to use this kind of information. For example, list all text files in the cwd:
$ ls *.txt
List all text files in the cwd and below (i.e., including child directories):
[1] An astute reader noted that, for commands—as opposed to, say, html or text files—using suffixes is not the best practice because it violates
the principle of encapsulation. The argument is that a user is neither supposed to know nor care about a program's internal implementation
details, which the suffix advertises. You can imagine a program that starts out as a shell script called mycommand.sh, is upgraded to Python as
mycommand.py, and then is rewritten in C for speed, becoming the binary mycommand. What if other programs depend on mycommand? Then
each time mycommand's suffix changes they have to be rewritten—an annoyance. Although I make this sloppy mistake in this article, that
Update: There's a subtlety inherent in this argument that I didn't appreciate the first time around. I'm going to jump ahead of the narrative here, so
you may want to skip this for now and revist it later. Suppose you have two identical Python scripts. One is called hello.py and one is simply
#!/usr/bin/env python
def yo():
print('hello')
>>> import hi
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named hi
Being able to import scripts in Python is important for all kinds of things, such as making modules, but you can only import something with a .py
extension. So how do you get around this if you're not supposed to use file extensions? The sage answers this question as follows:
The best way to handle this revolves around the core question of whether the file should be a command or a library. Libraries have
to have the extension, and commands should not, so making a tiny command wrapper that handles parsing options and then calls
To elaborate, this considers a command to be something in your PATH, while a library—which could be a runnable script—is not. So, in this
example, hello.py would stay the same, as a library not in your PATH:
#!/usr/bin/env python
21 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
def yo():
Sitemap
print('hello')
Section Contents
Top
#!/usr/bin/env python
import hello
hello.yo()
The first way to tell unix which program to use to interpret the script is simply to say so on the command line.
For example, we can use bash to execute bash scripts:
$ perl ./myscript_1.sh
String found where operator expected at ./myscript_1.sh line 1,
near "echo "hello kitty""
(Do you need to predeclare echo?)
syntax error at ./myscript_1.sh line 1, near "echo "hello kitty""
Execution of ./myscript_1.sh aborted due to compilation errors.
The second way to specify the proper interpreter—and the better way, which you should emulate—is to put it
in the script itself using a shebang. To do this, let's remind ourselves where bash and perl reside on our
system. On my computer, they're here:
$ which perl
/usr/bin/perl
22 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
Top
although perl could be somewhere else on your machine (bash should be in /bin by convention). The
< > >>
shebang specifies the language in which your script is interpreted according to the syntax #! followed by the
path to the language. It should be the first line of your script. Note that it's not a comment even though it
looks like one. Let's add shebangs to our two scripts:
$ cat myscript_1.sh
#!/bin/bash
echo "hello kitty"
$ cat myscript_1.pl
#!/usr/bin/perl
print "hello kitty\n";
$ ./myscript_1.sh
hello kitty
$ ./myscript_1.pl
hello kitty
However, there's still a lingering issue and it has to do with portability, an important software principle. What
if perl is in a different place on your machine than mine and you copy my scripts and try to run them? The
path will be wrong and they won't work. The solution to this issue is courtesy of a neat trick using env. We
can amend our script to be:
$ cat myscript_1.pl
#!/usr/bin/env perl
print "hello kitty\n";
Of course, this assumes you have a copy of env in /usr/bin, but this is usually a correct assumption. What
env does here is to use whatever your environmental variable for perl is—i.e., the perl that's first in your
PATH.
This is a useful practice even if you're not sharing scripts. Suppose you've updated your version of perl and
there's a newer copy than /usr/bin/perl. You've appropriately updated your PATH such that the directory
containing the updated perl comes before /usr/bin. If you have env in your shebang, you're all set.
[1]
However, if you've hardwired the old path in your shebang, your script will run on the old perl .
The question that the shebang resolves—which program will run your script?—reminds us of a more
fundamental distinction between interpreted languages and compiled languages. The former are those like
bash, Perl, and Python, where you can cat a script and look inside it. The later, like C++, require
compilation, the process whereby code is translated into machine language (the result is sometimes called a
binary). This can be done with a command line utility like g++:
Compiled programs, such as the unix utilities themselves, tend to run faster. Don't try to cat a binary, such
as ls, or it will spew out gibberish:
[1] Of course, depending on circumstances, you may very well want to stick with the old version of Perl or whatever's running your program. An
23 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
update can have unforeseen consequences and this is the motivation for tools like virtualenv (Python), whose docs remind us: "If an application
HOME
works, any change in its libraries or the versions of those libraries can break the application" ↑
Sitemap
Section Contents
The Bourne shell (sh) is a shell, or command-line interpreter, for computer operating systems. The shell is a
command that reads lines from either a file or the terminal, interprets them, and generally executes other
commands. It is the program that is running when a user logs into the system ... Commands can be typed
directly to the running shell or can be put into a file and the file can be executed directly by the shell
As it describes, sh is special because it's both a command interpreter and a command itself (usually found at
/bin/sh). Put differently, you can run myscript as:
$ sh ./myscript
$ sh
$ ./myscript
without specifying an interpreter or using a shebang, your script will be interpreted by sh by default. On most
computers, however, the default shell is no longer sh but bash (usually located at /bin/bash). To mash up
Wikipedia and the manual page:
The Bourne-Again SHell (bash) a Unix shell written by Brian Fox for the GNU Project as a free software
replacement for the Bourne shell. bash is an sh-compatible command language interpreter that executes
commands read from the standard input or from a file ... There are some subtle differences between bash
and traditional versions of sh
Like sh, bash is a command you can either invoke on a script or use to start an interactive bash shell. Read
more on Stackoverflow: Difference between sh and bash.
Which shell are you using right now? Almost certainly bash, but if you want to double check, there's a neat
command given here to display your shell type:
$ ps -p $$
There are more exotic shells, like Z shell and tcsh, but they're beyond the scope of this article.
u - user
g - group
o - other/world
r - read
w - write
x - execute
we can mix and match these how we like, using a plus sign to grant permissions according to the syntax:
24 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
chmod entity+permissiontype
Sitemap
E.g.:
$ chmod a-rwx myfile # remove all permissions for you, the group,
# and the rest of the world
If you find the above syntax cumbersome, there's a numerical shorthand you can use with chmod. The only
two I have memorized are 777 and 755:
Read more about the numeric code here. In general, it's a good practice to allow your files to be writable by
you alone, unless you have a compelling reason to share access to them.
ssh username@host
For example:
$ ssh [email protected]
If you're trying to ssh into a private computer and don't know the hostname, use its IP address
(username@IP-address).
ssh also allows you to run a command on the remote server without logging in. For instance, to list of the
contents of your remote computer's home directory, you could run:
Cool, eh? Moreover, if you have ssh access to a machine, you can copy files to or from it with the utility
rsync—a great way to move data without an external hard drive.
The file:
~/.ssh/config
determines ssh's behavior and you can create it if it doesn't exist (the dot in the name .ssh confers
invisibility—see the discussion about dotfiles below). On your own private computer, you can ssh into
25 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
selected servers without having to type in a password by updating this configuration file. To do this, generate
Sitemap rsa ssh keys:
Section Contents
Top
$ mkdir -p ~/.ssh
< > >>
$ cd ~/.ssh
$ ssh-keygen -t rsa -f localkey
~/.ssh/localkey.pub
~/.ssh/localkey
You can share your public key, but do not give anyone your private key! Suppose you want to ssh into
myserver.com. Normally, that's:
$ ssh [email protected]
Host Myserver
HostName myserver.com
User myusername
IdentityFile ~/.ssh/localkey
~/.ssh/authorized_keys
on the remote machine (i.e., the myserver.com computer). Now on your local computer, you can ssh into
myserver.com without a password:
$ ssh Myserver
[2]
You can also use this technique to push to github.com , without having to punch your password in each
time, by pasting your public key into:
If this is your first encounter with ssh, you'd be surprised how much of the work of the world is done by ssh.
It's worth reading the extensive man page, which gets into matters of computer security and cryptography.
[1] The host also has to enable ssh access. On Macintosh, for example, it's disabled by default, but you can turn it on, as instructed here. For
[2] As you get deeper into the game, tracking your scripts and keeping a single, stable version of them becomes crucial. Git, a vast subject for
another tutorial, is the neat solution to this problem and the industry standard for version control. On the web GitHub provides free hosting of
script repositories and connects to the command line via the git interface ↑
26 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
Returning to our first script, myscript.sh, let's save the output to a file:
$ cat out.txt
/Users/oliver/tmp/tmp
myfile.txt
This is interesting: out.txt has its output. However, not everything went into out.txt, because some error
messages were echoed to the console. What's going on here is that there are actually two output streams:
stdout (standard out) and stderr (standard error). Look at the following figure from Wikipedia:
Proper output goes into stdout while errors go into stderr. The syntax for saving stderr in unix is 2> as in:
$ # save the output into out.txt and the error into err.txt
$ ./myscript.sh > out.txt 2> err.txt
$ cat out.txt
/Users/oliver/tmp/tmp
myfile.txt
$ cat err.txt
mkdir: cannot create directory ‘tmp’: File exists
ls: cannot access myfile_2.txt: No such file or directory
When you think about it, the fact that output and error are separated is supremely useful. At work,
sometimes we parallelize heavily and run 1000 instances of a script. For each instance, the error and output
th
are saved separately. The 758 job, for example, might look like this:
(I'm in the habit of using the suffixes .o for output and .e for error.) With this technique we can quickly scan
through all 1000 .e files and check if their size is 0. If it is, we know there was no error; if not, we can re-run
the failed jobs. Some programs are in the habit of echoing run statistics or other information to stderr. This is
an unfortunate practice because it muddies the water and, as in the example above, would make it hard to
tell if there was an actual error.
Output vs error is a distinction that many programming languages make. For example, in C++ writing to
stdout and stderr is like this:
In Perl it's:
27 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
In Python it's:
Sitemap
and so on.
as in:
What if we want to choose where things will be printed from within our script? Then we can use the following
syntax:
#!/bin/bash
# version 1
echo "hello kitty"
#!/bin/bash
# version 2
echo "hello kitty" > somefile.txt
#!/bin/bash
# version 3
echo "hello kitty" > &1
#!/bin/bash
# version 4
echo "hello kitty" > &2
#!/bin/bash
# version 5
echo "hello kitty" > 1
This illustrates the point of the ampersand syntax: it distinguishes between the output streams and files
named 1 or 2. Let's try running script version 4 as a sanity check to make sure these scripts are working as
expected:
28 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
This syntax makes it easy to see how we could, e.g., redirect the standard error to standard output:
I rarely have occasion to do this and, although it's not something you need in your introductory unix toolkit,
it's good to know.
$ a=joe
$ if [ $a == "joe" ]; then echo hello; fi
hello
or:
$ a=joe
$ if [ $a == "joe" ]; then echo hello; echo hello; echo hello; fi
hello
hello
hello
Everything between the words then and fi (if backwards in case you didn't notice) will execute if the
condition is satisfied. In other languages, this block is often defined by curly brackets: { }. For example, in a
Perl script, the same code would be:
#!/usr/bin/env perl
my $a="joe";
if ( $a eq "joe" )
{
print "hello\n";
print "hello\n";
print "hello\n";
}
In bash, if is if, else is else, and else if is elif. In a script it would look like this:
#!/bin/bash
a=joe
if [ $a == "joe" ]; then
echo hello;
elif [ $a == "doe" ]; then
echo goodbye;
else
echo "ni hao";
29 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
fi
Sitemap
Section Contents You can also use a case statement to implement conditional logic. See an example of that here.
Top
In our context, this would read: bash is supreme on the command line, but not inside of a script.
$ file=emptyfile
$ if [ -e $file ]; then echo "exists"; if [ -s $file ]; then echo "non-0"; fi; fi
exists
$ file=nonemptyfile
$ if [ -e $file ]; then echo "exists"; if [ -s $file ]; then echo "non-0"; fi; fi
exists
non-0
Read The Linux Documentation Project's discussion of file test operators here.
Changing the subject altogether, you may be familiar with the idea of a return value in computer science.
Functions can return a value upon completion. In unix, commands also have a return value or exit code,
queryable with:
$?
This is usually employed to tell the user whether or not the command successfully executed. By convention,
successful execution returns 0. For example:
$ echo joe
joe
$ echo $? # query exit code of previous command
0
Let's see how the exit code can be useful. We'll make a script, test_exitcode.sh, such that:
$ cat test_exitcode.sh
30 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
#!/bin/bash
Sitemap
sleep 10
Section Contents
Top This script just pauses for 10 seconds. First, we'll let it run and then we'll interrupt it using Cntrl-c:
< > >>
$ ./test_exitcode.sh; # interrupt it
^C
$ echo $?
130
The non-zero exit code tells us that it's failed. Now we'll try the same thing with an if statement:
$ ./test_exitcode.sh
$ if [ $? == 0 ]; then echo "program succeeded"; else echo "program failed"; fi
program succeeded
$ ./test_exitcode.sh;
^C
$ if [ $? == 0 ]; then echo "program succeeded"; else echo "program failed"; fi
program failed
In research, you might run hundreds of command-line programs in parallel. For each instance, there are two
key questions: (1) Did it finish? (2) Did it run without error? Checking the exit status is the way to address the
second point. You should always check the program you're running to find information about its exit code,
since some use different conventions. Read The Linux Documentation Project's discussion of exit status
here.
This is yet another example of bash allowing you to stretch syntax like silly putty. In this code snippet,
echo joe
is run, and its successful execution passes a true return code to the if statement. So, the two joes we see
echoed to the console are from the statement to be evaluated and the statement inside the conditional. We
can also invert this formula, doing something if our command fails:
Did you follow that? (! means logical NOT in unix.) The idea is, we try to cd but, if it's unsuccessful, we echo
an error message. This is a particularly useful line to include in a script. If the user gives an output directory
as an argument and the directory doesn't exist, we exit. If it does exist, we cd into it and it's business as
usual:
31 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Without this line, the script will run in whatever directory it's in if cd fails. Once in lab, I was running a script
Sitemap that didn't have this kind of protection. The output directory wasn't found and the script starting making and
Section Contents deleting files in the wrong directory. It was powerfully uncool!
Top
$ touch file{1..4}
$ ls
file1 file2 file3 file4
The && operator will chug through a chain of commands and keep on going until one of the commands
fails, as in:
In contrast, the || operator will proceed through the command chain and stop after the first successful one,
as in:
Many other languages wouldn't let you get away with combining data types in the iterations of a loop, but this
is a recurrent bash theme: it's fast; it's loose; it's malleable.
32 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
To count from 1 to 10, try:
Sitemap
$ echo {1..10}
why do we need a loop here? Loops really come into their own in bash when—no surprise!—we're dealing
with files, paths, and commands. For example, to loop through all of the text files in the cwd, use:
$ ls *.txt
the former construction has the advantage that we can stuff as much code as we like in the block between
do and done. Let's make a random directory structure like so:
$ mkdir -p myfolder{1..3}/{X,Y}
We can populate it with token files (fodder for our example) via a loop:
$ j=0; for i in myfolder*/*; do echo "*** "$i" ***"; touch ${i}/a_${j}.txt ${i}/b_${j}.txt; ((j++)); done
In bash, ((j++)) is a way of incrementing j. We echo $i to get some visual feedback as the loop iterates.
Now our directory structure looks like this:
To practice loops, suppose we want to find any file that begins with b in any subfolder and make a symbolic
link to it from the cwd:
As we learned above, a link is not a copy of a file but, rather, a kind of pointer that allows us to access a file
from a path other than the one where it actually resides. Our loop yields the links:
33 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
b_4.txt -> myfolder3/X/b_4.txt
Sitemap
b_5.txt -> myfolder3/Y/b_5.txt
Section Contents
I can't overstate all the heroic things you can do with loops in bash. Suppose we want to change the
extension of any text file that begins with a and resides in an X subfolder from .txt to .html:
$ for i in myfolder*/X/a*.txt; do echo "*** "$i" ***"; j=$( echo $i | sed 's|\.txt|\.html|' ); echo $j; mv $i $j; echo; don
But I've jumped the gun! This example features three things we haven't learned yet: command substitution,
piping, and sed. You should revisit it after reading those sections, but the idea is that the variable j stores a
path that looks like our file's but has the extension replaced. And you see that a knowledge of loops is like a
stick of dynamite you can use to blow through large numbers of files.
$ for i in $( echo $PATH | tr ":" " " ); do echo "*** "$i" ***"; ls $i | head; echo; done | less
Can you guess what this does? It shows the first ten commands in each folder in our PATH—not something
you'd likely need to do, but a demonstration of the fluidity of these constructions.
If we want to run a command or script in parallel, we can do that with loops, too. gzip is a utility to compress
files, thereby saving hard drive space. To compress all text files in the cwd, in parallel, do:
But I've gotten ahead of myself again. We'll leave the discussion of this example to the section on processes.
I use while loops much less than for loops, but here's an example:
The while loop can also take input from a file. Suppose there's a file junk.txt such that:
$ cat junk.txt
1
2
3
#!/bin/bash
34 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
echo hello
Sitemap
#!/bin/bash
echo hello $1
Now:
$ ./hellokitty.sh kitty
hello kitty
In bash $1 represents the first argument to the script, $2 the second, and so on. If our script is:
#!/bin/bash
echo $0
echo hello $1 $4
Then:
In most programming languages, arguments passed in on the command line are stored as an array. Bash
stores the nth element of this array in the variable $n. $0 is special and refers to the name of the script itself.
For casual scripts this suits us well. However, as you go on to write more involved programs with many
options, it becomes impractical to rely on the position of an argument to determine its function in your script.
The proper way to do this is using flags that can be deployed in arbitrary order, as in:
You can do this with the command getopts, but it's sometimes easier just to write your own options parser.
Here's a sample script called test_args. Although a case statement would be a good way to handle
numerous conditions, I'll use an if statement:
#!/bin/bash
35 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
elif [ "$1" == "-f2" -o "$1" == "--flag2" ]; then
Sitemap
shift; var2=$1; shift
Section Contents
elif [ "$1" == "-f3" -o "$1" == "--flag3" ]; then
Top shift; var3=$1; shift
< > >> # if unknown argument, just shift
else
shift
fi
done
### main
# echo variable if not empty
if [ ! -z $var1 ]; then echo "flag1 passed "$var1; fi
if [ ! -z $var2 ]; then echo "flag2 passed "$var2; fi
if [ ! -z $var3 ]; then echo "flag3 passed "$var3; fi
The code loops through the argument array and keeps popping off elements until the array size is zero,
whereupon it exits the loop. For example, one might run this script as:
To spell out how this works, the first argument is --flag1. Since this matches one of our checks, we shift.
This pops this element out of our array, so the first element, $1, becomes x. This is stored in the variable
var1, then there's another shift and $1 becomes -f2, which matches another condition, and so on.
We're brushing up against the outer limits of bash here. My prejudice is that you usually shouldn't go this far
with bash, because its limitations will come sharply into focus if you try to do too-involved scripting. Instead,
use a more friendly language. In Perl, for example, the array containing inputs is @ARGV; in Python, it's
sys.argv. Let's compare these common scripting languages:
Perl has a Getopt package that is convenient for reading arguments, and Python has an even better one
called argparse. Their functionality is infinitely nicer than bash's, so steer clear of bash if you're going for a
36 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
script with lots of options.
Sitemap
Section Contents
Top
< > >> [1] The distinction between $* and $@ is knotty. Dive into these subtleties on Stackoverflow ↑
Let's continue in the realm of scripting. You can do a multi-line comment in bash with an if statement:
# multi-line comment
if false; then
echo hello
echo hello
echo hello
fi
Multi-line strings are handy for many things. For example, if you want a help section for your script, you can
do it like this:
cat <<_EOF_
Usage:
Required Arguments:
Options:
_EOF_
How does this syntax work? Everything between the _EOF_ tags comprises the string and is printed. This is
called a Here Document. Read The Linux Documentation Project's discussion of Here Documents here.
$ cat ./test_src.sh
#!/bin/bash
myvariable=54
echo $myvariable
If we run it and then check what happened to the variable on our command line, we get:
$ ./test_src.sh
54
$ echo $myvariable
The variable is undefined. The command source is for solving this problem. If we want the variable to
persist, we run:
37 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
and—voilà!—our variable exists in the shell. An equivalent syntax for sourcing uses a dot:
But now observe the following. We'll make a new script, test_src_2.sh, such that:
$ cat ./test_src_2.sh
#!/bin/bash
echo $myvariable
$ ./test_src_2.sh
Nothing! So $myvariable is defined in the shell but, if we run another script, its existence is unknown. Let's
amend our original script to add in an export:
$ cat ./test_src.sh
#!/bin/bash
$ ./test_src.sh
54
$ ./test_src_2.sh
$ source ./test_src.sh
54
$ ./test_src_2.sh
54
So, at last, we see how to do this. If we want access on the shell to a variable which is defined inside a
script, we must source that script. If we want other scripts to have access to that variable, we must source
plus export.
$ touch .test
Such a file will be invisible in the GUI and you won't see it with vanilla ls either. (This works the same way for
directories.) The only way to see it is to use the list all option:
ls -al
or to list it explicitly by name. This is useful for files that you generally want to keep hidden from the user or
38 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
discourage tinkering with.
Sitemap
Section Contents Many programs, such as bash, Vim, and Git, are highly configurable. Each uses dotfiles to let the user add
Top
functionality, change options, switch key bindings, etc. For example, here are some of the dotfiles files each
< > >>
program employs:
bash - .bashrc
vim - .vimrc
git - .gitconfig
The most famous dotfile in my circle is .bashrc which resides in HOME and configures your bash. Actually,
let me retract that: let's say .bash_profile instead of .bashrc (read about the difference here). In any case, the
idea is that this dotfile gets executed as soon as you open up the terminal and start a new session. It is
therefore ideal for setting your PATH and other variables, adding functions (like this one), creating aliases
(discussed below), and doing any other setup related chore. For example, suppose you download a new
program into /some/path/to/prog and you want to add it to your PATH. Then in your .bash_profile you'd
add:
export PATH=/some/path/to/prog:$PATH
Recalling how export works, this will allow any programs we run on the command line to have access to our
amended PATH. Note that we're adding this to the front of our PATH (so, if the program exists in our PATH
already, the existing copy will be superseded). Here's an example snippet of my setup file:
There is much ado about .bashrc (read .bash_profile) and it inspired one of the greatest unix blog-post titles
of all time: Pimp my .bashrc—although this blogger is only playing with his prompt, as it were. As you go on
in unix and add things to your .bash_profile, it will evolve into a kind of fingerprint, optimizing bash in your
own unique way (and potentially making it difficult for others to use).
If you have multiple computers, you'll want to recycle much of your program configurations on all of them. My
co-worker uses a nice system I've adopted where the local and global aspects of setup are separated. For
example, if you wanted to use certain aliases across all your computers, you'd put them in a global settings
file. However, changes to your PATH might be different on different machines, so you'd store this in a local
settings file. Then any time you change computers you can simply copy the global files and get your familiar
setup, saving lots of work. A convenient way to accomplish this goal of a unified shell environment across all
the systems you work on is to put your dotfiles on a server, like GitHub or Bitbucket, you can access from
anywhere. This is exactly what I've done and you can get the up-to-date versions of my dotfiles on GitHub.
Here's a sketch of how this idea works: in HOME make a .dotfiles/bash directory and populate it with your
setup files, using a suffix of either local or share:
$ ls -1 .dotfiles/bash/
bash_aliases_local
bash_aliases_share
bash_functions_share
bash_inirun_local
bash_paths_local
bash_settings_local
bash_settings_share
bash_welcome_local
bash_welcome_share
When .bash_profile is called at the startup of your session, it sources all these files:
39 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
Top # to make local configurations, add these files into this directory:
< > >> # bash_aliases_local
# bash_paths_local
# bash_settings_local
# bash_welcome_local
# this line, e.g., protects the functionality of rsync by only turning on the below if the shell is in interactive mod
# In particular, rsync fails if things are echo-ed to the terminal
[[ "$-" != *i* ]] && return
# bash welcome
if [ -e "${INIT_DIR}/bash_welcome_local" ]; then
cat ${INIT_DIR}/bash_welcome_local
elif [ -e "${INIT_DIR}/bash_welcome_share" ]; then
cat ${INIT_DIR}/bash_welcome_share
fi
#--------------------LOCAL------------------------------
# aliases local
if [ -e "${INIT_DIR}/bash_aliases_local" ]; then
source "${INIT_DIR}/bash_aliases_local"
echo "bash_aliases_local loaded"
fi
# settings local
if [ -e "${INIT_DIR}/bash_settings_local" ]; then
source "${INIT_DIR}/bash_settings_local"
echo "bash_settings_local loaded"
fi
# paths local
if [ -e "${INIT_DIR}/bash_paths_local" ]; then
source "${INIT_DIR}/bash_paths_local"
echo "bash_paths_local loaded"
fi
#---------------SHARE-----------------------------
# aliases share
if [ -e "${INIT_DIR}/bash_aliases_share" ]; then
source "${INIT_DIR}/bash_aliases_share"
echo "bash_aliases_share loaded"
fi
# settings share
if [ -e "${INIT_DIR}/bash_settings_share" ]; then
source "${INIT_DIR}/bash_settings_share"
echo "bash_settings_share loaded"
fi
# functions share
if [ -e "${INIT_DIR}/bash_functions_share" ]; then
source "${INIT_DIR}/bash_functions_share"
echo "bash_functions_share loaded"
fi
A word of caution: echoing things in your .bash_profile, as I'm doing here, can be dangerous and break the
functionaly of utilities like scp and rsync. However, we protect against this with the cryptic line near the top.
Taking care of bash is the hard part. Other programs are less of a chore because, even if you have different
programs in your PATH on your home and work computers, you probably want everything else to behave
the same. To accomplish this, just drop all your other configuration files into your .dotfiles repository and link
40 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
to them from your home directory:
Sitemap
Working Faster with Readline Functions and Key Bindings TOP VIEW_AS_PAGE
If you've started using the terminal extensively, you might find that things are a bit slow. Perhaps you need
some long command you wrote yesterday and you don't want to write the damn thing again. Or, if you want
to jump to the end of a line, it's tiresome to move the cursor one character at a time. Failure to immediately
solve these problems will push your productivity back into the stone age and you may end up swearing off
the terminal as a Rube Goldberg-ian dystopia. So—enter keyboard shortcuts!
The backstory about shortcuts is that there are two massively influential text editors, Emacs and Vim, whose
users—to be overdramatic—are divided into two warring camps. Each program has its own conventions for
shortcuts, like jumping words with your cursor, and in bash they're Emacs-flavored by default. But you can
toggle between either one:
Although I prefer Vim as a text-editor, I use Emacs key bindings on the command line. The reason is that in
Vim there are multiple modes (normal mode, insert mode, command mode). If you want to jump to the front
of a line, you have to switch from insert mode to normal mode, which breaks up the flow a little. In Emacs
there's no such complication. Emacs commands usually start with the Control key or the Meta key (usually
Esc). Here are some things you can do:
These are supremely useful! I use these numerous times a day. (On the Mac, the first three even work in the
Google search bar!) The first bunch of these fall under the umbrella of ReadLine Functions (read GNU's
extensive documentation here). There are actually tons more, and you can see them all by entering:
For the first two—which are absolutely indispensable—you can use the default Emacs way:
41 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
Top However, reaching for the Esc key is a royal pain in the ass—you have to re-position your hands on the
< > >>
keyboard. This is where key-binding comes into play. Using the command bind, you can map a Readline
Function to any key combination you like. Of course, you should be careful not to overwrite pre-existing key
bindings that you want to use. I like to map the following keys to these Readline Functions:
Cntrl-forward-arrow - forward-word
Cntrl-backward-arrow - backward-word
up-arrow - history-search-backward
down-arrow - history-search-forward
[1]
(although these may not work universally .) How does this cryptic symbology translate into these particular
keybindings? There's a neat trick you can use, to be revealed in the next section.
Tip: On Mac, you can move your cursor to any position on the line by holding down Option and clicking your
mouse there. I rarely use this, however, because it's faster to make your cursor jump via the keyboard.
[1] If you have trouble getting this to work on OS's terminal, try iTerm2 instead, as described here ↑
The American Standard Code for Information Interchange (ASCII) is a character-encoding scheme originally
based on the English alphabet that encodes 128 specified characters—the numbers 0-9, the letters a-z and
A-Z, some basic punctuation symbols, some control codes that originated with Teletype machines, and a
blank space—into the 7-bit binary integers.
For example, the character A is mapped to the number 65, while q is 113. Of special interest are the control
characters, which are the representations of things that cannot be printed like return or delete. Again from
Wikipedia, here is the portion of the ASCII table for these control characters:
42 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
([a], [b], and [c] refer to different notations—unicode, caret notation, and escape codes.)
So, the punchline: on the terminal you can press Cntrl-v followed by a special key, like up-arrow or delete, to
see its key binding. For example, what's the code for the up-arrow key? Press Cntrl-v up-arrow. On my
computer the result is:
$ ^[[A
$ ^[[B
$ ^[[5D
and so on. These may be different on your computer. As the chart above shows us, the ^[ is the terminal's
notation for Escape (it can also be represented as 033 or \e). To state the obvious, a sequence of characters
following the escape character can take on a life of its own: Esc-f means something different than f. So, after
all this effort, we finally get the payoff of understanding how the following lines work:
If you want to use different key bindings than mine or more functions, go for it, but the important thing is that
you make use of the Readline Functions—they will save you loads of time. Remember, you can list their
names with bind -l.
43 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap Note: On the Mac you can also see and edit some key bindings for the terminal in:
Section Contents
Top
Terminal > Preferences > Settings > Keyboard
< > >>
Screenshot:
alias c="cat"
This means every time you type a c the terminal is going to read it as cat, so instead of writing:
$ cat file.txt
$ c file.txt
Another use of alias is to weld particular flags onto a command, so every time the command is called, the
flags go with it automatically, as in:
or
Recall the former allows you to copy directories as well as files, and the later allows you to make nested
directories. Perhaps you always want to use these options, but this is a tradeoff between convenience and
freedom. In general, I prefer to use new words for aliases and not overwrite preexisting bash commands.
Here are some aliases I use in my setup file (see the complete list here):
alias c="cat"
alias s="less -S"
alias l="ls -hl"
alias lt="ls -hlt"
alias ll="ls -al"
alias rr="rm -r"
alias r="readlink -m"
alias ct="column -t"
44 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
alias ch="chmod -R 755"
Sitemap
alias chh="chmod -R 644"
Section Contents
alias grep="grep --color"
Top alias yy="rsync -azv --progress"
< > >> # remove empty lines with white space
alias noempty="perl -ne '{print if not m/^(\s*)$/}'"
alias awkf="awk -F'\t'"
alias length="awk '{print length}'"
Aliases are a godsend. But they also have their limitations. If you're familiar with git, you know the following
commands might go together:
$ git add .
$ git commit -m "some message"
$ git push
If you always want to call them together, you can put them in a function which gets sourced in your setup
file. We'll call this function gup for git update:
git add .
git commit -m "$mymessage"
git push
}
A couple of notes about functions in bash: (1) the arguments to a functions are retrieved as $1, $2, etc., just
as in a script; (2) if you're making a function in a script, variables are global by default unless you declare
them as local. I declared mymessage as a local variable, even though in this case there's no danger I'd
confuse it with a global variable of the same name.
This teaches us how to build functions but it's not terribly useful, because because you can git add and git
commit in a single step anyway—so this is a lot of overhead merely to call two commands together. The
Linux Documentation Project provides a sample .bashrc with the following more interesting function:
extract () {
if [ -f $1 ] ; then
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
*.rar) unrar x $1 ;;
*.gz) gunzip $1 ;;
*.tar) tar xvf $1 ;;
*.tbz2) tar xvjf $1 ;;
*.tgz) tar xvzf $1 ;;
*.zip) unzip $1 ;;
*.Z) uncompress $1 ;;
*.7z) 7z x $1 ;;
*) echo "don't know how to extract '$1'..." ;;
esac
else
echo "'$1' is not a valid file!"
fi
}
45 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
There are many ways to compress files in unix, such as zip and bzip2. And the command tar is for archiving
Sitemap —taking a whole directory structure and rolling it up into a single file (sometimes called a tarball). This
Section Contents function serves as a one-size-fits-all command to untar and/or uncompress these many different
Top
compression formats. Note that, because there are so many cases, it's better to use a case statement than
< > >>
an if statement.
$ extract my_compressed_file.gz
$ extract my_compressed_file.bz2
$ extract my_compressed_tarball.tar.gz
1. pwd
2. ls
3. cd
4. mkdir
5. echo
6. cat
7. cp
8. mv
9. rm
10. man
11. head
12. tail
13. less
14. more
15. sort
16. grep
17. which
18. chmod
19. history
20. clear
We've already seen the first 10, along with which and chmod. The easiest new one on the above list is
clear, which simply clears your screen. We'll cover the rest below.
These are great alternatives to cat, because you usually don't want to spew out a giant file. You only want to
peek at it, get a sense of the formatting, or hone in on some specific portion. If you combine head and tail
together in the same command chain—we'll see how to do this in the section about unix pipelines—you can
get a row of your file by number.
46 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap Another thing I like to do with head is look at multiple files simultaneously. For example, whereas cat
Section Contents concatenates files together:
Top
$ cat kitty.txt
kitty
kitty
kitty
head will print out the name of the file when it takes multiple arguments, as in:
$ head *
Type the arrow keys to scroll up and down, Space to page down, and q to exit. Another nice thing about
less: it has many Vim-like features you can read about on its man page, and this is not a coincidence. For
example, if you want to search for the word apple in your file, you just type slash ( / ) followed by apple.
If you have a file with many columns, it's hard to view in the terminal. A neat less flag to solve this problem is
-S:
This enables horizontal scrolling instead of having rows messily wrap onto the next line. This flag works
particularly well in combination with the column command, which forces the columns of your file to line up
nicely:
47 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME (this construction will be clearer when we learn about unix pipes below.) column delimits on whitespace by
Sitemap default, so if your fields themselves contain spaces and you want to delimit on tabs, amend it to:
Section Contents
Top
cat myfile.txt | column -s' ' -t | less -S
< > >>
(where you make a tab in the terminal by typing Cntrl-v tab). This makes the terminal feel almost like an
Excel spreadsheet. Observe how it changes the viewing experience for this file of fake financial data:
I included less and more together because I think about them as a pair, but I told a little white lie in calling
more indispensable: you really only need to use less, which is an improved version of more. Less is more
:-)
$ grep apple myfile.txt # return lines of file with the text apple
Also useful are what I call the ABCs of Grep—that's After, Before, Context. Here's what they do:
48 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME You can also do an inverse grep with the -v flag. To find lines that don't contain apple, try:
Sitemap
Section Contents $ grep -v apple myfile.txt # return lines that don't contain apple
Top
You can exit after, say, finding the first two instances of apple, so you don't waste more time searching:
There are more powerful variants of grep, like egrep, which permits the use of regular expressions (more
on these later), as in:
as well as other fancier grep-like tools, such as ack, available for download. For some reason—perhaps
because it's useful in a quick and dirty way and doesn't have any other meanings in English—grep has
inspired a peculiar cult following. There are grep t-shirts, grep memes, a grep function in Perl, and—
unbelievably—even a whole O'Reilly book devoted to the command:
$ cat testsort.txt
vfw 34 awfjo
a 4 2
f 10 10
beb 43 c
f 2 33
f 1 ?
Then:
$ sort testsort.txt
a 4 2
beb 43 c
f 1 ?
f 10 10
f 2 33
vfw 34 awfjo
49 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME What happened? The default behavior of sort is to dictionary sort the rows of a file according to what's in the
Sitemap first column, then second column, and so on. Where the first column has the same value—f in this
Section Contents
example—the values of the second column determine the order of the rows. Dictionary sort means that
Top
things are sorted as they would be in a dictionary: 1,2,10 gets sorted as 1,10,2. If you want to do a numerical
< > >>
sort, use the -n flag; if you want to sort in reverse order, use the -r flag. You can also sort according to a
specific column. The notation for this is:
sort -kn,m
where n and m are numbers which refer to the range column n to column m. In practice, it may be easier to
use a single column rather than a range so, for example:
sort -k2,2
means sort by the second column (technically, from column 2 to column 2).
Answer: The file has been sorted first by the first column, in reverse dictionary order, and then—where the
first column is the same—by the second column in numerical order. You get the point!
Two more notes about sort. Sometimes it is necessary to sort rows uniquely and the flag for this is -u:
We touched on this briefly in the section about global variables. Behind the curtain, sort does its work by
making temporary files, and it needs a place to put those files. By default, this is the directory set by
TMPDIR, but if you have a giant file to sort, you might have reason to instruct sort to use another directory
and that's what this flag does.
$ history
50 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME The easiest way to search the history, as we've seen, is to bind the Readline Function history-search-
Sitemap backward to something convenient, like the up arrow. Then you just press up to scroll backwards through
Section Contents
your history. So, if you enter a c, pressing up will step through all the commands you entered that began with
Top
c. If you want to find a command that contains the word apple, a useful tool is reverse intelligent search,
< > >>
which is invoked with Cntrl-r:
(reverse-i-search)`':
Although we won't cover unix pipelines until the next section, I can't resist giving you a sneak-preview. An
easier way to find any commands you entered containing apple is:
If you wanted to see the last 10 commands you entered, it would be:
$ history | tail
What's going on under the hood is somewhat complicated. Bash actually records your command history in
two ways: (1) it stores it in memory—the history list—and (2) it stores it in a file—the history file. There are a
slew of global variables that mediate the behavior of history, but some important ones are:
~/.bash_history
It's important to understand the interplay between the history list and the history file. In the default setup,
when you type a command at the prompt it goes into your history list—and is thus revealed via history—but
it doesn't get written into your history file until you log out. The next time you log in, the contents of your
history file are automatically read into memory (your history list) and are thus searchable in your session. So,
what's the difference between the following two commands?
$ history | tail
$ tail ~/.bash_history
The former will show you the last 10 commands you entered in your current session while the later will show
you last 10 commands from the previous session. This is all well and good, but let's imagine the following
scenario: you're like me and you have 5 or 10 different terminal windows open at the same time. You're
constantly switching windows and entering commands, but history-search is such a useful function that you
want any command you enter in any one of the 5 windows to be immediately accessible on all the others.
We can accomplish this by putting the following lines in our setup dotfile (.bash_profile):
How does this bit of magic work? Here's what the histappend option does, to quote from the man page of
shopt:
If set, the history list is appended to the history file when the shell exits, rather than overwriting the history file.
shopt -s histappend
To append every line to history individually set:
PROMPT_COMMAND='history -a'
With these two settings, a new shell will get the history lines from all previous shells instead of the default
'last window closed'>history (the history file is named by the value of the HISTFILE variable)
Let's break this down. shopt stands for "shell options" and is for enabling and disabling various
miscellaneous options. The
51 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
history -a
Sitemap
Section Contents part will "Append the new history lines (history lines entered since the beginning of the current Bash session)
Top
to the history file" (as the man page says). And the global variable PROMPT_COMMAND is a rather funny
< > >>
one that executes whatever code is stored in it right before each printing of the prompt. Put these together
and you're immediately updating the ~/.bash_history file with every command instead of waiting for this to
happen at logout. Since every shell can contribute in real time, you can run multiple shells in parallel and
have access to a common history on all of them—a good situation.
What if you are engaged in unsavory behavior and want to cover your tracks? You can clear your history as
follows:
However, this only deletes what's in memory. If you really want to be careful, you had better check your
history file, .bash_history, and delete any offending portions. Or just wipe the whole thing:
A wise man once said, "Those who don't know history are destined to [re-type] it." Often it's necessary to
retrace your footsteps or repeat some commands. In particular, if you're doing research, you'll be juggling
lots of scripts and files and how you produced a file can quickly become mysterious. To deal with this, I like
to selectively keep shell commands worth remembering in a "notes" file. If you looked on my computer, you'd
see notes files scattered everywhere. You might protest: but these are in your history, aren't they? Yes, they
are, but the history is gradually erased; it's clogged with trivial commands like ls and cd; and it doesn't have
information about where the command was run. A month from now, your ~/.bash_history is unlikely to have
that long command you desperately need to remember, but your curated notes file will.
alias n="history | tail -2 | head -1 | tr -s ' ' | cut -d' ' -f3- | awk '{print \"# \"\$0}' >> notes"
which—you know the refrain!—will be easier to understand after we learn pipes. To use this, just type n (for
notesappend) after a command:
Now a notes file has been created (or appended to) in our cwd and we can look at it:
$ cat notes
# ./long_hard-to-remember_command --with_lots --of_flags > poorly_named_file
Let's parse this: instead of going to stdout, the output of cat goes into sort, and the output of sort, in turn,
goes into less, whose output finally lands on our screen. This is perfect when you have a big file you want to
52 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
view page by page, but you want to see it sorted. Remember that there are two streams, stdout and stderr.
Sitemap With piping, only the stdout stream gets passed through the pipeline; the stderr—we hope there isn't any—
Section Contents hits our screen right away. There's a nice figure from Wikipedia summarizing this:
Top
It's been hard to get this far without mentioning pipes and you saw we were starting to break down in the
section about about loops as well as history:
because there's no other simple way to do these things. I also mentioned that we'd learn how to print, say,
th
the 37 line of a file using pipes. In unix there's always more than one way to skin a cat, so here are two:
(In awk, which we'll learn about below, NR is a special variable which refers to the row number.) Another
command you can use in your pipelines is tee. Consider the following. We have a file text.txt such that:
$ cat test.txt
1 c
3 c
2 t
1 c
Then:
$ cat tmp.txt
1 c
2 t
3 c
tee, in rough analogy with a plumber's tee fitting, allows us to save a file in the middle of the pipeline and
keep going. In this case, the output of sort is both saved as tmp.txt and passed through the pipe to wc -l,
which counts the lines of the file.
Using pipes expands the number of things you can do in the shell exponentially. The following are some
miscellaneous examples of command pipelines.
53 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Number the fields in a tab-delimited header and display them as a column:
Sitemap
Display columns 2 through 4 of a file for rows such that the first column equals 1.
In the previous section, I talked about my homemade alias n for notesappend. To flesh out this code snippet,
the last command in history is the one we just typed:
$ history | tail -1
1022 history | tail -1
$ echo Hello
Hello
$ history | tail -2 | head -1
1023 echo Hello
The -s flag in tr stands for squeeze-repeats and the point of this is that there can be a varying number of
space characters in the whitespace. Since we're going to cut on space as the delimiter, we have to make
sure this doesn't trip us up. With this out of the way, we simply cut on space and use awk to put a pound
sign in front of the command, so it's read as a comment:
$ history | tail -2 | head -1 | tr -s ' ' | cut -d' ' -f3- | awk '{print "# "$0}'
# echo Hello
One can always write a script to do something but some unix programmers like to rely on pipes to create
long (sometimes unreadable) chains of commands, all the while staying on the same line. This is called a
one-liner. I've witnessed some epic ones in my day!
$ d=$( pwd )
$ d=`pwd`
These are two different syntaxes for doing the same thing. You should use the first one, because it's much
cleaner and nicer (if you're a Perl user, you'll recognize the second one, which Perl shares). To take another
54 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
example, suppose you want to store the length of your file in a variable:
Sitemap
Recall that in a unix script $0 refers to the script itself. The readlink command is a fancy way of getting the
absolute path of our script and dirname gets the directory part of the path. Neat, eh?
Let's look at the first thing we said command substitution could do, using the output of a command as an
argument. To head all files with extension .txt in a directory, the command is:
$ head *.txt
$ head $( ls *.txt )
$ head $( find . -maxdepth 1 -name "*.txt" )
You would never use these in practice, because there's a simpler way to do it. However, more often than not,
command substitution is the shortcut. Let's say you want to look at myscript.pl and it's in your PATH but you
can remember where it's located. Then you could use:
Command substitution is also very useful in loops. For example, seq prints a sequence of numbers, so:
Answer: The first loops through all unique elements of the third column of file.txt. Here, it just echoes it, but
you could do anything. The second loops through all .txt files in the cwd that do not contain the word apple.
Note that we're combining command substitution with pipes!
After a long road, we're finally in a better position to understand the code we discussed above to change file
suffixes. To recap, if we make three text files:
$ touch {1..3}.txt
$ for i in {1..3}.txt; do j=$( echo $i | sed 's|\.txt|\.html|' ); cmd="mv $i $j"; echo "Run command: $cmd"; echo $c
This line features a trick: saving a command in a variable, cmd, allows us to echo it—showing the user what
will be run—and then execute it by piping it into bash.
55 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap We can get as wacky as we like. For the first three directories in our PATH, the following tells us how many
Section Contents things are in each and how big each is:
Top
... process substitution is a form of inter-process communication that allows the input or output of a command
to appear as a file. The command is substituted in-line, where a file name would normally occur, by the
command shell. This allows programs that normally only accept files to directly read from or write to another
program.
Let's take an example. Suppose the first line of your file is a header line, and you want to look at the tail of
the file, but you also want to print the header. In that case, you could use:
cat expects files as arguments, but we're getting around that with process substitution. Whatever's in the
block:
<( )
is treated as a file. Maybe this looks trivial to you, and you argue that you could do this just as well using:
You are correct. But what if you wanted to pipe all of this into another command (like column -t, which prints
your file arranged into pretty columns)? Then you'd have to do it like this:
So, is this the only way to do this? As long as I'm arguing with myself, I reiterate that bash is so elastic
there's always another way to do something. Any ideas? Here's one answer:
Curly brackets create code blocks in bash, as in many other languages. In this case, the code in the curly
brackets is executed first and everything is piped, together, into column -t.
How would you print the second column in a file twice? One way would be with awk, which we'll cover
below:
$ paste <( cat test.txt | cut -f2 ) <( cat test.txt | cut -f2 )
56 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sometimes you want to sort a file, but you don't want the sorting to scramble your file header. Here's how to
Sitemap sort on, say, the first column without touching your header:
Section Contents
Top
$ cat <( head -1 temp.txt ) <( cat temp.txt | sed '1d' | sort -k1,1n )
< > >>
As a final example, suppose you have a simple script, test_script, which counts the lines in a file and echoes
some text:
#!/bin/bash
myfile=$1
The first argument is a file you pass in to the script—say, it's 1.txt—so you could run it like this:
$ ./test_script 1.txt
Your file is 11 lines long.
$ gzip 1.txt
The program can't handle zipped input, but unzipping your files just to run them in this program is a pain.
Process substitution to the rescue!
This gets at the marrow of what process substitution is really good at—avoiding the creation of temporary
files.
If unix were a video game, piping would be level one, command substitution level two, and process
substitution level three. Once you've gotten comfortable with all three, you've beaten at least one part of this
game. Congratulations!
Now it's time for a fun and educational interlude. In 1982, Bell Labs AT&T produced a video touting the
merits of unix. It features funky turtlenecks and futuristic, yet dated, geometrical figures doing an odd dance
number. I don't recognize some of the unix commands they use. Yet, to me the funniest thing about this
video is how relevant many of the principles in it still are (once you get past the part about telephone
switchboards). Check it out:
57 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
(Video credit: YouTube: UNIX: Making Computers Easier To Use -- AT&T Archives film from 1982, Bell Laboratories)
Sitemap
Section Contents
Processes and Running Processes in the Background TOP VIEW_AS_PAGE
Top
Processes are a deep subject intertwined with topics like memory usage, threads, scheduling, and
< > >>
parallelization. My understanding of this stuff—which encroaches on the domain of operating systems—is
narrow, but here are the basics from a unix-eye-view. The command ps displays information about your
processes:
What if you have a program that takes some time to run? You run the program on the command line but,
while it's running, your hands are tied. You can't do any more computing until it's finished. Or can you? You
can—by running your program in the background. Let's look at the command sleep which pauses for an
amount of time specified in seconds. To run things in the background, use an ampersand:
$ sleep 60 &
[1] 6846
$ sleep 30 &
[2] 6847
The numbers printed are the PIDs or process IDs, which are unique identifiers for every process running on
our computer. Look at our processes:
$ ps -f
UID PID PPID C STIME TTY TIME CMD
501 5166 5165 0 Wed12PM ttys008 0:00.20 -bash
501 6846 5166 0 10:57PM ttys008 0:00.01 sleep 60
501 6847 5166 0 10:57PM ttys008 0:00.00 sleep 30
The header is almost self-explanatory, but what's TTY? Wikipedia tells us that this is an anachronistic name
for the terminal that derives from TeleTYpewriter. Observe that bash itself is a process which is running. We
also see that this particular bash—PID of 5166—is the parent process of both of our sleep commands
(PPID stands for parent process ID). Like directories, processes are hierarchical. Every directory, except for
the root directory, has a parent directory and so every process, except for the first process—init in unix or
launchd on the Mac—has a parent process. Just as tree shows us the directory hierarchy, pstree shows us
the process hierarchy:
$ pstree
-+= 00001 root /sbin/launchd
|-+= 00132 oliver /sbin/launchd
| |-+= 00252 oliver /Applications/Utilities/Terminal.app/Contents/MacOS/Terminal
| | \-+= 05165 root login -pf oliver
| | \-+= 05166 oliver -bash
|| |--= 06846 oliver sleep 30
|| |--= 06847 oliver sleep 60
|| \-+= 06848 oliver pstree
|| \--- 06849 root ps -axwwo user,pid,ppid,pgid,command
This command actually shows every single process on the computer, but I've cheated a bit and cut out
everything but the processes we're looking at. We can see the parent-child relationships: sleep and pstree
are children of bash, which is a child of login, which is a child of Terminal, and so on. With vanilla ps, every
process we saw began in a terminal—that's why it had a TTY ID. However, we're probably running Safari or
Chrome. How do we see these processes as well as the myriad others running on our system? To see all
processes, not just those spawned by terminals, use the -A flag:
Returning to our example with sleep, we can see the jobs we're currently running with the command jobs:
58 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME $ jobs
Sitemap [1]- Running sleep 60 &
Section Contents [2]+ Running sleep 30 &
Top
$ sleep 60 &
[1] 7161
$ jobs
[1]+ Running sleep 60 &
$ kill %1
$
[1]+ Terminated: 15 sleep 60
In this notation %n refers to the nth job. Recall we can also kill a job in the foreground with Cntrl-c. More
generally, if we didn't submit the command ourselves, we can kill any process on our computer using the
PID. Suppose we want to kill the terminal in which we are working. Let's grep for it:
We see that this particular process happens to have a PID of 252. grep returned any process with the word
Terminal, including the grep itself. We can be more precise with awk and see the header, too:
(that's print the first row OR anything with the 2nd field equal to 252.) Now let's kill it:
$ kill 252
Running stuff in the background is useful, especially if you have a time-consuming program. If you're
scripting a lot, you'll often find yourself running something like this:
i.e., running something in the background and saving both the output and error.
Two other commands that come to mind are time, which times how long your script takes to run, and
59 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME nohup ("no hang up"), which allows your script to run even if you quit the terminal:
Sitemap
As we mentioned above, you can also set multiple jobs to run in the background, in parallel, from a loop:
(Of course, you'd want to run something more useful than sleep!) If you're lucky enough to work on a large
computer cluster shared by many users—some of whom may be running memory- and time-intensive
programs—then scheduling different users' jobs is a daily fact of life. At work we use a queuing system
called the Sun Grid Engine to solve this problem. I wrote a short SGE wiki here.
To get a dynamic view of your processes, loads of other information, and sort them in different ways, use
top:
$ top
htop can show us a traditional top output split-screened with a process tree. We see various users—root
(discussed next), ubuntu, and www-data—and we see various processes, including init which has a PID of 1.
htop also shows us the percent usage of each of our cores—here there's only one and we're using just
0.7%. Can you guess what this computer might be doing? I'm using it to host a website, as nginx, a popular
web server, gives away. To echo a point made at the beginning of this article, this is one of the zillion doors
basic command line competence opens.
$ ls -hl /
you'll notice that these files do not belong to you but, rather, have been created by another user named root.
Did you know you were sharing your computer with somebody else? You're not exactly, but there is another
account on your computer for the root user who has, as far as the computer's concerned, all the power (read
60 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Wikipedia's discussion about the superuser here). It's good that you're not the root user by default because it
Sitemap protects you from yourself. You don't have permission to catastrophically delete an important system file.
Section Contents
Top
If you want to run a command as the superuser, you can use sudo. For example, you'll have trouble if you
< > >>
try to make a directory called junk in the root directory:
$ mkdir /junk
mkdir: cannot create directory ‘/junk’: Permission denied
However, if you invoke this command as the root user, you can do it:
provided you type in the password. Because root—not your user—owns this file, you also need sudo to
remove this directory:
$ sudo -i
Obviously, you should be cautious with sudo. When might using it be appropriate? The most common use
case is when you have to install a program, which might want to write into a directory root owns, like
/usr/bin, or access system files. I discuss installing new software below. You can also do terrible things with
sudo, such as gut your whole computer with a command so unspeakable I cannot utter it in syntactically
viable form. That's sudo are em dash are eff forward slash—it lives in infamy in an Urban Dictionary entry.
The key point about awk is, it works line by line. A typical awk construction is:
Awk executes its code once every line. Let's say we have a file, test.txt, such that:
$ cat test.txt
1 c
3 c
2 t
1 c
61 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME In awk, $1 is the notation for the first field, $2 is for second field, and so on. The whole line is $0. For
Sitemap example:
Section Contents
Top
$ cat test.txt | awk '{print}' # print full line
< > >>
1 c
3 c
2 t
1 c
There are two exceptions to the execute code per line rule: anything in a BEGIN block gets executed before
the file is read and anything in an END block gets executed after it's read. If you define variables in awk
they're global and persist rather than being cleared every line. For example, we can concatenate the
elements of the first column with an @ delimiter using the variable x:
Awk has a bunch of built-in variables which are handy: NR is the row number; NF is the total number of
fields; and OFS is the output delimiter. There are many more you can read about here. Continuing with our
very contrived examples, let's see how these can help us:
$ cat test.txt | awk '{OFS="\t"; print $1,$2}' # set output field separator to tab
1 c
3 c
2 t
1 c
62 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME Setting OFS spares us having to type a "\t" every time we want to print a tab. We can just use a comma
Sitemap instead. Look at the following three examples:
Section Contents
Top
$ cat test.txt | awk '{OFS="\t"; print $1,$2}' # print file as is
< > >>
1 c
3 c
2 t
1 c
$ cat test.txt | awk '{OFS="\t"; print NR,NF,$1,$2}' # print row & field num
1 2 1 c
2 2 3 c
3 2 2 t
4 2 1 c
So the first command prints the file as it is. The second command prints the file with the row number added
in front. And the third prints the file with the row number in the first column and the number of fields in the
second—in our case always two. Although these are purely pedagogical examples, these variables can do a
rd
lot for you. For example, if you wanted to print the 3 row of your file, you could use:
$ cat test.txt | awk '{if (NR==3) {print $0}}' # print the 3rd row of your file
2 t
$ cat test.txt | awk '{if (NR==3) {print}}' # same thing, more compact syntax
2 t
Sometimes you have a file and you want to check if every row has the same number of columns. Then use:
An important point is that by default awk delimits on white-space, not tabs (unlike, say, the command cut).
White space means any combination of spaces and tabs. You can tell awk to delimit on anything you like by
using the -F flag. For instance, let's look at the following situation:
When we feed a space b into awk, $1 refers to the first field, a. However, if we explicitly tell awk to delimit on
63 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
Section Contents
You can also use shell variables inside your awk by importing them with the -v flag:
Top
$ cat test.txt | awk '{if ($1==1) {print > "file1.txt"} else {print > "file2.txt"}}'
$ cat file1.txt
1 c
1 c
$ cat file2.txt
3 c
2 t
Question: In the following case, how would you print the row numbers such that the first field equals the
second field?
$ echo -e "a\ta\na\tc\na\tz\na\ta"
a a
a c
a z
a a
The take-home lesson is, you can do tons with awk, but you don't want to do too much. Anything that you
can do crisply on one, or a few, lines is awk-able. I'll do some more involved examples below so you can get
a better sense of awk scripting.
Sometimes the first line of a text file is a header and you want to remove it. Suppose we have a file,
test_header.txt, such that:
$ cat test_header.txt
This is a header
1 asdf
2 asdf
2 asdf
Then:
64 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
1 asdf
Sitemap
2 asdf
Section Contents
2 asdf
Top
1,3 is sed's notation for the range 1 to 3. We can't do much more without entering regular expression
territory. One sed construction is:
/pattern/d
where d stands for delete if the pattern is matched. If we have a file, test_comment.txt, such that:
$ cat test_comment.txt
1 asdf
# This is a comment
2 asdf
# This is a comment
2 asdf
s/A/B/
where s stands for substitute. So this means replace A with B. By default, this only works for the first
occurrence of A, but if you put a g at the end, for group, all As are replaced:
s/A/B/g
For example, the following replaces the first occurrence of kitty with X:
This is such a useful ability that all text editors allow you to perform find-and-replace as well. By replacing
some text with nothing, you can also use this as a delete:
65 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sed is especially good when you're trying to rename batches of files or directories on the command line. I
Sitemap often have occasion to use it in for loops. This example of changing file extensions should be very familiar by
Section Contents now:
Top
Example 1
Suppose we have some files that start with the prefix myfile and we want to concatenate them together.
However, in the resulting file, we want the first column to be the name of the file from which that row
originated. We can accomplish this as follows:
$ for i in myfile*; do echo "*** "$i" ***"; cat $i | awk -v x=${i} '{print x"\t"$0}' >> files.concat.txt; done
Example 2
Suppose we have a text file of URLs, test_markup.txt:
$ cat test_markup.txt
https://www.google.com
http://www.nytimes.com
http://en.wikipedia.org/wiki/Main_Page
and we want to convert them into HTML links. This is easy with awk:
Notice that we've escaped the quote character with a slash where necessary.
Example 3
Here's an example from work: the other day I ran multiple instances of a script and had 500 input and output
files. I wanted to check that the number of unique elements in the first column of each corresponding input
and output file was the same. Here was my one-liner:
$ for i in {1..500}; do a=$( cat input${i}.txt | cut -f1 | sort -u | wc -l ); b=$( cat output${i}.txt | cut -f1 | sort -u |
$ for i in {1..500}; do
a=$( cat input${i}.txt | cut -f1 | sort -u | wc -l );
b=$( cat output${i}.txt | cut -f1 | sort -u | wc -l );
echo -e $i"\t"$a"\t"$b;
done | awk '$2!=$3' # print if col2 not equal to col3
So, any time these numbers disagreed, awk printed the line.
Example 4
Now let's take a detour into the world of bioinformatics, a field in which studying sequencing data is one of
the chief pursuits. (Because bioinformatics relies on numerous different standard, parsable file formats, it's
perhaps the best Petri dish ever created for learning bash, Perl, Python, awk, and sed.) For this example, all
66 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME you need to know is that a fasta file is one type of file format in which sequencing data is stored. In fasta
Sitemap format, every sequence has an ID line, which begins with > followed by some sequence, which is allowed to
Section Contents
span multiple lines. For DNA, the sequence is comprised of the letters A C T G (also called base pairs or
Top
nucleotides). A fasta file of two genes could look like this:
< > >>
$ cat myfasta.fa
>GeneA
ATGCTGAAAGGTCGTAGGATTCGTAG
>GeneB
ATGAACGTAA
(if genes were this small) Suppose we want to create a file with ID vs length. In this example, it would be:
$ cat myfasta.fa | awk '{if ($1 ~ />/) {printf $1"\t"} else {print length}}' | sed 's|>||'
GeneA 26
GeneB 10
Another common chore is filtering a fasta file. Sometimes we want to ignore small sequences—say, anything
20 or fewer nucleotides—because either we don't believe they're real or because we'll have trouble aligning
them. This is simply:
$ cat myfasta.fa | awk '{if ($0 ~ />/) {id=$0} else if (length > 20) {print id; print}}'
>GeneA
ATGCTGAAAGGTCGTAGGATTCGTAG
To be 100% accurate, the sequence portions of a fasta file can spill over onto more than one line and the
above examples don't account for this. We can handle this by first piping our fasta file into:
$ cat myfasta.fa | awk 'BEGIN{f=0}{if ($0~/^>/) {if (f) printf "\n"; print $0; f=1} else printf $0}END{printf "\n
Here is a more difficult problem. When the sequencing machine does its magic, sometimes it can't make a
call what the base is, so instead of writing A C T or G it uses the character N to denote an unknown.
Suppose you have a fasta file—perhaps the sequences are thousands of base pairs long—and you want to
know where the coordinates of these stretches of Ns are. As a test case, let's look at the following miniature
fasta file:
$ cat test.fa
>a
ACTGNNNAGGTNNNA
>b
NNACTGNNNAGGTNN
In the long tradition of one-liners, here's what I wrote to solve this problem:
$ cat test.fa | awk 'BEGIN{flag=0}{if ($0 ~ />/) {if (flag==1) {print nstart"\t"nend;}; print; pos=0; flag=0} els
Got that? Probably not, because the strength of one-liners has never been readability. Let's expand this line
so we can see what it's doing:
67 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
flag=0;
Sitemap
}
Section Contents
else # sequence line
Top {
< > >> # loop thro sequence
for (i=1; i<=length; i++)
{
# increment position
pos++;
# grab the i-th letter
c=substr($0,i,1);
Looks like we got the ranges right! Let's talk through this.
In the case we're on an ID line, we reset our position variable and our flag. The flag is a boolean (meaning it
will be either 1, hi, or 0, low) which we're going to set hi every time we see an N. This plays out in the
following way: once we hit a sequence line, we're going to loop over every single character. If the character
is an N and the flag is low—meaning the previous character was not an N—then we record this position,
pos, as being both the start and end of an N streak: nstart=pos and nend=pos. If the character is an N and
the flag is hi, it means the previous character was an N—hence, we're in the middle of a streak of Ns—so
the start position is whatever it was when we saw the first N and we just have to update the end position. If
the character is not an N but the flag is hi, it means the previous character was an N and we just finished
going through a streak, so we'll print the range of the streak and set the flag low to signify we're no longer in
a streak.
There's one more subtlety: what if the sequence line ends on an N character? Then we won't end up printing
it because our logic only prints things the next time we see a non-N. To solve this issue, we add a line in the
if (line == ID line) block to check if our flag is hi. If it is, we print the last-seen range before clearing the
variables. And there's an even more subtle point: what if the last line of our file ends on an N? In this case,
that fix won't work because we won't be hitting any more ID lines. To solve this, we use the END{ } block: if
the flag is hi and we're done reading the file, we'll print the last-seen range. In this example, our fasta file
was small and we could eyeball the N ranges but, with a monster file, we'd need to trust our awk rather than
our eyes.
Example 5
Here's a final example, leaving the world of bioinformatics. Suppose you want to add numbers above each
column at the top of a text file. If your file looks like this:
hellokitty kitty
hellokitty kitty
1 2 3
hellokitty kitty
hellokitty kitty
68 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Easy, right? It's actually not trivial because a long word in one of the columns is going to make it so you can't
Sitemap simply print the numbers separated by tabs if you want them to line up with the columns. To get the spacing
Section Contents right, you might need more than one tab in some cases. So, let's awk it:
Top
What's going on? First, we're going to use the first row only (NR==1) and any other row is printed as is. To
number the columns, we iterate through the number of tab-delimited fields, NF. There's a word in the i-th
field, and we add int(length($i)/8) extra tabs. This is because we need an new tab every eight characters: if
the word is 6 letters long, we don't need any extra tabs, but if it's 12 letters we need one. Then for every field
we add one tab unless we're on the last field (i==NF), in which case we add a newline. Finally, we print the
first line as is.
Regular Expressions (regex for short) are a way of describing patterns in text—and once you've gone to the
trouble of describing a pattern, you can write code to match it, extract some piece of it, or parse it in just
about any way you like. Regex come up in so many walks of programming that it's critically important to
understand the basics, even if you're not a guru.
tttxc234
wer1
asfwaffffffff2342525
All follow the pattern: some number of letter characters followed by some number of numerical characters.
You could use a regex to test if a string matches this pattern or to grab the numerical part of any text that fits
this pattern.
[email protected]
[email protected]
[email protected]
69 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME all follow the pattern: string of non-@ characters followed by one @ followed by a string of non-@
Sitemap characters. You can imagine that a form on the internet which requires an email address uses a regex to
Section Contents verify that it follows this pattern.
Top
And you know from above that these constructions are handy in loops. In the next section, to confuse you,
we'll see that asterisk means something different in the regex context.
This makes Perl behave just like awk—and it obeys some of the same conventions, such as using the
BEGIN and END keywords. You can read more about Perl's command line options here and in even gorier
detail here, but let's see how to use it. Suppose you have a text file and you want to format it as an HTML
table:
$ cat test_table.txt
x y z
1 2 3
a b c
Doing this by hand would be pure torture. Let's do it with a one-liner on the command line:
All we're doing here is embedding each line in a table row (tr) tag, and each field in a table data (td) tag, plus
printing table tags at the beginning and end of the file.
If we have to deal with nontrivial regex on the command line, we'll be glad we're using Perl, not bash or awk.
Somewhere off the internet, I stole this regex cheat sheet:
70 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap
Syntax Equivalent Syntax What It Represents
Section Contents
\d [0-9] Any digit
Top
To get a feeling for how to use these, let's take an example file such that:
$ cat test.txt
889
tttxc234
wer1
CAT
asfwaffffffff2342525
Everything obeys the pattern non-digit string digit string except for 889, just digits, and CAT, just non-digits.
Look at the following:
tests for a match against a regular expression. Referring to our cheat sheet, the above command prints
every row because every row has at least 0 digits. Let's change the asterisk to a plus sign:
This prints everything with at least 1 digit, which is every row except CAT. Let's invert this and print the rows
with non-digits:
71 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
tttxc234
Sitemap
wer1
Section Contents
CAT
Top asfwaffffffff2342525
< > >>
This prints everything with at least 1 non-digit, which is every row except 889. We can try to match a more
specific pattern:
This prints everything with the pattern at least 1 digit, at least 1 non-digit, which no rows follow. What about
this?
It prints everything with the pattern at least 1 non-digit, at least 1 digit, which three rows follow.
Let's take another example, the one with emails. You have a file such that:
$ cat mail.txt
[email protected]
malformed.hotmail.com
malformed@@hotmail.com
[email protected]
[email protected]
Then to get the strings that are appropriately formatted as emails, we could do the following:
These do the same thing, but we're being a little more explicit in the second case. Escaping the @ sign with
a slash isn't a bad idea because in Perl @ can denote an array. If we wanted to grab lines with two @s, the
syntax would be:
72 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
malformed@@hotmail.com
Sitemap
Answer:
Aside from doing matching and extracting pieces of patterns with regex, you can also do substitutions, use
variables and logic, and more. Read more about Perl regex here and read about Python regex here. I wrote
a short (unpolished) Perl Wiki which also has some more examples here.
Borrowing a bit from my Vim wiki, Vim is a powerful text editor that allows you to do almost anything you can
dream up in a text file. It has so many commands, you can almost think of it as a text editor-cum-
programming language. One of the nicest things about Vim is that it's always available in your terminal. If
you use a GUI text editor, like Sublime or Aquamacs, then every time you switch computers, you have to
make sure you've installed the program. If you've ssh-ed into a remote server and you're trying to edit a file
with one of these GUI text editors, then you have to locally mount the server to edit the file—another
headache. If these two reasons aren't compelling enough to switch to Vim, it's also a good editor in its own
right.
The downside is that Vim isn't the easiest thing to learn, and it sometimes seems geared towards those with
a love of the technical. But there is an extensive user manual, a Tips Wiki, and vim.org, which admirably
states:
The most useful software is sometimes rendered useless by poor or altogether missing documentation. Vim
refuses to succumb to death by underdocumentation.
(The manual weighs in at over 250 pages—mission accomplished!) Vim is such a big subject that the goal of
this section will simply be to get your feet wet and set you pointing in the right direction.
Here's a random snippet of the utility wc's source code (written in C) edited in Vim:
73 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Sitemap
Section Contents
Top
The yellow numbers on the left are line numbers, not part of the document. What about the word INSERT
you see at the bottom? The first lesson of Vim is that it has modes. In insert mode, you write text into your
document, just as in any other editor. But in normal mode, where you start when you open Vim, you issue
commands to the editor. To leave normal mode and enter insert mode, you have various choices, including:
o create new line (below cursor), enter insert mode at the beginning of it
O create new line (above cursor), enter insert mode at the beginning of it
If you type a colon ( : ) in normal mode, you enter command mode (technically called command-line mode,
but not to be confused with the unix command line) where you can enter syntactically more complex
commands at the prompt. Delete the colon to return to normal mode. If you type a v in normal mode, you
toggle visual mode, in which you can highlight text. For this section, I'll assume you're in normal mode
(when this is not quite true, I'll use the notation :somecommand instead of saying, more ponderously, go into
command-line mode and enter somecommand).
:w save
:q quit
74 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Some basic commands to navigate through your file are:
Sitemap
Section Contents
27gg go to line 27
:27 go to line 27
Here are some basic copying (or "yanking") and pasting commands:
u undo
Cntrl-r redo
To find text in Vim, type a slash ( / ) followed by what you're searching for. To step through each instance of
what you found, it's:
This completes the whirlwind tour. Consider yourself officially weaned off of nano. For further reading, I put
together a quick and dirty Vim wiki here.
Problem 1
Question: How do we print the most recently modified (or created) file in the cwd?
Answer: ls has a -t flag, which sorts "by time modified (most recently modified first)", so the solution is
simply:
$ ls -t | head -1
75 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Problem 2
Sitemap Question: How do we print the number of unique elements in column 3 (delimiting on tab) of a text file? For
Section Contents example, suppose tmp.txt is:
Top
Answer: We can accomplish this with a simple pipeline using the -u flag of sort to sort column 3 uniquely:
Problem 3
Question: How do we print only the odd-numbered rows of a file? For example, suppose tmp.txt is:
$ cat tmp.txt
aaa
bbb
ccc
ddd
eee
fff
ggg
hhh
iii
jjj
Answer: Let's explicitly list the row number as well as the row number mod 2:
Problem 4
Question: Given two files, how do you find the rows that are only in one of them? For example, suppose
file1.txt is:
76 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
$ cat file2.txt
3
1
200
324
95
Every row in the first file is somewhere in the second file except for the rows with 10 and a b c.
Answer: To find the rows that are unique to the first file, try:
Every row in file2.txt is duplicated because we cat the file twice. And every row file1.txt and file2.txt share
will appear at least 3 times. Then we pipe this into uniq -u which will output only unique rows for a sorted
file.
Problem 5
My friend, Albert, gave me this elegant example. Here are the first 300 bases of chromosome 17 of the
human genome (build GRCh37) in a file chr17_300bp.fa:
Question: How would you count the number of As, Ts, Cs, and Gs in this sequence?
Answer: We can discard the identifier line with a grep -v ">". The command fold restricts the number of
characters printed per line, allowing us to turn a row into a column, as in:
77 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME Now we can sort this file, so all the A, T, C, and G rows go together, and then count them with uniq -c:
Sitemap
Problem 6
Changing a text file from one format into another is a common scripting chore. Suppose you have a file,
example_data.txt, that looks like this:
,height,weight,salary,age
1,106,111,111300,62
2,124,91,79740,40
3,127,176,15500,46
1 height 106
2 height 124
3 height 127
1 weight 111
2 weight 91
3 weight 176
1 salary 111300
2 salary 79740
3 salary 15500
1 age 62
2 age 40
3 age 46
?
These two different styles are sometimes called wide format data and narrow (or long) format data. (If you're
familiar with the language R, the package reshape2 can toggle between these two formats).
Answer: Taking this step by step, let's cut everything after the first comma-delimited field:
Now we're going to transpose the data to make it easier to work with—one of the many great things GNU
datamash does:
78 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Finally, we'll throw in a line of awk to finish the job:
Sitemap
awk '{for (i = 2; i <= NF; i++){print i-1"\t"$1"\t"$i}}'
Section Contents
Top This loops from the second field to the last field and prints a row for each iteration of the loop. Scroll
< > >>
horizontally to see the one-liner:
$ cat example_data.txt | cut -d"," -f2- | sed 's|,|\t|g' | datamash transpose | awk '{for (i = 2; i <= NF; i++){print i
1 height 106
2 height 124
3 height 127
1 weight 111
2 weight 91
3 weight 176
1 salary 111300
2 salary 79740
3 salary 15500
1 age 62
2 age 40
3 age 46
If this seems a little messy, it is, and that's a cue to switch to a more powerful parsing language. Let's revisit
this example when we discuss the Python shell below, so we can see how Python cuts through it like a knife
through butter.
Problem 7
You should avoid using spaces in file names on the command line, because it can cause all sorts of
annoyances. Use an underscore or dash instead.
Question: How many files and directories on your computer have spaces in their names?
Answer: To count every single file or directory on our computer, the command is:
$ sudo find / | wc -l
98912
(I'm testing this on Ubuntu Server 14.04 LTS.) How many have a space in their names?
But we just want to count the directory or file the first time we see it, not recursively count all the non-space-
containing children of a space-containing parent. We can do this as follows:
This works by telling awk to print only the lines such that the last field (delimiting on slash) has a space in it.
Here's an even simpler solution, via Albert:
79 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
Conclusion: only 0.07% of files and directories follow the poor naming convention of using a space.
Sitemap
Section Contents
Example Bash Scripts TOP VIEW_AS_PAGE
Top
When one-liners fail, it's time to script. When we were introduced to scripting above, we didn't have many
< > >>
tools in our toolkit. Now that we do, let's consider some example bash scripts—picking our battles for things
bash can do more cleanly than Python or Perl.
Example 1
We've already seen the construction:
which allows us to look at a script in our PATH without having to remember its exact location. The following
example is courtesy of my friend, Albert, who packaged this up into a script called catwhich:
#!/bin/bash
file=$(which $1 2>/dev/null)
The -f file test operator, says The Linux Documentation Project, checks if the "file is a regular file (not a
directory or device file)." The path /dev/null, where any potential error from which gets routed, is a special
unix path that acts like a black hole—saving anything in there makes it disappear. Functionally, it just
suppresses any error because we don't care what it is.
#!/bin/bash
file=$(which $1 2>/dev/null)
if [[ -f $file ]]; then
vim $file
else
echo "file \"$1\" does not exist!"
fi
Example 2
In lab we often run scripts in parallel. Each instance of a script will produce two log files, one containing the
stdout of the script and and one containing the stderr of the script. We sometimes use a convention of
printing a [start] at the beginning of the stdout and an [end] at the finish. This is one way of enabling us to
check if the job finished. An output log file might look like this:
[start]
[end]
Here's a script, checkerror, that loops through all of the stdout log files (I assume they have the extension .o)
in a directory and checks for these start and end tags:
80 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME #!/bin/bash
Sitemap
We're again making use of /dev/null because we don't want grep to return the lines it finds—we just want
to know if it's successful. We haven't encountered basename before. This simply gets the name of a file
from a path, so we don't output the whole file path.
Example 3
The last example is one I wrote to test if the (bioinformatics) programs bwa, blastn, and bedtools are in the
user's PATH:
#!/bin/bash
# define dependencies
dependencies="bwa blastn bedtools"
for i in $dependencies; do
if ! which $i > /dev/null; then
echo "[error] $i not found"
exit
fi
done
If, for example, you're transferring a program with dependencies to somebody, this double-checks that they
have all of the dependencies installed and accessible in their PATH. It works using the command which,
which throws a bad exit code if its argument isn't found in the PATH. These lines might be something you'd
see at the beginning of a script rather than a complete script in their own right.
I'm a new convert to Python and that's why I've leaned on Perl in some of the examples. However, these
days it seems inescapable that Perl is the past and Python is the future. Python has many awesome
features and one is the ability to run a Python shell simply by entering:
$ python
81 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
You can do a million things with this shell, but it's certainly the easiest way to do mathematics in the terminal.
Sitemap (Another convenient way is using the R shell, but you have to install R first.) For example:
Section Contents
Top
$ python # open the Python shell
< > >>
>>> 234+234
468
>>> import math
>>> math.log(234,3)
4.96564727304425
>>> math.log(234,2)
7.870364719583405
>>> math.pow(2,8)
256.0
>>> math.cos(3.141)
-0.9999998243808664
>>> math.sin(3.141)
0.0005926535550994539
You can read more about Python's math module here. If you want to do more sophisticated mathematics,
you can install the NumPy package.
The Python command line is great not just for math but for many tasks where bash feels clumsy. Remember
our parsing example above where we had a a file, example_data.txt:
,height,weight,salary,age
1,106,111,111300,62
2,124,91,79740,40
3,127,176,15500,46
1 height 106
2 height 124
3 height 127
1 weight 111
2 weight 91
3 weight 176
1 salary 111300
2 salary 79740
3 salary 15500
1 age 62
2 age 40
3 age 46
If you don't know Python, skip this section. If you do, how can you do it in the Python shell? Let's start by
opening the file and reading its contents:
>>> contents
',height,weight,salary,age\n1,106,111,111300,62\n2,124,91,79740,40\n3,127,176,15500,46\n'
Let's convert this string into a list, by splitting on the newline character:
>>> contents.split('\n')
[',height,weight,salary,age', '1,106,111,111300,62', '2,124,91,79740,40', '3,127,176,15500,46', '']
>>> contents.split('\n')[:-1]
82 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
[',height,weight,salary,age', '1,106,111,111300,62', '2,124,91,79740,40', '3,127,176,15500,46']
Sitemap
Section Contents Use list comprehension to split the elements of this list on the comma character:
Top
Now let's use a trick that if A is a list of lists, you can perform a matrix transpose with zip(*A):
This is using list comprehension to meld the first element of the tuple, height, to each subsequent element
and add a numerical index, as well. Let's apply this to each tuple in our list, and join everything with a
newline to finish the job:
Whew - done!
83 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
to access multiple terminal "windows" within the same terminal, err, window and more.
Sitemap
Section Contents Let's get the terminology straight. In tmux there are:
Top
Sessions are groupings of tmux windows. For most purposes, you only need one session. Within each
session, we can have multiple "windows" which you see as tabs on the bottom of your screen (i.e., they're
virtual windows, not the kind you'd create on a Mac by typing ⌘N). You can further subdivide windows into
panes, although this can get hairy fast. Here's an example of a tmux session containing four windows (the
tabs on the bottom), where the active window contains 3 panes:
I'm showing off by logging into three different computers—home, work, and Amazon.
In tmux, every command starts with a sequence of keys I'll refer to as Leader. The leader sequence is
[2]
Cntrl-b by default but I like to remap it to Cntrl-f . Here are the basics. To start a new tmux session, enter:
$ tmux
To get out of the tmux universe, detach from your current session using Leader d. To list your sessions,
type:
$ tmux ls
0: 4 windows (created Fri Sep 12 16:52:30 2014) [177x37]
This says we have one grouping of 4 virtual windows whose session id is 0. To attach to a session, use:
84 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
<Leader> x Kill the current window
Sitemap
See a longer list of commands here. tmux is amazing—I went from newbie to cheerleader in about 30
seconds!
Note: tmux does not come with your computer by default—you'll have to download it (see the the next
section).
[1] You could circumvent this with nohup, which makes a program persist even if the terminal quits ↑
[2] You can rebind the Leader using the configuration file ~/.tmux.conf ↑
When possible, avoid this approach and instead use a package manager, which takes care of installing a
program—and all of its dependencies—for you. Which package manager you use depends on which
operating system you're running. For Macintosh, it's impossible to live without brew, whose homepage calls
it, "The missing package manager for OS X." For Linux, it depends on your distribution's lineage: Ubuntu has
apt-get; Fedora has yum; and so on. All of these package managers can search for packages—i.e., see
what's out there—as well as install them.
Let's use the program gpg2 (The GNU Privacy Guard), a famous data encryption tool which implements the
OpenPGP standard, to see what this process looks like. First, let's search:
The exact details may vary on your computer but, in any case, now you're ready to wax dangerous and do
some Snowden-style hacking! (He reportedly used this tool.) I have a quick gpg2 primer here.
Bash in the Programming Ecosystem (or When Not to Use Bash) TOP VIEW_AS_PAGE
85 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME All programming languages are the same. This statement is simultaneously true and not true. In the
Sitemap programming landscape, bash occupies a unique place. We've seen it's definitely not for heavy-duty coding.
Section Contents Using basic data structures, like arrays or hashes, or doing any kind of math is unpleasant in bash. Look at
Top
the horrible syntax for floating-point arithmetic (via Wikipedia):
< > >>
Bash's strength is system stuff, like manipulating files and directories. If you do write a bash script, it's often
a wrapper—the glue which ties a bunch of programs together and bends them to your purpose. Many people
would argue that you should seldom write a script in bash. Confine bash to the command line and use a
modern scripting language if you want to script. Python and Perl can do more, have much friendlier syntax,
and make it easy to call system commands.
However, if you're new to programming, start with unix because it will get you acquainted with important
principles like input and output streams, permissions, processes, etc. faster than anything else. Furthermore,
much of the material we covered above is not bash-specific. Perl is, in many ways, an extension of bash and
it shares a lot of the same syntax—for file test operators, system calls with backticks, here documents, and
more.
So how do you go about learning unix? Just as reading a detailed book about violin won't transform you into
a violinist, this article is of limited utility and defers to the terminal itself as the best teacher. You have to pay
your dues and experience counts for a lot because there's so much to learn (I'm already a few years in and I
learn things almost daily). Start using it and crash headfirst into problems. There's no substitute for having
somebody to bother who knows it better than you do. Failing that, there's always Stackoverflow and the rest
of the internet, which has become the greatest computer science manual ever compiled. In programming, I'm
always surprised how much more a couple years of work taught me than a couple years of class, and
perhaps this speaks volumes about the merits of apprenticeship versus classroom instruction.
Unix—with its power, elegance, and scope—is a modern marvel. If you know it and your friends don't, it's a
bit like having been exposed to some awesome book or song that they haven't discovered yet. To see all the
commands at your fingertips, type tab tab on the command line (there are over 2200 on my Mac!). Or do
some reading for pleasure in:
$ man bash
$ info coreutils
to remind yourself what you're working with. I bet you didn't think anything this cool came out of the 70s! :D
Albert Lee
Vladimir Trifonov
Alex North-Keys
Guido Jansen
I don't like remembering syntax and I forget what I've learned quickly. Thus this article started out as a memory aid—
86 of 87 25/10/2019, 10:59 AM
Oliver | An Introduction to Unix http://oliverelliott.org/article/computing/tut_unix/
HOME
miscellaneous command line-alia for which I wanted a reference done in my own particular way, prejudices included. Its waistline
Sitemap expanded to its present (large) girth and it got a little play on the internet in early 2015. If I hadn't appreciated the importance of
Section Contents
editors and fact-checkers, I do now—as the editing of this page was nerve-wrackingly crowded-sourced in real time. I've since
Top
corrected many of the errors people pointed out as well as polished and added sections, and the article is better for it.
< > >>
If you've read this far, you know that this article is not, speaking in rigid literals, an introduction to unix. It's not about the nitty-gritty
of how operating system kernels work. And it's not even about UNIX®, which is still a trademark—"the worldwide Single UNIX
Specification integrating X/Open Company's XPG4, IEEE's POSIX Standards and ISO C" (say what!?), according to the official
website. These days UNIX® refers to a detailed specification rather than software. This article, in contrast, is about command line
basics. But it's not about all command lines, which would include rubbish like Windows Powershell. It's about unix-like command
This is what I mean when I use "unix" as a colloquial shorthand. However, to begin the article by embroiling readers in these
There were other comments but the funniest one came from my dad:
P.S. I see 2 keys at bottom of my keyboard that say, "command." Is this what all the fuss is about? I'm afraid if I touched them the
© 2018 Oliver
87 of 87 25/10/2019, 10:59 AM