Edxlinux
Edxlinux
• GNOME is a popular desktop environment and graphical user interface that runs on top of the Linux
operating system.
• The default display manager for GNOME is called gdm.
• The gdm display manager presents the user with the login screen which prompts for the login username
and password.
• Logging out through the desktop environment kills all processes in your current X session and returns to
the display manager login screen.
Summary
You have completed this chapter. Let's summarize the key concepts covered:
• You can control basic configuration options and desktop settings through the System Settings panel
• Linux always uses Coordinated Universal Time (UTC) for its own internal time-keeping . You can set
Date and Time Settings from the System Settings window.
• The Network Time Protocol is the most popular and reliable protocol for setting the local time via Internet
servers.
• The Displays panel allows you to change the resolution of your display and configure multiple screens.
• Network Manager can present available wireless networks, allow the choice of a wireless or mobile
broadband network, handle passwords, and set up VPNs.
• dpkg and RPM are the most popular package management systems used on Linux distributions.
• Debian distributions use dpkg and apt-based utilities for package management.
• RPM was developed by Red Hat, and adopted by a number of other
distributions, including the openSUSE, Mandriva, CentOS, Oracle Linux, and others.
Chapter 6
Introduction to Linux Documentation
Sources
Whether you are an inexperienced user or a veteran, you won’t always know how to use
various Linux programs and utilities, or what to type at the command line. You will need to
consult the help documentation regularly. Because Linux-based systems draw from a large
variety of sources, there are numerous reservoirs of documentation and ways of getting
help. Distributors consolidate this material and present it in a comprehensive and easy-to-
use manner.
Typing man with a topic name as an argument retrieves the information stored in the
topic's man pages. Some Linux distributions require every installed program to have a
corresponding man page, which explains the depth of coverage. (Note: man is actually an
abbreviation for manual.) The man pages structure were first introduced in the early
UNIX versions of the early 1970s.
• Web pages
• Published books
• Graphical help
• Other formats
man
The man program searches, formats, and displays the information contained in the man
pages. Because many topics have a lot of information, output is piped through a terminal
pager program such as less to be viewed one page at a time; at the same time the information
is formatted for a good visual display.
When no options are given, by default one sees only the dedicated page specifically about
the topic. You can broaden this to view all man pages containing a string in their name by
using the -f option. You can also view all man pages that discuss a specified subject (even
if the specified subject is not present in the name) by using the –k option.
Manual Chapters
The man pages are divided into nine
numbered chapters (1 through 9).
Sometimes, a letter is appended to the
chapter number to identify a specific topic.
For example, many pages describing part
of the X Window API are in chapter 3X.
$ man 3 printf
$ man -a printf
This is the GNU project's standard documentation format (info) which it prefers as an
alternative to man. The info system is more free-form and supports linked sub-sections.
Functionally, the GNU Info System resembles man in many ways. However, topics are
connected using links (even though its design predates the World Wide Web). Information
can be viewed through either a command line interface, a graphical help utility, printed or
viewed online.
You can view help for a particular topic by typing info <topic name>. The system then
searches for the topic in all available info files.
Some useful keys are: q to quit, h for help, and Enter to select a menu item.
Nodes are similar to sections and subsections in written documentation. You can move
between nodes or view each node sequentially. Each node may contain menus and linked
subtopics, or items.
Items can be compared to Internet hyperlinks. They are identified by an asterisk (*) at the
beginning of the item name. Named items (outside a menu) are identified with double-
colons (::) at the end of the item name. Items can refer to other nodes within the file or to
other files. The table lists the basic keystrokes for moving between nodes.
Key Function
n Go to the next node
p Go to the previous node
Move one node up in
u
the index
Introduction to the help Option
The third source of Linux documentation is use of the help option.
Most commands have an available short description which can be viewed using the --help
or the -h option along with the command or application. For example, to learn more about
the man command, you can run the following command:
$ man --help
The --help option is useful as a quick reference and it displays information faster than
the man or info pages.
For these built-in commands, help performs the same basic function as the -h and --help
arguments (which we will discuss shortly) perform for stand-alone programs.
The next screen covers a demonstration on how to use man, info and the help option.
You can also start the graphical help system from a graphical terminal using the following
commands:
• GNOME: gnome-help
• KDE: khelpcenter
Package Documentation
Linux documentation is also available as part of the package management system. Usually
this documentation is directly pulled from the upstream source code, but it can also
contain information about how the distribution packaged and set up the software.
Online Resources
There are many places to access online Linux documentation, and a little bit of searching
will get you buried in it.
You can also find very helpful documentation for each distribution. Each distribution has its
own user-generated forums and wiki sections. Here are just a few links to such sources:
Ubuntu: https://help.ubuntu.com/
CentOS: https://www.centos.org/docs/
OpenSUSE: http://en.opensuse.org/Portal:Documentation
GENTOO: http://www.gentoo.org/doc/en
Moreover you can use online search sites to locate helpful resources from all over the
Internet, including blog posts, forum and mailing list posts, news articles, and so on.
Summary
You have completed this chapter. Let’s summarize the key concepts covered:
• The main sources of Linux documentation are the man pages, GNU Info, the
help options and command, and a rich variety of online documentation sources.
• The man utility searches, formats, and displays man pages.
• The man pages provide in-depth documentation about programs and other topics
about the system including configuration files, system calls, library routines, and the
kernel.
• The GNU Info System was created by the GNU project as its standard
documentation. It is robust and is accessible via command line, web, and graphical
tools using info.
• Short descriptions for commands are usually displayed with the -h or --help argument.
• You can type help at the command line to display a synopsis of built-in commands.
• There are many other help resources both on your system and on the Internet.
Chapter 7
No GUI overhead.
Virtually every task can be accomplished using the command line.
You can script tasks and series of procedures.
You can log on remotely to networked machines anywhere on the Internet.
You can initiate graphical apps directly from the command line.
Click the image to view an enlarged version.
xterm
rxvt
konsole
terminator
Click the image to view an enlarged version.
Virtual Terminals
Virtual Terminals (VT) are console sessions that use the entire display and keyboard
outside of a graphical environment. Such terminals are considered "virtual" because
although there can be multiple active terminals, only one terminal remains visible at a
time. A VT is not quite the same as a command line terminal window; you can have many
of those visible at once on a graphical desktop.
One virtual terminal (usually number one or seven) is reserved for the graphical
environment, and text logins are enabled on the unused VTs. Ubuntu uses VT 7, but
CentOS/RHEL and openSUSE use VT 1 for the graphical display.
An example of a situation where using the VTs is helpful when you run into problems with
the graphical desktop. In this situation, you can switch to one of the text VTs and
troubleshoot.
To switch between the VTs, press CTRL-ALT-corresponding function key for the VT. For
example, press CTRL-ALT-F6 for VT 6. (Actually you only have to press ALT-F6 key
combination if you are in a VT not running X and want to switch to another VT.)
• Command
• Options
• Arguments
The command is the name of the program you are executing. It may be followed by one
or more options (or switches) that modify what the command may do. Options usually
start with one or two dashes, for example,-p or--print, in order to differentiate them from
arguments, which represent what the command operates on.
However, plenty of commands have no options, no arguments, or neither. You can also
type other things at the command line besides issuing commands, such as setting
environment variables.
Use the sudo service gdm stop or sudo service lightdm stop commands, to stop the
graphical user interface in Debian-based systems. On RPM-based systems typing sudo
telinit 3 may have the same effect of killing the GUI.
sudo
All the demonstrations created have a user configured with sudo capabilities to provide
the user with administrative (admin) privileges when required. sudo allows users to run
programs using the security privileges of another user, generally root (superuser). The
functionality of sudo is similar to that of run as in Windows.
On your own systems, you may need to set up and enable sudo to work correctly. To do
this, you need to follow some steps that we won’t explain in much detail now, but you will
learn about later in this course. When running on Ubuntu, sudo is already always set up
for you during installation. If you are running something in the Fedora or openSUSE
families of distributions, you will likely need to set up sudo to work properly for you after
initial installation.
Next, you will learn the steps to setup and run sudo on your system.
1. You will need to make modifications as the administrative or super user, root. While
sudo will become the preferred method of doing this, we don’t have it set up yet, so
we will use su (which we will discuss later in detail) instead. At the command line
prompt, type su and press Enter. You will then be prompted for the root password, so
enter it and press Enter. You will notice that nothing is printed; this is so others
cannot see the password on the screen. You should end up with a different looking
prompt, often ending with ‘#’. For example: $ su Password: #
2. Now you need to create a configuration file to enable your user account to use sudo.
Typically, this file is created in the /etc/sudoers.d/ directory with the name of the file
the same as your username. For example, for this demo, let’s say your username is
“student”. After doing step 1, you would then create the configuration file for
“student” by doing this: # echo "student ALL=(ALL) ALL" > /etc/sudoers.d/student
3. Finally, some Linux distributions will complain if you don’t also change permissions on
the file by doing: # chmod 440 /etc/sudoers.d/student
That should be it. For the rest of this course, if you use sudo you should be properly set
up. When using sudo, by default you will be prompted to give a password (your own user
password) at least the first time you do it within a specificed time interval. It is possible
(though very insecure) to configure sudo to not require a password or change the time
window in which the password does not have to be repeated with every sudo command.
Basic Operations
In this section we will discuss how to accomplish basic operations from the command line.
These include how to log in and log out from the system, restart or shutdown the system,
locate applications, access directories, identify the absolute and relative paths, and
explore the filesystem.
Once your session is started (either by logging in to a text terminal or via a graphical
terminal program) you can also connect and log in to remote systems via the Secure
Shell (SSH) utility. For example, by typing ssh [email protected], SSH would
connect securely to the remote machine and give you a command line terminal window,
using passwords (as with regular logins) or cryptographic keys (a topic we won't discuss)
to prove your identity.
The halt and poweroff commands issue shutdown -h to halt the system;
reboot issues shutdown -r and causes the machine to reboot instead of just shutting
down. Both rebooting and shutting down from the command line requires superuser (root)
access.
When administering a multiuser system, you have the option of notifying all users prior to
shutdown as in:
$ sudo shutdown -h 10:00 "Shutting down for scheduled maintenance."
Locating Applications
Depending on the specifics of your particular distribution's policy, programs and software
packages can be installed in various directories. In general, executable programs should
live in the /bin, /usr/bin,/sbin,/usr/sbin directories or under /opt.
One way to locate programs is to employ the which utility. For example, to find out
exactly where the diff program resides on the filesystem:
$ which diff
If which does not find the program, whereis is a good alternative because it looks for
packages in a broader range of system directories:
$ whereis diff
Accessing Directories
When you first log into a system or open a terminal, the default directory should be your
home directory; you can print the exact path of this by typing echo $HOME. (Note that
some Linux distributions actually open new graphical terminals in $HOME/Desktop.) The
following commands are useful for directory navigation:
Command Result
pwd Displays the present working directory
Change to your home directory (short-cut
cd ~ or cd
name is ~ (tilde))
cd .. Change to parent directory (..)
cd - Change to previous directory (- (minus))
Note: The next two screens cover a demonstration and Try-It-Yourself activity.
You can view a demonstration and practice the Try-It-Yourself activity.
1. Absolute pathname: An absolute pathname begins with the root directory and
follows the tree, branch by branch, until it reaches the desired directory or file.
Absolute paths always start with /.
2. Relative pathname: A relative pathname starts from the present working directory.
Relative paths never start with /.
Multiple slashes (/) between directories and files are allowed, but all but one slash
between elements in the pathname is ignored by the system. ////usr//bin is valid, but seen
as /usr/bin by the system.
Most of the time it is most convenient to use relative paths, which require less typing.
Usually you take advantage of the shortcuts provided by: . (present directory), .. (parent
directory) and ~ (your home directory).
For example, suppose you are currently working in your home directory and wish to move
to the /usr/bin directory. The following two ways will bring you to the same directory from
your home directory:
Command Usage
Changes your current directory to the root (/) directory (or path
cd /
you supply)
ls List the contents of the present working directory
List all files including hidden files and directories (those whose
ls –a
name start with . )
tree Displays a tree view of the filesystem
Hard and Soft (Symbolic) Links
ln can be used to create hard links and (with the -s option) soft links, also known as
symbolic links or symlinks. These two kinds of links are very useful in UNIX-based
operating systems. The advantages of symbolic links are discusssed on the following
screen.
Suppose that file1 already exists. A hard link, called file2, is created with the command:
$ ln file1 file2
Note that two files now appear to exist. However, a closer inspection of the file listing
shows that this is not quite true.
The -i option to ls prints out in the first column the inode number, which is a unique
quantity for each file object. This field is the same for both of these files; what is really
going on here is that it is only one file but it has more than one name associated with it,
as is indicated by the 3 that appears in the ls output. Thus, there already was another
object linked to file1 before the command was executed.
Symbolic Links
Symbolic (or Soft) links are created with the -s option as in:
$ ln -s file1 file4
$ ls -li file1 file4
Notice file4 no longer appears to be a regular file, and it clearly points to file1 and has a
different inode number.
Symbolic links take no extra space on the filesystem (unless their names are very long).
They are extremely convenient as they can easily be modified to point to different places.
An easy way to create a shortcut from your home directory to long pathnames is to
create a symbolic link.
Unlike hard links, soft links can point to objects even on different filesystems (or
partitions) which may or may not be currently available or even exist. In the case where
the link does not point to a currently available or existing object, you obtain
a dangling link.
Hard links are very useful and they save space, but you have to be careful with their use,
sometimes in subtle ways. For one thing if you remove either file1 or file2 in the example
on the previous screen, the inode object (and the remaining file name) will remain,
which might be undesirable as it may lead to subtle errors later if you recreate a file of
that name.
If you edit one of the files, exactly what happens depends on your editor; most editors
including vi and gedit will retain the link by default but it is possible that modifying one
of the names may break the link and result in the creation of two objects.
Navigating the Directory History
The cd command remembers where you were last, and lets you get back there with cd -.
For remembering more than just the last directory visited, use pushd to change the
directory instead of cd; this pushes your starting directory onto a list. Using popd will
then send you back to those directories, walking in reverse order (the most recent
directory will be the first one retrieved with popd). The list of directories is displayed with
the dirs command.
In Linux, all open files are represented internally by what are called file descriptors.
Simply put, these are represented by numbers starting at zero. stdin is file descriptor 0,
stdout is file descriptor 1, and stderr is file descriptor 2. Typically, if other files are
opened in addition to these three, which are opened by default, they will start at file
descriptor 3 and increase from there.
On the next screen and in chapters ahead, you will see examples which alter where a
running command gets its input, where it writes its output, or where it prints diagnostic
(error) messages.
I/O Redirection
Through the command shell we can redirect the three standard filestreams so that we
can get input from either a file or another command instead of from our keyboard, and we
can write output and errors to files or send them as input for subsequent commands.
For example, if we have a program called do_something that reads from stdin and
writes to stdout and stderr, we can change its input source by using the less-than sign
( < ) followed by the name of the file to be consumed for input data:
If you want to send the output to a file, use the greater-than sign (>) as in:
$ do_something > output-file
Because stderr is not the same as stdout, error messages will still be seen on the
terminal windows in the above example.
If you want to redirect stderr to a separate file, you use stderr’s file descriptor number
(2), the greater-than sign (>), followed by the name of the file you want to hold everything
the running command writes to stderr:
$ do_something 2> error-file
A special shorthand notation can be used to put anything written to file descriptor 2
(stderr) in the same place as file descriptor 1 (stdout): 2>&1
$ do_something > all-output-file 2>&1
Pipes
The UNIX/Linux philosophy is to have many simple and short programs (or commands)
cooperate together to produce quite complex results, rather than have one complex
program with many possible options and modes of operation. In order to accomplish this,
extensive use of pipes is made; you can pipe the output of one command or program into
another as its input.
In order to do this we use the vertical-bar, |, (pipe symbol) between commands as in:
$ command1 | command2 | command3
The above represents what we often call a pipeline and allows Linux to combine the
actions of several commands into one. This is extraordinarily efficient
because command2 and command3 do not have to wait for the previous pipeline
commands to complete before they can begin hacking at the data in their input streams;
on multiple CPU or core systems the available computing power is much better utilized
and things get done quicker. In addition there is no need to save output in (temporary)
files between the stages in the pipeline, which saves disk space and reduces reading and
writing from disk, which is often the slowest bottleneck in getting something done.
In this section, you will learn how to use the locate and find utilities, and how to use
wildcards in bash.
locate
The locate utility program performs a search through a previously constructed database
of files and directories on your system, matching all entries that contain a specified
character string. This can sometimes result in a very long list.
To get a shorter more relevant list we can use the grep program as a filter; grep will print
only the lines that contain one or more specified strings as in:
which will list all files and directories with both "zip" and "bin" in their name . (We will
cover grep in much more detail later.) Notice the use of | to pipe the two commands
together.
locate utilizes the database created by another program, updatedb. Most Linux systems
run this automatically once a day. However, you can update it at any time by just
running updatedb from the command line as the root user.
Wildcard Result
? Matches any single character
* Matches any string of characters
Matches any character in the set of characters, for example
[set]
[adf] will match any occurrence of "a", "d", or "f"
[!set] Matches any character not in the set of characters
To search for files using the ? wildcard, replace each unknown character with ?, e.g. if
you know only the first 2 letters are 'ba' of a 3-letter filename with an extension
of .out, type ls ba?.out .
To search for files using the * wildcard, replace the unknown string with *, e.g. if you
remember only that the extension was .out, type ls *.out
For example, administrators sometimes scan for large core files (which contain
diagnostic information after a program fails) that are more than several weeks old in order
to remove them. It is also common to remove files in /tmp (and other temporary
directories, such as those containing cached files) that have not been accessed recently.
Many distros use automated scripts that run periodically to accomplish such house
cleaning.
Using find
When no arguments are given, find lists all files in the current directory and all of its
subdirectories. Commonly used options to shorten the list include -name (only list files
with a certain pattern in their name), -iname (also ignore te case of file names), and -type
(which will restrict the results to files of a certain specified type, such as d for directory,
l for symbolic link or f for a regular file, etc).
Note that you have to end the command with either ‘;’ (including the single-quotes) or \;
Both forms are fine.
One can also use the -ok option which behaves the same as -exec except that find will
prompt you for permission before executing the command. This makes it a good way to
test your results before blindly executing any potentially dangerous commands.
Here, -ctime is when the inode meta-data (i.e., file ownership, permissions, etc) last
changed; it is often, but not necessarily when the file was first created. You can also
search for accessed/last read (-atime) or modified/last written (-mtime) times. The number
is the number of days and can be expressed as either a number (n) that means exactly
that value, +n which means greater than that number, or -n which means less than that
number. There are similar options for times in minutes (as in -cmin, -amin, and -mmin).
$ find / -size 0
Note the size here is in 512-byte blocks, by default; you can also specify bytes (c),
kilobytes (k), megabytes (M), gigabytes (G), etc. As with the time numbers above, file
sizes can also be exact numbers (n), +n or -n. For details consult the man page for find.
For example, to find files greater than 10 MB in size and running a command on those
files:
$ find / -size +10M -exec command {} ’;’
Viewing Files
You can use the following utilities to view files:
Command Usage
Used for viewing files that are not very long; it does not provide any
cat
scroll-back.
tac Used to look at a file backwards, one line at a time.
Used to view larger files because it is a paging program; it pauses at
each screenful of text, provides scroll-back capabilities, and lets you
less search and navigate within the file. Note: Use / to search for a
pattern in the forward direction and ? for a pattern in the backward
direction.
Used to print the last 10 lines of a file by default. You can change the
tail number of lines by doing -n 15 or just -15 if you wanted to look at
the last 15 lines instead of the default.
head The opposite of tail; by default it prints the first 10 lines of a file.
touch and mkdir
touch is often used to set or update the access, change, and modify times of files. By
default it resets a file's time stamp to match the current time.
This is normally done to create an empty file as a placeholder for a later purpose.
• The -t option allows you to set the date and time stamp of the file.
To set the time stamp to a specific time:
$ touch -t 03201600 myfile
This sets the file, myfile's, time stamp to 4 p.m., March 20th (03 20 1600).
• To create a sample directory named sampdir under the current directory, type
mkdir sampdir.
• To create a sample directory called sampdir under /usr, type mkdir /usr/sampdir.
Removing a directory is simply done with rmdir. The directory must be empty or it will
fail. To remove a directory and all of its contents you have to do rm -rf as we shall discuss.
Removing a File
Comma
Usage
nd
mv Rename a file
rm Remove a file
rm –f Forcefully remove a file
rm –i Interactively remove a file
If you are not certain about removing files that match a pattern you supply, it is always
good to run rm interactively (rm –i) to prompt before every removal.
While typing rm –rf is a fast and easy way to remove a whole filesystem tree recursively, it
is extremely dangerous and should be used with the utmost care, especially when used by
root (recall that recursive means drilling down through all sub-directories, all the way
down a tree). Below are the commands used to rename or remove a directory:
Command Usage
mv Rename a directory
rmdir Remove an empty directory
Forcefully remove a directory
rm -rf
recursively
Modifying the Command Line Prompt
The PS1 variable is the character string that is displayed as the prompt on the command
line. Most distributions set PS1 to a known default value, which is suitable in most cases.
However, users may want custom information to show on the command line. For example,
some system administrators require the user and the host system name to show up on the
command line as in:
student@quad32 $
This could prove useful if you are working in multiple roles and want to be always
reminded of who you are and what machine you are on. The prompt above could be
implemented by setting the PS1 variable to: \u@\h \$
For example:
$ echo $PS1
\$
$ PS1="\u@\h \$ "
coop@quad64 $ echo $PS1
\u@\h \$
coop@quad64 $
There are two broad families of package managers: those based on Debian and those
which use RPM as their low-level package manager. The two systems are incompatible,
but provide the same features at a broad level.
In this section, you will learn how to install, remove, or search for packages using the
different package management tools.
Most of the time users need work only with the high-level tool, which will take care of
calling the low-level tool as needed. Dependency tracking is a particularly important
feature of the high-level tool, as it handles the details of finding and installing each
dependency for you. Be careful, however, as installing a single package could result in
many dozens or even hundreds of dependent packages being installed.
Summary (1 of 2)
You have completed this chapter. Let’s summarize the key concepts covered.
• Virtual terminals (VT) in Linux are consoles, or command line terminals that use the
connected monitor and keyboard.
• Different Linux distributions start and stop the graphical desktop in different ways.
• A terminal emulator program on the graphical desktop works by emulating a terminal
within a window on the desktop.
• The Linux system allows you to either log in via text terminal or remotely via the
console.
• When typing your password, nothing is printed to the terminal, not even a generic
symbol to indicate that you typed.
• The preferred method to shut down or reboot the system is to use the shutdown
command.
• There are two types of pathnames: absolute and relative.
• An absolute pathname begins with the root directory and follows the tree, branch by
branch, until it reaches the desired directory or file.
• A relative pathname starts from the present working directory.
• Using hard and soft (symbolic) links is extremely useful in Linux.
• cd remembers where you were last, and lets you get back there with cd -.
• locate performs a database search to find all file names that match a given pattern.
• find locates files recursively from a given directory or set of directories.
• find is able to run commands on the files that it lists, when used with the -exec
option.
• touch is used to set the access, change, and edit times of files as well as to create
empty files.
• The Advanced Packaging Tool (apt) package management system is used to
manage installed software on Debian-based systems.
• You can use the Yellowdog Updater Modified (yum) open-source command-line
package-management utility for RPM-compatible Linux operating systems.
• The zypper package management system is based on RPM and used for openSUSE.
chapter 8
Introduction to Filesystems
In Linux (and all UNIX-like operating systems) it is often said “Everything is a file”, or at
least it is treated as such. This means whether you are dealing with normal data files and
documents, or with devices such as sound cards and printers, you interact with them
through the same kind of Input/Output (I/O) operations. This simplifies things: you open a
“file” and perform normal operations like reading the file and writing on it (which is one
reason why text editors, which you will learn about in an upcoming section, are so
important.)
On many systems (including Linux), the filesystem is structured like a tree. The tree is
usually portrayed as inverted, and starts at what is most often called the root
directory, which marks the beginning of the hierarchical filesystem and is also some
times referred to as the trunk, or simply denoted by /. The root directory is not the same
as the root user. The hierarchical filesystem also contains other elements in the path
(directory names) which are separated by forward slashes (/) as in /usr/bin/awk, where the
last element is the actual file name.
In this section, you will learn about some basic concepts including the filesystem hierarchy
as well as about disk partitions.
The Filesystem Hierarchy Standard (FHS) grew out of historical standards from early
versions of UNIX, such as the Berkeley Software Distribution (BSD) and others. The
FHS provides Linux developers and system administrators with a standard directory
structure for the filesystem, which provides consistency between systems and
distributions.
Visit http://www.pathname.com/fhs/ for a list of the main directories and their contents in
Linux systems.
Linux supports various filesystem types created for Linux, along with compatible
filesystems from other operating systems such as Windows and MacOS. Many older,
legacy filesystems, such as FAT, are supported.
Warning: If you mount a filesystem on a non-empty directory, the former contents of that
directory are covered-up and not accessible until the filesystem is unmounted. Thus mount
points are usually empty directories.
The mount command is used to attach a filesystem (which can be local to the computer
or, as we shall discuss, on a network) somewhere within the filesystem tree. Arguments
include the device node and mount point. For example,
will attach the filesystem contained in the disk partition associated with the /dev/sda5
device node, into the filesystem tree at the /home mount point. (Note that unless the
system is otherwise configured only the root user has permission to run mount.) If you
want it to be automatically available every time the system starts up, you need to edit the
file /etc/fstab accordingly (the name is short for Filesystem Table). Looking at this file will
show you the configuration of all pre-configured filesystems. man fstab will display how
this file is used and how to configure it.
Typing mount without any arguments will show all presently mounted filesystems.
The command df -Th (disk-free) will display information about mounted filesystems
including usage statistics about currently used and available space.
Using NFS (the Network Filesystem) is one of the methods used for sharing
data across physical systems. Many system administrators mount remote
users' home directories on a server in order to give them access to the same
files and configuration files across multiple client systems. This allows the
users to log in to different computers yet still have access to the same files and
resources.
We will now look in detail at how to use NFS on the server machine.
On the server machine, NFS daemons (built-in networking and service processes in Linux)
and other system servers are typically started with the following command: sudo service
nfs start
The text file /etc/exports contains the directories and permissions that a host is willing to
share with other systems over NFS. An entry in this file may look like the following:
/projects *.example.com(rw)
This entry allows the directory /projects to be mounted using NFS with read and write (rw)
permissions and shared with other hosts in the example.com domain. As we will detail in
the next chapter, every file in Linux has 3 possible permissions: read (r), write (w) and
execute (x).
After modifying the /etc/exports file, you can use the exportfs -av command to notify Linux
about the directories you are allowing to be remotely mounted using NFS (restarting NFS
with sudo service nfs restart will also work, but is heavier as it halts NFS for a short while
before starting it up again).
On the client machine, if it is desired to have the remote filesystem mounted automatically
upon system boot, the /etc/fstab file is modified to accomplish this. For example, an entry
in the client's /etc/fstab file might look like the following:
You can also mount the remote filesystem without a reboot or as a one-time mount by
directly using the mount command:
Remember, if /etc/fstab is not modified, this remote mount will not be present the next
time the system is restarted.
Certain filesystems like the one mounted at /proc are called pseudo filesystems because
they have no permanent presence anywhere on disk.
The /proc filesystem contains virtual files (files that exist only in memory) that permit
viewing constantly varying kernel data. This filesystem contains files and directories that
mimic kernel structures and configuration information. It doesn't contain real files but
runtime system information (e.g. system memory, devices mounted, hardware
configuration, etc). Some important files in /proc are:
/proc/cpuinfo
/proc/interrupts
/proc/meminfo
/proc/mounts
/proc/partitions
/proc/version
/proc/<Process-ID-#>
/proc/sys
The first example shows there is a directory for every process running on the system
which contains vital information about it. The second example shows a virtual directory
that contains a lot of information about the entire system, in particular its hardware and
configuration. The /proc filesystem is very useful because the information it reports is
gathered only as needed and never needs storage on disk.
Now that you know about the basics of filesystems, let's learn about the filesystem
architecture and directory structure in Linux.
Each user has a home directory, usually placed under /home. The /root (slash-root)
directory on modern Linux systems is no more than the root user's home directory.
The /home directory is often mounted as a separate filesystem on its own partition, or
even exported (shared) remotely on a network through NFS.
Sometimes you may group users based on their department or function. You can then
create subdirectories under the /home directory for each of these groups. For example, a
school may organize /home with something like the following:
/home/faculty/
/home/staff/
/home/students/
The /bin directory contains executable binaries, essential commands used in single-user
mode, and essential commands required by all system users, such as:
Command Usage
Produces a list of processes along with status information
ps
for the system.
ls Produces a listing of the contents of a directory.
cp Used to copy files.
To view a list of programs in the /bin directory, type: ls /bin
Commands that are not essential for the system in single-user mode are placed in the
/usr/bin directory, while the /sbin directory is used for essential binaries related to system
administration, such as ifconfig and shutdown. There is also a /usr/sbin directory for less
essential system administration programs.
The /etc directory is the home for system configuration files. It contains no binary
programs, although there are some executable scripts. For example, the
file resolv.conf tells the system where to go on the network to obtain host name to IP
address mappings (DNS). Files like passwd,shadow and group for managing user accounts
are found in the /etc directory. System run level scripts are found in subdirectories of /etc.
For example, /etc/rc2.d contains links to scripts for entering and leaving run level 2. The rc
directory historically stood for Run Commands. Some distros extend the contents of /etc.
For example, Red Hat adds the sysconfig subdirectory that contains more configuration
files.
The /boot directory contains the few essential files needed to boot the system. For every
alternative kernel installed on the system there are four files:
The images show an example listing of the /boot directory, taken from a CentOS system
that has three installed kernels. Names would vary and things would look somewhat
different on a different distribution.
/lib contains libraries (common code shared by applications and needed for them to run)
for the essential programs in /bin and /sbin. These library filenames either start with ld or
lib, for example, /lib/libncurses.so.5.7.
Most of these are what are known as dynamically loaded libraries (also known as
shared libraries or Shared Objects (SO)). On some Linux distributions there exists a
/lib64 directory containing 64-bit libraries, while /lib contains 32-bit versions.
Kernel modules (kernel code, often device drivers, that can be loaded and unloaded
without re-starting the system) are located in /lib/modules/<kernel-version-number>.
The /media directory is typically located where removable media, such as CDs, DVDs and
USB drives are mounted. Unless configuration prohibits it, Linux automatically mounts the
removable media in the /media directory when they are detected.
Directory
Usage
name
/usr/include Header files used to compile applications.
/usr/lib Libraries for programs in /usr/bin and /usr/sbin.
/usr/lib64 64-bit libraries for 64-bit programs in /usr/bin and /usr/sbin.
/usr/sbin Non-essential system binaries, such as system daemons.
Shared data used by applications, generally architecture-
/usr/share
independent.
/usr/src Source code, usually for the Linux kernel.
/usr/X11R6 X Window configuration files; generally obsolete.
Data and programs specific to the local machine. Subdirectories
/usr/local
include bin, sbin, lib, share, include, etc.
/usr/bin This is the primary directory of executable commands on the system.
Now that you know about the filesystem and its structure, let’s learn how to manage files
and directories.
diff is used to compare files and directories. This often-used utiility program has many
useful options (see man diff) including:
In this section, you will learn additional methods for comparing files and how to apply
patches to files.
You can compare three files at once using diff3, which uses one file as the reference basis
for the other two. For example, suppose you and a co-worker both have made
modifications to the same file working at the same time independently. diff3 can show the
differences based on the common file you both started with. The syntax for diff3 is as
follows:
Many modifications to source code and configuration files are distributed utilizing
patches, which are applied, not suprisingly, with the patch program. A patch file contains
the deltas (changes) required to update an older version of a file to the new one. The
patch files are actually produced by running diff with the correct options, as in:
Distributing just the patch is more concise and efficient than distributing the entire file. For
example, if only one line needs to change in a file that contains 1,000 lines, the patch file
will be just a few lines long.
To apply a patch you can just do either of the two methods below:
The first usage is more common as it is often used to apply changes to an entire directory
tree, rather than just one file as in the second example. To understand the use of the -p1
option and many others, see the man page for patch.
In Linux, a file's extension often does not categorize it the way it might in other operating
systems. One can not assume that a file named file.txt is a text file and not an executable
program. In Linux a file name is generally more meaningful to the user of the system than
the system itself; in fact most applications directly examine a file's contents to see what
kind of object it is rather than relying on an extension. This is very different from the way
Windows handles filenames, where a filename ending with .exe, for example, represents
an executable binary file.
The real nature of a file can be ascertained by using the file utility. For the file names
given as arguments, it examines the contents and certain characteristics to determine
whether the files are plain text, shared libraries, executable programs, scripts, or
something else.
There are many ways you can back up data or even your entire system. Basic ways to do
so include use of simple copying with cp and use of the more robust rsync.
Both can be used to synchronize entire directory trees. However, rsync is more efficient
because it checks if the file being copied already exists. If the file exists and there is no
change in size or modification time, rsync will avoid an unnecessary copy and save time.
Furthermore, because rsync copies only the parts of files that have actually changed, it
can be very fast.
cp can only copy files to and from destinations on the local machine (unless you are
copying to or from a filesystem mounted using NFS), but rsync can also be used to copy
files from one machine to another. Locations are designated in the target:path form where
target can be in the form of [user@]host. The user@ part is optional and used if the remote
user is different from the local user.
rsync is very efficient when recursively copying one directory tree to another, because
only the differences are transmitted over the network. One often synchronizes the
destination directory tree with the origin, using the -r option to recursively walk down the
directory tree copying all files and directories below the one listed as the source.
rsync is a very powerful utility. For example, a very useful way to back up a project
directory might be to use the following command:
$ rsync -r project-X archive-machine:archives/project-X
Note that rsync can be very destructive! Accidental misuse can do a lot of harm to data
and programs by inadvertently copying changes to where they are not wanted. Take care
to specify the correct options and paths. It is highly recommended that you first test your
rsync command using the -dry-run option to ensure that it provides the results that you
want.
To use rsync at the command prompt, type rsync sourcefile destinationfile, where either
file can be on the local machine or on a networked machine.
File data is often compressed to save disk space and reduce the time it takes to transmit
files over networks.
Command Usage
gzip The most frequently used Linux compression utility
Produces files significantly smaller than those produced by
bzip2
gzip
xz The most space efficient compression utility used in Linux
Is often required to examine and decompress archives from
zip
other operating systems
These techniques vary in the efficiency of the compression (how much space is saved) and
in how long they take to compress; generally the more efficient techniques take longer.
Decompression time doesn't vary as much across different methods.
In addition the tar utility is often used to group files in an archive and then compress the
whole archive at once.
gzip is the most oftenly used Linux compression utility. It compresses very well and is very
fast. The following table provides some usage examples:
Command Usage
Compresses all files in the current directory; each file is
gzip *
compressed and renamed with a .gz extension.
Compresses all files in the projectX directory along with all files in
gzip -r projectX
all of the directories under projectX.
De-compresses foo found in the file foo.gz. Under the hood, gunzip
gunzip foo
command is actually the same as gzip –d.
bzip2 has syntax that is similar to gzip but it uses a different compression algorithm and
produces significantly smaller files, at the price of taking a longer time to do its work.
Thus, It is more likely to be used to compress larger files.
Command Usage
Compress all of the files in the current directory and replaces
bzip2 *
each file with a file renamed with a .bz2 extension.
bunzip2 *.bz2 Decompress all of the files with an extension of .bz2 in the
current directory. Under the hood, bunzip2 is the same as
calling bzip2 -d.
xz is the most space efficient compression utility used in Linux and is now used by
www.kernel.org to store archives of the Linux kernel. Once again it trades a slower
compression speed for an even higher compression ratio.
Command Usage
Compress all of the files in the current
$ xz * directory and replace each file with one with
a .xz extension.
Compress the file foo into foo.xz using the
xz foo default compression level (-6), and
remove foo if compression succeeds.
Decompress bar.xz into bar and don't remove
xz -dk bar.xz
bar.xz even if decompression is successful.
Decompress a mix of compressed and
xz -dcf a.txt b.txt.xz > abcd.txt uncompressed files to standard output, using a
single command.
$ xz -d *.xz Decompress the files compressed using xz.
Compressed files are stored with a .xz extension.
The zip program is not often used to compress files in Linux, but is often required to
examine and decompress archives from other operating systems. It is only used in Linux
when you get a zipped file from a Windows user. It is a legacy program.
Command Usage
Compresses all files in the current directory and places them
zip backup *
in the file backup.zip.
Archives your login directory (~) and all files and directories
zip -r backup.zip ~
under it in the file backup.zip.
Extracts all files in the file backup.zip and places them in the
unzip backup.zip
current directory.
Archiving and Compressing Data Using tar
Historically, tar stood for "tape archive" and was used to archive files to a magnetic tape.
It allows you to create or extract files from an archive file, often called a tarball. At the
same time you can optionally compress while creating the archive, and decompress while
extracting its contents.
Command Usage
Extract all the files in mydir.tar into the mydir
$ tar xvf mydir.tar
directory
$ tar zcvf mydir.tar.gz mydir Create the archive and compress with gzip
$ tar jcvf mydir.tar.bz2 mydir Create the archive and compress with bz2
$ tar Jcvf mydir.tar.xz mydir Create the archive and compress with xz
$ tar xvf mydir.tar.gz Extract all the files in mydir.tar.gz into the mydir
directory. Note you do not have to tell tar it is in gzip
format.
You can separate out the archiving and compression stages, as in:
but this is slower and wastes space by creating an unneeded intermediary .tar file.
The dd program is very useful for making copies of raw disk space. For example, to back
up your Master Boot Record (MBR) (the first 512 byte sector on the disk that contains a
table describing the partitions on that disk), you might type:
To use dd to make a copy of one disk onto another, (WARNING!) deleting everything
that previously existed on the second disk, type:
dd if=/dev/sda of=/dev/sdb
An exact copy of the first disk device is created on the second disk device.
• The filesystem tree starts at what is often called the root directory (or trunk, or /).
• The Filesystem Hierarchy Standard (FHS) provides Linux developers and system
administrators a standard directory structure for the filesystem.
• Partitions help to segregate files according to usage, ownership and type.
• Filesystems can be mounted anywhere on the main filesystem tree at a mount
point. Automatic filesystem mounting can be set up by editing /etc/fstab.
• NFS (The Network Filesystem) is a useful method for sharing files and data through
the network systems.
• Filesystems like /proc are called pseudo filesystems because they exist only in
memory.
• /root (slash-root) is the home directory for the root user.
/var may be put in its own filesystem so that growth can be contained and not fatally
affect the system.
Chapter 9
user environment
Identifying the Current User
As you know, Linux is a multiuser operating system; i.e., more than one user
can log on at the same time.
Linux uses groups for organizing users. Groups are collections of accounts
with certain shared permissions. Control of group membership is administered
through the /etc/group file, which shows a list of groups and their members.
By default, every user belongs to a default or primary group. When a user
logs in, the group membership is set for their primary group and all the
members enjoy the same level of access and privilege. Permissions on various
files and directories can be modified at the group level.
All Linux users are assigned a unique user ID (uid), which is just an
integer, as well as one or more group ID’s (gid), including a default one
which is the same as the user ID.
Historically Fedora-family systems start uid's at 500; other distributions
begin at 1000.
These numbers are associated with names through the files /etc/passwd and
/etc/group.
Groups are used to establish a set of users who have common interests for
the purposes of access rights, privileges, and security considerations.
Access rights to files (and devices) are granted on the basis of the user
and the group they belong to.
Adding a new user is done with useradd and removing an existing user is done
with userdel. In the simplest form an account for the new user turkey would
be done with:
turkey:x:502:502::/home/turkey:/bin/bash
and sets the default shell to /bin/bash. Removing a user account is as easy
as typing userdel turkey However, this will leave the /home/turkey directory
intact. This might be useful if it is a temporary inactivation. To remove
the home directory while removing the account one needs to use the -r option
to userdel.
Typing id with no argument gives information about the current user, as in:
$ id
uid=500(george) gid=500(george) groups=106(fuse),500(george)
$ groups turkey
turkey : turkey
The root account is very powerful and has full access to the system. Other
operating systems often call this the administrator account; in Linux it is
often called the superuser account. You must be extremely cautious before
granting full root access to a user; it is rarely if ever justified.
External attacks often consist of tricks used to elevate to the root
account.
However, you can use the sudo feature to assign more limited privileges to
user accounts:
To fully become root, one merely types su and then is prompted for the root
password.
To execute just one command with root privilege type sudo <command>. When
the command is complete you will return to being a normal unprivileged user.
sudo configuration files are stored in the /etc/sudoers file and in the
/etc/sudoers.d/ directory. By default, the sudoers.d directory is empty.
In Linux, the command shell program (generally bash) uses one or more
startup files to configure the environment. Files in the /etc directory
define global settings for all users while Initialization files in the
user's home directory can include and/or override the global settings.
The startup files can do anything the user would like to do in every command
shell, such as:
1. ~/.bash_profile
2. ~/.bash_login
3. ~/.profile
The Linux login shell evaluates whatever startup file that it comes across
first and ignores the rest. This means that if it finds ~/.bash_profile, it
ignores ~/.bash_login and ~/.profile. Different distributions may use
different startup files.
However, every time you create a new shell, or terminal window, etc., you do
not perform a full system login; only the ~/.bashrc file is read and
evaluated. Although this file is not read and evaluated along with the login
shell, most distributions and/or users include the ~/.bashrc file from
within one of the three user-owned startup files. In the Ubuntu, openSuse,
and CentOS distros, the user must make appropriate changes in
the ~/.bash_profile file to include the ~/.bashrc file.
The .bash_profile will have certain extra lines, which in turn will collect
the required customization parameters from .bashrc.
Environment variables are simply named quantities that have specific values
and are understood by the command shell, such as bash. Some of these are
pre-set (built-in) by the system, and others are set by the user either at
the command line or within startup and other scripts. An environment
variable is actually no more than a character string that contains
information used by one or more applications.
There are a number of ways to view the values of currently set environment
variables; one can type set, env, or export. Depending on the state of your
system, set may print out many more lines than the other two methods.
$ set
BASH=/bin/bash
BASHOPTS=checkwinsize:cmdhist:expand_aliases:extglob:extquote:force_fignore
BASH_ALIASES=()
...
$ env
SSH_AGENT_PID=1892
GPG_AGENT_INFO=/run/user/me/keyring-Ilf3vt/gpg:0:1
TERM=xterm
SHELL=/bin/bash
...
$ export
declare -x COLORTERM=gnome-terminal
declare -x COMPIZ_BIN_PATH=/usr/bin /
declare -x COMPIZ_CONFIG_PROFILE=ubuntu
Task Command
Show the value of a specific
echo $SHELL
variable
export VARIABLE=value (or
Export a new variable value
VARIABLE=value; export VARIABLE)
1. Edit ~/.bashrc and add the
line export VARIABLE=value
2. Type source ~/.bashrc or
Add a variable permanently
just . ~/.bashrc (dot
~/.bashrc); or just start a
new shell by typing bash
HOME is an environment variable that represents the home (or login)
directory of the user. cd without arguments will change the current working
directory to the value of HOME. Note the tilde character (~) is often used
as an abbreviation for $HOME. Thus cd $HOME and cd ~ are completely
equivalent statements.
Command Explanation
$ echo $HOME
/home/me Show the value of the HOME
$ cd /bin environment variable then change
directory (cd) to /bin
• :path1:path2
• path1::path2
In the example :path1:path2, there is null directory before the first colon
(:). Similarly, for path1::path2 there is null directory between path1 and
path2.
$ export PATH=$HOME/bin:$PATH
$ echo $PATH
/home/me/bin:/usr/local/bin:/usr/bin:/bin/usr
PS1 is the primary prompt variable which controls what your command line
prompt looks like. The following special characters can be included in PS1 :
\u - User name
\h - Host name
\w - Current working directory
\! - History number of this command
\d - Date
They must be surrounded in single quotes when they are used as in the
following example:
$ echo $PS1
$
$ export PS1='\u@\h:\w$ '
[email protected]:~$ # new prompt
[email protected]:~$
Even better practice would be to save the old prompt first and then restore,
as in:
$ OLD_PS1=$PS1
The environment variable SHELL points to the user's default command shell
(the program that is handling whatever you type in a command window, usually
bash) and contains the full pathname to the shell:
$ echo $SHELL
/bin/bash
bash keeps track of previously entered commands and statements in a history
buffer; you can recall previously used commands simply by using the Up and
Down cursor keys. To view the list of previously executed commands, you can
just type history at the command line.
The list of commands is displayed with the most recent command appearing
last in the list. This information is stored in ~/.bash_history.
HISTSIZE stores the maximum number of lines in the history file for the
current session.
Key Usage
Browse through the list of commands
Up/Down arrow key
previously executed
!! (Pronounced as
Execute the previous command
bang-bang)
CTRL-R Search previously used commands
If you want to recall a command in the history list, but do not want to
press the arrow key repeatedly, you can press CTRL-R to do a reverse
intelligent search.
As you start typing the search goes back in reverse order to the first
command that matches the letters you've typed. By typing more successive
letters you make the match more and more specific.
The following is an example of how you can use the CTRL-R command to search
through the command history:
$ ^R # This
all happens on 1 line
(reverse-i-search)'s': sleep 1000 # Searched for 's'; matched "sleep"
$ sleep 1000 # Pressed Enter to
execute the searched command
$
The table describes the syntax used to execute previously used commands.
Syntax Task
! Start a history substitution
!$ Refer to the last argument in a line
!n Refer to the nth command line
Refer to the most recent command starting
!string
with string
All history substitutions start with !. In the line $ ls -l /bin /etc /var !
$ refers to /var, which is the last argument in the line.
1. echo $SHELL
2. echo $HOME
3. echo $PS1
4. ls -a
5. ls -l /etc/ passwd
6. sleep 1000
7. history
$ !1 # Execute command #1 above
echo $SHELL
/bin/bash
$ !sl # Execute the command beginning with "sl"
sleep 1000
$
You can use keyboard shortcuts to perform different tasks quickly. The table
lists some of these keyboard shortcuts and their uses.
Keyboard
Task
Shortcut
CTRL-L Clears the screen
CTRL-D Exits the current shell
CTRL-Z Puts the current process into suspended
background
CTRL-C Kills the current process
CTRL-H Works the same as backspace
CTRL-A Goes to the beginning of the line
CTRL-W Deletes the word before the cursor
Deletes from beginning of line to cursor
CTRL-U
position
CTRL-E Goes to the end of the line
Tab Auto-completes files, directories, and binaries
You can create customized commands or modify the behavior of already
existing ones by creating aliases. Most often these aliases are placed in
your ~/.bashrc file so they are available to any command shells you create.
Please note there should not be any spaces on either side of the equal sign
and the alias definition needs to be placed within either single or double
quotes if it contains any spaces.
File Ownership
In Linux and other UNIX-based operating systems, every file is associated
with a user who is the owner. Every file is also associated with a group (a
subset of all users) which has an interest in the file and certain rights,
or permissions: read, write, and execute.
The following utility programs involve user and group ownership and
permission setting.
Command Usage
Used to change user ownership of a file or
chown directory
chgrp Used to change group ownership
Used to change the permissions on the file
which can be done separately for owner, group
chmod and the rest of the world (often named as
other.)
Files have three kinds of permissions: read (r), write (w), execute (x).
These are generally represented as in rwx. These permissions affect three
groups of owners: user/owner (u), group (g), and others (o).
There are a number of different ways to use chmod. For instance, to give the
owner and others execute permission and remove the group write permission:
$ ls -l test1
-rw-rw-r-- 1 coop coop 1601 Mar 9 15:04 test1
$ chmod uo+x,g-w test1
$ ls -l test1
-rwxr--r-x 1 coop coop 1601 Mar 9 15:04 test1
where u stands for user (owner), o stands for other (world), and g stands
for group.
This kind of syntax can be difficult to type and remember, so one often uses
a shorthand which lets you set all the permissions in one step. This is done
with a simple algorithm, and a single digit suffices to specify all
three permission bits for each entity. This digit is the sum of:
When you apply this to the chmod command you have to give three digits for
each degree of freedom, such as in
$ ls -l
total 4
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-1
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-2
drwxrwxr-x. 2 bob bob 4096 Mar 16 19:04 temp
$ ls -l
total 4
-rw-rw-r--. 1 root bob 0 Mar 16 19:04 file-1
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-2
drwxrwxr-x. 2 bob bob 4096 Mar 16 19:04 temp
The image on LHS shows the group with their permissions on 'file1'.
The image on RHS shows the change in groups and thier permissions on "file1"
Chapter 10
text editors
At some point you will need to manually edit text files. You might be composing an email
off-line, writing a script to be used for bash or other command interpreters, altering a
system or application configuration file, or developing source code for a programming
language such as C or Java.
Linux Administrators quite often sidestep the text editors, by using graphical utilities for
creating and modifying system configuration files. However, this can be far more laborious
than directly using a text editor. Note that word processing applications such as Notepad
or the applications that are part of office suites are not really basic text editors because
they add a lot of extra (usually invisible) formatting information that will probably render
system administration configuration files unusable for their intended purpose. So
using text editors really is essential in Linux.
By now you have certainly realized Linux is packed with choices; when it comes to text
editors, there are many choices ranging from quite simple to very complex, including:
- nano
- gedit
- vi
- emacs
In this section, we will learn about nano and gedit; editors which are relatively simple and
easy to learn. Before we start, let's take a look at some cases where an editor is not
needed.
Sometimes you may want to create a short file and don't want to bother invoking a full
text editor. In addition, doing so can be quite useful when used from within scripts, even
when creating longer files. You'll no doubt find yourself using this method when you start
on the later chapters that cover bash scripting!
If you want to create a file without using an editor there are two standard ways to create
one from the command line and fill it with content.
Earlier we learned that a single greater-than sign (>) will send the output of a command to
a file. Two greater-than signs (>>) will append new output to an existing file.
Both the above techniques produce a file with the following lines in it:
line one
line two
line three
There are some text editors that are pretty obvious; they require no particular experience
to learn and are actually quite capable if not robust. One particularly easy one to use is the
text-terminal based editor nano. Just invoke nano by giving a file name as an argument.
All the help you need is displayed at the bottom of the screen, and you should be able to
proceed without any problem.
As a graphical editor, gedit is part of the GNOME desktop system (kwrite is associated
with KDE). The gedit and kwrite editors are very easy to use and are extremely capable.
They are also very configurable. They look a lot like Notepad in Windows. Other variants
such as kedit and kate are also supported by KDE.
nano is easy to use, and requires very little effort to learn. To open a file in nano, type
nano <filename> and press Enter. If the file doesn't exist, it will be created.
nano provides a two line “shortcut bar” at the bottom of the screen that lists the available
commands. Some of these commands are:
gedit (pronounced 'g-edit') is a simple-to-use graphical editor that can only be run within
a Graphical Desktop environment. It is visually quite similar to the Notepad text editor in
Windows, but is actually far more capable and very configurable and has a wealth of
plugins available to extend its capabilities further.
To open a new file in gedit, find the program in your desktop's menu system, or from the
command line type gedit <filename>. If the file doesn't exist it will be created.
Using gedit is pretty straight-forward and doesn't require much training. Its interface is
composed of quite familiar elements.
Both vi and emacs have a basic purely text-based form that can run in a non-graphical
environment. They also have one or more X-based graphical forms with extended
capabilities; these may be friendlier for a less experienced user. While vi and emacs can
have significantly steep learning curves for new users, they are extremely efficient when
one has learned how to use them.
You need to be aware that fights among seasoned users over which editor is better can be
quite intense and are often described as a holy war.
Usually the actual program installed on your system is vim which stands for vi Improved,
and is aliased to the name vi. The name is pronounced as “vee-eye”.
Even if you don’t want to use vi, it is good to gain some familiarity with it: it is a standard
tool installed on virtually all Linux distributions. Indeed, there may be times where there is
no other editor available on the system.
GNOME extends vi with a very graphical interface known as gvim and KDE offers kvim.
Either of these may be easier to use at first.
When using vi, all commands are entered through the keyboard; you don’t need to keep
moving your hands to use a pointer device such as a mouse or touchpad, unless you want
to do so when using one of the graphical versions of the editor.
Typing vimtutor launches a short but very comprehensive tutorial for those who want to
learn their first vi commands. This tutorial is a good place to start learning vi. Even though
it provides only an introduction and just seven lessons, it has enough material to make you
a very proficient vi user because it covers a large number of commands. After learning
these basic ones, you can look up new tricks to incorporate into your list of vi commands
because there are always more optimal ways to do things in vi with less typing.
Modes in vi
vi provides three modes as described in the table below. It is vital to not lose track of
which mode you are in. Many keystrokes and commands behave quite differently in
different modes.
Mode Feature
• By default, vi starts in Command mode.
• Each key is an editor command.
Command • Keyboard strokes are interpreted as commands that can
modify file contents.
Command Usage
vi myfile Start the vi editor and edit the myfile file
Start vi and edit myfile in recovery mode from a
vi -r myfile system crash
:r file2 Read in file2 and insert at current position
:w Write to the file
:w myfile Write out the file to myfile
:w! file2 Overwrite file2
:x or :wq Exit vi and write out modified file
:q Quit vi
Quit vi even though modifications have not been
:q! saved
Changing Cursor Positions in vi
The table describes the most important keystrokes used when changing cursor position in
vi. Line mode commands (those following colon (:)) require the ENTER key to be pressed
after the command is typed.
Key Usage
To move up, down, left and
arrow keys
right
j or <ret> To move one line down
k To move one line up
h or Backspace To move one character left
l or Space To move one character right
0 To move to beginning of line
$ To move to end of line
To move to beginning of
w
next word
:0 or 1G To move to beginning of file
:n or nG To move to line n
:$ or G To move to last line in file
CTRL-F or Page
To move forward one page
Down
CTRL-B or Page Up To move backward one page
^l To refresh and center screen
Searching for Text in vi
The table describes the most important commands used when searching for text in vi. The
ENTER key should be pressed after typing the search pattern.
Comman
Usage
d
/pattern Search forward for pattern
Search backward for
?pattern pattern
The table describes the most important keystrokes used when searching for text in vi.
Key Usage
Move to next occurrence of search
n pattern
Move to previous occurrence of
N search pattern
Working with Text in vi
The table describes the most important keystrokes used when changing, adding, and
deleting text in vi.
Click the link to download a consolidated PDF file with commands for vi.
commands for vi
Key Usage
a Append text after cursor; stop upon Escape key
A Append text at end of current line; stop upon Escape key
i Insert text before cursor; stop upon Escape key
Insert text at beginning of current line; stop upon Escape
I
key
Start a new line below current line, insert text there; stop
o
upon Escape key
Start a new line above current line, insert text there; stop
O
upon Escape key
r Replace character at current position
Replace text starting with current position; stop upon
R
Escape key
x Delete character at current position
Nx Delete N characters, starting at current position
dw Delete the word at the current position
D Delete the rest of the current line
dd Delete the current line
Ndd or dNd Delete N lines
u Undo the previous operation
yy Yank (copy) the current line and put it in buffer
Nyy or yNy Yank (copy) N lines and put it in buffer
Paste at the current position the yanked line or lines from
p
the buffer.
Using External Commands
Typing :sh command opens an external command shell. When you exit the shell, you will
resume your vi editing session.
Typing :!executes a command from within vi. The command follows the exclamation point.
This technique best suited for non-interactive commands such as:
:! wc %
Typing this will run the wc (word count) command on the file; the character % represents
the file currently being edited.
The fmt command does simple formatting of text. If you are editing a file and want the file
to look nice, you can run the file through fmt. One way to do this while editing is by using :
%!fmt, which runs the entire file (the % part)
The emacs editor is a popular competitor for vi. Unlike vi, it does not work with modes.
emacs is highly customizable and includes a large number of features. It was initially
designed for use on a console, but was soon adapted to work with a GUI as well. emacs
has many other capabilities other than simple text editing; it can be used for email,
debugging, etc.
Rather than having different modes for command and insert, like vi, emacs uses the
CTRL and Esc keys for special commands.
CTRL-x CTRL-
Write to the file giving a new name when prompted
w
CTRL-x CTRL-s Saves the current file
CTRL-x CTRL-c Exit after being prompted to save any modified files
The emacs tutorial is a good place to start learning basic emacs commands. It is available
any time when in emacs by simply typing CTRL-h (for help) and then the letter t for tutorial.
Changing Cursor Positions in emacs
The table lists some of the keys and key combinations that are used for changing cursor
positions in emacs.
Key Usage
Use the arrow keys for up, down, left
arrow keys
and right
CTRL-n One line down
CTRL-p One line up
CTRL-f One character forward/right
CTRL-b One character back/left
CTRL-a Move to beginning of line
CTRL-e Move to end of line
Esc-f Move to beginning of next word
Move back to beginning of preceding
Esc-b
word
Esc-< Move to beginning of file
Esc-x Goto-line n move to line n
Esc-> Move to end of file
CTRL-v or Page
Move forward one page
Down
Esc-v or Page Up Move backward one page
CTRL-l Refresh and center screen
Searching for Text in emacs
The table lists the key combinations that are used for searching for text in emacs.
Key Usage
Search forward for prompted pattern, or for
CTRL-s next pattern
Search backwards for prompted pattern, or for
CTRL-r next pattern
Working with Text in emacs
The table lists some of the key combinations used for changing, adding, and deleting text
in emacs:
Key Usage
CTRL-o Insert a blank line
CTRL-d Delete character at current position
CTRL-k Delete the rest of the current line
CTRL-_ Undo the previous operation
CTRL- (space or Mark the beginning of the selected region. The end will be at the
CTRL-@) cursor position
CTRL-w Delete the current marked text and write it to the buffer
Insert at current cursor location whatever was most recently
CTRL-y
deleted
You have completed this chapter. Let’s summarize the key concepts covered.
• Text editors (rather than word processing programs) are used quite often in Linux, for
tasks such as for creating or modifying system configuration files, writing scripts,
developing source code, etc.
• nano is an easy-to-use text-based editor that utilizes on-screen prompts.
• gedit is a graphical editor very similar to Notepad in Windows.
• The vi editor is available on all Linux systems and is very widely used. Graphical
extension versions of vi are widely available as well.
• emacs is available on all Linux systems as a popular alternative to vi. emacs can
support both a graphical user interface and a text mode interface.
• To access the vi tutorial, type vimtutor at a command line window.
• To access the emacs tutorial type Ctl-h and then t from within emacs.
• vi has three modes: Command, Insert, and Line; emacs has only one but requires
use of special keys such as Control and Escape.
• Both editors use various combinations of keystrokes to accomplish tasks; the learning
curve to master these can be long but once mastered using either editor is extremely
efficient.
Chapter 11
security principles
The Linux kernel allows properly authenticated users to access files and
applications. While each user is identified by a unique integer (the user id
or UID), a separate database associates a username with each UID. Upon
account creation, new user information is added to the user database and the
user's home directory must be created and populated with some essential
files. Command line programs such as useradd and userdel as well as GUI
tools are used for creating and removing accounts.
For each user, the following seven fields are maintained in the /etc/passwd
file:
Field Name Details Remarks
Should be between 1 and 32
Username User login name
characters long
User password (or
the character x if the Is never shown in Linux when
Password password is stored in the it is being typed; this stops
/etc/shadow file) in prying eyes
encrypted format
• UID 0 is reserved for
root user
• UID's ranging from 1-99
are reserved for other
predefined accounts
• UID's ranging from 100-
999 are reserved for
User ID Every user must have a user
system accounts and
(UID) id (UID)
groups (except for
RHEL, which reserves
only up to 499)
• Normal users have UID's
of 1000 or greater,
except on RHEL where
they start at 500
The primary Group ID (GID);
Group ID Will be covered in detail in
Group Identification Number
(GID) the chapter on Processes
stored in the /etc/group file
This field is optional and
allows insertion of extra
User Info For example: Rufus T. Firefly
information about the user
such as their name
Home The absolute path location of
For example: /home/rtfirefly
Directory user's home directory
The absolute location of a
Shell For example: /bin/bash
user's default shell
By default, Linux distinguishes between several account types in order to
isolate processes and workloads. Linux has four types of accounts:
• root
• System
• Normal
• Network
For a safe working environment, it is advised to grant the minimum
privileges possible and necessary to accounts, and remove inactive accounts.
The last utility, which shows the last time each user logged into the
system, can be used to help identify potentially inactive accounts which are
candidates for system removal.
Keep in mind that practices you use on multi-user business systems are more
strict than practices you can use on personal desktop systems that only
affect the casual user. This is especially true with security. We hope to
show you practices applicable to enterprise servers that you can use on all
systems, but understand that you may choose to relax these rules on your own
personal system.
root is the most privileged account on a Linux/UNIX system. This account has
the ability to carry out all facets of system administration, including
adding accounts, changing user passwords, examining log files, installing
software, etc. Utmost care must be taken when using this account. It has no
security restrictions imposed upon it.
When you are signed in as, or acting as root, the shell prompt displays '#'
(if you are using bash and you haven’t customized the prompt as we discuss
elsewhere in this course). This convention is intended to serve as a warning
to you of the absolute power of this account.
1. At the command prompt, as root type useradd <username> and press the
ENTER key.
2. To set the initial password, type passwd <username> and press the ENTER
key. The New password: prompt is displayed.
3. Enter the password and press the ENTER key.
To confirm the password, the prompt Retype new password: is displayed.
4. Enter the password again and press the ENTER key.
The message passwd: all authentication tokens updated successfully. is
displayed.
Operations That Do Not Require root Privileges
A regular account user can perform some operations requiring special
permissions; however, the system configuration must allow such abilities to
be exercised.
SUID (Set owner User ID upon execution—similar to the Windows "run as"
feature) is a special kind of file permission given to a file. SUID provides
temporary permissions to a user to run a program with the permissions of the
file owner (which may be root) instead of the permissions held by the user.
su sudo
When elevating privilege, you need
When elevating privilege, you need
to enter the root password. Giving
to enter the user’s password and
the root password to a normal user
not the root password.
should never, ever be done.
Offers more features and is
considered more secure and more
Once a user elevates to the root configurable. Exactly what the user
account using su, the user can is allowed to do can be precisely
do anything that the root user can controlled and limited. By default
do for as long as the user wants, the user will either always have to
without being asked again for a keep giving their password to do
password. further operations with sudo, or
can avoid doing so for a
configurable time interval.
The command has limited logging The command has detailed logging
features. features.
sudo has the ability to keep track of unsuccessful attempts at gaining root
access. Users' authorization for using sudo is based on configuration
information stored in the /etc/sudoers file and in the /etc/sudoers.d
directory.
The file has a lot of documentation in it about how to customize. Most Linux
distributions now prefer you add a file in the directory /etc/sudoers.d with
a name the same as the user. This file contains the individual user's sudo
configuration, and one should leave the master configuration file untouched
except for changes that affect all users.
• Calling username
• Terminal info
• Working directory
• User account invoked
• Command with arguments
Running a command such as sudo whoami results in a log file entry such as:
Dec 8 14:20:47 server1 sudo: op : TTY=pts/6 PWD=/var/log USER=root
COMMAND=/usr/bin/whoami
Hard disks, for example, are represented as /dev/sd*. While a root user can
read and write to the disk in a raw fashion (for example, by doing something
like:
However, it is well known that many systems do not get updated frequently
enough and problems which have already been cured are allowed to remain on
computers for a long time; this is particularly true with proprietary
operating systems where users are either uninformed or distrustful of the
vendor's patching policy as sometimes updates can cause new problems and
break existing operations. Many of the most successful attack vectors come
from exploiting security holes for which fixes are already known but not
universally deployed.
For example, if you wish to experiment with SHA-512 encoding, the word
“test” can be encoded using the program sha512sum to produce the SHA-512
form (see graphic):
IT professionals follow several good practices for securing the data and the
password of every user.
1. Password aging is a method to ensure that users get prompts that remind
them to create a new password after a specific period. This can ensure
that passwords, if cracked, will only be usable for a limited amount of
time. This feature is implemented using chage, which configures the
password expiry information for a user.
2. Another method is to force users to set strong passwords using Pluggable
Authentication Modules (PAM). PAM can be configured to automatically
verify that a password created or modified using the passwd utility is
sufficiently strong. PAM configuration is implemented using a library
called pam_cracklib.so, which can also be replaced by pam_passwdqc.so
for more options.
3. One can also install password cracking programs, such as Jack The
Ripper, to secure the password file and detect weak password entries.
It is recommended that written authorization be obtained before
installing such tools on any system that you do not own.
You can secure the boot process with a secure password to prevent someone
from bypassing the user authentication step. For systems using the GRUB boot
loader, for the older GRUB version 1, you can invoke grub-md5-crypt which
will prompt you for a password and then encrypt as shown on the adjoining
screen.
You then must edit /boot/grub/grub.conf by adding the following line below
the timeout entry:
You can also force passwords for only certain boot choices rather than all.
For the now more common GRUB version 2 things are more complicated, and you
have more flexibility and can do things like use user-specific passwords,
which can be their normal login password. Also you never edit the
configuration file, /boot/grub/grub.cfg, directly, rather you edit system
configuration files in /etc/grub.d and then run update-grub. One explanation
of this can be found at https://help.ubuntu.com/community/Grub2/Passwords.
You have completed this chapter. Let’s summarize the key concepts covered:
• Using the user credentials, the system verifies the authenticity and
identity.
• The SHA-512 algorithm is typically used to encode passwords. They can be
encrypted but not decrypted.
• Pluggable Authentication Modules (PAM) can be configured to
automatically verify that passwords created or modified using the passwd
utility are strong enough (what is considered strong enough can also be
configured).
• Your IT security policy should start with requirements on how to
properly secure physical access to servers and workstations.
• Keeping your systems updated is an important step in avoiding security
attacks.
Chapter 12
Introduction to Networking
A network is a group of computers and computing devices connected together through
communication channels, such as cables or wireless media. The computers connected over
a network may be located in the same geographical area or spread across the world.
Devices attached to a network must have at least one unique network address identifier
known as the IP (Internet Protocol) address. The address is essential for routing
packets of information through the network.
Exchanging information across the network requires using streams of bite-sized packets,
each of which contains a piece of the information going from one machine to another.
These packets contain data buffers together with headers which contain information
about where the packet is going to and coming from, and where it fits in the sequence of
packets that constitute the stream. Networking protocols and software are rather
complicated due to the diversity of machines and operating systems they must deal with,
as well as the fact that even very old standards must be supported.
There are two different types of IP addresses available: IPv4 (version 4) and IPv6 (version
6). IPv4 is older and by far the more widely used, while IPv6 is newer and is designed to
get past the limitations of the older standard and furnish many more possible addresses.
IPv4 uses 32-bits for addresses; there are only 4.3 billion unique addresses available.
Furthermore, many addresses are allotted and reserved but not actually used. IPv4 is
becoming inadequate because the number of devices available on the global network has
significantly increased over the past years.
IPv6 uses 128-bits for addresses; this allows for 3.4 X 10 38 unique addresses. If you have
a larger network of computers and want to add more, you may want to move to IPv6,
because it provides more unique addresses. However, it is difficult to move to IPv6 as the
two protocols do not inter-operate. Due to this, migrating equipment and addresses to
IPv6 requires significant effort and hasn't been as fast as was originally intended.
Network address are divided into five classes: A, B, C, D, and E. Classes A, B, and C are
classified into two parts: Network addresses (Net ID) and Host address (Host ID).
The Net ID is used to identify the network, while the Host ID is used to identify a host in
the network. Class D is used for special multicast applications (information is broadcast to
multiple computers simultaneously) and Class E is reserved for future use. In this section
you will learn about classes A, B, and C.
Class A addresses use the first octet of an IP address as their Net ID and use the other
three octets as the Host ID. The first bit of the first octet is always set to zero. So you can
use only 7-bits for unique network numbers. As a result, there are a maximum of 127 Class
A networks available. Not surprisingly, this was only feasible when there were very few
unique networks with large numbers of hosts. As the use of the Internet expanded, Classes
B and C were added in order to accomodate the growing demand for independent
networks.
Each Class A network can have up to 16.7 million unique hosts on its network. The range
of host address is from 1.0.0.0 to 127.255.255.255.
Class B addresses use the first two octets of the IP address as their Net ID and the last two
octets as the Host ID. The first two bits of the first octet are always set to binary 10, so
there are a maximum of 16,384 (14-bits) Class B networks. The first octet of a Class B
address has values from 128 to 191. The introduction of Class B networks expanded the
number of networks but it soon became clear that a further level would be needed.
Each Class B network can support a maximum of 65,536 unique hosts on its network. The
range of host address is from 128.0.0.0 to 191.255.255.255.
Class C addresses use the first three octets of the IP address as their Net ID and the last
octet as their Host ID. The first three bits of the first octet are set to binary 110, so almost
2.1 million (21-bits) Class C networks are available. The first octet of a Class C address has
values from 192 to 223. These are most common for smaller networks which don't have
many unique hosts.
Each Class C network can support up to 256 (8-bits) unique hosts. The range of host
address is from 192.0.0.0 to 223.255.255.255.
Typically, a range of IP addresses are requested from your Internet Service Provider (ISP)
by your organization's network administrator. Often your choice of which class of IP
address you are given depends on the size of your network and expected growth needs.
You can assign IP addresses to computers over a network manually or dynamically. When
you assign IP addresses manually, you add static (never changing) addresses to the
network. When you assign IP addresses dynamically (they can change every time you
reboot or even more often), the Dynamic Host Configuration Protocol (DHCP) is used
to assign IP addresses.
Before an IP address can be allocated manually, one must identify the size of the network
by determining the host range; this determines which network class (A, B, or C) can be
used. The ipcalc program can be used to ascertain the host range.
Note: The version of ipcalc supplied in the Fedora family of distributions does
not behave as described below, it is really a different program.
Assume that you have a Class C network. The first three octets of the IP address are
192.168.0. As it uses 3 octets (i.e. 24 bits) for the network mask, the shorthand for this
type of address is 192.168.0.0/24. To determine the host range of the address you can use
for this new host, at the command prompt, type: ipcalc 192.168.0.0/24 and press Enter.
From the result, you can check the HostMin and HostMax values to manually assign a
static address available from 1 to 254 (192.168.0.1 to 192.168.0.254).
Given an IP address, you can obtain its corresponding hostname. Accessing the machine
over the network becomes easier when you can type the hostname instead of the IP
address.
You can view your system’s hostname simply by typing hostname with no argument.
Note: If you give an argument, the system will try to change its hostname to
match it, however, only root users can do that.
The special hostname localhost is associated with the IP address 127.0.0.1, and describes
the machine you are currently on (which normally has additional network-related IP
addresses).
Note: The next two screens cover the demonstration and Try-It-Yourself activity.
You can view a demonstration and practice the procedure through the Try-It-
Yourself activity.
Network interfaces are a connection channel between a device and a network. Physically,
network interfaces can proceed through a network interface card (NIC) or can be more
abstractly implemented as software. You can have multiple network interfaces operating at
once. Specific interfaces can be brought up (activated) or brought down (de-activated) at
any time.
A list of currently active network interfaces is reported by the ifconfig utility which you
may have to run as the superuser, or at least, give the full path, i.e., /sbin/ifconfig, on
some distributions.
Network configuration files are essential to ensure that interfaces function correctly.
For Fedora family system configuration, the routing and host information is contained
in /etc/sysconfig/network. The network interface configuration script is located at
/etc/sysconfig/network-scripts/ifcfg-eth0.
For SUSE family system configuration, the routing and host information and network
interface configuration scripts are contained in the /etc/sysconfig/network directory.
You can type /etc/init.d/network start to start the networking configuration for Fedora and
SUSE families.
ip is a very powerful program that can do many things. Older (and more specific) utilities
such as ifconfig and route are often used to accomplish similar tasks. A look at the
relevant man pages can tell you much more about these utilities.
ping is used to check whether or not a machine attached to the network can receive and
send data; i.e., it confirms that the remote host is online and is responding.
To check the status of the remote host, at the command prompt, type ping <hostname>.
ping is frequently used for network testing and management; however, its usage can
increase network load unacceptably. Hence, you can abort the execution of ping by
typing CTRL-C, or by using the -c option, which limits the number of packets that ping will
send before it quits. When execution stops, a summary is displayed.
A network requires the connection of many nodes. Data moves from source to destination
by passing through a series of routers and potentially across multiple networks. Servers
maintain routing tables containing the addresses of each node in the network. The IP
Routing protocols enable routers to build up a forwarding table that correlates final
destinations with the next hop addresses.
route is used to view or change the IP routing table. You may want to change the IP
routing table to add, delete or modify specific (static ) routes to specific hosts or networks.
The table explains some commands that can be used to manage IP routing.
Task Command
Show current routing table $ route –n
Add static route $ route add -net address
Delete static route $ route del -net address
traceroute is used to inspect the route which the data packet takes to reach the
destination host which makes it quite useful for troubleshooting network delays and errors.
By using traceroute you can isolate connectivity issues between hops, which helps
resolve them faster.
To print the route taken by the packet to reach the network host, at the command prompt,
type traceroute <domain>.
Now, let’s learn about some additional networking tools. Networking tools are very useful
for monitoring and debugging network problems, such as network connectivity and
network traffic.
• Firefox
• Google Chrome
• Chromium
• Epiphany
• Opera
Sometimes you either do not have a graphical environment to work in (or have reasons not
to use it) but still need to access web resources. In such a case, you can use non-graphical
browsers such as the following:
Besides downloading you may want to obtain information about a URL, such as the source
code being used. curl can be used from the command line or a script to read such
information. curl also allows you to save the contents of a web page to a file as
does wget.
You can read a URL using curl <URL>. For example, if you want to read
http://www.linuxfoundation.org , type curl http://www.linuxfoundation.org.
To get the contents of a web page and store it to a file, type curl -o saved.html
http://www.mysite.com. The contents of the main index file at the website will be saved
in saved.html.
When you are connected to a network, you may need to transfer files from one machine to
another. File Transfer Protocol (FTP) is a well-known and popular method for
transferring files between computers using the Internet. This method is built on a client-
server model. FTP can be used within a browser or with standalone client programs.
FTP clients enable you to transfer files with remote computers using the FTP protocol.
These clients can be either graphical or command line tools. Filezilla, for example, allows
use of the drag-and-drop approach to transfer files between hosts. All web browsers
support FTP, all you have to do is give a URL like : ftp://ftp.kernel.org where the
usual http:// becomes ftp://.
• ftp
• sftp
• ncftp
• yafc (Yet Another FTP Client)
sftp is a very secure mode of connection, which uses the Secure Shell (ssh) protocol,
which we will discuss shortly. sftp encrypts its data and thus sensitive information is
transmitted more securely. However, it does not work with so-called anonymous FTP
(guest user credentials). Both ncftp and yafc are also powerful FTP clients which work on
a wide variety of operating systems including Windows and Linux.
Note: The next two screens cover the demonstration and Try-It-Yourself activity.
You can view a demonstration and practice the procedure through the Try-It-
Yourself activity.
Secure Shell (SSH) is a cryptographic network protocol used for secure data
communication. It is also used for remote services and other secure services between two
devices on the network and is very useful for administering systems which are not easily
available to physically work on but to which you have remote access.
To run my_command on a remote system via SSH, at the command prompt, type, ssh
<remotesystem> my_command and press Enter. ssh then prompts you for the remote
password. You can also configure ssh to securely allow your remote access without typing
a password each time.
We can also move files securely using Secure Copy (scp) between two networked hosts.
scp uses the SSH protocol for transferring data.
To copy a local file to a remote system, at the command prompt, type scp <localfile>
<user@remotesystem>:/home/user/ and press Enter.
You will receive a prompt for the remote password. You can also configure scp so that it
does not prompt for a password for each transfer.
You have completed this chapter. Let’s summarize the key concepts covered:
• The IP (Internet Protocol) address is a unique logical network address that is assigned
to a device on a network.
• IPv4 uses 32-bits for addresses and IPv6 uses 128-bits for addresses.
• Every IP address contains both a network and a host address field.
• There are five classes of network addresses available: A, B, C, D & E.
• DNS (Domain Name System) is used for converting Internet domain and host names
to IP addresses.
• The ifconfig program is used to display current active network interfaces.
• The commands ip addr show and ip route show can be used to view IP address and
routing information.
• You can use ping to check if the remote host is alive and responding.
You can use the route utility program to manage IP routing.
• You can monitor and debug network problems using networking tools.
• Firefox, Google Chrome, Chromium, and Epiphany are the main graphical
browsers used in Linux.
• Non-graphical or text browsers used in Linux are Lynx, Links, and w3m.
• You can use wget to download webpages.
• You can use curl to obtain information about URL's.
• FTP (File Transfer Protocol) is used to transfer files over a network.
• ftp, sftp, ncftp, and yafc are command line FTP clients used in Linux.
• You can use ssh to run commands on remote systems.
Chapter 13
Command Line Tools
Irrespective of the role you play with Linux (system administrator,
developer, or user) you often need to browse through and parse text files,
and/or extract data from them. These are file manipulation operations. Thus
it is essential for the Linux user to become adept at performing certain
operations on files.
Most of the time such file manipulation is done at the command line which allows users to perform tasks more
efficiently than while using a GUI. Furthermore the command line is more suitable for automating often executed
tasks.
Indeed, experienced system administrators write customized scripts to accomplish such repetitive tasks,
standardized for each particular environment. We will discuss such scripting later in much detail.
In this section, we will concentrate on command line file and text manipulation related utilities.
cat is short for concatenate and is one of the most frequently used Linux
command line utilities. It is often used to read and print files as well as
for simply viewing file contents. To view a file, use the following
command:
$ cat <filename>
For example, cat readme.txt will display the contents of readme.txt on the terminal. Often the main
purpose of cat, however, is to combine (concatenate) multiple files together. You can perform the actions listed in
the following table using cat:
Command Usage
cat file1 file2 Concatenate multiple files and display the output; i.e., the entire
content of the first file is followed by that of the second file.
cat file1 file2 > newfile Combine multiple files and save the output into a new file.
cat file >> existingfile Append a file to the end of an existing file.
cat > file Any subsequent lines typed will go into the file until CTRL-D is
typed.
cat >> file Any subsequent lines are appended to the file until CTRL-D is
typed.
The tac command (cat spelled backwards) prints the lines of a file in reverse order. (Each line remains the same
but the order of lines is inverted.) The syntax of tac is exactly the same as for cat as in:
$ tac file
$ tac file1 file2 > newfile
cat can be used to read from standard input (such as the terminal window) if
no files are specified. You can use the > operator to create and add lines
into a new file, and the >> operator to append lines (or files) to an
existing file.
To create a new file, at the command prompt type cat > <filename> and press the Enter key.
This command creates a new file and waits for the user to edit/enter the text. After you finish typing the required
text, press CTRL-D at the beginning of the next line to save and exit the editing.
Another way to create a file at the terminal is cat > <filename> << EOF. A new file is created and you can
type the required input. To exit, enter EOF at the beginning of a line.
Note that EOF is case sensitive. (One can also use another word, such as STOP.)
Note: The next few screens cover the demonstration and Try-It-Yourself activity. You can view a
demonstration and practice the procedure through the Try-It-Yourself activity.
echo
echo simply displays (echoes) text. It is used simply as in:
$ echo string
echo can be used to display a string on standard output (i.e., the terminal) or to place in a new file (using the >
operator) or append to an already existing file (using the >> operator).
The –e option along with the following switches is used to enable special character sequences, such as the
newline character or horizontal tab.
• \n represents newline
• \t represents horizontal tab
echo is particularly useful for viewing the values of environment variables (built-in shell variables). For example,
echo $USERNAME will print the name of the user who has logged into the current terminal.
Command Usage
echo string > newfile The specified string is placed in a new file.
echo string >> existingfile The specified string is appended to the end of an already existing
file.
Note that many Linux users and administrators will write scripts using more comprehensive language utilities
such as python and perl, rather than use sed and awk (and some other utilities we'll discuss later.) Using such
utilities is certainly fine in most circumstances; one should always feel free to use the tools one is experienced
with. However, the utilities that are described here are much lighter; i.e., they use fewer system resources, and
execute faster. There are times (such as during booting the system) where a lot of time would be wasted using the
more complicated tools, and the system may not even be able to run them. So the simpler tools will always be
needed.
sed is a powerful text processing tool and is one of the oldest earliest and
most popular UNIX utilities. It is used to modify the contents of a file,
usually placing the contents into a new file. Its name is an abbreviation
for stream editor.
sed can filter text as well as perform substitutions in data streams, working like a churn-mill.
Data from an input source/file (or stream) is taken and moved to a working space. The entire list of
operations/modifications is applied over the data in the working space and the final contents are moved to the
standard output space (or stream).
Command Usage
sed -e command <filename> Specify editing commands at the command line, operate on
file and put the output on standard out (e.g., the terminal)
Now that you know that you can perform multiple editing and filtering
operations with sed, let’s explain some of them in more detail. The table
explains some basic operations, where pattern is the current string
and replace_string is the new string:
Command Usage
sed s/pattern/replace_string/
file Substitute first string occurrence in a line
sed s/pattern/replace_string/g
file Substitute all string occurrences in a line
sed
1,3s/pattern/replace_string/g Substitute all string occurrences in a range of lines
file
sed -i
s/pattern/replace_string/g file Save changes for string substitution in the same file
You must use the -i option with care, because the action is not reversible. It is always safer to use sed without the
–i option and then replace the file yourself, as shown in the following example:
awk
awk is used to extract and then print specific contents of a file and is often used to construct reports. It was
created at Bell Labs in the 1970s and derived its name from the last names of its authors: Alfred Aho, Peter
Weinberger, and Brian Kernighan.
Command Usage
awk ‘command’ var=value file Specify a command directly at the command line
Specify a file that contains the script to be executed
awk -f scriptfile var=value file
along with f
As with sed, short awk commands can be specified directly at the command line, but a more complex script can
be saved in a file that you can specify using the -f option.
The command/action in awk needs to be surrounded with apostrophes (or single-quote (')). awk can be used as
follows:
Command Usage
awk '{ print $0 }' /etc/passwd Print entire file
awk -F: '{ print $1 }'
/etc/passwd Print first field (column) of every line, separated by a space
Most of the time such file manipulation is done at the command line which allows users to perform tasks more
efficiently than while using a GUI. Furthermore the command line is more suitable for automating often executed
tasks.
Indeed, experienced system administrators write customized scripts to accomplish such repetitive tasks,
standardized for each particular environment. We will discuss such scripting later in much detail.
In this section, we will concentrate on command line file and text manipulation related utilities.
cat is short for concatenate and is one of the most frequently used Linux
command line utilities. It is often used to read and print files as well as
for simply viewing file contents. To view a file, use the following
command:
$ cat <filename>
For example, cat readme.txt will display the contents of readme.txt on the terminal. Often the main
purpose of cat, however, is to combine (concatenate) multiple files together. You can perform the actions listed in
the following table using cat:
Command Usage
cat file1 file2 Concatenate multiple files and display the output; i.e., the entire
content of the first file is followed by that of the second file.
cat file1 file2 > newfile Combine multiple files and save the output into a new file.
cat file >> existingfile Append a file to the end of an existing file.
cat > file Any subsequent lines typed will go into the file until CTRL-D is
typed.
cat >> file Any subsequent lines are appended to the file until CTRL-D is
typed.
The tac command (cat spelled backwards) prints the lines of a file in reverse order. (Each line remains the same
but the order of lines is inverted.) The syntax of tac is exactly the same as for cat as in:
$ tac file
$ tac file1 file2 > newfile
cat can be used to read from standard input (such as the terminal window) if
no files are specified. You can use the > operator to create and add lines
into a new file, and the >> operator to append lines (or files) to an
existing file.
To create a new file, at the command prompt type cat > <filename> and press the Enter key.
This command creates a new file and waits for the user to edit/enter the text. After you finish typing the required
text, press CTRL-D at the beginning of the next line to save and exit the editing.
Another way to create a file at the terminal is cat > <filename> << EOF. A new file is created and you can
type the required input. To exit, enter EOF at the beginning of a line.
Note that EOF is case sensitive. (One can also use another word, such as STOP.)
Note: The next few screens cover the demonstration and Try-It-Yourself activity. You can view a
demonstration and practice the procedure through the Try-It-Yourself activity.
echo
echo simply displays (echoes) text. It is used simply as in:
$ echo string
echo can be used to display a string on standard output (i.e., the terminal) or to place in a new file (using the >
operator) or append to an already existing file (using the >> operator).
The –e option along with the following switches is used to enable special character sequences, such as the
newline character or horizontal tab.
• \n represents newline
• \t represents horizontal tab
echo is particularly useful for viewing the values of environment variables (built-in shell variables). For example,
echo $USERNAME will print the name of the user who has logged into the current terminal.
Command Usage
echo string > newfile The specified string is placed in a new file.
echo string >> existingfile The specified string is appended to the end of an already existing
file.
Note that many Linux users and administrators will write scripts using more comprehensive language utilities
such as python and perl, rather than use sed and awk (and some other utilities we'll discuss later.) Using such
utilities is certainly fine in most circumstances; one should always feel free to use the tools one is experienced
with. However, the utilities that are described here are much lighter; i.e., they use fewer system resources, and
execute faster. There are times (such as during booting the system) where a lot of time would be wasted using the
more complicated tools, and the system may not even be able to run them. So the simpler tools will always be
needed.
sed is a powerful text processing tool and is one of the oldest earliest and
most popular UNIX utilities. It is used to modify the contents of a file,
usually placing the contents into a new file. Its name is an abbreviation
for stream editor.
sed can filter text as well as perform substitutions in data streams, working like a churn-mill.
Data from an input source/file (or stream) is taken and moved to a working space. The entire list of
operations/modifications is applied over the data in the working space and the final contents are moved to the
standard output space (or stream).
Command Usage
sed -e command <filename> Specify editing commands at the command line, operate on
file and put the output on standard out (e.g., the terminal)
Command Usage
sed s/pattern/replace_string/
file Substitute first string occurrence in a line
sed s/pattern/replace_string/g
file Substitute all string occurrences in a line
sed
1,3s/pattern/replace_string/g Substitute all string occurrences in a range of lines
file
sed -i
s/pattern/replace_string/g file Save changes for string substitution in the same file
You must use the -i option with care, because the action is not reversible. It is always safer to use sed without the
–i option and then replace the file yourself, as shown in the following example:
Note: The next few screens cover the demonstration and Try-It-Yourself activity. You can view a
demonstration and practice the procedure through the Try-It-Yourself activity.
awk
awk is used to extract and then print specific contents of a file and is often used to construct reports. It was
created at Bell Labs in the 1970s and derived its name from the last names of its authors: Alfred Aho, Peter
Weinberger, and Brian Kernighan.
Command Usage
awk ‘command’ var=value file Specify a command directly at the command line
Specify a file that contains the script to be executed
awk -f scriptfile var=value file
along with f
As with sed, short awk commands can be specified directly at the command line, but a more complex script can
be saved in a file that you can specify using the -f option.
The command/action in awk needs to be surrounded with apostrophes (or single-quote (')). awk can be used as
follows:
Command Usage
awk '{ print $0 }' /etc/passwd Print entire file
awk -F: '{ print $1 }'
/etc/passwd Print first field (column) of every line, separated by a space
• sort
• uniq
• paste
• join
• split
You will also learn about regular expressions and search patterns.
sort
sort is used to rearrange the lines of a text file either in ascending or descending order, according to a sort key.
You can also sort by particular fields of a file. The default sort key is the order of the ASCII characters (i.e.,
essentially alphabetically).
Syntax Usage
sort <filename> Sort the lines in the specified file
cat file1 file2 | Append the two files, then sort the lines and display the output on the
sort terminal
sort -r <filename> Sort the lines in reverse order
When used with the -u option, sort checks for unique values after sorting the records (lines). It is equivalent to
running uniq (which we shall discuss) on the output of sort.
uniq is used to remove duplicate lines in a text file and is useful for
simplifying text display. uniq requires that the duplicate entries to be
removed are consecutive. Therefore one often runs sort first and then pipes
the output into uniq; if sort is passed the -u option it can do all this in
one step. In the example shown, the file is called names and was originally
Ted, Bob, Alice, Bob, Carol, Alice.
To remove duplicate entries from some files, use the following command: sort file1 file2 | uniq >
file3
OR
sort -u file1 file2 > file3
To count the number of duplicate entries, use the following command: uniq -c filename
Note: The next screen covers the Try-It-Yourself activity through which you can practice the procedure.
paste
Suppose you have a file that contains the full name of all employees and another file that lists their phone numbers
and Employee IDs. You want to create a new file that contains all the data listed in three columns: name,
employee ID, and phone number. How can you do this effectively without investing too much time?
paste can be used to create a single file containing all three columns. The different columns are identified based
on delimiters (spacing used to separate two fields). For example, delimiters can be a blank space, a tab, or an
Enter. In the image provided, a single space is used as the delimiter in all files.
• -d delimiters, which specify a list of delimiters to be used instead of tabs for separating consecutive values
on a single line. Each delimiter is used in turn; when the list has been exhausted, paste begins again at the
first delimiter.
• -s, which causes paste to append the data in series rather than in parallel; that is, in a horizontal rather than
vertical fashion.
paste can be used to combine fields (such as name or phone number) from
different files as well as combine lines from multiple files. For example,
line one from file1 can be combined with line one of file2, line two from
file1 can be combined with line two of file2, and so on.
join
Suppose you have two files with some similar columns. You have saved employees’ phone numbers in two files,
one with their first name and the other with their last name. You want to combine the files without repeating the
data of common columns. How do you achieve this?
The above task can be achieved using join, which is essentially an enhanced version of paste. It first checks
whether the files share common fields, such as names or phone numbers, and then joins the lines in two files
based on a common field.
To combine two files on a common field, at the command prompt type join
file1 file2 and press the Enter key.
For example, the common field (i.e., it contains the same values) among the phonebook and directory files is the
phone number, as shown by the output of the following cat commands:
$ cat phonebook
555-123-4567 Bob
555-231-3325 Carol
555-340-5678 Ted
555-289-6193 Alice
$ cat directory
555-123-4567 Anytown
555-231-3325 Mytown
555-340-5678 Yourtown
555-289-6193 Youngstown
The result of joining these two file is as shown in the output of the following command:
$ join phonebook directory
555-123-4567 Bob Anytown
555-231-3325 Carol Mytown
555-340-5678 Ted Yourtown
555-289-6193 Alice Youngstown
split
split is used to break up (or split) a file into equal-sized segments for easier viewing and manipulation, and is
generally used only on relatively large files. By default split breaks up a file into 1,000-line segments. The
original file remains unchanged, and set of new files with the same name plus an added prefix is created. By
default, the x prefix is added. To split a file into segments, use the command split infile.
To split a file into segments using a different prefix, use the command split infile <Prefix>.
$ wc -l american-english
99171 american-english
where we have used the wc program (soon to be discussed) to report on the number of lines in the file. Then
typing:
$ split american-english dictionary
will split the american-english file into equal-sized segments named 'dictionary'.
$ ls -l dictionary*
-rw-rw-r 1 me me 8552 Mar 23 20:19 dictionaryab
-rw-rw-r 1 me me 8653 Mar 23 20:19 dictionaryaa
. . .
Many text editors and utilities such as vi, sed, awk, find and grep work extensively with regular expressions.
Some of the popular computer languages that use regular expressions include Perl, Python and Ruby. It can get
rather complicated and there are whole books written about regular expressions; we'll only skim the surface here.
These regular expressions are different from the wildcards (or "metacharacters") used in filename matching in
command shells such as bash (which were covered in the earlier Chapter on Command Line Operations). The
table lists search patterns and their usage.
Some of the patterns that can be applied to this sentence are as follows:
Command Usage
a.. matches
b.|j. matches both
..$ matches
l.* matches
l.*y matches
the.* matches the whole
sentence
grep
grep is extensively used as a primary text searching tool. It scans files for specified patterns and can be used with
regular expressions as well as simple strings as shown in the table.
Command Usage
grep [pattern] <filename> Search for a pattern in a file and print all matching lines
grep -v [pattern] <filename> Print all lines that do not match the pattern
grep [0-9] <filename> Print the lines that contain the numbers 0 through 9
grep -C 3 [pattern] Print context of lines (specified number of lines above and
<filename> below the pattern) for matching the pattern. Here the number of
lines is specified as 3.
Tr
In this section, you will learn about some additional text utilities that
you can use for performing various actions on your Linux files, such as
changing the case of letters or determining the count of words, lines, and
characters in a file.
The tr utility is used to translate specified characters into other characters or to delete them. The general syntax is
as follows:
$ tr [options] set1 [set2]
The items in the square brackets are optional. tr requires at least one argument and accepts a maximum of two.
The first, designated set1 in the example, lists the characters in the text to be replaced or removed. The second,
set2, lists the characters that are to be substituted for the characters listed in the first argument. Sometimes these
sets need to be surrounded by apostrophes (or single-quotes (')) in order to have the shell ignore that they mean
something special to the shell. It is usually safe (and may be required) to use the single-quotes around each of the
sets as you will see in the examples below.
For example, suppose you have a file named city containing several lines of text in mixed case. To translate all
lower case characters to upper case, at the command prompt type cat city | tr a-z A-Z and press the
Enter key.
Command Usage
$ tr abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ Convert lower case to upper case
$ tr '{}' '()' < inputfile > outputfile Translate braces into parenthesis
$ echo "This is for testing" | tr
[:space:] '\t' Translate white-space to tabs
$ echo "the geek stuff" | tr -d 't' Delete specified characters using -d option
$ echo "my username is 432234" | tr -cd
[:digit:] Complement the sets using -c option
tee takes the output from any command, and while sending it to standard
output, it also saves it to a file. In other words, it "tees" the output
stream from the command: one stream is displayed on the standard output and
the other is saved to a file.
For example, to list the contents of a directory on the screen and save the output to a file, at the command prompt
type ls -l | tee newfile and press the Enter key.
For example, to print the number of lines contained in a file, at the command prompt type wc -l filename
and press the Enter key.
Option Description
–l display the number of lines.
For example, to display the third column delimited by a blank space, at the command prompt type ls -l |
cut -d" " -f3 and press the Enter key.
Note: The next two screens cover the Try-It-Yourself activities through which you can practice the
procedures.
For example, a banking system might maintain one simple large log file to record details of all of one day's ATM
transactions. Due to a security attack or a malfunction, the administrator might be forced to check for some data
by navigating within the file. In such cases, directly opening the file in an editor will cause issues, due to high
memory utilization, as an editor will usually try to read the whole file into memory first. However, one can use
less to view the contents of such a large file, scrolling up and down page by page without the system having to
place the entire file in memory before starting. This is much faster than using a text editor.
Viewing the file can be done by typing either of the two following commands:
$ less <filename>
$ cat <filename> | less
By default, manual (i.e., the man command) pages are sent through the less command.
head
head reads the first few lines of each named file (10 by default) and
displays it on standard output. You can give a different number of lines in
an option.
For example, If you want to print the first 5 lines from atmtrans.txt, use the following command:
$ head –n 5 atmtrans.txt
For example, to display the last 15 lines of atmtrans.txt, use the following command:
$ tail -n 15 atmtrans.txt
(You can also just say tail -15 atmtrans.txt.) To continually monitor new output in a growing log file:
$ tail -f atmtrans.txt
This command will continuously display any new lines of output in atmtrans.txt as soon as they appear. Thus it
enables you to monitor any current activity that is being reported and recorded.
Command Description
$ zcat compressed-file.txt.gz To view a compressed file
$ zless <filename>.gz To page through a compressed file
or
$ zmore <filename>.gz
$ zgrep -i less test-
file.txt.gz To search inside a compressed file
$ zdiff filename1.txt.gz
filename2.txt.gz To compare two compressed files
Note that if you run zless on an uncompressed file, it will still work and ignore the decompression stage. There are
also equivalent utility programs for other compression methods besides gzip; i.e, we have bzcat and
bzless associated with bzip2, and xzcat and xzless associated with xz.
Note: The next two screens cover the Try-It-Yourself activities through which you can practice the
procedures.
You have completed this chapter. Let’s summarize the key concepts covered:
• The command line often allows the users to perform tasks more efficiently than the GUI.
• cat, short for concatenate, is used to read, print and combine files.
• echo displays a line of text either on standard output or to place in a file.
• sed is a popular stream editor often used to filter and perform substitutions on files and text data streams.
• awk is a interpreted programming language typically used as a data extraction and reporting tool.
• sort is used to sort text files and output streams in either ascending or descending order.
uniq eliminates duplicate entries in a text file.
• paste combines fields from different files and can also extract and combine lines from multiple sources.
• join combines lines from two files based on a common field. It works only if files share a common field.
• split breaks up a large file into equal-sized segments.
• Regular expressions are text strings used for pattern matching. The pattern can be used to search for a
specific location, such as the start or end of a line or a word.
• grep searches text files and data streams for patterns and can be used with regular expressions.
• tr translates characters, copies standard input to standard output, and handles special characters.
tee accepts saves a copy of standard output to a file while still displaying
at the terminal.
• wc (word count) displays the number of lines, words and characters in a file or group of files.
• cut extracts columns from a file.
• less views files a page at a time and allows scrolling in both directions.
• head displays the first few lines of a file or data stream on standard output. By default it displays 10 lines.
• tail displays the last few lines of a file or data stream on standard output. By default it displays 10 lines.
• strings extracts printable character strings from binary files.
• The z command family is used to read and work with compressed files.
Chapter 14
Introduction to Printing
To manage printers and print directly from a computer or across a networked
environment, you need to know how to configure and install a printer.
Printing itself requires software that converts information from the
application you are using to a language your printer can understand. The
Linux standard for printing software is the Common UNIX Printing System
(CUPS).
CUPS is the software that is used behind the scenes to print from
applications like a web browser or LibreOffice. It converts page
descriptions produced by your application (put a paragraph here, draw a line
there, and so forth) and then sends the information to the printer. It acts
as a print server for local as well as network printers.
Printers manufactured by different companies may use their own particular print languages and formats. CUPS
uses a modular printing system which accommodates a wide variety of printers and also processes various data
formats. This makes the printing process simpler; you can concentrate more on printing and less on how to print.
Generally, the only time you should need to configure your printer is when you use it for the first time. In
fact, CUPS often figures things out on its own by detecting and configuring any printers it locates.
• Configuration Files
• Scheduler
• Job Files
• Log Files
• Filter
• Printer Drivers
• Backend
You will learn about each of these components in detail in the next few screens.
Scheduler
CUPS is designed around a print scheduler that manages print jobs, handles administrative commands, allows
users to query the printer status, and manages the flow of data through all CUPS components.
As you'll see shortly, CUPS has a browser-based interface which allows you to view and manipulate the order and
status of pending print jobs.
Configuration Files
The print scheduler reads server settings from several configuration files, the two most important of which are
cupsd.conf and printers.conf. These and all other CUPS related configuration files are stored under
the /etc/cups/ directory.
cupsd.conf is where most system-wide settings are located; it does not contain any printer-specific details.
Most of the settings available in this file relate to network security, i.e. which systems can access CUPS network
capabilities, how printers are advertised on the local network, what management features are offered, and so on.
printers.conf is where you will find the printer-specific settings. For every printer connected to the system,
a corresponding section describes the printer’s status and capabilities. This file is generated only after adding a
printer to the system and should not be modified by hand.
You can view the full list of configuration files by typing: ls -l /etc/cups/
Job Files
CUPS stores print requests as files under the /var/spool/cups directory (these can actually be accessed
before a document is sent to a printer). Data files are prefixed with the letter d while control files are prefixed with
the letter c. After a printer successfully handles a job, data files are automatically removed. These data files belong
to what is commonly known as the print queue.
Log Files
Log files are placed in /var/log/cups and are used by the scheduler to record activities that have taken
place. These files include access, error, and page records.
(Note on some distributions permissions are set such that you don't need the sudo.) You can view the log files with
the usual tools.
So In short, when you execute a print command, the scheduler validates the command and processes the print job
creating job files according to the settings specified in configuration files. Simultaneously, the scheduler records
activities in the log files. Job files are processed with the help of the filter, printer driver, and backend, and then
sent to the printer.
Installing CUPS
Due to printing being a relatively important and fundamental feature of any Linux distribution, most Linux
systems come with CUPS preinstalled. In some cases, especially for Linux server setups, CUPS may have been
left uninstalled. This may be fixed by installing the corresponding package manually. To install CUPS, please
ensure that your system is connected to the Internet.
Managing CUPS
After installing CUPS, you'll need to start and manage the CUPS daemon so that CUPS is ready for configuring a
printer. Managing the CUPS daemon is simple; all management features are wrapped around the cups init script,
which can be easily started, stopped, and restarted.
When configuring a printer, make sure the device is currently turned on and connected to the system; if so it
should show up in the printer selection menu. If the printer is not visible, you may want to troubleshoot using
tools that will determine if the printer is connected. For common USB printers, for example, the lsusb utility will
show a line for the printer. Some printer manufacturers also require some extra software to be installed in order to
make the printer visible to CUPS, however, due to the standardization these days, this is rarely required.
– Local/remote printers
– Share a printer as a CUPS server
• Control print jobs: – Monitor jobs
Some pages require a username and password to perform certain actions, for example to add a printer. For most
Linux distributions, you must use the root password to add, modify, or delete printers or classes.
These commands are useful in cases where printing operations must be automated (from shell scripts, for instance,
which contain multiple commands in one file). You will learn more about the shell scripts in the upcoming
chapters on bash scripts.
lp is just a command line front-end to the lpr utility that passes input to lpr. Thus, we will discuss only lp in
detail. In the example shown here, the task is to print the file called test1.txt.
Click the image to view an enlarged version.
Using lp
lp and lpr accept command line options that help you perform all operations that the GUI can
accomplish. lp is typically used with a file name as an argument.
Some lp commands and other printing utilities you can use are listed in the table.
Command Usage
lp <filename> To print the file to default printer
lp -d printer To print to a specific printer (useful if multiple
<filename> printers are available)
program | lp
echo string | lp To print the output of a program
In Linux, command line print job management commands allow you to monitor the job state as well as managing
the listing of all printers and checking their status, and cancelling or moving print jobs to another printer.
Command Usage
lpstat -p -d To get a list of available printers, along with their status
• It can be used on any printer that is PostScript-compatible; i.e., any modern printer
• Any program that understands the PostScript specification can print to it
• Information about page appearance, etc. is embedded in the page
The commands that can be used with enscript are listed in the table below (for a file called 'textfile.txt').
Command Usage
enscript -p
psfile.ps Convert a text file to PostScript (saved to psfile.ps)
textfile.txt
enscript -n -p
psfile.ps Convert a text file to n columns where n=1-9 (saved
textfile.txt in psfile.ps)
enscript
textfile.txt Print a text file directly to the default printer
1. Evince is available on virtually all distributions and the most widely used program.
2. Okular is based on the older kpdf and available on any distribution that provides the KDE environment.
3. GhostView is one of the first open source PDF readers and is universally available.
4. Xpdf is one of the oldest open source PDF readers and still has a good user base.
All of these open source PDF readers support and can read files following the PostScript standard unlike the
proprietary Adobe Acrobat Reader, which was once widely used on Linux systems but with the growth of these
excellent programs, very few Linux users use it today.
Next
On openSUSE:
$ sudo zypper install pdftk
You may find that CentOS (and RHEL) don't have pdftk in their packaging system, but you can obtain the PDF
Toolkit directly from the PDF Lab’s website by downloading from:
http://www.pdflabs.com/docs/install-pdftk-on-redhat-or-centos/
Using pdftk
You can accomplish a wide variety of tasks using pdftk including:
Command Usage
pdftk 1.pdf 2.pdf cat Merge the two documents 1.pdf and 2.pdf. The output will be
output 12.pdf saved to 12.pdf.
pdftk A=1.pdf cat A1-2 Write only pages 1 and 2 of 1.pdf. The output will be saved
output new.pdf to new.pdf.
pdftk A=1.pdf cat A1- Rotate all pages of 1.pdf 90 degrees clockwise and save result
endright output new.pdf in new.pdf.
When you run this command, you will receive a prompt to set the required password, which can have a maximum
of 32 characters. A new file, private.pdf, will be created with the identical content as public.pdf, but
anyone will need to type the password to be able to view it.
pdfinfo can extract information about PDF files, especially when the files are very large or when a graphical
interface is not available.
flpsed can add data to a PostScript document. This tool is specifically useful for filling in forms or adding short
comments into the document.
pdfmod is a simple application that provides a graphical interface for modifying PDF documents. Using this tool,
you can reorder, rotate, and remove pages; export images from a document; edit the title, subject, and author; add
keywords; and combine documents using drag-and-drop action.
For example, to collect the details of a document, you can use the following command:
$ pdfinfo /usr/share/doc/readme.pdf
Converting Between PostScript and PDF
Most users today are far more accustomed to working with files in PDF format, viewing them easily either on the
Internet through their browser or locally on their machine. The PostScript format is still important for various
technical reasons that the general user will rarely have to deal with.
From time to time you may need to convert files from one format to the other, and there are very simple utilities
for accomplishing that task. ps2pdf and pdf2ps are part of the ghostscript package installed on or available on all
Linux distributions. As an alternative, there are pstopdf and pdftops which are usually part of the poppler
package which may need to be added through your package manager. Unless you are doing a lot of conversions or
need some of the fancier options (which you can read about in the man pages for these utilities) it really doesn't
matter which ones you use.
Command Usage
pdf2ps file.pdf Converts file.pdf to file.ps
ps2pdf file.ps Convertsfile.ps to file.pdf
pstopdf input.ps output.pdf Converts input.ps to output.pdf
pdftops input.pdf output.pdf Converts input.pdf to output.ps
You have completed this chapter. Let’s summarize the key concepts covered:
• CUPS provides two command-line interfaces: the System V and BSD interfaces.
• The CUPS interface is available at http://localhost:631
• lp and lpr are used to submit a document to CUPS directly from the command line.
• lpoptions can be used to set printer options and defaults.
• PostScript effectively manages scaling of fonts and vector graphics to provide quality prints.
• enscript is used to convert a text file to PostScript and other formats.
Portable Document Format (PDF) is the standard format used to exchange documents while ensuring a
certain level of consistency in the way the documents are viewed.
• pdftk joins and splits PDFs; pulls single pages from a file; encrypts and decrypts PDF files; adds, updates,
and exports a PDF’s metadata; exports bookmarks to a text file; adds or removes attachments to a PDF; fixes
a damaged PDF; and fills out PDF forms.
• pdfinfo can extract information about PDF documents.
• flpsed can add data to a PostScript document.
• pdfmod is a simple application with a graphical interface that you can use to modify PDF documents.
Chapter 15
bash shell scripting
Introduction to Scripts
Suppose you want to look up a filename, check if the associated file exists, and then respond accordingly,
displaying a message confirming or not confirming the file's existence. If you only need to do it once, you can just
type a sequence of commands at a terminal. However, if you need to do this multiple times, automation is the way
to go. In order to automate sets of commands you’ll need to learn how to write shell scripts, the most common of
which are used with bash. The graphic illustrates several of the benefits of deploying scripts.
The #!/bin/bash in the first line should be recognized by anyone who has developed any kind of script in
UNIX environments. The first line of the script, that starts with #!, contains the full path of the command
interpreter (in this case /bin/bash) that is to be used on the file. As we will see on the next screen, you have a
few choices depending upon which scripting language you use.
Command Shell Choices
The command interpreter is tasked with executing statements that follow it in the script. Commonly used
interpreters include: /usr/bin/perl, /bin/bash, /bin/csh, /usr/bin/python and /bin/sh.
Typing a long sequence of commands at a terminal window can be complicated, time consuming, and error prone.
By deploying shell scripts, using the command-line becomes an efficient and quick way to launch complex
sequences of steps. The fact that shell scripts are saved in a file also makes it easy to use them to create new script
variations and share standard procedures with several users.
Linux provides a wide choice of shells; exactly what is available on the system is listed in /etc/shells.
Typical choices are:
/bin/sh
/bin/bash
/bin/tcsh
/bin/csh
/bin/ksh
Most Linux users use the default bash shell, but those with long UNIX backgrounds with other shells may want to
override the default.
Click the link to download UNIX Shell.pdf to read more about the UNIX Shell.
bash Scripts
Let's write a simple bash script that displays a two-line message on the screen. Either type
$ cat > exscript.sh
#!/bin/bash
echo "HELLO"
echo "WORLD"
and press ENTER and CTRL-D to save the file, or just create exscript.sh in your favorite text editor. Then,
type chmod +x exscript.sh to make the file executable. (The chmod +x command makes the file
executable for all users.) You can then run it by simply typing ./exscript.sh or by doing:
$ bash exscript.sh
HELLO
WORLD
Note if you use the second form, you don't have to make the file executable.
#!/bin/bash
# Interactive reading of variables
echo "ENTER YOUR NAME"
read sname
# Display of variable values
echo $sname
In the above example, when the script ./ioscript.sh is executed, the user will receive a prompt ENTER
YOUR NAME. The user then needs to enter a value and press the Enter key. The value will then be printed out.
Additional note: The hash-tag/pound-sign/number-sign (#) is used to start comments in the script and can be
placed anywhere in the line (the rest of the line is considered a comment).
Return Values
All shell scripts generate a return value upon finishing execution; the value can be set with the exit statement.
Return values permit a process to monitor the exit state of another process often in a parent-child relationship.
This helps to determine how this process terminated and take any appropriate steps necessary, contingent on
success or failure.
$ ls /etc/passwd
/etc/ passwd
$ echo $?
0
In this example, the system is able to locate the file /etc/passwd and returns a value of 0 to indicate success;
the return value is always stored in the $? environment variable. Applications often translate these return values
into meaningful messages easily understood by the user.
Character Description
Used to add a comment, except when used as \#, or as #!
#
when starting a script
\ Used at the end of a line to indicate continuation on to the
next line
; Used to interpret what follows as a new command
$ Indicates what follows is a variable
Note that when # is inserted at the beginning of a line of commentary, the whole line is ignored.
The concatenation operator (\) is used to concatenate large commands over several lines in the shell.
For example, you want to copy the file /var/ftp/pub/userdata/custdata/read from server1.linux.com to the
/opt/oradba/master/abc directory on server3.linux.co.in. To perform this action, you can write the command
using the \ operator as:
scp [email protected]:\
/var/ftp/pub/userdata/custdata/read \
[email protected]:\
/opt/oradba/master/abc/
The command is divided into multiple lines to make it look readable and easier to understand. The \ operator at the
end of each line combines the commands from multiple lines and executes it as one single command.
The three commands in the following example will all execute even if the ones preceding them fail:
$ make ; make install ; make clean
However, you may want to abort subsequent commands if one fails. You can do this using the && (and) operator
as in:
$ make && make install && make clean
If the first command fails the second one will never be executed. A final refinement is to use the || (or) operator
as in:
$ cat file1 || cat file2 || cat file3
In this case, you proceed until something succeeds and then you stop executing any further steps.
Functions
A function is a code block that implements a set of operations. Functions are useful for executing procedures
multiple times perhaps with varying input variables. Functions are also often called subroutines. Using functions
in scripts requires two steps:
1. Declaring a function
2. Calling a function
The function declaration requires a name which is used to invoke it. The proper syntax is:
function_name () {
command...
}
display () {
echo "This is a sample function"
}
The function can be as long as desired and have many statements. Once defined, the function can be called later as
many times as necessary. In the full example shown in the figure, we are also showing an often-used refinement:
how to pass an argument to the function. The first argument can be referred to as $1, the second as $2, etc.
Click the image to view an enlarged version.
• Compiled applications
• Built-in bash commands
• Other scripts
Compiled applications are binary executable files that you can find on the filesystem. The shell script always has
access to compiled applications such as rm, ls, df, vi, and gzip.
bash has many built-in commands which can only be used to display the output within a terminal shell or shell
script. Sometimes these commands have the same name as executable programs on the system, such as echo
which can lead to subtle problems. bash built-in commands include and cd, pwd, echo, read, logout,
printf, let, and ulimit.
A complete list of bash built-in commands can be found in the bash man page, or by simply typing help.
Command Substitution
At times, you may need to substitute the result of a command as a portion of another command. It can be done in
two ways:
Virtually any command can be executed this way. Both of these methods enable command substitution; however,
the $( ) method allows command nesting. New scripts should always use this more modern method. For
example:
$ cd /lib/modules/$(uname -r)/
In the above example, the output of the command "uname –r" becomes the argument for the cd command.
Click the image to view an enlarged version.
Environment Variables
Almost all scripts use variables containing a value, which can be used anywhere in the script. These variables can
either be user or system defined. Many applications use such environment variables (covered in the "User
Environment" chapter) for supplying inputs, validation, and controlling behaviour.
Some examples of standard environment variables are HOME, PATH, and HOST. When referenced, environment
variables must be prefixed with the $ symbol as in $HOME. You can view and set the value of environment
variables. For example, the following command displays the value stored in the PATH variable:
$ echo $PATH
However, no prefix is required when setting or modifying the variable value. For example, the following
command sets the value of the MYCOLOR variable to blue:
$ MYCOLOR=blue
You can get a list of environment variables with the env, set, or printenv commands.
Exporting Variables
By default, the variables created within a script are available only to the subsequent steps of that script. Any child
processes (sub-shells) do not have automatic access to the values of these variables. To make them available to
child processes, they must be promoted to environment variables using the export statement as in:
export VAR=value
or
While child processes are allowed to modify the value of exported variables, the parent will not see any changes;
exported variables are not shared, but only copied.
Script Parameters
Users often need to pass parameter values to a script, such as a filename, date, etc. Scripts will take different
paths or arrive at different values according to the parameters (command arguments) that are passed to them.
These values can be text or numbers as in:
$ ./script.sh /tmp
$ ./script.sh 100 200
Within a script, the parameter or an argument is represented with a $ and a number. The table lists some of these
parameters.
Parameter Meaning
$0 Script name
$1 First parameter
$2, $3, etc. Second, third parameter, etc.
$* All parameters
$# Number of arguments
Using Script Parameters
Using your favorite text editor, create a new script file named script3.sh with the following contents:
#!/bin/bash
echo The name of this program is: $0
echo The first argument passed from the command line is: $1
echo The second argument passed from the command line is: $2
echo The third argument passed from the command line is: $3
echo All of the arguments passed from the command line are : $*
echo
echo All done with $0
Make the script executable with chmod +x. Run the script giving it three arguments as in: script3.sh one
two three, and the script is processed as follows:
Output Redirection
Most operating systems accept input from the keyboard and display the output on the terminal. However, in shell
scripting you can send the output to a file. The process of diverting the output to a file is called output
redirection.
The > character is used to write output to a file. For example, the following command sends the output of free to
the file /tmp/free.out:
To check the contents of the /tmp/free.out file, at the command prompt type cat /tmp/free.out.
Two > characters (>>) will append output to a file if it exists, and act just like > if the file does not already exist.
Input Redirection
Just as the output can be redirected to a file, the input of a command can be read from a file. The process of
reading input from a file is called input redirection and uses the < character. If you create a file called
script8.sh with the following contents:
#!/bin/bash
echo “Line count”
wc -l < /temp/free.out
and then execute it with chmod +x script8.sh ; ./script8.sh, it will count the number of lines from
the /temp/free.out file and display the results.
Click the image to view an enlarged version.
The if Statement
Conditional decision making using an if statement, is a basic construct that any useful programming or scripting
language must have.
When an if statement is used, the ensuing actions depend on the evaluation of specified conditions such as:
if [ -f /etc/passwd ]
then
echo "/etc/passwd exists."
fi
Notice the use of the square brackets ([]) to delineate the test condition. There are many other kinds of tests you
can perform, such as checking whether two numbers are equal to, greater than, or less than each other and make a
decision accordingly; we will discuss these other tests.
In modern scripts you may see doubled brackets as in[[ -f /etc/passwd ]]. This is not an error. It is
never wrong to do so and it avoids some subtle problems such as referring to an empty environment variable
without surrounding it in double quotes; we won't talk about this here.
bash provides a set of file conditionals, that can used with the if statement, including:
Condition Meaning
-e file Check if the file exists.
-d file Check if the file is a directory.
-f file Check if the file is a regular file (i.e., not a symbolic link, device node,
directory, etc.)
-s file Check if the file is of non-zero size.
-g file Check if the file has sgid set.
-u file Check if the file has suid set.
-r file Check if the file is readable.
-w file Check if the file is writable.
-x file Check if the file is executable.
You can view the full list of file conditions using the command man 1 test.
You can use the if statement to compare strings using the operator == (two equal signs). The syntax is as
follows:
if [ string1 == string2 ] ; then
ACTION
fi
In the example illustrated here, the if statement is used to compare the input provided by the user and
accordingly display the result.
Numerical Tests
You can use specially defined operators with the if statement to compare numbers. The various operators that are
available are listed in the table.
Operator Meaning
-eq Equal to
-ne Not equal to
-gt Greater than
-lt Less than
-ge Greater than or equal to
-le Less than or equal to
The syntax for comparing numbers is as follows:
exp1 -op exp2
Arithmetic Expressions
Arithmetic expressions can be evaluated in the following three ways (spaces are important!):
• Using the expr utility: expr is a standard but somewhat deprecated program. The syntax is as follows:
expr 8 + 8
echo $(expr 8 + 8)
• Using the $((...)) syntax: This is the built-in shell format. The syntax is as follows:
echo $((x+1))
• Using the built-in shell command let. The syntax is as follows:
• Scripts are a sequence of statements and commands stored in a file that can be executed by a shell. The most
commonly used shell in Linux is bash.
• Command substitution allows you to substitute the result of a command as a portion of another command.
• Functions or routines are a group of commands that are used for execution.
• Environmental variables are quantities either pre-assigned by the shell or defined and modified by the user.
• To make environment variables visible to child processes, they need to
be exported.
• Scripts can behave differently based on the parameters (values) passed to them.
• The process of writing the output to a file is called output redirection.
• The process of reading input from a file is called input redirection.
• The if statement is used to select an action based on a condition.
• Arithmetic expressions consist of numbers and arithmetic operators, such as +, -, and *.
chapter 16
Advanced bash scripting
String Manipulation
Let’s go deeper and find out how to work with strings in scripts.
A string variable contains a sequence of text characters. It can include letters, numbers, symbols and punctuation
marks. Some examples: abcde, 123, abcde 123, abcde-123, &acbde=%123
String operators include those that do comparison, sorting, and finding the length. The following table
demonstrates the use of some basic string operators.
Operator Meaning
[ string1 >
string2 ] Compares the sorting order of string1 and string2.
In the first example, we compare the first string with the second string and display an appropriate message using
the if statement.
In the second example, we pass in a file name and see if that file exists in the current directory or not.
Parts of a String
At times, you may not need to compare or use an entire string. To extract the first character of a string we can
specify:
${string:0:1} Here 0 is the offset in the string (i.e., which character to begin from) where the extraction
needs to start and 1 is the number of characters to be extracted.
To extract all characters in a string after a dot (.), use the following expression: ${string#*.}
Boolean Expressions
Boolean expressions evaluate to either TRUE or FALSE, and results are obtained using the various Boolean
operators listed in the table.
Similarly, to check if the value of number1 is greater than the value of number2, use the following conditional
test:
case expression in
pattern1) execute commands;;
pattern2) execute commands;;
pattern3) execute commands;;
pattern4) execute commands;;
* ) execute some default commands or nothing ;;
esac
Looping Constructs
By using looping constructs, you can execute one or more lines of code repetitively. Usually you do this until a
conditional test returns either true or false as is required.
• for
• while
• until
All these loops are easily used for repeating a set of statements until the exit condition is true.
In this case, variable-name and list are substituted by you as appropriate (see examples). As with other
looping constructs, the statements that are repeated should be enclosed by do and done.
The screenshots here show an example of the for loop to print the sum of numbers 1 to 4.
Similar to the while loop, the set of commands that need to be repeated should be enclosed
between do and done. You can use any command or operator as the condition.
The screenshot here shows example of the until loop that displays odd numbers between 1 and 10.
Debugging helps you troubleshoot and resolve such errors, and is one of the most important tasks a system
administrator performs.
In bash shell scripting, you can run a script in debug mode by doing bash –x ./script_file. Debug
mode helps identify the error because:
On the left screen is a buggy shell script. On the right screen the buggy script is executed and the errors are
redirected to the file "error.txt". Using "cat" to display the contents of "error.txt" shows the errors of
executing the buggy shell script (presumably for further debugging).
Click the image to view an enlarged version.
Temporary files (and directories) are meant to store data for a short time. Usually one arranges it so that these files
disappear when the program using them terminates. While you can also use touch to create a temporary file, this
may make it easy for hackers to gain access to your data.
The best practice is to create random and unpredictable filenames for temporary storage. One way to do this is
with the mktemp utility as in these examples:
The XXXXXXXX is replaced by the mktemp utility with random characters to ensure the name of the temporary
file cannot be easily predicted and is only known within your program.
Command Usage
TEMP=$(mktemp
/tmp/tempfile.XXXXXXXX) To create a temporary file
To create a temporary
TEMPDIR=$(mktemp -d
directory
/tmp/tempdir.XXXXXXXX)
First, the danger: If someone creates a symbolic link from a known temporary file used by root to the
/etc/passwd file, like this:
$ ln -s /etc/passwd /tmp/tempfile
There could be a big problem if a script run by root has a line in like this:
echo $VAR > /tmp/tempfile
To prevent such a situation make sure you randomize your temporary filenames by replacing the above line with
the following lines:
TEMP=$(mktemp /tmp/tempfile.XXXXXXXX)
echo $VAR > $TEMP
It discards all data that gets written to it and never returns a failure on write operations. Using the proper
redirection operators, it can make the output disappear from commands that would normally generate output to
stdout and/or stderr:
$ find / > /dev/null
In the above command, the entire standard output stream is ignored, but any errors will still appear on the console.
The example shows you how to easily use the environmental variable method to generate random numbers.
Regardless of which of these two sources is used, the system maintains a so-called entropy pool of these digital
numbers/random bits. Random numbers are created from this entropy pool.
The Linux kernel offers the /dev/random and /dev/urandom device nodes which draw on the entropy pool to
provide random numbers which are drawn from the estimated number of bits of noise in the entropy pool.
/dev/random is used where very high quality randomness is required, such as one-time pad or key generation, but
it is relatively slow to provide vaules. /dev/urandom is faster and suitable (good enough) for most cryptographic
purposes.
Furthermore, when the entropy pool is empty, /dev/random is blocked and does not generate any number until
additional environmental noise (network traffic, mouse movement, etc.) is gathered whereas /dev/urandom reuses
the internal pool to produce more pseudo-random bits.
• You can manipulate strings to perform actions such as comparison, sorting, and finding length.
• You can use Boolean expressions when working with multiple data types including strings or numbers as
well as files.
• The output of a Boolean expression is either true or false.
• Operators used in Boolean expressions include the && (AND), ||(OR), and ! (NOT) operators.
• We looked at the advantages of using case statement in scenarios where
the value of a variable can lead to different execution paths.
• Script debugging methods help troubleshoot and resolve errors.
• The standard and error outputs from a script or shell commands can easily be redirected into the same file or
separate files to aid in debugging and saving results
• Linux allows you to create temporary files and directories, which store data for a short duration, both saving
space and increasing security.
• Linux provides several different ways of generating random numbers, which are widely used.
Chapter 17
What Is a Process?
A process is simply an instance of one or more related tasks (threads) executing on your computer. It is not the
same as a program or a command; a single program may actually start several processes simultaneously. Some
processes are independent of each other and others are related. A failure of one process may or may not affect the
others running on the system.
Processes use many system resources, such as memory, CPU (central processing unit) cycles, and peripheral
devices such as printers and displays. The operating system (especially the kernel) is responsible for allocating a
proper share of these resources to each process and ensuring overall optimum utilization.
Process Types
A terminal window (one kind of command shell), is a process that runs as long as needed. It allows users to
execute programs and access resources in an interactive environment. You can also run programs in the
background, which means they become detached from the shell.
Processes can be of different types according to the task being performed. Here are some different process types
along with their descriptions and examples.
However, sometimes processes go into what is called a sleep state, generally when they are waiting for something
to happen before they can resume, perhaps for the user to type something. In this condition a process is sitting in a
wait queue.
There are some other less frequent process states, especially when a process is terminating. Sometimes a child
process completes but its parent process has not asked about its state. Amusingly such a process is said to be in a
zombie state; it is not really alive but still shows up in the system's list of processes.
New PIDs are usually assigned in ascending order as processes are born. Thus PID 1 denotes the init process
(initialization process), and succeeding processes are gradually assigned higher numbers.
ID Type Description
Process ID (PID) Unique Process ID number
Parent Process ID (PPID) Process (Parent) that started this process
Thread ID number. This is the same as the
PID for single-threaded processes. For a
Thread ID (TID)
multi-threaded process, each thread shares
the same PID but has a unique TID.
User and Group IDs
Many users can access a system simultaneously, and each user can run multiple processes. The operating system
identifies the user who starts the process by the Real User ID (RUID) assigned to the user.
The user who determines the access rights for the users is identified by the Effective UID (EUID). The EUID may
or may not be the same as the RUID.
Users can be categorized into various groups. Each group is identified by the Real Group ID, or RGID. The
access rights of the group are determined by the Effective Group ID, or EGID. Each user can be a member of one
or more groups.
Most of the time we ignore these details and just talk about the User ID (UID).
The priority for a process can be set by specifying a nice value, or niceness, for the process. The lower the nice
value, the higher the priority. Low values are assigned to important processes, while high values are assigned to
processes that can wait longer. A process with a high nice value simply allows other processes to be executed first.
In Linux, a nice value of -20 represents the highest priority and 19 represents the lowest. (This does sound kind of
backwards, but this convention, the nicer the process, the lower the priority, goes back to the earliest days of
UNIX.)
You can also assign a so-called real-time priority to time-sensitive tasks, such as controlling machines through a
computer or collecting incoming data. This is just a very high priority and is not to be confused with what is called
hard real time which is conceptually different, and has more to do with making sure a job gets completed within
a very well-defined time window.
ps has many options for specifying exactly which tasks to examine, what information to display about them, and
precisely what output format should be used.
Without options ps will display all processes running under the current shell. You can use the -u option to display
information of processes for a specified username. The command ps -ef displays all the processes in the system
in full detail. The command ps -eLf goes one step further and displays one line of information for every thread
(remember, a process can contain multiple threads).
The following tables shows sample output of ps with the aux and axo qualifiers.
Command Output
ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19356 1292 ? Ss Feb27 0:08 /sbin/init
root 2 0.0 0.0 0 0 ? S Feb27 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Feb27 0:27 [migration/0]
. . .
Command Output
STAT PRI PID %CPU
COMMAND
Ss 20 1 0.0
init
ps axo
stat,priority,pi S 20 2 0.0
d,pcpu,comm kthreadd
S -100 3 0.0
migration/0
. . .
pstree displays the processes running on the system in the form of a tree diagram showing the relationship
between a process and its parent process and any other processes that it created. Repeated entries of a process are
not displayed, and threads are displayed in curly braces.
To terminate a process you can type kill -SIGKILL <pid> or kill -9 <pid>. Note however, you can
only kill your own processes: those belonging to another user are off limits unless you are root.
top
While a static view of what the system is doing is useful, monitoring the system performance live over time is also
valuable. One option would be to run ps at regular intervals, say, every two minutes. A better alternative is to use
top to get constant real-time updates (every two seconds by default) until you exit by typing q. top clearly
highlights which processes are consuming the most CPU cycles and memory (using appropriate commands from
within top.)
The percentage of user jobs running at a lower priority (niceness - ni) is then listed. Idle mode (id) should be low
if the load average is high, and vice versa. The percentage of jobs waiting (wa) for I/O is listed. Interrupts include
the percentage of hardware (hi) vs. software interrupts (si). Steal time (st) is generally used with virtual machines,
which has some of its idle CPU time taken for other uses.
You need to monitor memory usage very carefully to ensure good system performance. Once the physical memory
is exhausted, the system starts using swap space (temporary storage space on the hard drive) as an extended
memory pool, and since accessing disk is much slower than accessing memory, this will negatively affect system
performance.
If the system starts using swap often, you can add more swap space. However, adding more physical memory
should also be considered.
Command Output
t Display or hide summary information (rows 2 and 3)
m Display or hide memory information (rows 4 and 5)
A Sort the process list by top resource consumers
r Renice (change the priority of) a specific processes
k Kill a specific process
f Enter the top configuration screen
Load Averages
Load average is the average of the load number for a given period of time. It takes into account processes that
are:
The last piece of information is the average load of the system. Assuming our system is a single-CPU system, the
0.25 means that for the past minute, on average, the system has been 25% utilized. 0.12 in the next position means
that over the past 5 minutes, on average, the system has been 12% utilized; and 0.15 in the final position means
that over the past 15 minutes, on average, the system has been 15% utilized. If we saw a value of 1.00 in the
second position, that would imply that the single-CPU system was 100% utilized, on average, over the past 5
minutes; this is good if we want to fully use a system. A value over 1.00 for a single-CPU system implies that the
system was over-utilized: there were more processes needing CPU than CPU was available.
If we had more than one CPU, say a quad-CPU system, we would divide the load average numbers by the number
of CPUs. In this case, for example, seeing a 1 minute load average of 4.00 implies that the system as a whole was
100% (4.00/4) utilized during the last minute.
Short term increases are usually not a problem. A high peak you see is likely a burst of activity, not a new level.
For example, at start up, many processes start and then activity settles down. If a high peak is seen in the 5 and 15
minute load averages, it would may be cause for concern.
You can either use CTRL-Z to suspend a foreground job or CTRL-C to terminate a foreground job and can
always use the bg and fg commands to run a process in the background and foreground, respectively.
Managing Jobs
The jobs utility displays all jobs running in background. The display shows the job ID, state, and command name,
as shown here.
jobs -l provides a the same information as jobs including the PID of the background jobs.
The background jobs are connected to the terminal window, so if you log off, the jobs utility will not show the
ones started from that window.
cron
cron is a time-based scheduling utility program. It can launch routine background jobs at specific times and/or
days on an on-going basis. cron is driven by a configuration file called /etc/crontab (cron table) which
contains the various shell commands that need to be run at the properly scheduled times. There are both system-
wide crontab files and individual user-based ones. Each line of a crontab file represents a job, and is composed of
a so-called CRON expression, followed by a shell command to execute.
The crontab -e command will open the crontab editor to edit existing jobs or to create new jobs. Each line of
the crontab file will contain 6 fields:
sleep suspends execution for at least the specified period of time, which can be given as the number of seconds
(the default), minutes, hours or days. After that time has passed (or an interrupting signal has been received)
execution will resume.
Syntax:
sleep NUMBER[SUFFIX]...
where SUFFIX may be:
1. s for seconds (the default)
2. m for minutes
3. h for hours
4. d for days
sleep and at are quite different; sleep delays execution for a specific period while at starts execution at a later
time.
You have completed this chapter. Let’s summarize the key concepts covered:
• Web browsers
• Email clients
• Online media applications
• Other applications
Web Browsers
As discussed in the earlier chapter on Network Operations, Linux offers a wide variety of web browsers, both
graphical and text based, including:
• Firefox
• Google Chrome
• Chromium
• Epiphany
• Konqueror
• w3m
• lynx
Email Applications
Email applications allow for sending, receiving, and reading messages over the Internet. Linux systems offer a
wide number of email clients, both graphical and text-based. In addition many users simply use their browsers to
access their email accounts.
Most email clients use the Internet Message Access Protocol (IMAP) or the older Post Office Protocol (POP)
to access emails stored on a remote mail server. Most email applications also display HTML (HyperText
Markup Language) formatted emails that display objects, such as pictures and hyperlinks. The features of
advanced email applications include the ability of importing address books/contact lists, configuration
information, and emails from other email applications.
• Graphical email clients, such as Thunderbird (produced by Mozilla), Evolution, and Claws Mail
• Text mode email clients such as mutt and mail
Other Internet Applications
Linux systems provide many other applications for performing Internet-related tasks. These include:
Application Use
Intuitive graphical FTP client that supports FTP, Secure File Transfer Protocol
FileZilla
(SFTP), and FTP Secured (FTPS). Used to transfer files to/from (FTP) servers.
Pidgin To access GTalk, AIM, ICQ, MSN, IRC and other messaging networks
Ekiga To connect to Voice over Internet Protocol (VoIP) networks
Hexchat To access Internet Relay Chat (IRC) networks
Office Applications
Most day-to-day computer systems have productivity applications (sometimes called office suites) available or
installed (click here for a list of commonly used suites). Each suite is a collection of closely coupled programs
used to create and edit different kinds of files such as:
Sound Players
Multimedia applications are used to listen to music, view videos, etc, as well as to present and view text and
graphics. Linux systems offer a number of sound player applications including:
Application Use
Mature MP3 player with a graphical interface, that plays audio and
video files, and streams (online audio files). It allows you to create a
Amarok
play list that contains a group of songs, and uses a database to store
information about the music collection.
Used to record and edit sounds and can be quickly installed through a
Audacity
package manager. Audacity has a simple interface to get you started.
Supports a large variety of digital music sources including streaming
Internet audio and podcasts. The application also enables search of
Rhythmbox particular audio in a library. It supports ‘smart playlists’ with an
‘automatic update’ feature which can revise playlists based on
specified selection criteria.
Of course Linux systems can also connect with commercial online music streaming services such as Pandora
and Spotify through web browsers.
Movie Players
Movie (video) players can portray input from many different sources, either local to the machine or on the
Internet.
• VLC
• MPlayer
• Xine
• Totem
Movie Editors
Movie editors are used to edit videos or movies. Linux systems offer a number of movie editors including:
Application Use
Kino Acquire and edit camera streams. Kino can merge and separate video clips.
Cinepaint Frame-by-frame retouching. Cinepaint is used for editing images in a video.
Create 3D animation and design. Blender is a professional tool that uses modeling
Blender as a starting point. There are complex and powerful tools for camera capture,
recording, editing, enhancing and creating video, each having its own focus.
Cinelerra Capture, compose, and edit audio/video.
Record, convert, and stream audio/video. FFmpeg is a format converter, among
FFmpeg
other things, and has other tools such as ffplay and ffserver.
GIMP (GNU Image Manipulation Program)
Graphic editors allow you to create, edit, view, and organize images of various formats like Joint Photographic
Experts Group (JPEG or JPG), Portable Network Graphics (PNG), Graphics Interchange Format (GIF), and
Tagged Image File Format (TIFF).
GIMP (GNU Image Manipulation Program) is a feature-rich image retouching and editing tool similar to
Adobe Photoshop and is available on all Linux distributions. Some features of the GIMP are:
Graphics Utilities
In addition to the GIMP, there are other graphics utilities that help perform various image-related tasks,
including:
You have completed this chapter. Let’s summarize the key concepts covered:
• Linux offers a wide variety of Internet applications such as web browsers, email clients, online media
applications, and others.
• Web browsers supported by Linux can be either graphical or text-based such as Firefox, Google Chrome,
Epiphany, w3m, lynx and others.
• Linux supports graphical email clients, such as Thunderbird, Evolution, and Claws Mail, and text mode
email clients, such as mutt and mail.
• Linux systems provide many other applications for performing Internet-related tasks, such as Filezilla,
XChat, Pidgin, and others.
• Most Linux distributions offer LibreOffice to create and edit different kinds of documents.
• Linux systems offer entire suites of development applications and tools, including compilers and debuggers.
• Linux systems offer a number of sound players including Amarok, Audacity, and Rhythmbox.
• Linux systems offer a number of movie players including VLC, MPlayer, Xine, and Totem.
• Linux systems offer a number of movie editors including Kino, Cinepaint, Blender among others.
• The GIMP (GNU Image Manipulation Program) utility is a feature-rich image retouching and editing tool
available on all Linux distributions.
• Other graphics utilities that help perform various image-related tasks are eog, Inkscape, convert, and
Scribus.