Linux
Linux
Lajpat Nagar
BCA
Sem III
Linux
LINUX ENVIRONMENT
BCA III (JU)
UNIT I
UNIX & LINUX:- The Operating System: Linux history, Linux features, Linux
distributions, Linux’s relationship to Unix, Overview of Linux architecture,
Installation, Booting, Login and Shutdown Process, Start up scripts, controlling
processes, system processes (an overview).
Linux Internals - System Calls, Process Management, Memory Management, Disk
and filesystems ,Networking ,Security, Graphical User Interface, Device Drivers
Getting help in Linux with –help, whatis, man command, info command, simple
commands like date,whoami, who, w, cal, bc ,hostname,uname, concept of
aliases etc Linux filesystem types ext2, ext3, ext4,Basic linux directory structure
and the functions of different directories basic directory navigation commands
like cd, mv, copy,rm,cat command , less command, runlevel (importance of
/etc/inittab)
Unix Programming
Fig. 1.1 presents the architecture of a typical operating system and shows how an
OS succeeds in presenting users and application programs with a uniform
interface without regard to the details of the underlying hardware. We see that:
The operating system kernel is in direct control of the underlying hardware.
The kernel provides low-level device, memory and processor management
functions (e.g. dealing with interrupts from hardware devices, sharing the
processor among multiple programs, allocating memory for programs etc.)
Basic hardware-independent kernel services are exposed to higher-level
programs through a library of system calls (e.g. services to create a file,
begin execution of a program, or open a logical network connection to
another computer).
Application programs (e.g. word processors, spreadsheets) and system
utility programs (simple but useful application programs that come with the
operating system, e.g. programs which find text inside a group of files)
make use of system calls. Applications and system utilities are launched
using a shell (a textual command line interface) or a graphical user interface
that provides direct user interaction.
Operating systems (and different flavours of the same operating system) can be
distinguished from one another by the system calls, system utilities and user
interface they provide, as well as by the resource scheduling policies
implemented by the kernel.
OPERATING SYSTEM
There are so many different types of Operating System but here we will be
categorizing it into two types. These are:
It can be used by more than one person at a time.
Multi-user is a term that defines an operating system that allows
concurrent access by multiple users of a computer.
Time-sharing systems are multi-user systems. Most batch processing
systems for mainframe computers may also be considered "multi-user", to
avoid leaving the CPU idle while it waits for I/O operations to complete.
However, the term "Multi-tasking" is more common in this context.
An example is a Unix server where multiple remote users have access (via
Telnet) to the Unix shell prompt at the same time. Another example uses
multiple Xsessions spread across multiple monitors powered by a single
machine.
Multi-User : Allows two or more users to run programs at the same time.
Some operating systems permit hundreds or even thousands of concurrent
users.
Multiprocessing : Supports running a program on more than one CPU.
Multitasking : Allows more than one program to run concurrently.
Multithreading : Allows different parts of a single program to run
concurrently.
Real Time: Responds to input instantly. General-purpose operating
systems, such as DOS and UNIX, are not real-time.
UNIX:
UNIX is a giant Operating System developed by Ken Thompson and Dennis
Ritche developed at Bell Laboratories USA in 1969, which has practically
everything an operating system should have, and several features which
other operating system never had.
It has a number of profound and diverse concepts developed and perfected
over a period of time.
Its elegance and richness go beyond the commands and tools that
constitute it, while simplicity permeates the entire system
every hardware and today champions the cause of Open
It runs practically on
Source Movement.
Unix today is a mature operating system, and is used heavily in a large variety of
scientific, engineering, and mission critical applications. Interest
in Unix has grown substantially in recent years because of the
proliferation of the Linux (a Unix look-alike) operating system.
developments are graphical interfaces, MOTIF, X Windows, and Open
Recent
View
UNIX Systems
The Seventh Edition, released in 1978, marked a split in UNIX development into
two main branches: SYSV (System 5) and BSD (Berkeley Software Distribution).
BSD arose from the University of California at Berkeley where Ken Thompson
spent a sabbatical year. Its development was continued by students at Berkeley
and other research institutions.
SYSV was developed by AT&T and other commercial companies. UNIX flavours
based on SYSV have traditionally been more conservative, but better supported
than BSD-based flavours.
The latest incarnations of SYSV (SVR4 or System 5 Release 4) and BSD Unix are
actually very similar.
Linux is a free open source UNIX OS for PCs that was originally developed in 1991
by Linus Torvalds, a Finnish undergraduate student.
Linux is neither pure SYSV or pure BSD. Instead, incorporates some features from
each (e.g. SYSV-style startup files but BSD-style file system layout) and aims to
conform with a set of IEEE standards called POSIX (Portable Operating System
Interface). To maximise code portability, it typically supports SYSV, BSD and POSIX
system calls (e.g. poll, select, memset, memcpy, bzero and bcopy are all
supported).
The open source nature of Linux means that the source code for the Linux kernel
is freely available so that anyone can add features and correct deficiencies.
This approach has been very successful and what started as one person's project
has now turned into a collaboration of hundreds of volunteer developers from
around the globe. The open source approach has not just successfully been
applied to kernel code, but also to application programs for Linux.
Redhat is the most popular distribution because it has been ported to a large
number of hardware platforms (including Intel, Alpha, and SPARC), it is easy to
use and install and it comes with a comprehensive set of utilities and applications
including the X Windows graphics system, GNOME and KDE GUI environments,
and the StarOffice suite (an open source MS-Office clone for Linux).
Kernel
The Linux kernel includes device driver support for a large number of PC hardware
devices (graphics cards, network cards, hard disks etc.), advanced processor and
memory management features, and support for many different types of
filesystems (including DOS floppies and the ISO9660 standard for CDROMs). In
terms of the services that it provides to application programs and system utilities,
the kernel implements most BSD and SYSV system calls, as well as the system calls
described in the POSIX.1 specification.
The kernel (in raw binary form that is loaded directly into memory at system
startup time) is typically found in the file /boot/vmlinuz, while the source files can
usually be found in /usr/src/linux.The latest version of the Linux kernel sources
can be downloaded from http://www.kernel.org/.
Shells and GUIs
Linux supports two forms of command input: through textual command line shells
similar to those found on most UNIX systems (e.g. sh - the Bourne shell, bash - the
Bourne again shell and csh - the C shell) and through graphical interfaces (GUIs)
such as the KDE and GNOME window managers. If you are connecting remotely to
a server your access will typically be through a command line shell.
System Utilities
Virtually every system utility that you would expect to find on standard
implementations of UNIX (including every system utility described in the POSIX.2
specification) has been ported to Linux. This includes commands such as ls, cp,
grep, awk, sed, bc, wc, more, and so on. These system utilities are designed to be
powerful tools that do a single task extremely well (e.g. grep finds text inside files
while wc counts the number of words, lines and bytes inside a file). Users can
often solve problems by interconnecting these tools instead of writing a large
monolithic application program.
Like other UNIX flavours, Linux's system utilities also include server programs
called daemons which provide remote network and administration services (e.g.
telnetd and sshd provide remote login facilities, lpd provides printing services,
httpd serves web pages, crond runs regular system administration tasks
automatically). A daemon (probably derived from the Latin word which refers to a
beneficient spirit who watches over someone, or perhaps short for "Disk And
Execution MONitor") is usually spawned automatically at system startup and
spends most of its time lying dormant (lurking?) waiting for some event to occur.
Application programs
Linux is free:
As in free beer, they say. If you want to spend absolutely nothing, you don't even
have to pay the price of a CD. Linux can be downloaded in its entirety from the
Internet completely for free. No registration fees, no costs per user, free updates,
and freely available source code in case you want to change the behavior of your
system.
The license commonly used is the GNU Public License (GPL). The license says that
anybody who may want to do so, has the right to change Linux and eventually to
redistribute a changed version, on the one condition that the code is still available
after redistribution. In practice, you are free to grab a kernel image, for instance
to add support for teletransportation machines or time travel and sell your new
code, as long as your customers can still have a copy of that code.
Although there are a large number of Linux implementations, you will find a lot of
similarities in the different distributions, if only because every Linux machine is a
box with building blocks that you may put together following your own needs and
views.
Linux may appear different depending on the distribution, your hardware and
personal taste, but the fundamentals on which all graphical and other interfaces
are built, remain the same. The Linux system is based on GNU tools (Gnu's Not
UNIX), which provide a set of standard ways to handle and use the system. All
GNU tools are open source, so they can be installed on any system. Most
distributions offer pre-compiled packages of most common tools, such as RPM
packages on RedHat and Debian packages (also called deb or dpkg) on Debian, so
you needn't be a programmer to install a package on your system.
1.5.2. GNU/Linux
The Linux kernel is not part of the GNU project but uses the same license as GNU
software. A great majority of utilities and development tools (the meat of your
system), which are not Linux-specific, are taken from the GNU project. Because
any usable system must contain both the kernel and at least a minimal set of
utilities, some people argue that such a system should be called aGNU/Linux
system.
Fedora Core
Debian
SuSE Linux
Mandriva (former MandrakeSoft)
Knoppix: an operating system that runs from your CD-ROM, you don't
need to install anything.
Introduction
A lot of Linux systems use lilo, the LInux LOader for booting operating systems.
GRUB is easier to use and more flexible.
The boot process
When an x86 computer is booted, the processor looks at the end of the system
memory for the BIOS (Basic Input/Output System) and runs it. The BIOS program
is written into permanent read-only memory and is always available for use. The
BIOS provides the lowest level interface to peripheral devices and controls the
first step of the boot process.
The BIOS tests the system, looks for and checks peripherals, and then looks for a
drive to use to boot the system. Usually it checks the floppy drive (or CD-ROM
drive on many newer systems) for bootable media, if present, and then it looks to
the hard drive. The order of the drives used for booting is usually controlled by a
particular BIOS setting on the system. Once Linux is installed on the hard drive of
a system, the BIOS looks for a Master Boot Record (MBR) starting at the first
sector on the first hard drive, loads its contents into memory, then passes control
to it.
This MBR contains instructions on how to load the GRUB (or LILO) boot-loader,
using a pre-selected operating system. The MBR then loads the boot-loader,
which takes over the process (if the boot-loader is installed in the MBR). In the
default Red Hat Linux configuration, GRUB uses the settings in the MBR to display
boot options in a menu. Once GRUB has received the correct instructions for the
operating system to start, either from its command line or configuration file, it
finds the necessary boot file and hands off control of the machine to that
operating system.
This boot method is called direct loading because instructions are used to directly
load the operating system, with no intermediary code between the boot-loaders
and the operating system's main files (such as the kernel). The boot process used
by other operating systems may differ slightly from the above, however. For
example, Microsoft's DOS and Windows operating systems completely overwrite
anything on the MBR when they are installed without incorporating any of the
current MBR's configuration. This destroys any other information stored in the
MBR by other operating systems, such as Linux. The Microsoft operating systems,
as well as various other proprietary operating systems, are loaded using a chain
loading boot method. With this method, the MBR points to the first sector of the
partition holding the operating system, where it finds the special files necessary
to actually boot that operating system.
GRUB supports both boot methods, allowing you to use it with almost any
operating system, most popular file systems, and almost any hard disk your BIOS
can recognize.
4.2.4. Init
The kernel, once it is loaded, finds init in sbin and executes it.
When init starts, it becomes the parent or grandparent of all of the processes that
start up automatically on your Linux system. The first thing init does, is reading its
initialization file, /etc/inittab. This instructs init to read an initial configuration
script for the environment, which sets the path, starts swapping, checks the file
systems, and so on. Basically, this step takes care of everything that your system
needs to have done at system initialization: setting the clock, initializing serial
ports and so forth.
Then init continues to read the /etc/inittab file, which describes how the system
should be set up in each run level and sets the default run level. A run level is a
configuration of processes. All UNIX-like systems can be run in different process
configurations, such as the single user mode, which is referred to as run level 1 or
run level S (or s). In this mode, only the system administrator can connect to the
system. It is used to perform maintenance tasks without risks of damaging the
system or user data. Naturally, in this configuration we don't need to offer user
services, so they will all be disabled. Another run level is the reboot run level, or
run level 6, which shuts down all running services according to the appropriate
procedures and then restarts the system.
The idea behind operating different services at different run levels essentially
revolves around the fact that different systems can be used in different ways.
Some services cannot be used until the system is in a particular state, or mode,
such as being ready for more than one user or having networking available.
#
# inittab This file describes how the INIT process should set up
# the system in a certain run-level.
Feel free to configure unused run levels (commonly run level 4) as you see fit.
Many users configure those run levels in a way that makes the most sense for
them while leaving the standard run levels as they are by default. This allows
them to quickly move in and out of their custom configuration without disturbing
the normal set of features at the standard run levels.
Tools
Shutdown
UNIX was not made to be shut down, but if you really must, use the shutdown
command. After completing the shutdown procedure, the -h option will halt the
system, while -r will reboot it.
The reboot and halt commands are now able to invoke shutdown if run when the
system is in run levels 1-5, and thus ensure proper shutdown of the system,but it
is a bad habit to get into, as not all UNIX/Linux versions have this feature.
If your computer does not power itself down, you should not turn off the
computer until you see a message indicating that the system is halted or finished
shutting down, in order to give the system the time to unmount all partitions.
Being impatient may cause data loss.
BIOS
Master Boot Record (MBR)
Kernel
init
BIOS
Floppy
CDROM
SCSI drive
IDE drive
Master Boot Record
LILO
edit /etc/lilo.conf
o duplicate image= section, eg:
o image=/bzImage-2.2.12
o label=12
o read-only
o man lilo.conf for details
run /sbin/lilo
(copy modules)
reboot to test
Kernel
initialise devices
(optionally loads initrd, see below)
mount root FS
o specified by lilo or loadin
o kernel prints:
VFS: Mounted root (ext2 filesystem) readonly.
run /sbin/init, PID 1
o can be changed with boot=
o init prints:
INIT: version 2.76 booting
initrd
Details in /usr/src/linux/Documentation/initrd.txt
/sbin/init
reads /etc/inittab
runs script defined by this line:
o si::sysinit:/etc/init.d/rcS
switches to runlevel defined by
o id:3:initdefault:
sysinit
Run Levels
0 halt
1 single user
2-4 user defined
5 X11
6 Reboot
Default in /etc/inittab, eg
o id:3:initdefault:
Change using /sbin/telinit
Boot Summary
lilo
/etc/lilo.conf
o
debian runs
o /etc/rcS.d/S* and /etc/rc.boot/
o /etc/rc3.d/S* scripts
redhat runs
o /etc/rc.d/rc.sysinit
o /etc/rc.d/rc3.d/S* scripts
Software Installation
Introduction
A package in RPM format may include a dependency on the LSB Core and other
LSB specifications.Packages that are not in RPM format may test for the presence
of a conforming implementation by means of the lsb_release utility.
Note: The implementation itself may use a different packaging format for its own
packages, and of course it may use any available mechanism for installing the LSB-
conformant packages.
An RPM format file consists of 4 sections, the Lead, Signature, Header, and the
Payload. All values are stored in network byte order.
Lead
Signature
Header
Payload
These 4 sections shall exist in the order specified.
The signature section is used to verify the integrity, and optionally, the
authenticity of the majority of the package file.
The header section contains all available information about the package. Entries
such as the package's name, version, and file list, are contained in the header.
UNIX COMMANDS
A user of a Unix Based System works at a user terminal.After the boot procedure
is completed that is the Operating System is loaded in memory , the following
message appears at each user’s terminal:
Login:
Each user has an identification called the user name, which has to be entered
when the login : message appears.
Unix keeps track of all user names and information about identity is a special
file.If the login name entered does not matches with any other user names, it
displays the login message again This ensures that only authorized people use the
system.
When a valid user name is entered at the terminal, the ‘$’ symbol is displayed on
the screen. This is the Unix Prompt.
Once the user has logged into the system, the user’s work session continues until
the user instructs the shell to terminate the session.
This is done by
In order to maintain security of the files, the user should NEVER leave the
terminal without logging out.
The Shell is an intermediary program which interprets the commands that are
typed at the terminal and translates them into commands that the kernel
understands. The shell thus acts as a blanket around the kernel and eliminates the
need for the programmer to communicate directly with the kernel.
It is UNIX unique feature that all UNIX commands exist as a utility program. These
programs are located in individual files in one of the system directories such as
/bin, /etc or /usr/bin. The shell can be considered as a master utility program
which enables a user to gain access to all other utilities and resources of a
computer.
Types of Shells:
The shell runs like any other program under the UNIX system. Hence, one shell
program can replace the other, or call another shell program. Due to this feature,
a number of shells have been developed in response to different needs of the
user. Some of the popular shells are:
Bourne Shell:
C Shell:
Korn Shell:
Developed by David Korn., this combines the best feature of both the
above shells. Although very few systems currently offer this shell, it is slowly
gaining popularity. The executable filename is ksh.
Restricted Shell:
This is the restricted version of the Bourne shell. It is typically used for
guest logins- users who are not part of the system and in secure installations
where users must be restricted to work only in their own limited environments.
The executable file name is rsh.
Directory Commands:
When you login to the system, UNIX automatically places you in a directory called
the home directory. It is created by the system when a user account is opened.
Home directory can be changed also.
$ echo $HOME
The pwd (print working directory) command is used to display the path name of
the current directory.
$ pwd
$ cd<directory name>
$ cd /user
/user – name of the directory you want to change
The two dots are used with cd command to go to the parent directory.
$ cd..
Note: The cd command without any pathname takes the user back to the HOME
Directory.
To create a directory:
To remove a directory:
$rmdir<dir_name>
$ ls <directory_name>
Ls Options:
Option Description
-x Multicolumnar Output
-F Marks executable with *,directories with / and
symbolic links with @
-a Shows all filenames beginning with dot including . and ..
-R Recursive list
-r Sorts filename in reverse order
-l Long Listing in ASCII collating Sequence seven
attributes of a file
-1 one filename in each file
-d dirname lists only dirname if dirname is a directory
-t Sorts filename by last modification time
-lt Sorts listing by last modification time
-u Sorts filename by last access Time
-lu Sorts by ASCII collating Sequence but listing shows last
access time.
-lut As above but sorted by access Time
-i Displays inode number.
Listing Directory Contents(-x):
$ ls –x firstfile secondfile
Here firstfile and secondfile are two directories anf above command will display
the contents of both the directory.
Recursive Listing(-R):
It lists all files and subdirectories in a directory tree. Similar to Dir/s command of
DOS , this traversal of the directory tree is done recursively till there are no
subdirectories left.
$ ls –xR
cat is used to display the contents of a file. cat also accepts one filename as
arguments.
Here file1 and file2 are two files and there contents displayed one after another
sequentially.
To display Nonprinting Characters (-v) cat is normally used for displaying text file
only. Executables, when seen with cat, simply display junk. To display nonprinting
ASCII Characters in the input this option is used.
Numbering Lines(-n) : To debug a program if lines are displayed along with the
line number , it makes arror debugging easy.
But in vi editor line numbers are already contained and it will not work out. To do
the same pr command is used.
To Create a File:
Cat command is also used to create a file.To do this Enter the command cat,
followed by the >(the right cheveron) character and the filename ;
It means that the output goes to the filename following > symbol.After writing the
contents use Ctrl+d to close the file.
Ctrl+d(signifies the end of the input to the system)This is the eof character used
by UNIX systems
Copying a File(cp):
The cp(copy) command copies a file or a group of files. It creates an exact image
of the file on disk with a different name. The syntax requires at least two
filenames to be specified in the command line.When both are ordinary files, the
first is copied to the second.
cp chap01 unit1
If destination file doesn’t exists, it will first be created before copying takes place.
Few examples:
cp can also be used to copy more than one file with a single invocation of the
command, but in this case the last filename must be a directory.
cp chap* progs
cp Options
Interactive Copying(-i)
$ cp –i chap01 unit1
cp –R progs newprogs
Files can be deleted using rm command. It can delete more than one file with a
single invocation.
rm progs/chap01 progs/chap02
Rm options
Interactive Deletion(-i) This option makes the command ask the user
for confirmation before removing each file.
$ rm –r *
Forcing Removal(-f) rm will prompt you for removal if a file is write protected.
The –f option overrides this minor protection also.It will force removal even if
the files are write-protected . When you combine the –r option with it, it could
be most risky thing to do.
$ rm –rf *
Renaming Files(mv) :
$ mv pis perdir
Mv Options
Interactive move(-i):
To make the command ask user confirmation before moving a
file .
$ mv –i chap01 man01.
More chap01.
q is used to exit.
$ wc file1
Options:
-l : To count lines -
w : To count words
-c : To count characters
Multiple Filenames:
When used with multiple filenames ,wc produces a line for each file as well as a
total count.
$ od –b file1
The option –b and –c (character ) is used to display character and octal both
output together
$ od –bc file1
Option used
-l(list):
This option is used to give a detailed list of
bytes numbers and the differing bytes in octal for each character that differs in
both files.
To know Common:
To know the commands common in files comm
command is used.In this case both the files should be sorted. When we run
comm, it displays three columnar output.
The first line contains unique to the first file, second column shows unique to the
second file and third column contains common to both the files.
$ comm file[12] comparing file1 and file2
$ comm -3 file1 file2
$ comm -13 file1 file2
Diff uses certain special symbols and instructions to indicate the changes that are
required to make two files identical.These instructions are used by sed command
by the most powerful commands on the system.
E.g.:
$ wc –c libc.html
$ gzip libc.html
$ wc –c libc.html.gz
Gzip options:
-d :
To restore the original and uncompressed file you have two options.
gzip –d
gunzip
$gunzip libc.html.gz
$gzip –d libc.html
$gunzip libc.html.gz User_Guide.ps.gz
-r(Recursive Compression) :
It compresses all files found in subdirectories.This
option is used for decompression also.
$ gzip –r progs
$gunzip –r
progs Or
$ gzip –dr progs
-c (Writing to Terminal):
When used with gzip, this option doesn’t create a
compressed file, but sends the compressed output to the terminal.
The same option used with gunzip also shows compresses contents of
a compressed file on the terminal.The original file remains unchanged.
-c Create an archive
-x Extract files from archive
-t Display files in archive
-f arch Name of the archive arch.
e.g.:
To extract files:
$ gunzip archive.tar.gz
$tar –xvf archive.tar
Recursive Compression(-r) :
It descends the tree structure except that it also
compresses files
$ zip -r sumit_home.zip
Using unzip files can be restored.While restoration if file already exists it asks for
the user permission.
$unzip –v archive.zip
UNIT – II
Files: File Concept, File System Structure, Inodes, File Attributes, File types, The
Linux File System: Basic Princples, Pathnames, Mounting and Unmounting File
Systems, Different File Types, File Permissions, Disk Usage Limits, Directory
Structure Standard Input and Output, Redirecting input and Output, Using Pipes
to connect processes, tee command, Linux File Security, permission types,
examining permissions, changing permissions(symbolic method numeric
method),default permissions and umask Vi editor basics, three modes of vi
editor,concept of inodes,inodes and directories,cp and inodes ,mv and inodes rm
and inodes,symbolic links and hard links,mount and umount command, creating
archives, tar,gzip,gunzip,bzip2,bunzip2(basic usage of these commands )
2.1 File
A file is a collection of information, which can be data, an application, and
documents; in fact, under UNIX, a file can contain anything. When a file is created,
UNIX assigns the file a unique internal number (called an inode).
Unlike the old DOS Files, a UNIX file doesn’t contain the eof mark (End Of File).A
file size is not an attribute, kept in a separate area in the hard disk. It is not
directly accessible to human, only accessible to kernel.
UNIX treats directories and devices as files as well. A directory is simply a folder
where you store files and other directories. All physical devices like the hard disk,
memory, CD-ROM, printer and modem are also treated as file and even shell,
kernel and main memory of the system are also treated as a file.
The UNIX operating system is built around the concept of a filesystem which is
used to store all of the information that constitutes the long-term state of the
system. This state includes the operating system kernel itself, the executable files
for the commands supported by the operating system, configuration information,
temporary workfiles, user data, and various special files that are used to give
controlled access to system hardware and operating system functions.
Every item stored in a UNIX filesystem belongs to one of four types:
Ordinary files
Ordinary files can contain text, data, or program information. Files cannot contain
other files or directories. Unlike other operating systems, UNIX filenames are not
broken into a name part and an extension part (although extensions are still
frequently used as a means to classify files). Instead they can contain any
keyboard character except for '/' and be up to 256 characters long (note however
that characters such as *,?,# and & have special meaning in most shells and
should not therefore be used in filenames). Putting spaces in filenames also
makes them difficult to manipulate - rather use the underscore '_'.
The most common file type, also known as regular file.
It contains only data as a stream of characters. All programs you write belongs to
this file type.
Text file
Binary file
Text file: contains only printable characters and you can often view the contents
and make sense out of them.
All java, C program sources, shell and Perl scripts are text files.
A text files contains lines of characters where every lines are terminated by new
line character (Line Feed-LF- viewed by OD command).
Binary File:
Contains both printable and unprintable characters that cover the entire ASCII
range.
Most UNIX commands are binary files. Object code and executables produced by
compiling a program are also binary files.
Using cat command for binary files gives unreadable output and even disturbs
terminal settings.
Directories
Directories are containers or folders that hold files, and other directories.
A directory contains no data. It keeps detail of files and subdirectories it contains.
UNIX file system is organized with a number of directories and subdirectories and
you can create them as and when you need them.
Often required to group a set of files related to a specific application. This also
benefited to have the same file names under different directories.
A directory file contains the entry for every file and subdirectory that it houses.
Each entry has two components:
The filename
A unique identification for the file or directory(called the inode number)
You cannot write a directory file, but you can perform some action that makes the
kernel write a directory. E.g. when you create a file or remove a file kernel
automatically updates its corresponding directory by adding or removing the
entry (inode number and filename) associated with the file.
NOTE: The name of a file can only be found in its directory; the file itself doesn’t
contain its own name or any other attributes, like its size or access rights.
Devices
To provide applications with easy access to hardware devices, UNIX allows them
to be used in much the same way as ordinary files. There are two types of devices
in UNIX - block-oriented devices which transfer data in blocks (e.g. hard disks) and
character-oriented devices that transfer data on a byte-by-byte basis (e.g.
modems and dumb terminals).
A device file is a special file which does not contain a stream of characters, in fact
it doesn’t contains anything at all. Every file has some attributes that are not
stored in the file but elsewhere on the disk. It’s the attribute of a device file that
entirely governs the operation of the device. The kernel identifies a device from
its attributes and then uses them to operate the device.
You usually print a file, install software from CD-ROM, backup a file to tape, these
all activities are performed by reading or writing the file representing the
device.E.g. You print a file by writing the file representing the printer. When you
restore files from tape, you read the file associated with the tape drive..
The kernel takes care of this “reflection” by mapping these files to their respective
devices.
Device filenames are generally found inside a single directory structure, /dev.
Links
A link is a pointer to another file. There are two types of links - a hard link to a file
is indistinguishable from the file itself. A soft link (or symbolic link) provides an
indirect pointer or shortcut to a file. A soft link is implemented as a directory file
entry containing a pathname.
2.3 Typical UNIX Directory Structure
The UNIX filesystem is laid out as a hierarchical tree structure which is anchored
at a special top-level directory known as the root (designated by a slash '/').
Because of the tree structure, a directory can have many child directories, but
only one parent directory. Fig. 2.1 illustrates this layout.
Fig. 2.2 shows some typical directories you will find on UNIX systems and briefly
describes their contents. Note that these although these subdirectories appear as
part of a seamless logical filesystem, they do not need be present on the same
hard disk device; some may even be located on a remote machine and
accessed across a network.
When you log into UNIX, your current working directory is your user home
directory. You can refer to your home directory at any time as "~" and the
home directory of other users as "~<login>". So ~will/play is another way for
user jane to specify an absolute path to the directory /homes/will/play.
User will may refer to the directory as ~/play.
The Parent-Child Relationship:
The File System in UNIX is a collection of all these related files organized in a
hierarchical structure.
The implicit feature of every UNIX file system is that there is a top, which serves
as a reference point for all files. This top is called as root and is represented by
a front slash (/)
Root is actually a directory. It is conceptually different from the user-id root used
by the system administrator to log in.
The root directory has a number of sub directories under it. These
subdirectories can have number of subdirectories and other files under them.
Every file, apart from root must have parent, and it should be possible to
trace the ultimate parentage of a file to root.
From the administrative point of view the entire file system comprises of two
groups of files.
The first group contains the files that are made available during
system installation.
The second group consists of files for the users to edit. Contents of directories
would change as more software, utilities are added to the system.Users can have
their own files to write program, send and receive mails and to create temporary
files.
First Group:
/bin and /usr/bin : directories where all commonly used UNIX commands are
found. (Binaries hence bin). Path command shows its list.
/sbin and /usr/sbin : can have list of commands can only be executed by system
administrator. System administrator’s PATH command can show these files.
/etc : contains Configuration files of the system. Your login name and
password are stored in files like /etc/passwd and etc/shadow.
Can change a very important aspect of system functioning by editing a text file
in this directory.
/dev : Contains all device files. These files don’t occupy space on disk. There could
be more subdirectories like pts, dsk and rdsk in this directory.
/lib and /usr/lib : contains all library files in binary form. You will require
to link your C Programs with files these directory.
/usr/share/man : This is where the man pages are stored. There are
separate subdirectories here that contain the pages for each section.
SECOND GROUP:
/var : The variable part of the file system. Contains all your print jobs
and your outgoing and incoming mail.
/home : Users are housed here. Kumar would have his home
directory in /home/Kumar. Your system may use a different location for
home directory.
pwd displays the full absolute path to the your current location in the
filesystem. So
$ pwd
/usr/bin
ls (list directory)
ls lists the contents of a directory. If no target directory is given, then the contents
of the current working directory are displayed. So, if the current working directory
is /,
$ ls
bin dev home mnt share usr var
boot etc lib proc sbin tmp vol
Actually, ls doesn't show you all the entries in a directory - files and directories
that begin with a dot (.) are hidden (this includes the directories '.' and '..' which
are always present). The reason for this is that files that begin with a . usually
contain important configuration information and should not be changed under
normal circumstances. If you want to see all files, ls supports the -a option:
$ ls -a
Even this listing is not that helpful - there are no hints to properties such as the
size, type and ownership of files, just their names. To see more detailed
information, use the -l option (long listing), which can be combined with the -
a option as follows:
$ ls -a -l
(or, equivalently,)
$ ls -al
type is a single character which is either 'd' (directory), '-' (ordinary file), 'l'
(symbolic link), 'b' (block-oriented device) or 'c' (character-oriented device).
links refers to the number of filesystem links pointing to the file/directory (see
the discussion on hard/soft links in the next section).
group denotes a collection of users who are allowed to access the file according
to the group access rights specified in the permissions field.
size is the length of a file, or the number of bytes used by the operating system
to store the list of files in a directory.
date is the date when the file or directory was last modified (written to). The
- u option display the time when the file was last accessed (read).
$ man ls
man is the online UNIX user manual, and you can use it to get help with
commands and find out about what options are supported. It has quite a terse
style which is often not that helpful, so some users prefer to the use the
(non-standard) info utility if it is installed:
$ info ls
$ more target-file(s)
displays the contents of target-file(s) on the screen, pausing at the end of each
screenful and asking the user to press a key (useful for long files). It also
incorporates a searching facility (press '/' and then type a phrase that you want
to look for).
You can also use more to break up the output of commands that produce
more than one screenful of output as follows (| is the pipe operator, which will
be discussed in the next chapter):
$ ls -l | more
less is just like more, except that has a few extra features (such as allowing users
to scroll backwards and forwards through the displayed file). less not a standard
utility, however and may not be present on all UNIX systems.
$ ln filename linkname
creates another directory entry for filename called linkname (i.e. linkname is a
hard link). Both directory entries appear identical (and both now have a link
count of 2). If either filename or linkname is modified, the change will be
reflected in the other file (since they are in fact just two different directory
entries pointing to the same file).
$ ln -s filename linkname
creates a shortcut called linkname (i.e. linkname is a soft link). The shortcut
appears as an entry with a special type ('l'):
$ ln -s hello.txt
bye.txt $ ls -l
bye.txt
lrwxrwxrwx 1 will finance 13 bye.txt -> hello.txt
$
The link count of the source file remains unaffected. Notice that the permission
bits on a symbolic link are not used (always appearing as rwxrwxrwx). Instead
the permissions on the link are determined by the permissions on the target
(hello.txt in this case).
Note that you can create a symbolic link to a file that doesn't exist, but not a
hard link. Another difference between the two is that you can create symbolic
links across different physical disk devices or partitions, but hard links are
restricted to the same disk partition. Finally, most current UNIX implementations
do not allow hard links to point to directories.
Note that the UNIX shell performs these expansions (including any filename
matching) on a command's arguments before the command is executed.
2.7 Quotes
As we have seen certain special characters (e.g. '*', '-','{' etc.) are interpreted in a
special way by the shell. In order to pass arguments that use these characters to
commands directly (i.e. without filename expansion etc.), we need to use special
quoting characters. There are three levels of quoting that you can try:
Use single forward quotes (') around arguments to prevent all expansions.
There is a fourth type of quoting in UNIX. Single backward quotes (`) are used to
pass the output of some command as an input argument to another. For example:
$
hostname
rose
$ echo this machine is called
`hostname` this machine is called rose
3.2 File and Directory Permissions
File Directory
Permission
User can look at
read the contents of User can list the files in the directory
the file
User can modify
User can create new files and remove
write the contents of
existing files in the directory
the file
User can change into the directory, but
User can use the
cannot list the files unless (s)he has read
execute filename as a
permission. User can read files if (s)he has
UNIX command
read permission on them.
FILE ATTRIBUTES
A File also has a number of attributes that are changeable by well defined
rules.The Unix file system lets user access other files not belonging to them
without infringing on security.There are so many attributes of file but here we
consider only basic file attributes.
Permission
Ownership
To List File Attributes(ls –l):
This –l(long) option is used to display most attributes of file like its
permission,size, and ownership details.ls looks up the file’s inode to fetch its
sttributes.ls –l lists seven attributes of all files in the current directory.
$ ls –l
The list is always preceded by the word total n where n is a number denotes
number of blocks occupied by these files in the disk, each block consisting of 512
bytes(1024 in Linux).
The First Column shows the type and permissions associated with each file.
The First Character in this column is always -, which means that the file is ordinary
one.When at the same position it is d , it indicates a directory.
After this you can see a series of characters that can take the values r, w, x and -
.In UNIX you can have three types of file permissions i.e. Read , Write and
Execute.
The second column indicates the number of links associated with the file.
Ownership: When you create a file ,you automically become its owner.
The third column shows the owner name of the file.The owner has a full authority
to tamper with the file’s contents and permissions.
Group Ownership: When opening a user account, the system administrator also
assigns user to some group.The fourth column represents the group owner of the
file.The group has a set of privileges distinct from others as well as the owner.
File size: The fifth column shows the size of file in bytes i.e. the amount of data it
contains.It is only a character count of the file and not a measure of the disk space
that it occupies.
Last Modification Time: The sixth, seventh and eighth columns indicate the last
Modification time of the file, which is stored to the nearest Second. If the file is
less than a year old since its last modification date, the year won’t be displayed.
Relative Permission:
Chmod takes as its argument an expression comprising some letters and symbols
that completely describe the user category and the type of permissionbeing
assigned or removed.The expression contains three component:
User category(usres,group,others)
4. a-All(ugo)
Absolute Permissions
To set all nine permission bits explicitly Absolute Permissions are used.The
expressions used by chmod here is a string of three octal numbers(base-8).Each
type of permission is assigned a number as shown:
Read permission-4
Write Permission-2
Execute permission-1
For each category we add the numbers that represent the assigned
permissions.For instance 6 represents read and write permissions and
7 represents all.
E.g. :
can be used to change the group that a file or directory belongs to. It
also supports a -R option.
3.3 Inspecting File Content
Besides cat there are several other useful utilities for investigating the contents of
files:
file filename(s)
file analyzes a file's contents for you and reports a high-level description of what
type of file it appears to be:
file can identify a wide range of files but sometimes gets understandably
confused (e.g. when trying to automatically detect the difference between C++
and Java code).
head and tail display the first and last few lines in a file respectively. You can
specify the number of lines as an option, e.g.
tail includes a useful -f option that can be used to continuously monitor the last
few lines of a (possibly changing) file. This can be used to monitor log files, for
example:
$ tail -f /var/log/messages
objdump can be used to disassemble binary files - that is it can show the machine
language instructions which make up compiled application programs and system
utilities.
$ cat
hello.txt
hello world
$ od -c hello.txt
0000000 h e l l o w o r l d \n
0000014
$ od -x hello.txt
0000000 6865 6c6c 6f20 776f 726c 640a
0000014
There are also several other useful content inspectors that are non-standard (in
terms of availability on UNIX systems) but are nevertheless in widespread use.
They are summarised in Fig. 3.2.
find
If you have a rough idea of the directory tree the file might be in (or even if you
don't and you're prepared to wait a while) you can use find:
find will look for a file called targetfile in any part of the directory tree
rooted at directory. targetfile can include wildcard characters. For example:
will search all user directories for any file ending in ".txt" and output any matching
files (with a full absolute or relative path). Here the quotes (") are necessary to
avoid filename expansion, while the 2>/dev/null suppresses error messages
(arising from errors such as not being able to read the contents of directories for
which the user does not have the right permissions).
find can in fact do a lot more than just find files by name. It can find files by type
(e.g. -type f for files, -type d for directories), by permissions (e.g. -perm o=r for all
files and directories that can be read by others), by size (-size) etc. You can also
execute commands on the files you find. For example,
counts the number of lines in every text file in and below the current directory.
The '{}' is replaced by the name of each file found and the ';' ends the -execclause.
For more information about find and its abilities, use man find and/or info find.
which (sometimes also called whence) command
If you can execute an application program or system utility by typing its name
at the shell prompt, you can use which to find out where it is stored on disk. For
example:
$ which
ls /bin/ls
locate string
find can take a long time to execute if you are searching a large filespace (e.g.
searching from / downwards). The locate command provides a much faster way of
locating all files whose names match a particular search string. For example:
$ locate ".txt"
will find all filenames in the filesystem that contain ".txt" anywhere in their full
paths.
One disadvantage of locate is it stores all filenames on the system in an index that
is usually updated only once a day. This means locate will not find files that have
been created very recently. It may also report filenames as being present even
though the file has just been deleted. Unlike find, locate cannot track down files
on the basis of their permissions, size and so on.
grep searches the named files (or standard input if no files are named) for
lines that match a given pattern. The default behaviour of grep is to print out
the matching lines. For example:
searches all text files in the current directory for lines that do not contain
any form of the word hello (e.g. Hello, HELLO, or hELlO).
If you want to search all files in an entire directory tree for a particular pattern,
you can combine grep with find using backward single quotes to pass the
output fromfind into grep. So
will search all text files in the directory tree rooted at the current directory for
lines containing the word "hello".
The patterns that grep uses are actually a special type of pattern known as
regular expressions. Just like arithemetic expressions, regular expressions are
made up of basic subexpressions combined by operators.
The caret `^' and the dollar sign `$' are special characters that
match the beginning and end of a line respectively. The dot '.' matches
any character. So
'(^[0-9]{1,5}[a-zA-Z ]+$)|none'
You can read more about regular expressions on the grep and egrep manual
pages.
Note that UNIX systems also usually support another grep variant
called fgrep (fixed grep) which simply looks for a fixed string inside a file (but this
facility is largely redundant).
3.6 Sorting files
There are two facilities that are useful for sorting files in UNIX:
sort filenames
sort sorts lines contained in a group of files alphabetically (or if the -n option is
specified) numerically. The sorted output is displayed on the screen, and may be
stored in another file by redirecting the output. So
uniq filename
uniq removes duplicate adjacent lines from a file. This facility is most useful when
combined with sort:
tar backs up entire directories and files onto a tape device or (more commonly)
into a single disk file known as an archive. An archive is a file that contains other
files plus information about them, such as their filename, owner, timestamps, and
access permissions. tar does not perform any compression by default.
cpio
cpio is another facility for creating and reading archives. Unlike tar, cpio
doesn't automatically archive the contents of directories, so it's common to
combine cpiowith find when creating an archive:
This will take all the files in the current directory and the
directories below and place them in an archive called archivename.The - depth
option controls the order in which the filenames are produced and is
recommended to prevent problems with directory permissions when doing a
restore.The -o option creates the archive, the -v option prints the names of the
files archived as they are added and the -H option specifies an archive format
type (in this case it creates a tar archive). Another common archive type is crc,
a portable format with a checksum for error control.
Here the -d option will create directories as necessary. To force cpio to extract
files on top of files of the same name that already exist (and have the same or
later modification time), use the -u option.
compress, gzip
compress and gzip are utilities for compressing and decompressing individual
files (which may be or may not be archive files). To compress files, use:
$ compress filename
or
$ gzip filename
$ compress -d filename
or
$ gzip -d filename
UNIX supports tools for accessing removable media such as CDROMs and
floppy disks.
mount, umount
The mount command serves to attach the filesystem found on some device to
the filesystem tree. Conversely, the umount command will detach it again (it is
very important to remember to do this when removing the floppy or CDROM).
The file /etc/fstab contains a list of devices and the points at which they will be
attached to the main filesystem:
$ cat /etc/fstab
/dev/fd0 /mnt/floppy auto rw,user,noauto 0 0
/dev/hdc /mnt/cdrom iso9660 ro,user,noauto 0 0
In this case, the mount point for the floppy drive is /mnt/floppy and the mount
point for the CDROM is /mnt/cdrom. To access a floppy we can use:
$ mount /mnt/floppy
$ cd /mnt/floppy
$ ls (etc...)
To force all changed data to be written back to the floppy and to detach the
floppy disk from the filesystem, we use:
$ umount /mnt/floppy
mtools
If they are installed, the (non-standard) mtools utilities provide a convenient way
of accessing DOS-formatted floppies without having to mount and unmount
filesystems. You can use DOS-type commands like "mdir a:", "mcopy a:*.* .",
"mformat a:", etc. (see the mtools manual pages for more details).
7.1 Objectives
This lecture covers basic system administration concepts and tasks, namely:
Note that you will not be given administrator access on the lab machines. However,
you might like to try some basic administration tasks on your home PC.
One way to become root is to log in as usual using the username root and the
root password (usually security measures are in place so that this is only possible
if you are using a "secure" console and not connecting over a network). Using root
as your default login in this way is not recommended, however, because normal
safeguards that apply to other user accounts do not apply to root. Consequently
using root for mundane tasks often results in a memory lapse or misplaced
keystrokes having catastrophic effects (e.g. forgetting for a moment which
directory you are in and accidentally deleting another user's files, or accidentally
typing "rm -rf * .txt" instead of "rm -rf *.txt" ).
A better way to become root is to use the su utility. su (switch user) lets you
become another user (at least as far as the computer is concerned). If you don't
specify the name of the user you wish to become, the system will assume you
want to become root. Using su does not usually change your current directory,
unless you specify a "-" option which will run the target user's startup scripts and
change into their home directory (provided you can supply the right password of
course). So:
$ su -
Password: xxxxxxxx
#
Note that the root account often displays a different prompt (usually a #). To
return to your old self, simply type "exit" at the shell prompt.
You should avoid leaving a root window open while you are not at your machine.
/sbin/shutdown allows a UNIX system to shut down gracefully and securely. All
logged-in users are notified that the system is going down, and new logins are
blocked. It is possible to shut the system down immediately or after a specified
delay and to specify what should happen after the system has been shut down:
If you have to shut a system down extremely urgently or for some reason cannot
use shutdown, it is at least a good idea to first run the command:
# sync
At system startup, the operating system performs various low-level tasks, such as
initialising the memory system, loading up device drivers to communicate with
hardware devices, mounting filesystems and creating the init process (the parent
of all processes). init's primary responsibility is to start up the system services as
specified in /etc/inittab. Typically these services include gettys (i.e. virtual
terminals where users can login), and the scripts in the
directory /etc/rc.d/init.dwhich usually spawn high-level daemons such
as httpd (the web server). On most UNIX systems you can type dmesg to see
system startup messages, or look in/var/log/messages.
If a mounted filesystem is not "clean" (e.g. the machine was turned off without
shutting down properly), a system utility fsck is automatically run to repair it.
Automatic running can only fix certain errors, however, and you may have to run
it manually:
# fsck filesys
where filesys is the name of a device (e.g. /dev/hda1) or a mount point (like /).
"Lost" files recovered during this process end up in the lost+found directory.
Some more modern filesystems called "journaling" file systems don't require fsck,
since they keep extensive logs of filesystem events and are able to recover in a
similar way to a transactional database.
useradd is a utility for adding new users to a UNIX system. It adds new user
information to the /etc/passwd file and creates a new home directory for the
user. When you add a new user, you should also set their password (using the - p
option on useradd, or using the passwd utility):
# useradd bob
# passwd bob
7.5 Controlling User Groups
groupadd (in /usr/sbin):
groupadd creates a new user group and adds the new information to /etc/group:
# groupadd groupname
Every user belongs to a primary group and possibly also to a set of supplementary
groups. To modify the group permissions of an existing user, use
groups
# groups username
Look in /usr/src/linux for the kernel source code. If it isn't there (or if there is just
a message saying that only kernel binaries have been installed), get hold of a
copy of the latest kernel source code from http://www.kernel.org and untar it
into /usr/src/linux.
Change directory to /usr/src/linux.
You will be asked to select which modules (device drivers, multiprocessor support
etc.) you wish to include. For each module, you can chose to include it in the
kernel code (y), incorporate it as an optional module that will be loaded if needed
(m) or to exclude it from the kernel code (n). To find out which optional modules
have actually been loaded you can run lsmod when the system reboots.
Now type:
Finally, you may need to update the /etc/lilo.conf file so that lilo (the Linux
boot loader) includes an entry for your new kernel. Then run
# lilo
to update the changes. When you reboot your machine, you should be able
to select your new kernel image from the lilo boot loader.
Each entry in the /etc/crontab file entry contains six fields separated by spaces or
tabs in the following form:
minute 0 through 59
hour 0 through 23
day_of_month 1 through 31
month 1 through 12
weekday 0 (Sun) through 6 (Sat)
command a shell command
You must specify a value for each field. Except for the command field, these
fields can contain the following:
You can also specify some execution environment options at the top of the
/etc/crontab file:
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
To run the calendar command at 6:30am. every Mon, Wed, and Fri, a suitable
/etc/crontab entry would be:
30 6 * * 1,3,5 /usr/bin/calendar
The output of the command will be mailed to the user specified in the MAILTO
environment option.
You don't need to restart the cron daemon crond after changing /etc/crontab - it
automatically detects changes.
rs:2345:respawn:/home/sms/server/RingToneServer
Here rs is a 2 character code identifying the service, and 2345 are the runlevels (to
find about runlevels, type man runlevel) for which the process should be created.
Theinit process will create the RingToneServer process at system startup, and
respawn it should it die for any reason.
The VI Editor
.vi Basics:
To invoke a file:
$ vi <filename>
This is the mode where you can pass commands to act on text, using most of the
keys of the keyboard.Presing a key doesn’t show it on the screen but performs a
function like moving the cursor,deleting a line etc.
To enter a text you have to move to the Input Mode.This can be done by pressing
the key marked I and you are in this mode.
To erase a letter written before the current cursor position use backspace.
To invoke the exMode from the Command Mode enter colon( : ), which shows up
in the last line.Enter x and press enter.
The file is saved and It quits the editor taking back to the shell prompt.
Modes of vi Editor:There are three modes in vi Editor.These are:
Command Mode
Input Mode
Ex Mode or Last Line Mode
Command Mode:
Input Mode:
You can run an ex Mode Command followed by an [Enter].It will take you back to
the command Mode.
To Remember:
To Clear Screen:
Key of the Input Mode doesn’t appear on the screen.It is used for
entering,copying,replacing the text.Following are the commands used in this
mode:
Command Action
:wq as above
file
.k Moves cursor up
The repeat factor can also be used as a command prefix with all these four
commands
e.g.
Word Navigation:
Scrolling
Absolute Movement:
Editing Text
Deleting text:
right
left
Moving Text
E.g suppose u have written Nuix instead of Unix. It means u have to place n after u
for this use:
To associate word right use p and left use “P”. P places text on left of the cursor..
To put deleted lines from one location to another same commands can be used.
Copying Text
Joining Lines:
Use the actual command once and repeat it any number of times Use u to undo it.
Searching can be made in both forward and reverse directions and can be
repeated.
With the ex mode s command is used to substitute some text.It lets you replace a
pattern in the file with something else.
Syntax:
:address/source_pattern/target_pattern/flags
Address can be one or pair of numbers separated by comma.
Commanly used flag is g,which carries out substitution for all occurrences of the
pattern in a line.
If g or gc flag is not used than substitution will be carried out for the first
occurrences in each addressed line.
:1,$s /director/member/g
For example, a program that normally reads from stdin (the keyboard) can be
redirected to read from a file.
It is also possible to redirect any device using the file descriptor number assigned
to that device.
By default, the c compiler sends the error output to
the screen. To redirect the error output to a file, it is
necessary to use the file descriptor.
Pipes
_ALI
TOKEN_A
VAR_1
VAR_2
Following are the examples of invalid variable names:
2_VAR -
VARIABLE
VAR1-VAR2
VAR_A!
The reason you cannot use other characters such as !,*, or - is that these
characters have a special meaning for the shell.
Defining Variables:
Variables are defined as follows::
variable_name=variable_value
For example:
NAME="Zara Ali"
Above example defines the variable NAME and assigns it the value "Zara Ali".
Variables of this type are called scalar variables. A scalar variable can hold only
one value at a time.
The shell enables you to store any value you want in a variable. For example:
VAR1="Zara Ali"
VAR2=100
Accessing Values:
To access the value stored in a variable, prefix its name with the dollar sign ( $):
For example, following script would access the value of defined variable NAME
and would print it on STDOUT:
#!/bin/sh
NAME="Zara Ali"
echo $NAME
This would produce following value:
Zara Ali
Read-only Variables:
The shell provides a way to mark variables as read-only by using the readonly
command. After a variable is marked read-only, its value cannot be changed.
For example, following script would give error while trying to change the value of
NAME:
#!/bin/sh
NAME="Zara Ali"
readonly NAME
NAME="Qadiri"
This would produce following result:
Unsetting Variables:
Unsetting or deleting a variable tells the shell to remove the variable from the list
of variables that it tracks. Once you unset a variable, you would not be able to
access stored value in the variable.
Following is the syntax to unset a defined variable using the unset command:
unset variable_name
Above command would unset the value of a defined variable. Here is a simple
example:
#!/bin/sh
NAME="Zara Ali"
unset NAME
echo $NAME
Above example would not print anything. You cannot use the unset command to
unset variables that are marked readonly.
Variable Types:
The shell also maintains a set of internal variables known as shell variables. These
variables cause the shell to work in a particular way. Shell variables are local to
the shell in which they are defined; they are not available to the parent or child
shells. By convention, shell variable names are generally given in lower case in the
C shell family and upper case in the Bourne shell family.
C Shell Family
The C shell family explicitly distinguishes between shell variables and environment variables.
Shell Variables
A shell variable is defined by the set command and deleted by the unset
command. The main purpose of your .cshrc file (discussed later in this chapter) is
to define such variables for each process. To define a new variable or change the
value of one that is already defined, enter:
% set name=value
where name is the variable name, and value is a character string that is the value
of the variable. If value is a list of text strings, use parentheses around the list
when defining the variable, e.g.,
The set command issued without arguments will display all your shell variables.
You cannot check the value of a particular variable by using set name,
omitting=value in the command; this will effectively assign an empty value to the
variable.
% unset name
To use a shell variable in a command, preface it with a dollar sign ($), for example
$name. This tells the command interpreter that you want the variable's value, not
its name, to be used. You can also use ${name}, which avoids confusion when
concatenated with text.
% echo $name
If the value is a list, to see the value of the nth string in the list enter:
% echo $name[n]
The square brackets are required, and there is no space between the name and
the opening bracket.
% set name=prepend_value${name}
or
% set name=${name}append_value
Note that when a shell is started up, four important shell variables are
automatically initialized to contain the same values as the corresponding
environment variables. These are user, term, home and path. If any of these are
changed, the corresponding environment variables will also be changed.
Environment Variables
For example, first we set a variables TEST and then we access its value using echo
command:
$TEST="Unix
Programming" $echo $TEST
Unix Programming
Note that environment variables are set without using $ sign but while accessing
them we use $sign as prefix. These variables retain their values until we come out
shell.
When you login to the system, the shell undergoes a phase called initialization to
set up various environment. This is usually a two step process that involves the
shell reading the following files:
/etc/profile
profile
2. If it exists, the shell reads it. Otherwise, this file is skipped. No error
message is displayed.
3. The shell checks to see whether the file .profile exists in your home
directory. Your home directory is the directory that you start out in after
you log in.
4. If it exists, the shell reads it; otherwise, the shell skips it. No error message
is displayed.
5. As soon as both of these files have been read, the shell displays a prompt:
This is the prompt where you can enter commands in order to have them
execute.
Note - The shell initialization process detailed here applies to all Bourne type
shells, but some additional files are used by bash and ksh.
The file .profile is under your control. You can add as much shell customization
information as you want to this file. The minimum set of information that you
need to configure includes
You can check your .profile available in your home directory. Open it using vi
editor and check all the variables set for your environment.
Usually the type of terminal you are using is automatically configured by either
the login or gettyprograms. Sometimes, the autoconfiguration process guesses
your terminal incorrectly.
If your terminal is set incorrectly, the output of commands might look strange, or
you might not be able to interact with the shell properly.
To make sure that this is not the case, most users set their terminal to the lowest
common denominator as follows:
$TERM=vt100
$
Setting the PATH:
When you type any command on command prompt, the shell has to locate the
command before it can be executed.
The PATH variable specifies the locations in which the shell should look for
commands. Usually it is set as follows:
$PATH=/bin:/usr/bin
$
Here each of the individual entries separated by the colon character, :, are
directories. If you request the shell to execute a command and it cannot find it in
any of the directories given in the PATH variable, a message similar to the
following appears:
$hello
hello: not found
$
There are variables like PS1 and PS2 which are discussed in the next section.
$PS1='=>'
=> => =>
Your prompt would become =>. To set the value of PS1 so that it shows the
working directory, issue the command:
=>PS1="[\u@\h \w]\$"
[root@ip-72-167-112-17 /var/www/tutorialspoint/unix]$
[root@ip-72-167-112-17 /var/www/tutorialspoint/unix]$
The result of this command is that the prompt displays the user's username, the
machine's name (hostname), and the working directory.
There are quite a few escape sequences that can be used as value arguments for
PS1; try to limit yourself to the most critical so that the prompt does not
overwhelm you with information.
Escape Sequence Description
\n Newline.
\W Working directory.
Command number of the current command. Increases with each new command
\#
entered.
If the effective UID is 0 (that is, if you are logged in as root), end the prompt with
\$
the # character; otherwise, use the $.
You can make the change yourself every time you log in, or you can have the
change made automatically in PS1 by adding it to your .profile file.
When you issue a command that is incomplete, the shell will display a secondary
prompt and wait for you to complete the command and hit Enter again.
The default secondary prompt is > (the greater than sign), but can be changed by
re-defining the PS2shell variable:
$ echo "this is
a > test"
this is a
test
$
$ PS2="secondary prompt->"
$ echo "this is a
secondary prompt-
>test" this is a
test
$
Environment Variables:
Variable Description
DISPLAY Contains the identifier for the display that X11 programs should use by default.
Indicates the home directory of the current user: the default argument for the cd
HOME
built-in command.
Indicates the Internal Field Separator that is used by the parser for word splitting
IFS
after expansion.
LANG expands to the default system locale; LC_ALL can be used to override
LANG this. For example, if its value is pt_BR, then the language is set to (Brazilian)
Portuguese and the locale to Brazil.
RANDOM Generates a random integer between 0 and 32,767 each time it is referenced.
TZ Refers to Time zone. It can take values like GMT, AST, etc.
UID Expands to the numeric user ID of the current user, initialized at shell startup.
$ echo
$TERM xterm
$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/home/amrood/bin:/usr/local/bin
$
Display Environment Variable
Open the terminal and type the following commands to display all
environment variables and their values under UNIX-like operating systems:
$ set
OR
$ printenv
OR
$ env
Sample outputs:
echo $PATH
echo $PS1
A few more examples:
echo $USER
echo $PWD
echo $MAIL
echo $JAVA_PATH
echo $DB2INSTANCE
You can use the following command to change the environment variable for the
current session as per your shell.
var=value
export var
JAVA_PATH=/opt/jdk/bin
export JAVA_PATH
export var=value
export PATH=$PATH:/opt/bin:/usr/local/bin:$HOME/bin
For C shell (csh or tcsh)
For example, the $ character represents the process ID number, or PID, of the
current shell:
$echo $$
29949
The following table shows a number of special variables that you can use in your
shell scripts:
Variable Description
These variables correspond to the arguments with which a script was invoked. Here n is a
$n positive decimal number corresponding to the position of an argument (the first argument is
$1, the second argument is $2, and so on).
All the arguments are double quoted. If a script receives two arguments, $* is equivalent to $1
$*
$2.
All the arguments are individually double quoted. If a script receives two arguments, $@ is
$@
equivalent to $1 $2.
The process number of the current shell. For shell scripts, this is the process ID under which
$$
they are executing.
$! The process number of the last background command.
Command-Line Arguments:
The command-line arguments $1, $2, $3,...$9 are positional parameters, with $0
pointing to the actual command, program, shell script, or function and $1, $2, $3,
...$9 as the arguments to the command.
#!/bin/sh
There are special parameters that allow accessing all of the command-line
arguments at once. $* and $@ both will act the same unless they are enclosed in
double quotes, "".
Both the parameter specifies all command-line arguments but the "$*" special
parameter takes the entire list as one argument with spaces between and the
"$@" special parameter takes the entire list and separates it into separate
arguments.
We can write the shell script shown below to process an unknown number of
command-line arguments with either the $* or $@ special parameters:
#!/bin/sh
for TOKEN in $*
do
echo $TOKEN
done
There is one sample run for the above script:
Exit Status:
Exit status is a numerical value returned by every command upon its completion.
As a rule, most commands return an exit status of 0 if they were successful, and 1
if they were unsuccessful.
Some commands return additional exit statuses for particular reasons. For
example, some commands differentiate between kinds of errors and will return
various exit values depending on the specific type of failure.
Following is the example of successful command:
PAGINATING FILES((pr)
The pr commands prepares a file for printing by adding suitable header footers
and formatted text.
$pr <filename>
$ pr dept.lst
pr adds five lines of margin at the top and five at the bottom. The lower portion of
the page has not been shown in the examples for reasons of economy.The header
shows the date and time of last modification of the file, along with the filename
and page number.
14.2.1 pr Options
$ a.out | pr –t -5
$pr –t –n –d –o 10 dept.lst
$pr –l 54 chap01
Because pr formats its input by adding margins and a header, it’s often used as a
pre-processor before printing with lp command.
&pr –h “Department list” | lp
The head command, as the name implies, displays the top of the file. When used
without an option it displays the first ten lines of the file.
.head emp.lst
To display fixed number of lines option used is –n
$head –n 3 emp.lst
$tail -3 emp.lst
+count option is used to represent the line number from where the
selection should begin.
Options:
$tail –f /oracle/app/oracle/product/8.1/install.log
a) Cutting Columns(-c):
Cutting Fields(-f):
Cut uses tab as field delimiter and –f is used to mention field.-d is also used to
mention field delimiter.
$who | cut –d “ “ –f 1
Pasting Files(paste):
Joining Lines(-s) It is used to join lines. With this option –d can also be used to
put delimiters.It can also be used in a circular manner. Suppose three
different information about a candidate is given in three lines and we want to
get all together in one line , the command used can be:
Ordering a file(sort):
$ sort sortfile
Sort reorder lines in ASCII collating sequence –whitespace first, then numerals,
uppercase letters and finally lowercase letters.
Sort options:
Sorting on secondary key: if the PK is the third field and SK is the 2nd field then
Numeric sort(-n)
$sort numfile
$sort –n numfile
USERS
username:x:UID:GID:comment:home_directory:login_shell 1
234567
root:x:0:0:Super User:/root:/bin/bash
idallen:x:500:500:Ian! D. Allen:/home/idallen:/bin/bash
When a system has shadow passwords enabled (the default), the password
field in /etc/passwd is replaced by an “x” and the user’s real encrypted
password is stored in /etc/shadow.
/etc/shadow is only readable by the root user, so even the encrypted
password is hidden and can’t be used in a password-cracking program
Each line in /etc/shadow contains the user’s login userid, their encrypted
password, and fields relating to password expiration.
Special passwords (see “man shadow”):
1. a leading ! means the password (and thus account) is locked
2. an asterisk (star) * indicates the account has been disabled
useradd
userdel
chsh
“CHange SHell”
Changes the login shell in /etc/passwd - does not affect current shell
Only root can change shells of other accounts
If a shell isn’t specified on the command line, it will prompt for one
Usually only allows setting a shell from a small system-defined list
passwd
Groups allow a set of permissions to be assigned to group of users
Every file system object has “group” permissions; if you are not the owner
of the object but are in that group, group permissions apply to you.
File system objects have only one owner and can be in only
one group.
Logged in users can be “in” (members of) multiple groups.
Most group information is maintained in /etc/group and /etc/gshadow
BUT: At login, every user is given an initial group GID from
the passwd file.
A user will belong to other groups (supplementary groups), if the user is
a member of those groups in the /etc/group file.
groupname:x:GID:userid1,userid2,userid3
1234
root:x:0:
cdrom:x:500:idallen,alleni
group name
encrypted password (or an x marker indicating use of /etc/gshadow)
Group ID number (GID)
Optional list of userids that are members of that group
1. The above information about groups is kept in /etc/group
2. Modifications can be done by root or by the Group Administrator for a group
3. Its content can be viewed by anyone
4. Encrypted passwords are usually stored in /etc/gshadow, accessible only
by root
Group Shadow Passwords - /etc/gshadow
When a system has shadow passwords enabled (the default), the password field
in /etc/group is replaced by an “x” and the user’s real encrypted password is
stored in /etc/gshadow.
/etc/gshadow is only readable by the root user, so even the encrypted password
is hidden and can’t be used in a password-cracking program
Each line in /etc/gshadow contains the group name, the group encrypted
password, an optional list of Group Administrators, and an optional list of
Group Members (which should be the same in /etc/group)
Special passwords (see “man gshadow”):
Example: su --login
Opens up a subshell as the new user, with that user’s privileges
Exiting the subshell goes back to the previous user
Ordinary (non-root) users need to enter the password for the other account
A dash - or --login option (options must be surrounded by spaces) means use a
full login shell that clears the environment, sets groups and goes to the user’s
home directory as if the user had just logged in.
Without the full login, the command will set privileges but will leave most of the
existing environment unchanged, including an unchanged current directory (that
may not grant the new user any permissions!).
If you don’t give a userid, it assumes you want to become the root user
[idallen@localhost]$ whoami
idallen
[idallen@localhost]$ su
password: XXX
[root@localhost]# whoami
root
[root@localhost]# exit
[idallen@localhost]$
[idallen@localhost]$ whoami
idallen
sudo - do as if suIndex
There are 3 special permission that are available for executable files
and directories. These are:
1. SUID permission
2. SGID permission
3. Sticky bit
Have you ever thought how a non-root user can change his own password when
he does not have write permission to the/etc/shadow file. To understand the
trick check for the permission of /usr/bin/passwd command :
# ls -lrt /usr/bin/passwd
-r-sr-sr-x 1 root sys 31396 Jan 20 2014
/usr/bin/passwd
– If you check carefully, you would find the 2 S’s in the permission field. The first
s stands for the SUID and the second one stands for SGID.
– When a command or script with SUID bit set is run, its effective UID becomes
that of the owner of the file, rather than of the user who is running it.
# ls -l /bin/su
-rwsr-xr-x-x 1 root user 16384 Jan 12 2014 /bin/su
Note :
If a capital “S” appears in the owner’s execute field, it indicates that the setuid bit
is on, and the execute bit “x” for the owner of the file is off or denied.
– SGID permission is similar to the SUID permission, only difference is – when the
script or command with SGID on is run, it runs as if it were a member of the same
group in which the file is a member.
# ls -l /usr/bin/write
Note :
– If a lowercase letter “l” appears in the group’s execute field, it indicates that
the setgid bit is on, and the execute bit for the group is off or denied.
SGID on a directory
– When SGID permission is set on a directory, files created in the directory belong
to the group of which the directory is a member.
– For example if a user having write permission in the directory creates a file
there, that file is a member of the same group as the directory and not the user’s
group.
Sticky Bit
– It is useful for shared directories such as /var/tmp and /tmp because users can
create files, read and execute files owned by other users, but are not allowed to
remove files owned by other users.
– For example if user bob creates a file named /tmp/bob, other user tom can not
delete this file even when the /tmp directory has permission of 777. If sticky bit is
not set then tom can delete /tmp/bob, as the /tmp/bob file inherits the parent
directory permissions.
– root user (Off course!) and owner of the files can remove their own files.
# ls -ld /var/tmp
# chmod +t [path_to_directory]
or
http://thegeekdiary.com/what-is-suid-sgid-and-sticky-bit/
AWK FILTER
awk scans input lines one after the other, searching each line to see if it matches a
set of patterns or conditions specified in the awk program.
For each pattern, an action is specified. The action is performed when the pattern
matches that of the input line.
pattern { action }
pattern { action }
When awk scans an input line, it breaks it down into a number of fields. Fields are
separated by a space or tab character. Fields are numbered beginning at one, and
the dollar symbol ($) is used to represent a field.
For instance, the following line in a file
I like money.
$1 I
$2 like
$3 money.
awk interprets the actions specified in the program file myawk1, and applies this
to each line read from the file /etc/group. The effect is to print out each input line
read from the file, in effect, displaying the file on the screen (same as the Unix
command cat).
/brian/ { print $0 }
This involves specifying a pattern to match for each input line scanned. The
following awk program (myawk2) compares field one ($1) and if the field matches
the string "386", the specific action is performed (the entire line is printed).
$1 == "386" { print $0 }
Note: The == symbol represents an equality test, thus in the above pattern, it
compares the string of field one against the constant string "386", and performs
the action if it matches.
Note: < ctrl-d> is a keypress to terminate input to the shell. Hold down the ctrl key
and then press d. User input is shown in bold type.
The program prints out all input lines where the computer type is a "386".
Comments begin with the hash (#) symbol and continue till the end of the line.
The awk program below adds a comment to a previous awk program shown
earlier
Comments can be placed anywhere on the line. The example below shows the
comment placed after the action.
Relational Expressions
We have already seen the equality test. Detailed below are the other relational
operators used in comparing expressions.
!= not equal
~ matches
# myawk5
# an awk program to print the location/serial number of 486 computers
$1 == "486" { print $3, $4 }
D404 MG0012
A424 CC0182
# myawk6
# an awk program to print out all computers belonging to
management. /MG/ { print $0 }
The awk program myawk6 scans each input line searching for the occurrence of
the string MG. When found, the action prints out the line. The problem with this
is it might be possible for the string MG to occur in another field, but the serial
number indicates that it belongs to another department.
In all the previous examples, the output of the awk program has been either the
entire line or fields within the line. Lets add some text to make the output more
meaningful. Consider the following awk program,
# myawk7
# list computers located in D block, type and location
$3 ~ /D/ { print "Location = ", $3, " type = ", $1 }
Lets examine how to print out some simple text. Consider the following
statement,
#myawk8
$1 == "286" { printf( "Location : "); print $3 }
Lets now examine how to use printf to display a field which is a text string. In the
previous program, a separate statement (print $3) was used to write the room
location. In the program below, this will be combined into the printf statement
also.
#myawk9
$1 == "286" { printf( "Location is %s\n", $3 ); }
myawk9 Program Output
Location is A423
Location is A425
Note: The symbol \n causes subsequent output to begin on a new line. The symbol
%s informs printf to print out a text string, in this case it is the contents of the
field $3.
Consider the following awk program which prints the location and serial number
of all 286 computers.
#myawk10
$1=="286" { printf( "Location = %s, serial # = %s\n", $3, $4 ); }
#myawk11
$1=="486" { printf("Location = %s, disk = %dKb\n", $3, $5 ); }
Formatting Output
Lets see how to format the output information into specific field widths. A
modifier to the %s symbol specifies the size of the field width, which by default
is right justified.
#myawk12
# formatting the output using a field width
$1=="286" {printf("Location = %10s, disk = %5dKb\n",$3,$5);}
10%s specifies to print out field $3 using a field width of 10 characters, and
%5d specifies to print out field $5 using a field width of 5 digits.
Below lists the options to printf covered above. [n] indicates optional arguments.
\n print a new-line
The keywords BEGIN and END are used to perform specific actions relative to the
programs execution.
BEGIN The action associated with this keyword is executed before the first
input line is read.
END The action associated with this keyword is executed after all input lines
have been processed.
The BEGIN keyword is normally associated with printing titles and setting default
values, whilst the END keyword is normally associated with printing totals.
Consider the following awk program, which uses BEGIN to print a title.
#myawk13
BEGIN { print "Location of 286 Computers" }
$1 == "286" { print $3 }
#myawk14
# print the number of computers
END { print "There are ", NR, "computers" }
awk programs support the use of variables. Consider an example where we want
to count the number of 486 computers we have. Variables are explicitly initialised
to zero by awk, so there is no need to assign a value of zero to them.
The following awk program counts the number of 486 computers, and uses the
END keyword to print out the total after all input lines have been processed.
When each input line is read, field one is checked to see if it matches 486. If it
does, the awk variable computers is incremented (the symbol ++ means
increment by one).
#myawk15
$1 == "486" { computers++ }
END { printf("The number of 486 computers is %d\n", computers); }
Regular Expressions
awk provides pattern searching which is more comprehensive than the simple
examples outlined previously. These patterns are called regular expressions, and
are similar to those supported by other UNIX utilities like grep.
The simplest regular expression is a string enclosed in slashes,
/386/
In the above example, any input line containing the string 386 will be printed. To
restrict a search to a specific field, the match (or not match) symbol is used. In the
following example, field one of the input line is searched for the string 386.
$1 ~ /386/
In regular expressions, the following symbols are metacharacters with special
meanings.
\^$.[]*+?()|
Note: In this example, field one is searched for the character '8' and '6', in any
order of occurrence and position.
If the first character after the opening bracket ([) is a caret (^) symbol, this
complements the set so that it matches any character NOT IN the set. The
following example (myawk17) shows this, matching field one with any character
except "8" or "6".
#myawk17
# display all which do not contain 2, 3, 4, 6 or 8 in first
field $1 ~ /[^23468]/ { print $0 }
#myawk18
# display all lines where field one contains A-Z, a-z
$1 ~ /[a-zA-Z]/ { print $0 }
myawk18 Program Output
#myawk19
# illustrate multiple searching using alternatives
/(Apple|Mac|68020|68030)/ { print $0 }
/b\$/ { print $0 }
\b backspace
\f formfeed
\r carriage return
\t tab
\" double quote
The following example prints all input lines which contain a tab character
/\t/ { print $0 }
Consider also the use of string concatenation in pattern matching. The plus (+)
symbol concatenates one or more strings in pattern matching. The following awk
program (myawk16a) searches for computer types which begin with a dollar ($)
symbol and are followed by an alphabetic character (a-z, A-Z), and the last
character in the string is the symbol x.
#myawk16a
$1 ~ /^\$+[a-zA-Z]+x$/ { print $0 }
#myawk17
# display all which do not contain 2, 3, 4, 6 or 8 in first
field $1 ~ /[^23468]/ { print $0 }
The awk program below shows how to rewrite this (myawk17) using a variable
which is assigned the regular expression.
#myawk20
BEGIN { matchstr = "[^23468]" }
$1 ~ matchstr { print $0 }
Consider the following example, which searches for all lines which contain the
double quote character (").
#myawk21
BEGIN { matchstr = "\"" }
$1 ~ matchstr { print $0 }
Combining Patterns
Patterns can be combined to provide more powerful and complex matching. The
following symbols are used to combine patterns.
#myawk22
$1 == "486" && $5 > 250 { print $0 }
#myawk23
# demonstrate the use of pattern ranges
/XT/, /Mac/ { print $0 }
The awk program myawk23 prints out all input lines between the first occurrence
of "XT" and the next occurrence of "Mac".
Write an awk program using a pattern range to print out all input lines beginning
with the first computer fitted with 8192Kb of memory, up to the next computer
which has less than 80Mb of hard disk. After running the program successfully,
enter it in the space provided below.
awk provides a number of internal variables which it uses to process files. These
variables are accessible by the programmer. The following is a summary of awk's
built-in variables.
#myawk24
# print the first five input lines of a file, bit like head
FNR == 1, FNR == 5 { print $0 }
#myawk25
# print each input line preceded with a line number
# print the heading which includes the name of the file
BEGIN { print "File:", FILENAME }
{ print NR, ":\t", $0 }
#myawk26
# demonstrate use of argc and argv parameters
BEGIN { print "There are ",ARGC, "parameters on the command line";
print "The first argument is ", ARGV[0];
print "The second argument is ", ARGV[1]
}
#myawk27
# print out the number of fields in each input
line { print "Input line", NR, "has", NF, "fields" }
Using the BEGIN statement, it is often desirable to change both FS (the symbol
used to separate fields) and RS (the symbol used to separate input lines). The
following text file (awktext2) is used for the program myawk28. The test file
separates each field using a dollar symbol ($), and each input line by a carat
symbol (^). The program reads the file and prints out the username and password
for each user record. A heading is shown only for clarity.
(username$address$password$privledge$downloadlimit$protocol^)
#myawk28
+ add
- subtract
* multiply
/ divide
++ increment
-- decrement
% modulus
^ exponential
+= plus equals
-= minus equals
*= multiply equals
/= divide equals
%= modulus equals
^= exponential equals
The following awk program displays the average installed memory capacity for an
IBM type computer (XT - 486). Note the use of %f within the printf statement to
print out the result in floating point format. The use of .2 between the % and f
symbols specify two decimal places.
#myawk29
/(XT|286|386|486)/ { computers++, ram += $2 }
END { avgmem = ram / computers;
printf(" The average memory per PC = %.2f", avgmem )
}
myawk29 Program Output
The average memory per PC = 4480.00
"hello\n"
The following awk program (myawk31) uses the string function gsub to replace
each occurrence of 286 with the string AT.
#myawk31
{ gsub( /286/, "AT" ); print $0 }
The expression can include the relational operators, the regular expression
matching operators, the logical operators and parentheses for grouping.
expression is evaluated first, and if NON-ZERO then statement1 is executed,
otherwise statement2 is executed.
In the following awk program (myawk32), each input line is scanned and field $5
is compared against the value of the awk user defined variable disksize (awk
initialises it to 0). When field $5 is greater, it is assigned to disksize, and the input
line is saved in the other user defined variable computer. Note the use of the
braces { } to group the program statements as belonging to the if statement
(same syntax as in the C language).
#myawk32
#demonstrate use of if statement, find biggest disk
{ if( disksize < $5 )
{
disksize = $5
computer = $0
}
}
END { print computer }
#myawk33
# a while statement to print out each second field only for "286"
computers BEGIN { printf("Type\tLoc\tDisk\n") }
/286/ { field = 1
while( field < = NF )
{
printf("%s\t", $field )
field += 2
}
print ""
}
1. expression1
2. expression is evaluated. If non-zero got step 3 else exit
3. statement is executed
4. expression2 is executed
5. goto step 2
Consider the following awk program (myawk34) which is the same as myawk33
shown earlier.
#myawk34
# a for statement to print out each second field only for "286"
computers BEGIN { printf("Type\tLoc\tDisk\n") }
/286/ { for( field = 1; field < = NF; field += 2) printf("%s\t", $field )
print ""
}
#myawk35
# print out every second field for "286"
computers BEGIN { field = 1 }
$1 == "286" { do {
printf("%s\t", $field)
field += 2
} while( field < = NF )
}
The break statement causes an immediate exit from within a while or for loop.
The continue statement causes the next iteration of a loop.
The next statement skips to the next input line then re-starts from the first
pattern-action statement.
The exit statement causes the program to branch to the END statement (if one
exists), else it exits the program.
#myawk36
#print out computer types "286" using a next statement
{ while( $1 != "286" ) next; print $0 }
awk provides single dimensioned arrays. Arrays need not be declared, they are
created in the same manner as awk user defined variables.
Elements can be specified as numeric or string values. Consider the following awk
program (myawk37) which uses arrays to hold the number of "486" computers
and the disk space totals for all computers.
#myawk37
Note that the previous program (myawk37) uses TWO pattern action statements
for each input line. The first pattern action statement handles the number of
"486" type computers, whilst the second handles the total disk space for all
computer types.
Consider the following awk program (myawk38) which uses the in statement
associated with processing areas. The program .....
#myawk38
{ computers[$1]++ }
END { for ( name in computers )
print "The number of ",name,"computers is",computers[name]
}
myawk38 Program Output
Consider the following awk program (myawk39) which calculates the factorial of
an inputted number.
#myawk39
function factorial( n ) {
if( n < = 1 ) return 1
else return n * factorial( n - 1)
}
{ print "the factorial of ", $1, "is ", factorial($1) }
Sample myawk39 Program Output (awk -fmyawk39)
10
the factorial of 10 is 3628800
3
the factorial of 3 is 6
1
the factorial of 1 is 1
4
the factorial of 4 is 24
awk Output
The statements print and printf are used in awk programs to generate output.
awk uses two variables, OFS (output field separator) and ORS (output record
separator) to delineate fields and output lines. These can be changed at any time.
The special characters used in printf, which follow the % symbol, are,
c single character
d decimal integer
e double number, scientific notation
f floating point number
g use e or f, whichever is shortest
o octal
s string
x hexadecimal
% the % symbol
The default output format is %.6g and is changed by assigning a new value to
OFMT.
awks output generated by print and printf can be redirected to a file by using the
redirection symbols > (create/write) and > > (append). The names of files MUST
be in quotes.
#myawk40
# demonstrates sending output to a file
$1 == "486" { print "Type=",$1, "Location=",$3 > "comp486.dat"
The output of awk programs can be piped into a UNIX command. The statement
awk Input
Data Files
We have seen TWO methods to give file input to an awk program. The first
specified the filename on the command line, the other left it blank, and awk read
from the keyboard (examples were myawk30 and myawk39).
Program Files
We have used the -f parameter to specify the file containing the awk program.
awk programs can also be specified on the command-line enclosed in single
quotes, as the following example shows.
awk provides the function getline to read input from the current input file or from
a file or pipe.
getline reads the next input line, splitting it into fields according to the settings of
NF, NR and FNR. It returns 1 for success, 0 for end-of-file, and -1 on error.
The statement
getline data
reads the next input line into the user defined variable data. No splitting of fields
is done and NF is not set.
The statement
reads the next input line from the file "temp.dat", field splitting is performed, and
NF is set.
The statement
reads the next input line from the file "temp.dat" into the user defined variable
data, no field splitting is done, and NF, NR and FNR are not altered.
Consider the following example, which pipes the output of the UNIX command
who into getline. Each time through the while loop, another line is read from who,
and the user defined variable users is incremented. The program counts the
number of people on the host system.
awk Summary
The following is a summary of the most common awk statements and features.
Command Line
awk program filenames
awk -f program-file filenames
awk -Fs
(sets field separator to string s, -Ft sets separator to tab)
Patterns
BEGIN
END
/regular expression/
relational expression
pattern & & pattern
pattern || pattern
(pattern)
!pattern
pattern, pattern
Input Output
getline < file set $0 from next input line of file, set NF
getline var set var from next input line, net NR, FNR
getline var < file set var from next input line of file
print print current input line
print expr-list print expressions
print expr-list > file print expressions to file
printf fmt, expr-list format and print
printf fmt, expr-list > file format and print to file
system( cmd-line ) execute command cmd-line, return status
In print and printf above, > > appends to a file, and the | command writes to
a pipe. Similarly, command | getline pipes into getline. The function getline
returns
0 on the end of a file, -1 on an error.
Functions
String Functions
Arithmetic Functions
= += -= *= /= %= ^= assignment
?: conditional expression
|| logical or
&& logical and
~ !~ regular expression match, negated
match
< < = > > = != == relationals
blank string concatenation
+ - add, subtract
* / % multiply, divide, modulus
+ - ! unary plus, unary minus, logical
negation
^ exponentional
++ -- increment, decrement
$ field
c matches no-metacharacter c
\c matches literal character c
. matches any character except newline
^ matches beginning of line or string
$ matches end of line or string
[abc...] character class matches any of abc...
[^abc...] negated class matches any but abc... and newline
r1 | r2 matches either r1 or r2
r1r2 concatenation: matches r1, then r2
r+ matches one or more r's
r* matches zero or more r's
r? matches zeor or more r's
(r) grouping: matches r
Built-In Variables
Limits
Each implementation of awk imposes some limits. Below are typical limits
100 fields
2500 characters per input line
2500 characters per output line
1024 characters per individual field
1024 characters per printf string
400 characters maximum quoted string
400 characters in character class
15 open files
1 pipe
sleep
This command suspends execution for a specified time interval, expressed in
seconds.
UNIT – IV
Process related commands(ps, top, pstree, nice, renice), Introduction to the linux
Kernel, getting started with the kernel(obtaining the kernel source, installing the
kernel source,using patches, the kernel source tree, building the kernel proceess
management(process descriptor and the task structure, allocating the process
descriptor, storing the process descriptor, process state, manipulating the current
process state, process context, the process family tree, the Linux scheduling
algorithm, overview of system calls,.Intoduction to kernel debuggers(in windows
and linux)
Introduction to UNIX:
Lecture Four
4.1 Objectives
This lecture covers:
4.3 Pipes
The pipe ('|') operator is used to create concurrently executing processes that
pass data directly to one another. It is useful for combining system utilities to
perform more complex functions. For example:
creates three processes (corresponding to cat, sort and uniq) which execute
concurrently. As they execute, the output of the who process is passed on to the
sortprocess which is in turn passed on to the uniq process. uniq displays its output
on the screen (a sorted list of users with duplicate lines removed).
Similarly:
The output from programs is usually written to the screen, while their input
usually comes from the keyboard (if no file arguments are given). In technical
terms, we say that processes usually write to standard output (the screen) and
take their input from standard input (the keyboard). There is in fact another
output channel called standard error, where processes write their error
messages; by default error messages are also sent to the screen.
To redirect standard output to a file instead of the screen, we use the > operator:
$ echo hello
hello
$ echo hello > output
$ cat output
hello
In this case, the contents of the file output will be destroyed if the file already
exists. If instead we want to append the output of the echo command to the file,
we can use the >> operator:
$ cat nonexistent
2>errors $ cat errors
cat: nonexistent: No such file or directory
$
You can redirect standard error and standard output to two different files:
$ cat <
output hello
bye
You can combine input redirection with output redirection, but be careful not to
use the same filename in both places. For example:
will destroy the contents of the file output. This is because the first thing the shell
does when it sees the > operator is to create an empty file ready for the output.
One last point to note is that we can pass standard output to system utilities that
require filenames as "-":
Here the output of the gzip -d command is used as the input file to the tar
command.
Most shells provide sophisticated job control facilities that let you control many
running jobs (i.e. processes) at the same time. This is useful if, for example, you
are editing a text file and want ot interrupt your editing to do something else.
With job control, you can suspend the editor, go back to the shell prompt, and
start work on something else. When you are finished, you can switch back to the
editor and continue as if you hadn't left.
Jobs can either be in the foreground or the background. There can be only one job
in the foreground at any time. The foreground job has control of the shell with
which you interact - it receives input from the keyboard and sends output to the
screen. Jobs in the background do not receive input from the terminal, generally
running along quietly without the need for interaction (and drawing it to your
attention if they do).
The foreground job may be suspended, i.e. temporarily stopped, by pressing the
Ctrl-Z key. A suspended job can be made to continue running in the foreground or
background as needed by typing "fg" or "bg" respectively. Note that suspending a
job is very different from interrupting a job (by pressing the interrupt key, usually
Ctrl-C); interrupted jobs are killed off permanently and cannot be resumed.
Background jobs can also be run directly from the command line, by appending a
'&' character to the command line. For example:
Here the [1] returned by the shell represents the job number of the background
process, and the 27501 is the PID of the process. To see a list of all the jobs
associated with the current shell, type jobs:
$ jobs
[1]+ Running find / -print 1>output 2>errors &
$
Note that if you have more than one job you can refer to the job as %n where n is
the job number. So for example fg %3 resumes job number 3 in the foreground.
To find out the process ID's of the underlying processes associated with the shell
and its jobs use ps (process show):
$ ps
PID TTY TIME CMD
17717 pts/10 00:00:00 bash
27501 /10 00:00:01 find
27502 pts/10 00:00:00 ps
So here the PID of the shell (bash) is 17717, the PID of find is 27501 and the PID of
ps is 27502.
To terminate a process or job abrubtly, use the kill command. kill allows jobs to
referred to in two ways - by their PID or by their job number. So
$ kill %1
or
$ kill 27501
would terminate the find process. Actually kill only sends the process a signal
requesting it shutdown and exit gracefully (the SIGTERM signal), so this may not
always work. To force a process to terminate abruptly (and with a higher
probability of sucess), use a -9 option (the SIGKILL signal):
$ kill -9 27501
kill can be used to send many other types of signals to running processes. For
example a -19 option (SIGSTOP) will suspend a running process. To see a list of
such signals, run kill -l.
$ sleep 200&
This will run the sleep program in the background for 200 seconds.
$ pstree –p
You may even see the sleep process listed if it is still running in the background.
Using sty
This sets certain terminal I/O options for the device associated with stdin. When
typed without arguments, it displays the current settings of these options.
The format of the command is
stty [-a] [-g] [options]
parity
character
size baud
rate stop bits
flow control, cts, xon-xoff
upper-lowercase translation
cr/cr-lf translation, delays after cr/cr-lf, formfeed, tabs
echo, full/half-duplex
A - sign preceding an argument turns that option off. Some examples are,
Using tput
bold=`tput smso`
norm=`tput rmso`
echo "${bold}Hello there${norm}"
The last example assigns the codes for bold-on and bold-off to shell variables.
These are then used to highlight a text string.
Using tset
This command is used to specify the erase and kill characters for the terminal, as
well as specifying the terminal type and exporting terminal information for use in
the shell environment.
This command is used at login time (.profile) to determine the terminal type.
Examples of commands are,
You can also use ps to show all processes running on the machine (not just the
processes in your current shell):
Many UNIX versions have a system utility called top that provides an interactive
way to monitor system activity. Detailed statistics about currently running
processes are displayed and constantly refreshed. Processes are displayed in
order of CPU utilization. Useful keys in top are:
s - set update frequency k - kill process (by PID)
u - display processes of one user q – quit
One other useful process control utility that can be found on most UNIX systems
is the pkill command. You can use pkill to kill processes by name instead of PID or
job number. So another way to kill off our background find process (along with
any another find processes we are running) would be:
$ pkill find
[1]+ Terminated find / -print 1>output 2>errors
$
Note that, for obvious security reasons, you can only kill processes that belong to
you (unless you are the superuser).
The Process
Processes as a living being can born, can give birth, and also can die.
Basics of Processes:
Processes have also attributes as the files have. Some attributes of every process
are maintained by the kernel in memory in a separate structure called the process
table. The process table is the inode for processes. Two important attributes of a
process are:
The other attributes also get inherited by the child from its parent.
$ echo $$
The PID
$ cat emp.lst
This process will get started by shell process.It will remain active till the command
is active.The shell (sh,ksh,bash,csh) is said to be its parent of cat.Every process
have and parent, no process can be orphan.
The command:
Sets up two processes for the two commands.In UNIX multitasking nature permits
a process to generate (or spawn) one or more children.
Process status(ps):
ps command is used to display some process attributes.ps can be
seen as the process counterpart of the file system’s ls command.The Command
reads through the kernal’s data structure and process tables to fetch the
characterstics of processes.
ps displays the processes associated with a user at the terminal. If you execute
the command immediately after logging in.
ps Options:
Full Listing(-f):
To get a detailed listing which also shows the parent of every
process, use the –f(full) options
$ps –f
$ps –u shan
The –a (all) option lists the processes of all users but doesn’t display the system
processes.
$ps –a
$ps –a
Displays the list of all processes along with the controlling terminal (TTY), Time
and CMD running on your system.
System Process are easily identified by ? in the TTY column.These processes are
known as daemons because they are called without a specific request from a
user.Many of these daemons are actually sleeping and wake up only when they
receive input.
There are three distinct phases and uses three important system calls:
Fork
Exec
Wait
Fork:
A process in Unix is created with the fork system call, which creates a copy
of the process that invokes it.
The process image is practically identical to that of the calling
process,except for a few parameters like PID.
When a process is forked in this way, the child gets a new PID.
The forking mechanism is responsible for the multiplication of process in
the system.
Exec:
The child then overwrites the image that it has just created with the copy of
the program that has to be executed.
This is done with a system call belonging to the exec family, and the parent
is said to exec this process.
No additional process is created here ; the existing program is simply
replaced with the new program.This process has the same PID as the child that
was just forked.
Wait:
The parent then executes the wait system call to wait for the child process
to complete.
It picks the exit status of the child and then continues with its other
functions.
A parent may not decide to wait for the death of its child.
In brief when you run a command from the shell, the shell first forks another shell
process.
The newly forked shell then overlays itself with the executable image of the
command(say cat), which then starts to run.
The parent(shell) waits for the caommnad(cat) to terminate and then picks up the
exit status of the child.
This is the number returned by the child to the kernel, and has great significance
in both shell programming and systems programming.
When a process is forked, the child has differenr PID and PPID from its
parent.However it inherits most of the environment of its parent.The important
attributes that are inherited are:
When the system moves to multiuser mode, init forks and execs a getty for every
active communication port(or line).
Each one of these getys prints the login prompt on the respective terminal and
then goes off to sleep.
When a user attempts to log in, getty wakes up and fork-execs the login program
to verify the login name and the password entered.
On successful login, login forks-execs the process representing the login shell.
init goes off to sleep, waiting for the death of its children.The other processes
getty and login have extinguished themselves by overlaying.When the user logs
out, her shell is killed, and the death is intimated to init.init then wakes up,swan
another getty for that line to monitor the next login.
From the process point of view , the shell recognizes three types of commands:
External Commands
Shell Scripts
Internal Commands
External Commands:
The most commonly used ones are the UNIX utilities and
programs like cat,ls etc.The shell creates a process for each of these commands
that it executes while remaining their parent.
Shell Scripts:
The shell executes these scripts by spawning another shell,
which then executes the commands listed in the script.The child shell becomes
the parent of the commands listed in the scripts.The child shell becomes the
parent of the commands that feature in the script.
Internal Commands:
A multitasking system lets a user do more than one job at a time.Since there can
be one job in the foreground, the rest of the job have to run in the
background.There are two ways of doing this
With the shell’s & operator
The nohup command
The shell immediately returns a number- the PID of the invoked command.The
prompt is returned and the shell is ready to accept another command even
though the previous command has not been terminated yet.The shell,however
,remains the parent of the background process.
Using an & you can as many jobs in the background as the system loads permits.
Depending on the shell you are using ,the standard output and the standard error
of a job running in the background may or may not come to the terminal.
Background execution of a jobs is a useful feature that you should utilize to
relegate time-consuming or low-priority jobs to the background, and the run the
important ones in the foreground.
Background jobs cease to run , when a user logs out(The c hell and Bash
excepted)That happens because her shell is killed.And when the parent is killed,
its children are also normally killed.
The UNIX system permits variation in this default behaviour.The nohup(no
hangup) command, when prefixed to a command, permits execution of the
process even after the user has logged out. You must use the & with it as well:
Note: Using the ps command after using nohup from another terminal few
important things are to be noticed like:
The shell died on logging out but its child(sort)didn’t; it turned into an orphan.The
kernel handles such situations by reassigning the PPID of the orphan to the
system’s init process (PID 1)- the parent of all shells.When the user logs out , init
takes over the parentage of any process run with nohup.In this way u can kill
apparent without killing its child.
If you run more than one command in a pipeline, then you should use the nohup
command at the beginning of each command in the pipeline:
To run a job with a low priority, the command name should be prefixed with nice:
nice wc –l manual
Or
nice wc –l manual &
nice is a built-in command in the C shell. .nice values are system-dependent and
typically ranges from 1 to 19. Commands execute with a nice value that is
generally in the middle of the range from 1 to 19. Commands execute with a nice
value that is generally in the middle of the range- usually 10. A higher nice value
implies a lower priority. .nice reduces the priority of any process, thereby raising
its nice value. You can also specify the nice value explicitly with the –n option:
As the same signal number may represent two different signals on two different
machines, its better to represent them by their signal name having the SIG prefix.
If a program is running longer than you anticipated and you terminate it , you
normally press the interrupt key. This sends the process the SIGINT signal(2)
Ther si one signal that aprocess cannot ignore or run user defined code to handle
it; it’s the SIGKILL(9) signal which terminates a process immediately
kill 111
Here 111 is PID of a process
To facilitate premature termination , the & operator displays the PID of the
process that runs in the background.If you don’t remember the PID the use ps
command to know then kill.
If you want to kill more than one job either in the background or in different
windows , then all can be killed using single kill statements with their respective
id’s
In most shell system variable $! Stores the PID of the last background job- the
same number seen when the & is affixed to a command.So you can kill the last
background process without using the ps command to find out its PID:
& kill $!
By default, kill uses the SIGTERM signal (15) to terminate the process.
Processes can also be killed using SIGKILL signal (9).This signal can’t be generated
at the press of a key, so you must use kill with the signal name(without the SIG)
prefixed with –s option:
kill –s KILL 121
A simple kill command won’t kill the login shell.You can kill your login shell by
using any of these commands:
kill –l
Used to view all signal names
JOB CONTROL:
The & at the end of the line indicates that the job is now running in the
background.Any number of foreground jobs can goes to the background, first
with [Ctrl+z] and then with bg command.
All the jobs comprise one process each.List can be viewed using jobs command.
Any of the background job can be brought in the foreground using fg command.
The fg and bg both commands can be used along with the job number,job name
or a string as argument prefixed with % symbol.
e.g.
fg %1 fg
%sort
bg %2
bg %?perm Sends to the background job containing string perm.
Start [Ctrl+q]
Stop [Ctrl+s]
Susp [Ctrl+z]
Dsusp [Ctrl+y]
at One At A Time:
At takes as a argument the time the job is to be executed and
displays the at> prompt.Input has to be supplied from the standard input
at 14:08
at>abc.sh
[Ctrl+d]
at doesn’t indicate the name of the script to be executed; that is something the
user has to remember.A user may prefer to redirect the output of the command
itself.
at 15:08
.script1.sh>resu.lst
The month name and day of week can also be used, remember must be either
fully spelled out or abbreviated to three letters.
Jobs can be listed with at –l and removed with at –r.
Unfortunately, there’s no way you can find out the name of the program
scheduled to be executed.This can create problem specially when you are unable
to recall whether a specific job has actually been scheduled for later execution.
This command doesn’t take any arguments but uses an internal algorithm to
determine the execution time. This prevents too many CPU – hungry jobs from
running at the same time. The response of batch is similar to at .
Any job scheduled with batch goes to a special at queue, and can also be removed
with at –r.
The ps –e command always shows the cron daemon running. This is the Unix
System’s chronograph, ticking away every minute. Unlike at and batch that are
meant for one-time execution. .cron executes programs at regular intervals. It is
mostly dormant but every minute it wakes up and looks in a control file (the
crontab file) in /var/spool/cron/crontabs for instructions to be performed at that
instant. After executing them, it goes to sleep, only to wake up the next minute.
The first field (legal values 00 to 59) specifies the number of minutes after the
hour when the command is to be executed .
The range 00 to 10 schedules execution every minute in the first 10 minutes of
the hour.
The second field (17 i.e. 5 p.m.) indicates the hour in 24 hour format for
scheduling(legal values 1 to 24)
The third field(legal values 1 to 31) controls the day of the month. This field (here
an *) read with the other two, implies that the command is to be executed every
minute , for the first 10 minutes, starting at 5 p.m. every day.
The fourth field (3, 6, 9, and 12) specifies the month (legal values 1 to 12)
The fifth field (5 – Friday0 indicates the days of the week (legal values 0 to 6),
Sunday having the value 0.
Now here the find command will thus be executed every minute in the first 10
minutes after 5 p.m. every Friday of the months March, June, September and
December of every Year.
You need to use the crontab command to place the file in the directory containing
crontab files for cron to read the file again:
crontab cron.txt
To see the content crontab –l is used and to remove them crontab –r is used.
The time command accepts the entire command line to be timed as its argument
and does this work.It executes the program and also displays the time usage on
the terminal.
This enables programmers to tune their program to keep CPU usage to an
optimum level.
To find out time taken to perform a sorting operation be preceeding the sort
command with time.
a) real : Clock elapsed time from the invocation of the command till its
termination.
c)sys : It ndicates the time used by the kernel in doing work on behalf if the user
process.
The sum of the user time and the sys time actually represents the CPU
time.
This will run the sleep program in the background for 200 seconds.
$ pstree -p
You may even see the sleep process listed if it is still running in the background.
Using stty
This sets certain terminal I/O options for the device associated with stdin. When
typed without arguments, it displays the current settings of these options.
parity
character
size baud
rate stop bits
flow control, cts, xon-xoff
upper-lowercase translation
cr/cr-lf translation, delays after cr/cr-lf, formfeed, tabs
echo, full/half-duplex
A - sign preceding an argument turns that option off. Some examples are,
stty -echo ;suppress echoing of input
stty erase # ;set erase character to #
stty quit @ ;set SIGQUIT to @ (initially ctrl-\)
Using tput
This command uses the terminfo database to make the values of terminal
dependent attributes available to the shell.
bold=`tput smso`
norm=`tput rmso`
echo "${bold}Hello there${norm}"
The last example assigns the codes for bold-on and bold-off to shell variables.
These are then used to highlight a text string.
Using tset
This command is used to specify the erase and kill characters for the terminal, as
well as specifying the terminal type and exporting terminal information for use in
the shell environment.
This command is used at login time (.profile) to determine the terminal type.
Examples of commands are,
Introduction to UNIX:
Lecture Five
5.1 Shells and Shell Scripts
A shell is a program which reads and executes commands for the user. Shells also
usually provide features such job control, input and output redirection and a
command language for writing shell scripts. A shell script is simply an ordinary
text file containing a series of commands in a shell command language (just like a
"batch file" under MS-DOS).
When you log on to a Unix machine, you see first see a prompt.This prompt
remains there till you key in something.Even though it may appear that the
system is idling , a Unix Command is in fact running at the terminal.This command
is all the time and never terminates unless you log out.This command is the shell.
Even though the shell appears not to be doing anything meaningful when there is
no activity at the terminal, it swings into action the moment you key in something
from the keyboard.You can see the shell is running using ps command.
Bash shell runs at the terminal /dev/pts/2.Following are the activities performed
by the shell in its interpretive cycle:
The shell issues the prompt and waits for you to enter a command.
After a command is entered the shell scans the command line for metacharacters
and expands abbreviations like the * in rm* to recreate a simplified command
line.
After command execution is complete, the prompt reappears and the shell
returns to its waiting role to start the next cycle.You are now free to enter
another command.
You can change this behaviour and instruct the shell not to wait so you can fire
one job after another without waiting for the previous one to complete .i.e.
Background Processing using metacharacter(&)
There are three standard files actually containing streams of characters which
many command see as input and output. A stream is simply a sequence of
bytes.When a user logs in, the shell makes available three files representing three
streams.Each stream is associated with a default device
Standard Output- The file representing output, which is connected to the display
Standard Error-The file representing error messages that emanate from the
command or shell, also connected to display.
Standard Input: - cat and wc are the commands which when used without
arguments, read the file representing standard input.The file is indeed special; it
can represent three input sources:
$ wc < sample.txt
When a command takes input from multiple sources-say a file and standard input
the symbol – must be used to indicate the sequence of taking input.
$cat –foo first from standard input and then from foo
$cat foo –bar first from foo , then standard input and then bar
Standard Output:
All commands displaying output on the terminal actually write to the standard
output file as a stream of characters and not directly to the terminal. The three
possible destinations are:
$wc sample.txt>newfile
$cat newfile
If the output file doesn’t exists, the shell creates it before executing the command
else overwrites it so to overcome this problem we can use append operator>>.
Standard Error :
To handle the error stream, file descriptors are used. A file descriptor is the
number used to represent all the three files as:
Standard Input
Standard Output
Standard Error
When you enter an incorrect command to try to open a nonexistent file, certain
error message display on the screen.This is the default destination i.e. terminal.
This error message can be redirected as:
/dev/null is a special file that simply accepts any stream without growing in size.
If you check the file size, it will be zero./dev/null simply incinerates all output
written to it.No matter whether you direct or append output to this file, it’s
always remains zero.It is actually a pseudo-device which is useful in redirecting
error message away from the terminal.
/dev/tty : the second special file in the unix system is the one indicating one’s
terminal-/dev/tty. This is not the file that represents standard output or standard
error.Commands enerally don’t write to this file, but you will require redirecting
some statements in the shell scripts.
$who >/dev/tty
If two users issue the same command both can get the ouput in their different
output terminals.say /dev/pts/1 and /dev/pts/2
Redirecting a script’s output implies redirecting the output of all statements in the
script.echo statements if contained in the script, and the output is redirected to
/dev/tty then it won’t affect.
When you first login to a shell, your shell runs a systemwide start-up script
(usually called /etc/profile under sh, bash and ksh and /etc/.login under csh). It
then looks in your home directory and runs your personal start-up script (.profile
under sh, bash and ksh and .cshrc under csh and tcsh). Your personal start-up
script is therefore usually a good place to set up environment variables such as
PATH, EDITOR etc. For example with bash, to add the directory ~/bin to your
PATH, you can include the line:
export PATH=$PATH:~/bin
in your .profile. If you subsequently modify your .profile and you wish to import
the changes into your current shell, type:
$ source .profile
or
$ . ./profile
The source command is built into the shell. It ensures that changes to the
environment made in .profile affect the current shell, and not the shell that would
otherwise be created to execute the .profile script.
With csh, to add the directory ~/bin to your PATH, you can include the line:
set path = ( $PATH $HOME/bin )
in your .cshrc.
Command Substitution
Command Substitution is a very handy feature of the bash shell. It enables you to
take the output of a command and treat it as though it was written on the
command line. For example, if you want to set the variable X to the output of a
command, the way you do this is via command substitution.
brace expansion
backtick expansion.
An example is given;:
#!/bin/bash
files="$(ls )"
web_files=`ls
public_html` echo $files
echo $web_files
X=`expr 3 \* 2 + 4` # expr evaluate arithmatic expressions. man expr for details.
echo $X
Note that even though the output of ls contains newlines, the variables do not.
Bash variables can not contain newline Anyway, the advantage of the $()
substitution method is almost self evident: it is very easy to nest. It is supported
by most of the bourne shell varients.However, the backtick substitution is slightly
more readable, and is supported by even the most basic.
Single Quotes versus double quotes
Basically, variable names are exapnded within double quotes, but not single
quotes. If you do not need to refer to variables, single quotes are good to use as
the results are more predictable.
An example
#!/bin/bash
echo -n '$USER=' # -n option stops echo from breaking the
line echo "$USER"
echo "\$USER=$USER" # this does the same thing as the first two lines
The output looks like this (assuming your username is elflord)
$USER=elflord
$USER=elflord
so the double quotes still have a work around. Double quotes are more flexible,
but less predictable. Given the choice between single quotes and double quotes,
use single quotes.
It is a good idea to protect variable names in double quotes. This is usually the
most important if your variables value either (a) contains spaces or (b) is the
empty string. An example is as follows:
#!/bin/bash
X=""
if [ -n $X ]; then # -n tests to see if the argument is non empty
echo "the variable X is not the empty string"
fi
#!/bin/bash
X=""
if [ -n "$X" ]; then # -n tests to see if the argument is non empty
echo "the variable X is not the empty string"
fi
In this example, the expression expands to [ -n "" ] which returns false, since the
string enclosed in inverted commas is clearly empty.
#!/bin/bash
LS="ls"
LS_FLAGS="-al"
ls -al /home/elflord
(assuming that /home/elflord is your home directory). That is, the shell simply
replaces the variables with their values, and then executes the command.
Suppose you want to echo the value of the variable X, followed immediately by
the letters "abc". Let's have a try :
#!/bin/bash
X=ABC
echo "$Xabc"
This gives no output.The answer is that the shell thought that we were asking for
the variable Xabc, which is uninitialised. The way to deal with this is to put braces
around X to seperate it from the other characters. The following gives the desired
result:
#!/bin/bash
X=ABC
echo "${X}abc"
Any UNIX command or program can be executed from a shell script just as if you
would on the line command line. You can also capture the output of a command
and assign it to a variable by using the forward single quotes ` `:
#!\bin\sh
lines=`wc -l $1`
echo "the file $1 has $lines lines"
This script outputs the number of lines in the file passed as the first parameter.
Arithmetic operations
The Bourne shell doesn't have any built-in ability to evaluate simple mathematical
expressions. Fortunately the UNIX expr command is available to do this. It is
frequently used in shell scripts with forward single quotes to update the value of a
variable. For example:
expr supports the operators +, -, *, /, % (remainder), <, <=, =, !=, >=, >, | (or) and
& (and).
8.3 Shell Programming / Shell Scripts
Shell scripts are executed in a separate child shell process and this sub-shell need
not be the same as your login shell.
E.g.:
#!/bin/sh
# script.sh an Script Shell
.echo “Today’s date:’date’”
.echo “This months calendar :”
.cal ‘date “+%m 20%y”’
.echo “My shell :$SHELL”
To run the script, make it executable first and then invoke the script name.
$chmod +x script.sh
$script.sh
$sh script.sh
Making Scripts Interactive (read)
The read statement is the shell’s internal tool for taking input from the user i.e.
making script interactive.It is used with one or more variables.Input supplied
through the standard input is read into these variables.
.read name
The script pauses at this point to take input from the keyboard.Whatever you
enter will get stored in the variable name.
#!/bin/sh
#emp1.sh
#
.echo “Enter the pattern to be searched :\c”
.read pname
.echo “Enter the file to be used :\c”
.read flname
.echo “Searching for $pname from $flname”
.grep “$pname”$flname
.echo “Selected records shown above”
A shell lets you define variables (like most programming languages). A variable is a
piece of data that is given a name. Once you have assigned a value to a variable,
you access its value by prepending a $ to the name:
$ bob='hello
world' $ echo $bob
hello world
$
Variables created within a shell are local to that shell, so only that shell can access
them. The set command will show you a list of all variables currently defined in a
shell. If you wish a variable to be accessible to commands outside the shell, you
can export it into the environment:
$ export bob
(under csh you used setenv). The environment is the set of variables that are
made available to commands (including shells) when they are executed. UNIX
commands and programs can read the values of environment variables, and
adjust their behaviour accordingly. For example, the environment variable PAGER
is used by the mancommand (and others) to see what command should be used
to display multiple pages. If you say:
$ export PAGER=cat
and then try the man command (say man pwd), the page will go flying past
without stopping. If you now say:
$ export PAGER=more
normal service should be resumed (since now more will be used to display the
pages one at a time). Another environment variable that is commonly used is the
EDITORvariable which specifies the default editor to use (so you can set this to vi
or emacs or which ever other editor you prefer). To find out which environment
variables are used by a particular command, consult the man pages for that
command.
Another interesting environment variable is PS1, the main shell prompt string
which you can use to create your own custom prompt. For example:
The shell often incorporates efficient mechanisms for specifying common parts of
the shell prompt (e.g. in bash you can use \h for the current host, \w for the
current working directory, \d for the date, \t for the time, \u for the current user
and so on - see the bash man page).
/bin:/usr/bin:/usr/local/bin:.
and you typed ls, the shell would look for /bin/ls, /usr/bin/ls etc. Note that the
PATH contains'.', i.e. the current working directory. This allows you to create a
shell script or program and run it as a command from your current directory
without having to explicitly say "./filename".
Note that PATH has nothing to with filenames that are specified as arguments to
commands (e.g. cat myfile.txt would only look for ./myfile.txt, not
for/bin/myfile.txt, /usr/bin/myfile.txt etc.)
Shell scripts can also accepts arguments from the command line.This way they
can be run non interactively and be used with redirection and pipelines.When
arguments are specified with a shell script, they are assigned to certain special
“variables” rather positional parameters.
The first argument is read into the parameter 1, the second argument into $2
and so on.Few more positional parameter are as:
e.g.
.cmd1&&cmd2
.cmd1||cmd2
The && delimit two commands ; the command cmd2 is executed only when cmd1
succeeds.You can use it with grep in this way:
The || operator plays an inverse role; the second command is executed only
when the first fails.If you grep a pattern from a file without success, you can noify
the failure.
#!/bin/sh
# this is a comment
echo "The number of arguments is $#"
echo "The arguments are $*"
echo "The first is $1"
echo "My process number is $$"
echo "Enter a number from the keyboard:
" read number
echo "The number you entered was $number"
The shell script begins with the line "#!/bin/sh" . Usually "#" denotes the start of a
comment, but #! is a special combination that tells UNIX to use the Bourne shell
(sh) to interpret this script. The #! must be the first two characters of the script.
The arguments passed to the script can be accessed through $1, $2, $3 etc. $*
stands for all the arguments, and $# for the number of arguments. The process
number of the shell executing the script is given by $$. the read number
statement assigns keyboard input to the variable number.
To execute this script, we first have to make the file simple executable:
$ ls -l simple
-rw-r--r-- 1 will finance 175 Dec 13 simple
$ chmod +x simple
$ ls -l simple
We can use input and output redirection in the normal way with scripts, so:
would produce similar output but would not pause to read a number from the
keyboard.
if-then-else statements
Shell scripts are able to perform simple conditional
branches: if [ test ]
then commands-if-test-is-
true
else commands-if-test-is-
false
fi
Syntax 1
Syntax 2
Syntax 3
When you use if to evaluate expressions, you require the test statement because
the true and false values returned by expressions can’t be handled by if..test
uses certain operators to evaluate the condition on its right and returns either a
true or false exit status, which is then used by if for making decisions.test works
in three ways:
Numeric Comparision:
The numerical comparision operators used by test have a form different from
what you would have seen anwhere. They always begin with –(hyphen) ,
followed by a two character word and enclosed on either side by whitespaces.
Operator Meaning
-eq Equal to
-ne not equal to
-gt Greater than
-ge Greater than or equal to
-lt less than
-le Less than or equal to
Test True if
File Tests
test can be used to test the various file attributes like its type(file,directory
or symbolic link) or its permission (read,write,execute,SUID)
Test True if
The test condition may involve file characteristics or simple string or numerical
comparisons. The [used here is actually the name of a command (/bin/[) which
performs the evaluation of the test condition. Therefore there must be spaces
before and after it as well as before the closing bracket. Some common test
conditions are:
-s file
true if file exists and is not empty
-f file
true if file is an ordinary
file -d file
true if file is a directory
-r file
true if file is
readable -w file
true if file is
writable -x file
true if file is
executable $X -eq $Y
true if X equals Y
$X -ne $Y
true if X not equal to
Y $X -lt $Y
true if X less than
$Y $X -gt $Y
true if X greater than $Y
$X -le $Y
true if X less than or equal to Y
$X -ge $Y
true if X greater than or equal to
Y "$A" = "$B"
true if string A equals string
B "$A" != "$B"
true if string A not equal to string B
$X ! -gt $Y
true if string X is not greater than
Y $E -a $F
true if expressions E and F are both true
$E -o $F
true if either expression E or expression F is true
String Comparisions:
test can be used to compare strings with yet another set of operators.Equality is
performed with = and inequality with the != .
e.g.
#! /bin/sh
# Checks user input for null values
#
.if [$# -eq 0]; then
.echo “Enter the strings to be searched :\c”
.read pname
.if [-z “$pname”+; then
.echo you have not entered the string”;
.exit 1
.fi
e.g. 2
#!/bin/sh
# Using test, $0 and $# in an if-elif-if
#
.if test $# -eq 0; then
.echo “Usage $0 pattern file “>/dev/tty
.elif test $# -eq 2 ; then
.grep “$1” $2 || echo “$1 not found in $2” >/dev/tty
.else
.echo “you didn’t enter two arguments” > /dev/tty
.fi
for loops
The following script sorts each text files in the current directory:
#!/bin/sh
for f in *.txt
do
echo sorting file $f
cat $f | sort > $f.sorted
echo sorted file has been output to $f.sorted
done
while loops
while [ test ]
do
statements (to be executed while test is true)
done
The following script waits until a non-empty file input.txt has been created:
#!/bin/sh
while [ ! -s input.txt
] do
echo waiting...
sleep 5
done
echo input.txt is ready
You can abort a shell script at any point using the exit statement, so the following
script is equivalent:
#!/bin/sh
while true
do
if [ -s input.txt ]
echo input.txt is ready
exit
fi
echo waiting...
sleep 5
done
case statements
case statements are a convenient way to perform multiway branches where one
input pattern must be compared to several alternatives:
case variable
in pattern1)
statement (executed if variable matches pattern1)
;;
pattern2)
statement
;;
etc.
esac
The following script uses a case statement to have a guess at the type of non-
directory non-executable files passed as arguments on the basis of their
extensions (note how the "or" operator | can be used to denote multiple
patterns, how "*" has been used as a catch-all, and the effect of the forward
single quotes `):
#!/bin/sh
for f in $*
do
if [ -f $f -a ! -x $f ]
then
case $f
in core)
echo "$f: a core dump file"
;;
*.c)
echo "$f: a C program"
;;
*.cpp|*.cc|*.cxx)
echo "$f: a C++ program"
;;
*.txt)
echo "$f: a text file"
;;
*.pl)
echo "$f: a PERL script"
;;
*.html|*.htm)
echo "$f: a web document"
;;
*)
echo "$f: appears to be "`file -b $f`
;;
esac
fi
done