Unix Mod3
Unix Mod3
❍ Functions
❍ Aliases
● Shell Scripts
● Printing
● Remote Access
● Regular Expressions
● The Stream Editor sed
● Some Other Utilities
● Using awk Instead of cut
● Exercise 1
● Exercise 2
❍ When the illegal form of who is piped to wc, the 2 line error message is displayed.
❍ The next line is the number of lines given wc -l, which was zero since who did not produce any regular
output but only generated an end of file.
❍ Clearly, the error message was not piped to wc -l since we can see it.
❍ Error messages are output to a special output stream called stderr.
● There are numbers associated with the three input/output streams:
Stream Number
stdin 0
stdout 1
stderr 2
● Just as the greater-than symbol > directs stdout to a file, you can direct stderr to a file with 2>:
[you@faraday you]$ who -x 2> errfile | wc -l
0
[you@faraday you]$ cat errfile
who: invalid option -- x
Try `who --help' for more information.
[you@faraday you]$ _
❍ This turns out to be amazingly useful. Old timers were really happy when this was added to the shell.
● To run a command that takes a long time to finish without tying up your terminal window, put it in the background by
ending the command with an ampersand &. The command will give you some information about the program you are
running and give you back your shell prompt.
[you@faraday you]$ long_running_command &
[1] 12345
[you@faraday you]$ _
❍This is also the way to invoke a program, such as netscape or emacs in an X-window environment, that
use their own window.
■ X-windows is discussed in Module 4.
❍ The number in the square brackets is the job number assigned by the shell
❍ The other number is the process identification number (pid) assigned by the kernel.
❍ If the program produces output, you will want to direct it to a file with >
❍ If the program requires input, you will want to prepare it in a file and feed it to the program with <
● You may see the jobs that are running with the jobs command:
[you@faraday you]$ jobs
[1]+ Running long_running_command &
[you@faraday you]$ _
❍ You can list all your processes with ps as another way of finding the jobs you are running:
[you@faraday you]$ ps
PID TTY TIME CMD
12340 pts/0 00:00:00 bash
12345 pts/0 00:01:23 long_running_command
13589 pts/0 00:00:00 ps
[you@faraday you]$ _
■ The first column is the process identification number assigned by the kernel.
■ The second column is the device ("teletype") that invoked the command.
■ The third column is the cpu time used by the process in hours:minutes:seconds.
■ ps labels its columns by default. This makes it a chatterbox compared to many UNIX/Linux programs,
such as who.
● You can kill a background job with kill %n where n it the job number assigned by the shell.
[you@faraday you]$ kill %1
[you@faraday you]$ _
❍ You can also kill the job by giving kill the process identification number pid assigned by the kernel
[you@faraday you]$ kill 12345
[you@faraday you]$ _
● Say you execute a command, such as long_running_command, in the foreground. Then you do not get your shell
prompt back.
[you@faraday you]$ long_running_command
_
If you then wish to place it in the background, you first stop the job with Ctrl-Z and then place it in the background
with bg
^z
[2]+ Stopped long_running_command
[you@faraday you]$ bg
[2]+ long_running_command &
[you@faraday you]$ _
● When you log out, by default all jobs that you are running will be terminated. The nohup command makes the
process immune to being killed when you "hang up" i.e. log out:
[you@faraday you]$ nohup other_long_running_command &
[3] 12571
[you@faraday you]$ _
● A few parts ot this section has some discussion that only applies to the bash shell.
❍ In most of those cases, tcsh has the same functionality with slightly different syntax.
● The shell maintains a list of variables, functions and aliases that have been defined.
❍ The variable $TERM identifies the type of terminal being used
❍ By convention, variables names are all upper-case, although the convention is not required.
❍ Note that when defining the variable, the dollar sign $ is not part of the name.
❍ There are no spaces around the equal sign =
❍ For tcsh the equivalent command is: set myvariable='hi sailor'
shell. But if you exit the spawned shell you get back to your login shell:
[you@faraday you]$ bash
[you@faraday you]$ exit
[you@faraday you]$ _
■ You can spawn any shell from any other shell in a similar way:
[you@faraday you]$ tcsh
[you@faraday ~]$ exit
[you@faraday you]$ _
■ For tcsh to define a variable that will be known to all sub-shells, define it with:
setenv other_variable 'bye sailor'
■ Note there is no equal sign in the above.
● The shell variable $PATH defines the path used to find commands to be executed:
[you@faraday you]$ echo $PATH
/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:.
[you@faraday you]$ _
❍ The directories are searched in the order in which they are listed.
❍ The fields are separated by colons : just like the /etc/passwd file. This is common in UNIX/Linux.
❍ The directory indicated by a dot . stands for the present working directory.
■ Some environments do not include the present working directory in the path by default.
● You can also define functions. For example, the following sequence of commands is executed often by all users:
[you@faraday you]$ cd some_directory
[you@faraday you]$ ls
❍ You can execute the same sequence in a single line by separating the commands with a semi-colon ;
[you@faraday you]$ cd some_directory; ls
❍ You can define a function chd that rolls these two commands into one:
[you@faraday you]$ function chd()
> { cd $1; ls; }
[you@faraday you]$ chd some_directory
❍ As with shell variables, there are no spaces around the equal sign =
❍ You can alias a command to itself. For example, the -i flag to rm asks for confirmation before removing a
file. You can make this the default behavior for rm with:
[you@faraday you]$ alias rm='rm -i'
[you@faraday you]$ rm some_file
rm: remove `some_file'? _
● You can customise the behavior of your login shell by creating a file .bash_profile in your home directory and
defining in it any variables, functions and aliases that you wish. Such a file might look like:
[you@faraday you]$ cat .bash_profile
# Local aliases
[you@faraday you]$ _
runcomm or rc for short. UNIX/Linux configuration involves many files and directories with an rc in
their name.
■ For tcsh users, the file ~/.tcshrc is read by both login and non-login shells.
● For simple situations, you saw in Module 2 that the shell's history mechanism can save you much typing. For more
complicated or often-repeated tasks, this may not be enough.
● You can create a text file containing shell commands to be executed. Such files are called shell scripts.
❍ The .bash_profile file is a shell script.
● You alert the system to invoked a shell to run the contents of the file by beginning the file with the line:
#!/bin/bash
❍ The .bash_profile script does not begin by invoking /bin/bash to run it, since we want it to be
invoked by the shell that called it.
❍ Other programs can be used in similar scripts. For example, programs written in Perl usually begin with:
#!/usr/bin/perl
● You can then put in any shell commands that you wish:
#!/bin/bash
❍ The #! in the first line is an exception to the rule that lines that begin with a sharp sign # are a comment.
❍ The history of this exception is ghastly!
● You then need to make the file executable:
[you@faraday you]$ chmod +x file_name
● Each flavor of UNIX/Linux has variations on how to do printing. We shall discuss modern Linux implementations
❍ We are describing the LPRng software, which is available from: http://www.lprng.com/.
● You may determine which printers are currently accepting requests with lpstat -a
[you@faraday you]$ lpstat -a
hp2100 accepting requests since 2002-04-27-13:19:56.892
hp4mp accepting requests since 2002-04-27-13:19:56.910
hp4050n accepting requests since 2002-04-27-13:19:56.901
hpVsi accepting requests since 2002-04-27-13:19:56.919
[you@faraday you]$ _
❍ lpstat means "line printer status." I haven't seen an actual line printer in a long time.
● The shell variable $LPDEST determines which printer is the current print destination.
[you@faraday you]$ echo $LPDEST
hpVsi
[you@faraday you]$ _
● Many installations, like ours, use PostScript as the language for its printers.
❍ If the print spooler is given a PostScript file, it is passed unchanged to the printer.
❍ If the spooler is given a text file, it invokes a filter such as a2ps to convert it to PostScript.
❍ For non-text files, many utilities exist to produce a PostScript file and/or convert to PostScript and send it
● Many flavors of UNIX/Linux have a -t option to the man command to typeset the page for printing.
[you@faraday you]$ man -t mkdir
❍ For many modern Linux distributions the flag produces PostScript which can then be sent to the printer:
[you@faraday you]$ man -t mkdir | lp
request-id is you@faraday+57
[you@faraday you]$ _
❍ For some other flavors of UNIX/Linux, the flag typesets the page and sends it to the printer:
[you@some_machine you]$ man -t mkdir
request-id is you@some_machine+88
[you@faraday you]$ _
❍ The man page for man should document how your the command works for your system.
● You can monitor the job with lpq
[you@faraday you]$ lpq
Printer: hpVsi@faraday 'Printer in Room 126'
Queue: 1 printable job
Server: pid 25321 active
Unspooler: pid 25322 active
Status: printing 'harrison@faraday+56' starting OF 'ofhp' at 13:45:36.743
Rank Owner/ID Class Job Files Size Time
active harrison@faraday+56 A 56 some_file 377 13:45:36
[you@faraday you]$ _
❍ If there is more than one job spooled for the printer, they will all be listed.
Remote Access
● You can access most UNIX/Linux systems from anywhere on the Internet.
● telnet is the best-known program to log in to a UNIX/Linux box remotely
❍ I strongly recommend you not use telnet
■ When you give your password, it is sent as clear text over the network. Any "packet sniffer" can grab it.
● The "secure shell" ssh encrypts the password before sending it to the UNIX/Linux machine, which then decrypts it.
❍ This is much more secure than telnet.
■ For Windoze machines, the PuTTY program works very well. It is free and available at
http://www.chiark.greenend.org.uk/~sgtatham/putty/
❍ There are also secure ways to copy files from one machine to another using the same technology called scp
and sftp.
■ You should consider using scp or sftp instead of the file transfer protocol program ftp.
❍ Many system administrators very much want to remove telnet and ftp from their computers, but users
Regular Expressions
● Regular expressions, used in the name of the command grep for example, pervade UNIX/Linux and its utilities.
● It takes a particular kind of geek to be able to remember all of the sometimes very complex syntaxes of all the
possible regular expressions.
● All UNIX/Linux geeks are aware of the basics of regular expressions that we will discuss here.
● The caret ^ stands for the beginning of a line.
❍ All first and second year accounts on Faraday begin with the letter x. Thus to get all of these accounts out of
the password file but not match the letter x anywhere other than at the beginning of each line use:
[you@faraday you]$ grep '^x' /etc/passwd
❍ To match a literal caret in a file precede it with a backslash:
[you@faraday you]$ grep '\^' somefile
❍ This rule applies to other regular expressions: precede it by a backslash to turn off its special meaning.
● The dollar sign $ stands for the end of a line. Thus we could get all tcsh users out of the password file with:
[you@faraday you]$ grep 'tcsh$' /etc/passwd
● The period . matches any single character. Thus to extract all the lines of a file containing exactly three characters
use:
[you@faraday you]$ grep '^...$' somefile
● The asterisk * matches zero or more occurrences of the preceding character. Thus two extract all lines that contain
one or more of the letter X in a row use:
[you@faraday you]$ grep 'XX*' somefile
❍ To match all lines that contain two X letters with anything at all between them, including nothing:
[you@faraday you]$ grep 'X.*X' somefile
❍ To match all lines that contain two X letters with one character or more between them:
[you@faraday you]$ grep 'X..*X' somefile
❍ Regular expressions are greedy: they always match the longest matching pattern. Thus the following matches
the entire line in a file:
[you@faraday you]$ grep '.*' somefile
● Square brackets [ ] matches any of the characters between the brackets. Thus to extract all lines containing either
the or The in a file:
[you@faraday you]$ grep '[Tt]he' somefile
❍ To get sed to quit whenever it encounters either The or the put the regular expression between slashes:
[you@faraday you]$ sed '/[Tt]he/q' somefile
● Above we pointed out that ps is verbose, in the sense that by default it labels the columns of its output:
[you@faraday you]$ ps
PID TTY TIME CMD
12340 pts/0 00:00:00 bash
12345 pts/0 00:01:23 long_running_command
13589 pts/0 00:00:00 ps
[you@faraday you]$ _
■ You may similarly delete any line by giving its line number.
● You may substitute strings for other strings using s/from/to/. Thus to change occurrences of either The or the to
Das in each line of the file:
[you@faraday you]$ sed 's/[Tt]he/Das/' somefile
❍ By default, sed only replaces the first occurrence in each line. To replace all occurrences in each line add a
global flag g:
[you@faraday you]$ sed 's/[Tt]he/Das/g' somefile
● sed by default always writes every line to stdout, whether or not it gets changed. The -n option tells the program
not to output a line unless you tell it to. You tell sed to output a line with p (for "print"). Thus to grep lines
containing either The or the:
[you@faraday you]$ sed -n '/[Tt]he/p' somefile
● In common with many UNIX/Linux utilities, sed is capable of a great number of advanced operations, which we
shall not discuss here.
● The many utilities plus the ability to combine them is a major factor in making UNIX/Linux so powerful.
● Here we very briefly describe some of the most-used other utilities. The list is far from exhaustive and is only shown
to give you some of the flavor of the UNIX/Linux environment.
❍ sort - investigated in Module 2.
❍ grep - discussed in Modules 2 and 3.
❍ cut - used in Module 2's Exercise 3.
❍ wc - discussed in Module 2
❍ uniq - removes any lines that are identical to the preceding line.
❍ tr - translate letters or ranges of letters into other letters or ranges of letter.
❍ nl - numbers the lines in the file
❍ tee - copies stdin to stdout unchanged, and also
directs stdin to a file. The figure illustrates tee.
❍ fmt - a simple formatter for text files.
❍ expand - convert tabs to spaces.
❍ awk - a powerful "little programming language"
■ Complex enough that there is a whole book on it.
Kernighan
❍ perl - a very powerful programming language
■ The learning curve to find out how to do simple
■ The name stands for "perfectly eclectic rubbish lister" among other things.
● In earlier Exercises you have used cut to cut parts out of a file.
● For ASCII text files with fields separated by one or more spaces or tabs, awk provides an easy alternative. Here is a
text file:
[you@faraday you]$ cat some_file
ham meat
spam semi-meat
kolbassa unknown
broccoli vegetable
beer beverage
[you@faraday you]$
❍ There are 6 spaces between ham and meat, two tabs between spam and semi-meat, and single spaces
between the fields in the other three lines in the file.
❍ awk will treat multiple instances of "whitespace" as a single field separator.
❍ It is possible to do the above operation with sed, but the regular expression syntax is pretty gory:
[you@faraday you]$ cat some_file |
> sed 's/^.*[[:space:]][[:space:]]*//'
meat
semi-meat
unknown
vegetable
beverage
[you@faraday you]$
❍ In the previous section, we mentioned the Perl programing language. Here is a way to have Perl do the same
operation:
[you@faraday you]$ cat some_file |
> perl -lane 'print $F[1]'
meat
semi-meat
unknown
vegetable
beverage
[you@faraday you]$
Exercise 1
i=0
i=`expr $i + 1`
done
❍ For your convenience, we have created a version of the above script as a text file named long.txt. You may
download it by clicking here.
■ Remember to execute: chmod +x long.txt
❍ For now, the script may be treated as "magic." We will de-mystify it in the Module 4.
● Create a file .bashrc in your home directory if one does not already exist. Otherwise edit the existing one.
❍ Have it set a variable somevariable to any contents that you wish.
❍ Spawn a new shell from your login shell and verify that the variable is set properly.
● Create a directory in your home directory named bin and change into it.
❍ Create a shell script in the bin directory with almost any name that you wish. Do not use the name of an
❍ Often the same user is listed as having multiple logins. Modify the script so that it uses awk or cut to pick out
the user names, sort the output by these names and then use uniq and wc to count the number of unique
logins currently logged in.
❍ Spawn a new shell.
❍ Modify the $PATH variable so that it includes your newly created bin directory.
❍ Verify that your shell script runs just by giving its name.
❍ Unless you wish to keep your new shell script remove it.
■ Many users maintain their own personal bin of their own commands and edit ~/.bash_profile
● Use sed to create a file newpasswd in your home directory that is a copy of /etc/passwd except that your login
is replaced by the string bozo.
❍ The following two ways of using sed are equivalent:
❍ Be sure that if the string matching your login appears elsewhere in the file it is not changed .
❍ Be sure that if your login name is you that you do not change the login of a user named you2.
❍ Use diff to compare the two versions of the password file.
● Recall that the fourth field in the password file is the number corresponding to the group (the gid)of the user.
❍ Look in the password file to find the number of the group you are in.
❍ Confirm by using id .
❍ Look in the file /etc/group to discover how the gid is correlated with the name you see in the output from
id.
● Create a shell script to find out how many users are in your group. Do this two different ways:
❍ Use cut, grep and wc -l to count the number of users in your group.
❍ Use only grep and wc -l. (This will be a good test of your knowledge of regular expressions.)
■ By the time your shell script is finished, it will have five different procedures in it. Use echo to
● Add to your shell script a facility to count the number of users in each group:
❍ Use cut to pick out all the gids, sort them and then use uniq -c to count them.
❍ It will be nice to apply a further sort -n so that output is sorted by the number of users in each group.
■ The sort | uniq -c | sort -n sequence is a frequently occurring idiom in shell scripts.
❍ Duplicate the code to count the number of users in each group, but modify it so that it sorts by the gid instead
that group.
■ You will want to know that when you ask awk to print more than one field, the fields should be
separated by a comma ,.
You, of course, know that looking at a solution is not a substitute for actually doing the exercise yourself. However, a sample
script that solves the above Exercise is available here; your script may be better than this one.
This document is Copyright © 2002 by David M. Harrison. This is $Revision: 1.20 $, $Date: 2003/06/12 12:43:19 $
(year/month/day UTC).
This material may be distributed only subject to the terms and conditions set forth in the Open Content License, v1.0 or later
(the latest version is presently available at http://opencontent.org/opl.shtml).