SPL_Lecture3
SPL_Lecture3
Spring 2025
❖ The metacharacters used to match filenames belong to a category called wild cards.
e.g.,
For instance, chap* represents all filenames beginning with chap. You can use this
pattern as an argument to a command rather than supply a long list of filenames which
the pattern represents. The shell will expand it suitably before the command is executed.
The * and ?
❖ This character matches any number of characters including none. When it is appended to the string
chap, the pattern chap* matches filenames beginning with the string chap—including the file chap.
You can now use this pattern as an argument to ls:
When the shell encounters this command line, it immediately identifies the * as a metacharacter. It
then creates a list of files from the current directory that match this pattern. It reconstructs the
command line as follows, and then hands it over to the kernel for execution:
Caution!
❖ Be careful when you use the * with rm to remove files. You could land yourself in a real
mess if, instead of typing rm *.o, which removes all the C object files, you inadvertently
introduce a space between * and .o:
The error message here masks a disaster that has just occurred; rm has removed all
files in this directory! Whenever you use a * with rm, you should pause and check the
command line before you finally press [Enter].
The ?
❖ The ? matches a single character. When used with the same string chap (as chap?),
the shell matches all five-character filenames beginning with chap. Place another ? at
the end of this string, and you have the pattern chap??. Use both of these expressions
separately, and the meaning of the ? becomes obvious:
The Character Class
❖ The character class comprises a set of characters enclosed by the rectangular brackets [ and
], but it matches a single character in the class. The pattern [abcd] is a character class, and it
matches a single character—an a, b, c, or d.
❖ Range specification is also possible inside the class with a - (hyphen); the two characters on
either side of it form the range of the characters to be matched. Here are two examples:
❖ The expression [a-zA-Z]* matches all filenames beginning with a letter, irrespective
of case.
❖ You can match a word character by including numerals and the underscore character
as well: [a-zA-Z0-9_]
Shells
Shell itself is a program on the server and can be one of many varieties
➢ bash : Most popular shell, default on most Linux systems. Installed on all Linux
systems
➢ tcsh : A C-like syntax for scripting, supports arguments for aliases etc (i.e.,
C-Shell)
Source: https://www.purdue.edu/hla/sites/varalalab/wp-content/uploads/sites/20/2018/02/Lecture_5.pdf
Negating the Character Class (!)
❖ The solution that we prescribe here unfortunately doesn’t work with the C shell,
but with the other shells, you can use the ! as the first character in the class to
negate the class.
Matching the Dot
❖ If you want to list all hidden filenames in your directory having at least three characters
after the dot, then the dot must be matched explicitly:
❖ However, if the filename contains a dot anywhere but at the beginning, it need not be
matched explicitly. For example, the expression *c also matches all C programs that end
with c, regardless of what comes before it.
There are two things that the * and ? can’t match
❖ First, they don’t match a filename beginning with a dot, although they can match any
number of embedded dots. For instance, apache*gz matches apache_1.3.20.tar.gz.
❖ Second, these characters don’t match the / in a pathname. You can’t use cd /usr*local
to switch to /usr/local.
The Shell’s Wild Cards
Wild Card Matches
? A single character
[x-z] A single character that is within the ASCII range of the characters x and z
[!x-z] A single character that is not within the ASCII range of the characters x and z
(Not in C shell)
{pat1,pat2...} pat1, pat2, etc. (Not in Bourne Shell; see Going Further)
!(flname) All except flname (Korn and Bash; see Going Further)
!(fname1 | fname2) All except fname1 and fname2 (Korn and Bash; see Going Further)
Escaping
❖ Escaping: Providing a \ (backslash) before the wild card to remove (escape) its special
meaning. or , When the \ precedes a metacharacter, its special meaning is turned off.
To remove the file My Document.doc, which has a space embedded, a similar reasoning
should be followed:
$
Ignoring the Newline Character
❖ Command lines that use several arguments often overflow to the next line. To
ensure better readability, split the wrapped line into two lines, but make sure that
you input a \ before you press [Enter]:
The \ here ignores [Enter]. It also produces the second prompt (which could be
a > or a ?), which indicates that the command line is incomplete.
Quoting
❖ There’s another way to turn off the meaning of a metacharacter. When a command
argument is enclosed in quotes, the meanings of all enclosed special characters are
turned off.
Single quotes
Double quotes
Redirection
Redirection in Linux is a mechanism that allows you to control the input and output of
commands by redirecting standard streams to files, other commands, or devices. Linux provides
three standard streams:
➢ Standard input: The file (or stream) representing input, which is connected to the keyboard.
➢ Standard output: The file (or stream) representing output, which is connected to the display.
➢ Standard error: The file (or stream) representing error messages that emanate from the
command or shell. This is also connected to the display.
Standard Input
❖ The keyboard, the default source.
-c : Count bytes.
-m : Count characters.
-C : Same as -m.
-l : Count lines.
-w : Count words delimited by white space characters or new line characters. Delimiting characters
are Extended Unix Code (EUC) characters from any code set defined by iswspace().
If no option is specified the default is -lwc (count lines, words, and bytes.)
The keyboard, the default source
When you use wc without an argument and have no special symbols like the < and | in the
command line, wc obtains its input from the default source. You have to provide this input from the
keyboard and mark the end of input with [Ctrl-d]:
wc obtains its input from the default source
❖ The shell opens the file and assigns it as standard input to the wc command. This redirection
requires the < symbol:
The filename is missing once again, which means that wc didn’t open /etc/passwd. It read the
standard input file as a stream but only after the shell made a reassignment of this stream to a disk
file. The sequence works like this:
➢ On seeing the <, the shell opens the disk file, /etc/passwd, for reading.
➢ It unplugs the standard input file from its default source and assigns it to /etc/passwd.
➢ wc reads from standard input that has previously been reassigned by the shell to /etc/passwd.
Taking Input Both from File and Standard Input
When a command takes input from multiple sources, say, a file and standard input, the -
symbol must be used to indicate the sequence of taking input.
Standard Output
All commands displaying output on the terminal actually write to the standard output file as a
stream of characters, and not directly to the terminal as such. There are three possible
destinations for this stream:
❖ Redirection can also be used with multiple files. The following example saves all C
programs:
The File Descriptor
❖ Each of the three standard files is represented by a number, called a file descriptor.
0: Standard input
1: Standard output
2: Standard error
Note: > and 1> mean the same thing to the shell, while < and 0< also are identical. We need
to explicitly use one of these descriptors when handling the standard error stream.
Standard Error
❖ When a command runs unsuccessfully, diagnostic messages often show up on the screen.
This is the standard error stream whose default destination is the terminal.
Filters: Using Both Standard Input and Standard Output
Filters
e.g.,
Example
1
2
3
4
These four commands are functionally equivalent in terms of their execution. They all use redirection to pass calc.txt as input to
the wc command and save the output to result.txt. Let's break them down:
1. 2. 3. 4.
< calc.txt: Redirects the > result.txt: Specifies >result.txt: Redirects the > result.txt: Redirects
contents of calc.txt as that the output of wc output of wc to result.txt. the output of wc to
input to the wc should go to result.txt. result.txt.
command.
> result.txt: Redirects < calc.txt: Specifies that <calc.txt: Redirects the < calc.txt: Redirects the
the output of wc to the input for wc should contents of calc.txt as contents of calc.txt as
result.txt. come from calc.txt. input to wc. input to wc.
Collective Manipulation
So far, we have used the > to handle a single stream of a single command. But the shell also
supports collective stream handling. This can happen in these two ways:
❖ Handle two standard streams as a single one using the 2>&1 and 1>&2 symbols.
❖ Form a command group by enclosing multiple commands with the ( and ) symbols or {
and } symbols. You can then use a single instruction to control all commands in the
group.
Replicating Descriptors
➢ 1>&2 Send the standard output to the destination of the standard error.
➢ 2>&1 Send the standard error to the destination of the standard output.
The 2> symbol reassigns standard error to error.txt and 1>&2 sends the standard output of
echo to the standard error. Note the sequence: first we redirect and then we specify the
replication of the descriptor
Command Grouping
Sometimes, we need to manipulate a group of commands collectively: redirect them, run
them in the background, and so on. The () and {} handle a command group.
This saves all C program sources in a file preceded by a multicolumn list of programs acting
as a table of contents. The echo command serves to insert a blank line between them. The {}
can also be used for this purpose:
() vs {}
❖ Commands inside () are executed in a subshell. ❖ Commands inside {} are executed in the current
❖ A subshell is a separate instance of the shell that shell.
inherits the environment of the parent shell but ❖ No subshell is created, so any changes made to
does not affect it. Changes to variables, variables, directories, or environment settings
directories, or environment settings inside the inside {} persist in the parent shell.
parentheses are isolated and do not persist in the
parent shell.
➢ For long-running commands, this process can be slow. The second command can’t act unless the
first has completed its job.
➢ You require an intermediate file that has to be removed after completion of the job. When you are
handling large files, temporary files can build up easily and eat up disk space in no time.
Pipes
The shell can connect these streams using a special operator—the | (pipe)—and avoid the
creation of the disk file.
The output of who has been passed directly to the input of wc, and who is said to be piped to wc.
When a sequence of commands is combined together in this way, a pipeline is formed.
$ echo $total
5
Environment vs. Shell Variables
$ export FOO=BAR (FOO defined in the environment or global variable)
$ echo $FOO
$ echo $FOO2
(empty) [ The new shell cannot access FOO2 because it was not exported to the
environment in the parent shell.]
Source: https://www.purdue.edu/hla/sites/varalalab/wp-content/uploads/sites/20/2018/02/Lecture_5.pdf
Variable Concatenation is Simple
$ ext=.avi # This line assigns the string .avi to the variable ext
$ moviename=holmes # This line assigns the string holmes to the variable moviename.
$ echo $filename
holmes.avi
All About Variables
➢ Variable names begin with a letter but can contain numerals and the _ as the other characters.
➢ Unlike in programming languages, shell variables are not typed; you don’t need to use a char,
int, or long prefix when you define them.
➢ All shell variables are of the string type, which means that even a number like 123 is stored as a
string rather than in binary. (This may not remain true in the future.)
➢ All shell variables are initialized to null strings by default. While explicit assignment of null strings
with x=”” or x=’’ or x= is possible.
Thank you!