0% found this document useful (0 votes)
12 views

Ch7_IO ‎Redirection ‎and Text ‎Processing ‎Tools

Uploaded by

mody.20.fa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Ch7_IO ‎Redirection ‎and Text ‎Processing ‎Tools

Uploaded by

mody.20.fa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

IOS203_Ch7

I/O Redirection and Text Processing Tools

Standard I/O Redirection ................................................................................................... 3


Pipes and Filters ................................................................................................................. 5
Text Processing Tools ....................................................................................................... 10

1
IOS203_Ch7

Objectives
Upon completion of this course, the student will be able to:
• Redirect I/O channels to files;
• Connect commands using pipes;
• Use tools for extracting, analyzing and manipulating text data.

Keywords
I/O Redirection, stdin, stdout, stderr, pipe, /dev/null, tr, tee, lpr, set, grep, egrep, sed,
REGEX, wc, head, tail, sort, uniq, cut, paste, diff, pr, uptime.

2
IOS203_Ch7

Linux has the ability to redirect a command's input, output and error data. It allows the input of
a program to come from any source, and the output to go to any source. Furthermore, the output
from one command can be fed directly to the input of another command through a pipe and a
filter can modify the stream.
1. Standard I/O Redirection
Linux provides three I/O channels to commands:

 stdin: standard input (channel 0) is connected, by default, to keyboard;


 stdout: standard output (channel 1) is connected, by default, to terminal window;
 stderr: standard error (channel 2) is connected, by default, to terminal window.

Note: Commands produce two kinds of output, normal output and error message output, and the
shell can redirect each of these separately.

Figure 1: The three I/O channels

Consider the following command and its output, which assumes that there is a file called file2 in
your home directory, but no file called file5:

Shell Redirection allows standard I/O channels to be redirected to/from a file. The following
table shows the common shell redirection operators:

3
IOS203_Ch7

Redirection Operator Description


command > file
Redirect standard output of command to file
command 1> file
command >> file
Append standard output of command to file
command 1>> file
command < file
Send file as input to command
command 0< file
command 2> file Redirect error messages from command to file
command 2>> file Append error messages from command to file
command &> file
command 1> file 2>&1
Redirect All Output (stdout and stderr) of
command > file 2>&1
command to file
command 1>&2 2>file
command >&2 2>file

The find comand illustrates the reason behind separating stdout from stderr, especially when run
as an unprivileged user. For example, the following find command will search for all files named
passwd in the /etc directory and its subdirectory. Usually, it shows so many “permission denied”
error messages:

To discard all the error messages, use:


[student@StudentHost ~]$ find /etc -name passwd 2> /dev/null

In this situation, the stderr output is redirected to the /dev/null device that discards all data
written to it. Therefore, the error messages do not come to the terminal window. /dev/null has
unlimited storage, but nothing can be retrieved from it. Anything written to /dev/null will be lost
forever. For this reason, /dev/null can be useful to discard unwanted output from commands.

4
IOS203_Ch7

To save the list of matching paths to the fout file, use:


[student@StudentHost ~]$ find /etc -name passwd >fout 2> /dev/null

This command line redirects matching paths to the fout file, discards errors, and sends nothing to
the terminal window.

Note: If the fout file exist, it will be overwritten. If it does not exist, it will be created.

You can append the result of any other command in the same file fout, for instance:
[student@StudentHost ~]$ ls –l file1 fiel2 file5 >>fout 2> /dev/null

This command line adds non-error results to the end of the fout file, the errors to the /dev/null
device, and sends nothing to the terminal window. If the file does not exist, it will be created.

You can also redirect both stdout and stderr to the same fout file:
[student@StudentHost ~]$ find /etc -name passwd >fout 2>&1
The input of a command can be redirected from a file. As (>) is used for output redirection, the
(<) is used to redirect the input of a command. The commands that normally take their input
from the standard input can have their input redirected from a file. For example, the following
command counts the number of lines in the fout file generated above:
[student@StudentHost ~]$ wc –l <fout
or
[student@StudentHost ~]$ wc –l 0<fout

tr command

The tr command is another example for input redirection, it doesn’t accept filenames as
arguments and it requires its input to be redirected from somewhere:

[student@StudentHost ~]$ tr ‘A-Z’ ‘a-z’ < .bash_profile

This command translates the uppercase characters in .bash_profile to lowercase characters, and
the command line:

[student@StudentHost ~]$ tr ‘A-Z’ ‘a-z’ < .bash_profile >fout

does the same thing, but redirects the result to the fout file.

2. Pipes and Filters


The Pipe is a way by which the output from one command becomes the input to a second:

5
IOS203_Ch7

Figure 2: Piping information from one command to another

To create a pipe, use the “|” character, and put stdout on the left, and stdin on the right. In order
to reduce the amount of information displayed on the terminal window, you can use multiple
pipes on one command line. For example, you can pipe the output of the cat command to less
which will show you only one scroll length of content at a time:

[student@StudentHost ~]$ cat file2 | less

Suppose you wanted to run two commands back to back and send their output through a pipe
like this:

[student@StudentHost ~]$ cal 2019; cal 2020 | lpr

You would find that only the calendar 2020 was printed and the calendar 2019 went to the
terminal window. This can be solved by updating the command line in this way:

[student@StudentHost ~]$ (cal 2019; cal 2020) | lpr

tee command

$ command1 | tee filename | command2

This command allows you to redirect the stdout to multiple targets. It receives information from
stdin, stores stdout of command1 in a file, then pipes to command2. For instance, in the next
command line, the output from set command is written to the file set.out while also being piped
to less:
[student@StudentHost ~]$ set | tee set.out | less

The set command sets or unsets shell variables. But, when used without any argument it will print
on the terminal window a list of all variables including environment and shell variables, and shell
functions.

Note: stderr is not forwarded across the pipes.

6
IOS203_Ch7

Here are some other examples:

While the date command shows the current date and time, the uptime command returns
information about how long your system has been running, number of users with running
sessions, and the system load averages for the past 1, 5, and 15 minutes. The option –a is used to
append the result to file2.

Linux has filters (e.g., grep, sed) to take the standard input, does something useful with it, and
then returns it as a standard output. Filters use Regular Expressions (or REGEX) for complex
searches.

grep command

grep stands for “global regular expression print”. It displays the lines in a file that match a pattern.
It can also process standard input. The pattern may contain regular expression metacharacters.
For example, the following command list lines containing ‘bash’ from the fout file.

[student@StudentHost ~]$ grep 'bash' fout

The following table shows common grep options:

Option Description
-v return lines that do not contain the pattern
-n precede returned lines with line numbers
-c only return a count of lines with the matching pattern
-l only return the names of files that have at least one line containing the pattern
-i perform a case-insensitive search

7
IOS203_Ch7

Here are some grep examples:

regular expressions
“Regular expressions” or (REGEX) are text-matching patterns written in a standard and well-
characterized pattern-matching language. They are a universal standard used by most programs
that do pattern matching, although there are minor variations among implementations. REGEX
parse and manipulate text. For example, the command:

[student@StudentHost ~]$ grep ‘^root’ /etc/passwd

shows the lines beginning with ‘root’ in the passwd file. The common symbols used with text
patterns to form REGEX are listed in the following table:

8
IOS203_Ch7

Symbol matches
. (period) any single occurrence of any character except a newline
[chars] any character from a given set
[^chars] any character not in a given set
^ the beginning of a line
$ the end of a line
\w any “word” character (same as [A-Za-z0-9_])
\d any digit (same as [0-9])
| either the element to its left or the one to its right
(expr) Limits scope, groups elements, allows matches to be captured
? zero or one match of the preceding element
* zero, one, or many matches of the preceding element
+ one or more matches of the preceding element
{n} exactly n instances of the preceding element
{min,} at least min instances (note the comma)
{min, max} any number of instances from min to max

REGEX examples

▪ [a-z] exactly one lowercase letter


▪ [a-z]* zero or more lowercase letters
▪ [a-z]+ one or more lowercase letters
▪ [a-zA-Z0-9] one lowercase or uppercase letter, or a digit
▪ [^(] anything that is not '('
▪ ‘[^aeiou]’ any character except ‘a’, ‘e’, ‘I’, ‘o’, or ‘u’
▪ ‘[ab^&]’ any character among ‘a’, ‘b’, ‘^’, or ‘&’
▪ ‘[aA]wk’ ‘awk’ or ‘Awk’
▪ ‘[1-9]’ is the same as ‘[123456789]’
▪ ‘[abcde]’ is equivalent to ‘[a-e]’
▪ ‘[abcde123456789]’ is equivalent to ‘[a-e1-9]’
▪ ‘abc*d’ any string among ‘abd’, ‘abcd’, ‘abccd’, ‘abcccd’, or even
‘abcccccccccccccccccccccccccccccccccccd’
▪ ‘^The’ any ‘The’ that are the first characters on a line
▪ ‘well$’ will match ‘well’ only if they are the last characters on a line prior to the NEWLINE
character
▪ ‘^Ken$’ would only match a line that started with ‘Ken’ and then had no other characters
on the line
▪ '[0-9]\{3\}-[0-9]\{4\}' {999-9999, like phone numbers}
▪ '"smug"' {'smug' within double quotes}
▪ 'B[oO][bB]' any string among BOB, Bob, BOb or BoB

9
IOS203_Ch7

egrep (Extended Regular Expression) is a similar command, but it uses a more powerful set of
regular expressions; it behaves exactly like grep –E

‘ab+c’ matches ‘a’ followed by a ‘bc’ or more b’s followed by one ‘c’

‘ab|ac’ matches either ‘ab’ or ‘ac’

(D|N).Michel matches ‘D.Michel’ or ‘N.Michel’

sed command

sed stands for stream editor which is used to perform basic text transformations on an input
stream (a file or input from a pipeline). It is very helpful for using regular expressions to change
something in the text. For example, the following command substitutes the ‘BASH’ string with the
‘SH’ string:

[student@StudentHost ~]$ cat fout | sed 's/BASH/SH/'

This command replaces the first occurrence of ‘BASH’ on each line containing it. To replace all
occurrences of ‘BASH’, you should use the following command line:

[student@StudentHost ~]$ cat fout | sed 's/BASH/SH/g'

sed makes no change to the original input file. It can shows some selected lines from a given file:

[student@StudentHost ~]$ sed –n ‘1,2p’ fout # prints the first two lines from the fout file

[student@StudentHost ~]$ sed –n ‘/Linux/p’ fout # prints any line containing Linux from the fout file

[student@StudentHost ~]$ sed –n ‘/1,2d/’ fout # deletes lines 1 and 2 from the fout file

Notes:

 It is considered good practice to always quote your regular expressions;


 The difference between the * in a regex and the shell’s usage:
- In a regex, the * stands for zero or more occurrences of a single preceding character
- In the shell, the * stands for any number of characters that may or may not be different
 The ‘-’ character has a special meaning in a character class BUT ONLY if it is used within a
range

3. Text Processing Tools


Linux bash shell has a number of useful tools that help us do various text processing tasks. Tools
for analyzing files, extracting what we need and then displaying the required content to the
terminal window.

10
IOS203_Ch7

head and tail commands

The head command outputs the first lines (default: 10) of files, while the tail command outputs
the last lines (default: 10) of files. Here are some examples:

wc command

This command prints line, word, and byte counts for a given file. It can perform such statistics for
files or output from other commands passed to it through pipe.

sort command

The sort command sorts lines of text files. Like the cat command, it can concatenate multiple files,
but it prints the sorted result of concatenation. The sort command has the following syntax:

$ sort [options] file(s)

11
IOS203_Ch7

The common options are described in the following table:

Option Description
-r Reverse sort to sort descending
-n Numeric sort
-f Ignore case of characters in strings
-u Unique (remove duplicate lines in output)
-t ‘x’ Use ‘x’ as field separator
-k pos1 Sort from field pos1

By default, the sort command sorts the lines according to alphabetical order. However, it can sort
them according numeric order:

Consider you need to sort the /etc/passwd file according to the first field using ‘:’ as separator
and you are interesting in the first three lines only:

Consider you need to sort the files in your home directory according to their size (from smaller
to larger), and you need to know the biggest three files:

uniq command
If there is duplication in some lines, the uniq command detects the adjacent duplicate lines,
removes the repeated lines and keeps only one. The following table shows some options used
with this command:

12
IOS203_Ch7

Option Description
-u Print only unique lines
-d Print only duplicated lines
-c Prefix line with the number of its occurrences

Here are some examples:

13
IOS203_Ch7

Note: Because the uniq command only works on already sorted data, it is almost used in
conjunction with the sort command.

cut command
This command displays specific columns of a file or stdin.

Option Description
-d Specify the column delimiter(default is TAB)
-f Specify the column to print
-c Cut by characters

Here are some examples:

paste command
The paste command merges corresponding or subsequent lines of files. The general syntax for
the paste command is as follows:

14
IOS203_Ch7

$ paste [OPTION].. [FILE]...

You can use the option –d to specify a delimiter instead of TAB separator.

[student@StudentHost ~]$ paste –d ‘_’ b.txt c.txt

diff command
The diff allows you to compare two files line by line. It can also compare the contents of directories. It
is most commonly used to create a patch containing the difference between one or more files. For
example, if you want to compare c.txt with cc.txt which is new version of the same file, you could
write:

15
IOS203_Ch7

Remarks

• < indicates line in first file;


• > indicates line in second file;
• c indicates that a line changed;
• d indicates that a line is deleted;
• a indicates that a line is added.

pr command
This command converts text files into a paginated, columned version. If no file specified, pr read
standard input. By default, pr formats files into single-column pages of 66 lines. To print in
formatted form, you should pipe formatted document to lpr.
pr syntax is:

$ pr [options] [arguments]

For example, the command line:

[student@StudentHost ~]$ ls -a | pr -n -h "Files in $(pwd)" > Result.txt

fetchs a listing of all files in the current directory using the ls command, and pipe the output to pr,
which formats the data in a printer-friendly format with a custom header and numbered lines.
The formatted pr output is written to the file Result.txt, which can then be printed.

16
IOS203_Ch7

Questions
1. How many files found in the /usr/bin directory?
2. How many times the string conf appears in the file names of the /etc directory?
3. How many directories (not sub-directories) found in the /etc directory?
4. From the /etc/passwd file, display the line of any account that starts with the letter ‘C’
5. How many lines found in the /etc/passwd file
6. Display a list of usernames (and no other data) from the /etc/passwd file
7. From the /etc/passwd file, display the line for any account that is using the bash shell
8. From the /etc/passwd file, display the line for any account that is not using the bash shell
9. From the /etc/passwd file, display the lines that contain the word root. Display only the
filenames and do not print errors
10. Create a sorted list of all bash users and store it in users.txt.
11. Create a sorted list of all logged on users and store it in onUsers.txt.
12. Create a sorted list of all filenames stored in the /etc directory that contain the string conf at
the end of their filename.
13. Create a sorted list of all files stored in the /etc directory that contain the case insensitive
string conf in their filename.
14. Write a line that displays only ip address and the subnet mask from the /sbin/ifconfig file.
15. What command line should you type to remove all non-letters from a stream.
16. What command line should you type to receive a text file, and outputs all words on a separate
line.
17. What command line should you type to keep only small letters from a stream.
18. What command line should you type to keep only small letters and digits from a stream.
19. Create a sorted list of all users their UID greater than 510 and append it to the file users.txt.
20. Open two shells on the same computer. Create an empty story.txt file. Then type tail -f
story.txt. Use the second shell to append a line of text to that file. Verify that the first shell
displays this line.

References
[1] Red Hat Linux Essentials RH033-RHEL5-en-2-20070306
[2] Paul Cobbaut, “Linux Fundamentals”, https://linux-training.be/funhtml/index.html. Updated
on 2015-05-24

17

You might also like