Lab 7 - Working With Text
Lab 7 - Working With Text
7.1 Introduction
This is Lab 7: Working with Text. By performing this lab, students will learn how to redirect text
streams, use regular expressions and use commands for filtering text files.
In this lab, you will perform the following tasks:
Learn how to redirect and pipe standard input, output and error channels.
Use regular expressions to filter the output of commands or file content.
View large files or command output with programs for paging, and viewing selected portions.
Previous
Next
Previous
Next
7.2.1 Step 1
Use the redirection symbol > along with the echo command to redirect the output from the normal
output of stdout (to the terminal) to a file. The cat command can be used to display file contents
and will be used in this example to verify redirected output to the file. Type the following:
Previous
Next
7.2.2 Step 2
When you use the > symbol to redirect stdout, the contents of the file are first destroyed. Type the
following commands to see a demonstration:
Notice that using one redirection symbol overwrites an existing file. This is called "clobbering" a file.
Previous
Next
7.2.3 Step 3
You can avoid clobbering a file by using >> instead of >. By using >> you append to a file. Execute
the following commands to see a demonstration of this:
cat mymessage
echo "How are you?" >> mymessage
cat mymessage
Notice that by using >> all existing data is preserved and the new data is appended at the end of the
file.
Previous
Next
7.2.4 Step 4
The find command is a good command to demonstrate how stderr works. This very flexible
command allows searching with a host of options such as filename, size, date, type and permission.
The find command will begin the search in the directory specified and recursively search all of the
subdirectories. For example, to search for files beginning in your home directory containing the
name bash:
Notice the error message indicating you do not have permission to access certain files/directories.
This is because as a regular user, you don't have the right to "look inside" some directories. These
types of error messages are sent to stderr, not stdout.
The find command is beyond the scope of this course. The purpose of using the command is to
demonstrate the difference between stdout and stderr.
Previous
Next
7.2.5 Step 5
To redirect stderr (error messages) to a file, issue the following command:
Recall that the file descriptor for stderr is the number 2, so it is used along with the > symbol to
redirect the stderr output to a file called err.txt. Note that 1> is the same as >.
The previous example demonstrates why knowing redirection is important. If you want to "ignore" the
errors that the find command displays, you can redirect those messages into a file and look at them
later, making it easier to focus on the rest of the output of the command.
Previous
Next
7.2.6 Step 6
You can also redirect stdout and stderr into two separate files.
Note that a space is permitted but not required after the > redirection symbol.
Previous
Next
7.2.7 Step 7
To redirect both standard output (stdout) and standard error (stderr) to one file, first
redirect stdout to a file and then redirect stderr to that same file by using the notation 2>&1.
Previous
Next
7.2.8 Step 8
Standard input (stdin) can also be redirected. Normally stdin comes from the keyboard, but
sometimes you want it to come from a file instead. For example, the tr command translates
characters, but it only accepts data from stdin, never from a file name given as an argument. This
is great when you want to do something like capitalize data that is inputted from the keyboard (Note:
Press Control+d, to signal the tr command to stop processing standard input):
tr a-z A-Z
this is interesting
how do I stop this?
^D
Previous
Next
7.2.9 Step 9
The tr command accepts keyboard input (stdin), translates the characters and then redirects the
output to stdout. To create a file of all lower-case characters, execute the following:
Press the Enter key to make sure your cursor is on the line below "This works!", then
use Control+d to stop input. To verify you created the file, execute the following command:
cat myfile
Previous
Next
7.2.10 Step 10
Execute the following commands to use the tr command by redirecting stdin from a file:
cat myfile
tr a-z A-Z < myfile
Previous
Next
7.2.11 Step 11
Another popular form of redirection is to take the output of one command and send it into another
command as input. For example, the output of some commands can be massive, resulting in the
output scrolling off the screen too quickly to read. Execute the following command to take the output
of the ls command and send it into the more command, which displays one page of data at a time:
ls -l /etc | more
You will need to press the spacebar to continue or you can also press CTRL+c to escape this
listing.
The cut command is useful for extracting fields from files that are either delimited by a character,
like the colon : in /etc/passwd, or that have a fixed width. It will be used in the next few examples
as it typically provides a great deal of output that we can use to demonstrate using the | character.
Previous
Next
7.2.12 Step 12
In the following example, you will use a command called cut to extract all of the usernames from a
database called /etc/passwd (a file that contains user account information). First, try running
the cut command by itself:
Previous
Next
10.2.13 Step 13
The output in the previous example was unordered and scrolled off the screen. In the next step you
are going to take the output of the cut command and send it into the sort command to provide
some order to the output:
Previous
Next
10.2.14 Step 14
Now the output is sorted, but it still scrolls off the screen. Send the output of the sort command to
the more command to solve this problem:
Previous
Next
Previous
Next
10.3.1 Step 1
The /etc/passwd is likely too large to be displayed on the screen without scrolling the screen. To
see a demonstration of this, use the cat command to display the entire contents of
the /etc/passwdfile:
cat /etc/passwd
Previous
Next
10.3.2 Step 2
Use the more command to display the entire contents of the /etc/passwd file:
more /etc/passwd
Your output should be similar to the following:
Note
The --More--(92%) indicates you are "in" the more command and 92% through the current data.
Previous
Next
10.3.3 Step 3
While you are in the more command, you can view the help screen by pressing the h key:
Previous
Next
10.3.4 Step 4
Press the Spacebar to view the rest of the document:
<SPACE>
In the next example, you will learn how to search a document using either
the more or less commands.
Searching for a pattern within both the more and less commands is done by typing the slash /,
followed by the pattern to find. If a match is found, the screen should scroll to the first match. To
move forward to the next match, press the n key. With the less command you can also move
backwards to previous matches by pressing the N (capital n) key.
Previous
Next
10.3.5 Step 5
Use the less command to display the entire contents of the /etc/passwd file. Then search for the
word bin, use n to move forward, and N to move backwards. Finally, quit the less pager by typing
the letter q:
less /etc/passwd
/bin
nnnNNNq
Important
Unlike the more command which automatically exits when you reach the end of a file, you must
press a quit key such as q to quit the less program.
Previous
Next
10.3.6 Step 6
You can use the head command to display the top part of a file. By default, the head command will
display the first ten lines of the file:
head /etc/passwd
Previous
Next
10.3.7 Step 7
Use the tail command to display the last ten lines of the /etc/passwd file:
tail /etc/passwd
Previous
Next
10.3.8 Step 8
Use the head command to display the first two lines of the /etc/passwd file:
head -2 /etc/passwd
Previous
Next
10.3.9 Step 9
Execute the following command line to pipe the output of the ls command to the tail command,
displaying the last five file names in the /etc directory:
ls /etc | tail -5
Previous
Next
10.3.10 Step 10
Another way to specify how many lines to output with the head command is to use the option -n -#,
where # is the number of lines counted from the bottom of the output to exclude. Notice the minus
symbol - in front of the #. For example, if the /etc/passwd contains 27 lines, the following
command will display lines 1-7, excluding the last twenty lines:
Previous
Next
Previous
Next
10.4.1 Step 1
The use of grep in its simplest form is to search for a given string of characters, such as sshd in
the /etc/passwd file. The grep command will print the entire line containing the match:
cd /etc
grep sshd passwd
sysadmin@localhost:~$ cd /etc
sysadmin@localhost:/etc$ grep sshd passwd
sshd:x:106:65534::/run/sshd:/usr/sbin/nologin
sysadmin@localhost:/etc$
Previous
Next
10.4.2 Step 2
Regular expressions are "greedy" in the sense that they will match every single instance of the
specified pattern:
Previous
Next
10.4.3 Step 3
To limit the output, you can use regular expressions to specify a more precise pattern. For example,
the caret ^ character can be used to match a pattern at the beginning of a line; so, when you
execute the following command line, only lines that begin with root should be matched and
displayed:
Note that there are two additional instances of the word root but only the one appearing at the
beginning of the line is matched (displayed in red).
Best Practice
Use single quotes (not double quotes) around regular expressions to prevent the shell program from
trying to interpret them.
Previous
Next
10.4.4 Step 4
Match the pattern sync anywhere on a line:
Next
10.4.5 Step 5
Use the $ symbol to match the pattern sync at the end of a line:
The command in the previous step matched every instance; the second only matches the instance
at the end of the line.
Previous
Next
10.4.6 Step 6
Use the period character . to match any single character. For example, execute the following
command to match any character followed by a 'y':
Previous
Next
10.4.7 Step 7
The pipe character, |, or "alternation operator", acts as an "or" operator. For example, execute the
following to attempt to match either sshd, root or operator:
Observe that the grep command does not recognize the pipe as the alternation operator by default.
The grep command is actually including the pipe as a plain character in the pattern to be matched.
The use of either grep -E or egrep will allow the use of the extended regular expressions, including
alternation.
Previous
Next
10.4.8 Step 8
Use the -E switch to allow grep to operate in extended mode in order to recognize the alternation
operator:
Previous
Next
10.4.9 Step 9
Use another extended regular expression, this time with egrep with alternation in a group to match a
pattern. The strings nob and non will match:
Note
The parentheses, ( ), were used to limit the "scope" of the | character. Without them, a pattern
such as nob|n would have meant "match either nob or n”.
Previous
Next
10.4.10 Step 10
The [ ] characters can also be used to match a single character. However, unlike the period
character ., the [ ] characters are used to specify exactly what character you want to match. For
example, if you want to match a numeric character, you can specify [0-9]. Execute the following
command for a demonstration:
Note
The head command was used to limit the output of the grep command.
Previous
Next
10.4.11 Step 11
Suppose you want to search for a pattern containing a sequence of three digits. You can
use { } characters with a number to express that you want to repeat a pattern a specific number of
times; for example: {3}. The use of the numeric qualifier requires the extended mode of grep:
Previous
Next