My Unix Material
My Unix Material
Grep is the frequently used command in Unix (or Linux). Most of us use grep just for finding the words in a file.
The power of grep comes with using its options and regular expressions. You can analyze large sets of log files
with the help of grep command.
Grep stands for Global search for Regular Expressions and Print.
this saves a lot of time if you are executing the same command again and again.
!grep
This displays the last executed grep command and also prints the result set of the command on the terminal.
this is the basic usage of grep command. It searches for the given string in the specified file.
This searches for the string "Error" in the log file and prints all the lines that has the word "Error".
This is also the basic usage of the grep command. You can manually specify the list of files you want to search
or you can specify a file pattern (use regular expressions) to search for.
the -i option enables to search for a string case insensitively in the give file. It matches the words like "UNIX",
"Unix", "unix".
This will search for the lines which starts with a number. Regular expressions is huge topic and I am not
covering it here. This example is just for providing the usage of regular expressions.
By default, grep matches the given string/pattern even if it found as a substring in a file. The -w option to grep
makes it match only the whole words.
Some times, if you are searching for an error in a log file; it is always good to know the lines around the error
lines to know the cause of the error.
This will prints the matched lines along with the two lines before the matched lines.
This will display the matched lines along with the three lines after the matched lines.
This will display the matched lines and also five lines before and after the matched lines.
you can search for a string in all the files under the current directory and sub-directories with the help -r option.
grep -r "string" *
you can display the lines that are not matched with the specified search sting pattern using the -v option.
You can remove the blank lines using the grep command.
We can find the number of lines that matches the given string/pattern
We can just display the files that contains the given string/pattern.
15. Display the file names that do not contain the pattern.
We can display the files which do not contain the matched string/pattern.
By default, grep displays the entire line which has the matched string. We can make the grep to display only
the matched string by using the -o option.
grep -o "string" file.txt
We can make the grep command to display the position of the line which contains the matched string in a file
using the -n option
The -b option allows the grep command to display the character position of the matched string in a file.
the ^ regular expression pattern specifies the start of a line. This can be used in grep to match the lines which
start with the given string or pattern.
the $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which
end with the given string or pattern.
Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word
"unix" with "linux" in the file.
Here the "s" specifies the substitution operation. The "/" are delimiters. The "unix" is the search pattern and the
"linux" is the replacement string.
By default, the sed command replaces the first occurrence of the pattern in each line and it won't replace the
second, third...occurrence in the line.
Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command
replaces the second occurrence of the word "unix" with "linux" in a line.
The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the
string in the line.
Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line.
The following sed command replaces the third, fourth, fifth... "unix" word with "linux" word in a line.
You can use any delimiter other than the slash. As an example if you want to change the web url to another url
as
In this case the url consists the delimiter character which we used. In that case you have to escape the slash
with backslash character, otherwise the substitution won't work.
Using too many backslashes makes the sed command look awkward. In this case we can change the delimiter
to another character as shown in the below example.
>sed 's_http://_www_' file.txt
There might be cases where you want to search for the pattern and replace that pattern by adding some extra
characters to it. In such cases & comes in handy. The & represents the matched string.
The first pair of parenthesis specified in the pattern represents the \1, the second represents the \2 and so on.
The \1,\2 can be used in the replacement string to make changes to the source string. As an example, if you
want to replace the word "unix" in a line with twice as the word like "unixunix" use the sed command as below.
The /p print flag prints the replaced line twice on the terminal. If a line does not have the search pattern and is
not replaced, then the /p prints that line only once.
If you use -n alone without /p, then the sed does not print anything.
You can run multiple sed commands by piping the output of one sed command as input to another sed
command.
Sed provides -e option to run multiple sed commands in a single sed command. The above output can be
achieved in a single sed command as shown below.
You can restrict the sed command to replace the string on a specific line number. An example is
>sed '3 s/unix/linux/' file.txt
The above sed command replaces the string only on the third line.
You can specify a range of line numbers to the sed command for replacing a string.
Here the sed command replaces the lines with range from 1 to 3. Another example is
Here $ indicates the last line in the file. So the sed command replaces the text from second line to last line in
the file.
You can specify a pattern to the sed command to match in a line. If the pattern match occurs, then only the sed
command looks for the string to be replaced and if it finds, then the sed command replaces the string.
Here the sed command first looks for the lines which has the pattern "linux" and then replaces the word "unix"
with "centos".
You can delete the lines a file by specifying the line number or a range or numbers.
you can make the sed command to print each line of a file two times.
Here the sed command looks for the pattern "unix" in each line of a file and prints those lines that has the
pattern.
You can also make the sed command to work as grep -v, just by using the reversing the sed with NOT (!).
The sed command can add a new line after a pattern match is found. The "a" command to sed tells it to add a
new line after a match is found.
the sed command can add a new line before a pattern match is found. The "i" command to sed tells it to add a
new line before a match is found.
The sed command can be used to replace an entire line with a new line. The "c" command to sed tells it to
change the line.
>sed '/unix/ c "Change line"' file.txt
"Change line"
"Change line"
The sed command can be used to convert the lower case letters to upper case letters by using the transform
"y" option.
we will see the usage of cut command by considering the below text file as an example
unix or linux os
is unix good os
is linux good os
The above cut command prints the fourth character in each line of the file. You can print more than one
character at a time by specifying the character positions in a comma separated list as shown in the below
example
xo
ui
ln
This command prints the fourth and sixth character in each line.
You can print a range of characters in a line by specifying the start and end position of the characters.
x or
unix
linu
The above cut command prints the characters from fourth position to the seventh position in each line. To print
the first six characters in a line, omit the start position and specify only the end position.
cut -c-6 file.txt
unix o
is uni
is lin
To print the characters from tenth position to the end, specify only the start position and omit the end position.
inux os
ood os
good os
If you omit the start and end positions, then the cut command prints the entire line.
3.Write a unix/linux cut command to print the fields using the delimiter?
You can use the cut command just as awk command to extract the fields in a file using a delimiter. The -d
option in cut command can be used to specify the delimiter and -f option is used to specify the field position.
or
unix
linux
This command prints the second field in each line by treating the space as delimiter. You can print more than
one field by specifying the position of the fields in a comma delimited list.
cut -d' ' -f2,3 file.txt
or linux
unix good
linux good
The above command prints the second and third field in each line.
Note: If the delimiter you specified is not exists in the line, then the cut command prints the entire line. To
suppress these lines use the -s option in cut command.
You can print a range of fields by specifying the start and end position.
The above command prints the first, second and third fields. To print the first three fields, you can ignore the
start position and specify only the end position.
To print the fields from second fields to last field, you can omit the last field position.
5. Write a unix/linux cut command to display the first field from /etc/passwd file?
The /etc/passwd is a delimited file and the delimiter is a colon (:). The cut command to display the first field in
/etc/passwd file is
logfile.dat
sum.pl
add_int.sh
Using the cut command extract the portion after the dot.
First reverse the text in each line and then apply the command on it.
Awk is one of the most powerful tools in UNIX used for processing the rows and columns in a file. Awk has built
in string functions and associative arrays. Awk supports most of the operators, conditional blocks, and loops
available in C language.
One of the good things is that you can convert Awk scripts into Perl scripts using a2p utility.
Here the actions in the begin block are performed before processing the file and the actions in the end block
are performed after processing the file. The rest of the actions are performed while processing the file.
Examples:
Create a file input_file with the following data. This file can be easily created using the output of ls -l.
From the data, you can observe that this file has rows and columns. The rows are separated by a new line
character and the columns are separated by a space characters. We will use this file as the input for the
examples discussed here.
Here $1 has a meaning. $1, $2, $3... represents the first, second, third columns... in a row respectively. This
awk command will print the first column in each row as shown below.
-rw-r--r--
-rw-r--r--
-rw-r--r--
-rw-r--r--
-rw-r--r--
-rw-r--r--
To print the 4th and 6th columns in a file use awk '{print $4,$5}' input_file
Here the Begin and End blocks are not used in awk. So, the print command will be executed for each row it
reads from the file. In the next example we will see how to use the Begin and End blocks.
this will prints the sum of the value in the 5th column. In the Begin block the variable sum is assigned with value
0. In the next block the value of 5th column is added to the sum variable. This addition of the 5th column to the
sum variable repeats for every row it processed. When all the rows are processed the sum variable will hold the
sum of the values in the 5th column. This value is printed in the End block.
3. In this example we will see how to execute the awk script written in a file. Create a file sum_column and
paste the below script in that file
#!/usr/bin/awk -f
BEGIN {sum=0}
{sum=sum+$5}
This will run the script in sum_column file and displays the sum of the 5th column in the input_file.
This awk command checks for the string "t4" in the 9th column and if it finds a match then it will print the entire
line. The output of this awk command is
This will print the squares of first numbers from 1 to 5. The output of the command is
square of 1 is 1
square of 2 is 4
square of 3 is 9
square of 4 is 16
square of 5 is 25
Notice that the syntax of “if” and “for” are similar to the C language.
You have already seen $0, $1, $2... which prints the entire line, first column, second column... respectively.
Now we will see other built in variables with examples.
So far, we have seen the fields separted by a space character. By default Awk assumes that fields in a file are
separted by space characters. If the fields in the file are separted by any other character, we can use the FS
variable to tell about the delimiter.
39 p1
15 t1
38 t2
38 t3
39 t4
39 t5
By default whenever we printed the fields using the print statement the fields are displayed with space
character as delimiter. For example
center 0
center 17
center 26
center 25
center 43
center 48
center:17
center:26
center:25
center:43
center:48
Note: print $4,$5 and print $4$5 will not work the same way. The first one displays the output with space as
delimiter. The second one displays the output without any delimiter.
index(string, search)
length(string)
split(string,array,separator)
substr(string, position)
substr(string,position,max)
tolower(string)
toupper(string)
Advanced Examples:
The awk split function splits a string into an array using the delimiter.
Now we will see how to filter the lines using the split function with an example.
2 N,P,SHELL,111
3 I,M,UNIX,222
4 X,Y,BASH,333
5 P,R,SCRIPT,444
Required output: Now we have to print only the lines in which whose 2nd field has the string "UNIX" as the 3rd
field( The 2nd filed in the line is separated by comma delimiter ).
The ouptut is:
1 U,N,UNIX,000
3 I,M,UNIX,222
awk '{
split($2,arr,",");
if(arr[3] == "UNIX")
print $0
} ' file.txt
!find
This will execute the last find command. It also displays the last find command executed along with the result
on the terminal.
./bkp/sum.java
./sum.java
This will find all the files with name "sum.java" in the current directory and sub-directories.
./SUM.java
./bkp/sum.java
./sum.java
This will find all the files with name "sum.java" while ignoring the case in the current directory and sub-
directories.
This will find for the file "sum.java" in the current directory only
./SUM.java
./bkp/sum.java
./sum.java
./multiply.java
It displayed all the files which have the word "java" in the filename
This will look for the files in the /etc directory with "java" in the filename
./SUM.java
./bkp
./multiply.java
This is like inverting the match. It prints all the files except the given file "sum.java".
./tmp/sum.java
./bkp/var/tmp/files/sum.java
./bkp/var/tmp/sum.java
./bkp/var/sum.java
./bkp/sum.java
./sum.java
You can see here the find command displayed all the files with name "sum.java" in the current directory and
sub-directories.
a. How to print the files in the current directory and one level down to the current directory?
./tmp/sum.java
./bkp/sum.java
./sum.java
b. How to print the files in the current directory and two levels down to the current directory?
./tmp/sum.java
./bkp/var/sum.java
./bkp/sum.java
./sum.java
c. How to print the files in the subdirectories between level 1 and 4?
./tmp/sum.java
./bkp/var/tmp/files/sum.java
./bkp/var/tmp/sum.java
./bkp/var/sum.java
./bkp/sum.java
./empty_file
10. How to find the largest file in the current directory and sub directories
The find command "find . -type f -exec ls -s {} \;" will list all the files along with the size of the file. Then the sort
command will sort the files based on the size. The head command will pick only the first line from the output of
sort.
11. How to find the smallest file in the current directory and sub directories
find . -type s
b. Finding directories
find . -type d
find . -type f
14. How to find the files which are modified after the modification of a give file.
This will display all the files which are modified after the file "sum.java"
15. Display the files which are accessed after the modification of a give file.
16. Display the files which are changed after the modification of a give file.
This will display the files which have read, write, and execute permissions. To know the permissions of files and
directories use the command "ls -l".
18. Find the files which are modified within 30 minutes.
find . -mtime -1
20. How to find the files which are modified 30 minutes back
21. How to find the files which are modified 1 day back.
find . -atime -1
find . -ctime -2
26. How to find the files which are created between two files.
So far we have just find the files and displayed on the terminal. Now we will see how to perform some
operations on the files.
1. How to find the permissions of the files which contain the name "java"?
Alternate method is
2. Find the files which have the name "java" in it and then display only the files which have "class" word in
them?
Similarly you can apply other Unix commands on the files found using the find command. I will add more
examples as and when i found.