Shell Removing duplicate
Shell Removing duplicate
utility
Create text file personame as follows:
personame
Hello I am vivek
12333
12333
welcome
to
sai computer academy, a'bad.
what still I remeber that name.
oaky! how are u luser?
what still I remeber that name.
After creating file, issue following command at shell prompt
$ uniq personame
Hello I am vivek
12333
welcome
to
sai computer academy, a'bad.
what still I remeber that name.
oaky! how are u luser?
what still I remeber that name.
Above command prints those lines which are unique. For e.g. our original file
contains 12333 twice, so
additional copies of 12333 are deleted. But if you examine output of uniq, you will
notice that 12333 is
gone (Duplicate), and "what still I remeber that name" remains as its. Because the
uniq utility compare
only adjacent lines, duplicate lines must be next to each other in the file. To
solve this problem you can
use command as follows
$ sort personame | uniq
General Syntax of uniq utility:
Syntax:
uniq {file-name}
LSST v1.05r3 > Chapter 5 > Removing duplicate lines using uniq utility
http://www.cyberciti.biz/pdf/lsst/ch05sec08.html (1 of 2) [7/29/2002 6:53:19 PM]
Prev Home Next
sed utility - Editing file without using
editor
Up Finding matching pattern using grep
utility
LSST v1.05r3 > Chapter 5 > Removing duplicate lines using uniq utility
http://www.cyberciti.biz/pdf/lsst/ch05sec08.html (2 of 2) [7/29/2002 6:53:19 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 5: Essential Utilities for Power User Next
Finding matching pattern using grep
utility
Create text file as follows:
demo-file
hello world!
cartoons are good
especially toon like tom (cat)
what
the number one song
12221
they love us
I too
After saving file, issue following command,
$ grep "too" demofile
cartoons are good
especially toon like tom (cat)
I too
grep will locate all lines for the "too" pattern and print all (matched) such line
on-screen. grep prints too,
as well as cartoons and toon; because grep treat "too" as expression. Expression by
grep is read as the
letter t followed by o and so on. So if this expression is found any where on line
its printed. grep don't
understand words.
Syntax:
grep "word-to-find" {file-name}
Prev Home Next
Removing duplicate lines from text
database file using uniq utility
Up Learning expressions with ex
LSST v1.05r3 > Chapter 5 > Finding matching pattern using grep utility
http://www.cyberciti.biz/pdf/lsst/ch05sec09.html [7/29/2002 6:53:20 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Introduction
In the chpater 5, "Quick Tour of essential utilities", you have seen basic
utilities. If you use them with
other tools, these utilities are very useful for data processing or for other
works. In rest part of tutorial we
will learn more about patterns, filters, expressions, and off course sed and awk in
depth.
Learning expressions with ex
What does "cat" mean to you ?
One its the word cat, (second cat is an animal! I know 'tom' cat), If same question
is asked to computer
(not computer but to grep utility) then grep will try to find all occurrence of
"cat" word (remember grep
read word "cat" as the c letter followed by a and followed by t) including cat,
copycat, catalog etc.
Pattern defined as:
"Set of characters (may be words or not) is called pattern."
For e.g. "dog", "celeron", "mouse", "ship" etc are all example of pattern. Pattern
can be change from one
to another, for e.g. "ship" as "sheep".
Metacharacters defined as:
"If patterns are identified using special characters then such special characters
are known as
metacharacters".
expressions defined as:
"Combination of pattern and metacharacters is known as expressions (regular
expressions)."
Regular expressions are used by different Linux utilities like
grep
awk
sed
So you must know how to construct regular expression. In the next part of LSST you
will learn how to
construct regular expression using ex editor.
For this part of chapter/tutorial create 'demofile' - text file using any text
editor.
Prev Home Next
Finding matching pattern using grep utility Up Getting started with ex
LSST v1.05 > Chapter 6 > Introduction
http://www.cyberciti.biz/pdf/lsst/ch06.html [7/29/2002 6:53:23 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Getting started with ex
You can start the ex editor by typeing ex at shell prompt:
Syntax:
ex {file-name}
Example:
$ ex demofile
The : (colon) is ex prompt where you can type ex text editor command or regular
expression. Its time to
open our demofile, use ex as follows:
$ ex demofile
"demofile" [noeol] 20L, 387C
Entering Ex mode. Type "visual" to go to Normal mode.
:
As you can see, you will get : prompt, here you can type ex command, type q and
press ENTER key to
exit from ex as shown follows: (remember commands are case sensetive)
: q
vivek@ls vivek]$
After typing the q command you are exit to shell prompt.
Prev Home Next
Learning expressions with ex Up Printing text on-screen
LSST v1.05 > Chapter 6 > Getting started with ex
http://www.cyberciti.biz/pdf/lsst/ch06sec01.html [7/29/2002 6:53:24 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Printing text on-screen
First open the our demofile as follows:
$ ex demofile
"demofile" [noeol] 20L, 387C
Entering Ex mode. Type "visual" to go to Normal mode.
Now type 'p' in front of : as follow and press enter
:p
Okay! I will stop.
:
NOTE By default p command will print current line, in our case its the last line of
above text file.
Printing lines using range
Now if you want to print 1st line to next 5 line (i.e. 1 to 5 lines) then give
command
:1,5 p
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
NOTE Here 1,5 is the address. if single number is used (e.g. 5 p) it indicate line
number and if two
numbers are separated by comma its range of line.
Printing particular line
To print 2nd line from our file give command
:2 p
This is vivek from Poona.
Printing entire file on-screen
Give command
:1,$ p
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
LSST v1.05 > Chapter 6 > Printing text on-screen
http://www.cyberciti.biz/pdf/lsst/ch06sec02.html (1 of 2) [7/29/2002 6:53:25 PM]
.....
...
.....
Okay! I will stop.
NOTE Here 1 is 1st line and $ is the special character of ex which mean last-line
character. So 1,$ means
print from 1st line to last-line character (i.e. end of file). Here p stands print.
Printing line number with our text
Give command
:set number
:1,3 p
1 Hello World.
2 This is vivek from Poona.
3
NOTE This command prints number next to each line. If you don't want number you can
turn off
numbers by issuing following command
:set nonumber
:1,3 p
Hello World.
This is vivek from Poona.
Prev Home Next
Getting started with ex Up Deleting lines
LSST v1.05 > Chapter 6 > Printing text on-screen
http://www.cyberciti.biz/pdf/lsst/ch06sec02.html (2 of 2) [7/29/2002 6:53:25 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Deleting lines
Give command
:1, d
I love linux.
NOTE
Here 1 is 1st line and d command indicates deletes (Which deletes the 1st line).
You can even delete range of line by giving command as
:1,5 d
Prev Home Next
Printing text on-screen Up Copying lines
LSST v1.05 > Chapter 6 > Deleting lines
http://www.cyberciti.biz/pdf/lsst/ch06sec03.html [7/29/2002 6:53:26 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Copying lines
Give command as follows
:1,4 co $
:1,$ p
I love linux.
It is different from all other Os
....
.....
. (DOT) is special command of linux.
Okay! I will stop.
I love linux.
It is different from all other Os
My brother Vikrant also loves linux.
NOTE Here 1,4 means copy 1 to 4 lines; co command stands for copy; $ is end of
file. So it mean copy
first four line to end of file. You can delete this line as follows
:18,21 d
Okay! I will stop.
:1,$ p
I love linux.
It is different from all other Os
My brother Vikrant also loves linux.
He currently lerarns linux.
Linux is cooool.
Linux is now 10 years old.
Next year linux will be 11 year old.
Rani my sister never uses Linux
She only loves to play games and nothing else.
Do you know?
. (DOT) is special command of linux.
LSST v1.05 > Chapter 6 > Copying lines
http://www.cyberciti.biz/pdf/lsst/ch06sec04.html (1 of 2) [7/29/2002 6:53:28 PM]
Okay! I will stop.
Prev Home Next
Deleting lines Up Searching the words
LSST v1.05 > Chapter 6 > Copying lines
http://www.cyberciti.biz/pdf/lsst/ch06sec04.html (2 of 2) [7/29/2002 6:53:28 PM]
Linux Shell Scripting Tutorial (LSST) v1.05r3
Prev Chapter 6: Learning expressions with ex Next
Searching the words
(a) Give following command
:/linux/ p
I love linux.
Note In ex you can specify address (line) using number for various operation. This
is useful if you know
the line number in advance, but if you don't know line number, then you can use
contextual address to
print line on-screen. In above example /linux/ is contextual address which is
constructed by surrounding
a regular expression with two slashes. And p is print command of ex.
Try following and note down difference (Hint - Watch p is missing)
:/Linux/
(b)Give following command
:g/linux/ p
I love linux.
My brother Vikrant also loves linux.
He currently lerarns linux.
Next year linux will be 11 year old.
. (DOT) is special command of linux.
In previous example (:/linux/ p) only one line is printed. If you want to print all
occurrence of the word
"linux" then you have to use g, which mean global line address. This instruct ex to
find all occurrence of
pattern. Try following
:1,$ /Linux/ p
Which give the same result. It means g stands for 1,$.
Saving the file in ex
Give command
:w
"demofile" 20L, 386C written
w command will save the file.
Quitting the ex
Give command
:q
LSST v1.05 > Chapter 6 > Searching the words
http://www.cyberciti.biz/pdf/lsst/ch06sec05.html (1 of 2) [7/29/2002 6:53:29 PM]
q command quits from ex and you are return to shell prompt.
Note use wq command to do save and exit from ex.
Prev Home Next
Coping lines Up Find and Replace (Substituting regular
expression)
LSST v1.05 > Chapter 6 > Searching the words
http://www.cyberciti.biz/pdf/lsst/ch06sec05.html (2 of 2) [7/29/2002 6:53:29 PM]