Let us name the given text file as bar.txt
We use file handling methods in python to remove duplicate lines in python text file or function. The text file or function has to be in the same directory as the python program file. Following code is one way of removing duplicates in a text file bar.txt and the output is stored in foo.txt. These files should be in the same directory as the python script file, else it won’t work.
The file bar.txt is as follows
A cow is an animal. A cow is an animal. A buffalo too is an animal. Lion is the king of jungle.
Example
The code below removes the duplicate lines in bar.txt and stores in foo.txt
# This program opens file bar.txt and removes duplicate lines and writes the # contents to foo.txt file. lines_seen = set() # holds lines already seen outfile = open('foo.txt', "w") infile = open('bar.txt', "r") print "The file bar.txt is as follows" for line in infile: print line if line not in lines_seen: # not a duplicate outfile.write(line) lines_seen.add(line) outfile.close() print "The file foo.txt is as follows" for line in open('foo.txt', "r"): print line
Output
The file foo.txt is as follows
A cow is an animal. A buffalo too is an animal. Lion is the king of jungle.