Python regex `finditer` function
Python regex nditer() function explained with examples.
WE'LL COVER THE FOLLOWING
• Python string nditer
• Syntax
• Example 1
Python string nditer #
finditer() is a powerful function in the re module. It returns an iterator
yielding MatchObject instances over all non-overlapping matches for the RE
pattern in string.
Syntax #
re.finditer(pattern, string, flags=0)
Here the string is scanned left-to-right, and matches are returned in the order
found. Empty matches are included in the result unless they touch the
beginning of another match.
Example 1 #
Here is a simple example which demonstrates the use of finditer. It reads in a
page of html text, finds all the occurrences of the word “the” and prints “the”
and the following word. It also prints the character position of each match
using the MatchObject’s start() method.
import re
import urllib2
html = urllib2.urlopen('https://docs.python.org/2/library/re.html').read()
pattern = r'\b(the\s+\w+)\s+'
regex = re.compile(pattern, re.IGNORECASE)
for match in regex.finditer(html):
print "%s: %s" % (match.start(), match.group(1))
Once you have the list of tuples, you can loop over it to do some computation
for each tuple.
Expected output:
output
3261: The Python
4210: the backslash
4451: the same
4474: the same
4651: the pattern
4679: the regular
4930: The solution
5937: The functions
6301: the standard
and so on...