23.python Regular Expressions
23.python Regular Expressions
The regular expressions can be defined as the sequence of characters which are
used to search for a pattern in a string. The module re provides the support to use regex in
the python program
import re
Regex Functions
SN Function Description
This method matches the regex pattern in the string with the optional flag. It
1 Match
returns true if a match is found in the string otherwise it returns false.
2 Search This method returns the match object if there is a match found in the string.
3 Findall It returns a list that contains all the matches of a pattern in the string.
4 Split Returns a list in which the string has been split in each match.
This method returns a list containing a list of all matches of a pattern within the
string. It returns the patterns in the order they are found. If there are no matches, then an
empty list is returned.
Example
import re
str = "How are you. How is everything"
matches = re.findall("How", str)
print(matches)
Output:
['How', 'How']
The search() function searches the string for a match, and returns a Match object if
there is a match.
If there is more than one match, only the first occurrence of the match will be
returned:
Example
The match object contains the information about the search and the output. If there is no
match found, the None object is returned.
Example
import re
str = "How are you. How is everything"
matches = re.search("How", str)
print(type(matches))
print(matches) #matches is the search object
Output:
<class '_sre.SRE_Match'>
<_sre.SRE_Match object; span=(0, 3), match='How'>
Example
>>>import re
>>> str = "hi happy how. how how everything"
>>> matches = re.search("how", str)
>>> print(matches)
<_sre.SRE_Match object; span=(9, 12), match='how'>
THE MATCH OBJECT METHODS
There are the following methods associated with the Match object.
span(): It returns the tuple containing the starting and end position of the match.
group(): The part of the string is returned where the match is found.
Example
import re
str = "How are you. How is everything"
matches = re.search("How", str)
print(matches.span())
print(matches.group())
print(matches.string)
Output:
(0, 3)
How
How are you. How is everything
SPLIT FUNCTION
So, the delimiter could be __, _,, ,_ or ,,. The regular expression to cover these delimiters
is '[_,][_,]'. [_,] specifies that a character could match _ or ,.
Python Program
import re
str = '63__foo,,bar,_mango_,apple'
#split string into chunks
chunks = re.split('[_,][_,]',str)
#print chunks
print(chunks)
Output
In this example, we will also use + which matches one or more of the previous character.
Regular expression '\d+' would match one or more decimal digits. In this example, we
will use this regular expression to split a string into chunks which are separated by one or
more decimal digits.
Python Program
import re
str = 'foo635bar4125mango2apple21orange'
chunks = re.split('\d+',str)
print(chunks)
Output
Example
>>> print(matches)
Example 2:
>>> print(matches)
The sub() function replaces the matches with the text of your choice:
>>> print(matches)
hi8happy8how.8how8how8everything
Example
>>> print(matches)