Regular Expression
Regular Expression
RegEx
• A sequence of characters that forms a search
pattern.
• RegEx can be used to check if a string contains the
specified search pattern.
• Python has a built-in package called re,
which can be used to work with Regular
Expressions.
• Import the re module:
• import re
• Regex is provided by many programming languages,
such as python, java, javascript, etc.
• Applications:
• email validation, password validation, phone number
validation, and many other fields of the form.
Example
import re
#Check if the string starts with "The" and ends with "Spain":
if x:
print("YES! We have a match!")
else:
print("No match")
import re
Output:
['ai', 'ai']
Example: Return an empty list if no match was found
import re
txt = "The rain in Spain"
#Check if "Portugal" is in the string:
x = re.findall("Portugal", txt)
print(x)
if (x):
print("Yes, there is at least one match!")
else:
print("No match")
Output:
[]
No match
Use of findall()
• #Check if "INDIA" is in the string: match is case-sensitive
• import re
• txt = "The rain in Spain, INDIA"
• x = re.findall("INDIA", txt)
• print(x)
The search() Function:
The search() function searches the string for a match, and returns a Match object if there is a
match.
If there is more than one match, only the first occurrence of the match will be returned:
import re
The split() function returns a list where the string has been split at each
match:
import re
#Split the string at every white-space character:
txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)
Output: ['The', 'rain', 'in', 'Spain']
Example: Split the string only at the first occurrence
import re
The sub() function replaces the matches with the text of your choice:
import re
• A Match Object is an object containing information about the search and the result.
• If there is no match, the value None will be returned, instead of the Match Object.
import re
#The search() function returns a Match object:
txt = "The rain in Spain"
x = re.search("ai", txt)
print(x)
Output:
<_sre.SRE_Match object; span=(5, 7), match='ai'>
• The Match object has properties and methods used to retrieve
information about the search, and the result:
• .span() returns a tuple containing the start-, and end positions of the
match.
• .string returns the string passed into the function
• .group() returns the part of the string where there was a match
Example: Print the position (start- and end-
position) of the first match occurrence.
import re
#Search for an upper case "S" character in the beginning of a word, and
print its position:
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())
Output:
(12, 17)
Example: Print the string passed into the function:
import re
import re
#Search for an upper case "S" character in the beginning of a word, and
print the word:
Output: Spain
Thank you