Python Regex: Re - Match, Re - Search, Re - Findall With Example
Python Regex: Re - Match, Re - Search, Re - Findall With Example
Python Regex: Re - Match, Re - Search, Re - Findall With Example
search(),
re.findall() with Example
What is Regular Expression?
A regular expression in a programming language is a special text string used for describing a search
pattern. It is extremely useful for extracting information from text such as code, files, log, spreadsheets or
even documents.
While using the regular expression the first thing is to recognize is that everything is essentially a character,
and we are writing patterns to match a specific sequence of characters also referred as string. Ascii or latin
letters are those that are on your keyboards and Unicode is used to match the foreign text. It includes digits
and punctuation and all special characters like $#@!%, etc.
For instance, a regular expression could tell a program to search for specific text from the string and then to
print out the result accordingly. Expression can include
Text matching
Repetition
Branching
Pattern-composition etc.
In Python, a regular expression is denoted as RE (REs, regexes or regex pattern) are imported through re
module. Python supports regular expression through libraries. In Python regular expression supports
various things like Modifiers, Identifiers, and White space characters.
import re
"re" module included with Python primarily used for string searching and manipulation
Also used frequently for web page "Scraping" (extract large amount of data from websites)
We will begin the expression tutorial with this simple exercise by using the expressions (w+) and (^).
Here we will see an example of how we can use w+ and ^ expression in our code. We cover re.findall
function later in this tutorial but for a while we simply focus on \w+ and \^ expression.
For example, for our string "guru99, education is fun" if we execute the code with w+ and^, it will give the
output "guru99".
import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+",xx)
print(r1)
Remember, if you remove +sign from the w+, the output will change, and it will only give the first character
of the first letter, i.e., [g]
To understand how this regular expression works in Python, we begin with a simple example of a split
function. In the example, we have split each word using the "re.split" function and at the same time we
have used expression \s that allows to parse each word in the string separately.
When you execute this code it will give you the output ['we', 'are', 'splitting', 'the', 'words'].
Now, let see what happens if you remove "\" from s. There is no 's' alphabet in the output, this is because
we have removed '\' from the string, and it evaluates "s" as a regular character and thus split the words
wherever it finds "s" in the string.
Similarly, there are series of other regular expressions in Python that you can use in various ways in
Python like \d,\D,$,\.,\b, etc.
import re
xx = "guru99,education is fun"
r1 = re.findall(r"^\w+", xx)
print((re.split(r'\s','we are splitting the words')))
print((re.split(r's','split the words')))
Next, we will going to see the types of methods that are used with regular expressions.
re.match()
re.search()
re.findall()
Note: Based on the regular expressions, Python offers two different primitive operations. The match
method checks for a match only at the beginning of the string while search checks for a match anywhere in
the string.
Using re.match()
The match function is used to match the RE pattern to string with optional flags. In this method, the
expression "w+" and "\W" will match the words starting with letter 'g' and thereafter, anything which is not
started with 'g' is not identified. To check match for each element in the list or string, we run the forloop.
Finding Pattern in Text (re.search())
A regular expression is commonly used to search for a pattern in a text. This method takes a regular
expression pattern and a string and searches for that pattern with the string.
In order to use search() function, you need to import re first and then execute the code. The search()
function takes the "pattern" and "text" to scan from our main string and returns a match object when the
pattern is found or else not match if the pattern is not found.
For example here we look for two literal strings "Software testing" "guru99", in a text string
"Software Testing is fun". For "software testing" we found the match hence it returns the output as "found a
match", while for word "guru99" we could not found in string hence it returns the output as "No match".
import re
Python Flags
Many Python Regex Methods and Regex functions take an optional argument called Flags. This flags can
modify the meaning of the given Regex pattern. To understand these we will see one or two example of
these Flags.
[re.S] Make [ . ]
import re
xx = """guru99
careerguru99
selenium"""
k1 = re.findall(r"^\w", xx)
k2 = re.findall(r"^\w", xx, re.MULTILINE)
print(k1)
print(k2)
Likewise, you can also use other Python flags like re.U (Unicode), re.L (Follow locale), re.X (Allow
Comment), etc.
Python 2 Example
Above codes are Python 3 examples, If you want to run in Python 2 please consider following code.
A regular expression in a programming language is a special text string used for describing a search
pattern. It includes digits and punctuation and all special characters like $#@!%, etc. Expression can
include literal
Text matching
Repetition
Branching
Pattern-composition etc.
In Python, a regular expression is denoted as RE (REs, regexes or regex pattern) are embedded through
re module.
"re" module included with Python primarily used for string searching and manipulation
Also used frequently for webpage "Scraping" (extract large amount of data from websites)
Regular Expression Methods include re.match(),re.search()& re.findall()
Python Flags Many Python Regex Methods and Regex functions take an optional argument
called Flags
This flags can modify the meaning of the given Regex pattern
Various Python flags used in Regex Methods are re.M, re.I, re.S, etc.