
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What is Raw String Notation in Python regular expression?
This article will discuss what raw string notation is in Python regular expressions. Regex is a set of characters that specifies a search pattern and is mostly used in text processors and search engines to execute find and replace operations.
In Python, regular expressions are used to find patterns in text. But some characters (like \n and \t) may create problems. Raw string notation (r'') can be useful here! This allows Python to treat backslashes as normal characters, which makes regex patterns easy to create and interpret.
Python uses "" to escape characters in raw strings. This can change your regex pattern, which leads to unexpected results. Using r'' helps you avoid such problems.
Here is an example in which we have not used a raw string -
# Double backslash needed pattern = "\d+"
Now let's use see this example using raw string ?
# Using raw string pattern = r"\d+"
How to Use Raw String Notation?
When we define regex patterns, we should always use raw string notation to avoid confusion. Here is the general syntax you can use in your programs -
# Importing the re module import re pattern = r"your_regex_pattern"
Now let us see some examples with Raw String Notation in the section below -
Example 1
In this example, we will use the re module to find all the digits in a string. We have used the re.findall() method to find all the digits in the string. The pattern we will use is \d+, which means one or more digits.
# Import the re module import re # Sample text text = "There are 126258 Articles in tutorialspoint website" # Raw string notation pattern = r"\d+" # Find all matches matches = re.findall(pattern, text) # Print the matches print(matches)
This will create the following outcome -
['126258']
Example 2
In this example, we will use the re module to find all the email addresses in a string. We have used the re.findall() method to find all the email addresses in the string.
# Import the re module import re # Pattern to match simple email addresses pattern = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" # Sample text text = "Contact us at [email protected] or [email protected]" # Find all matches matches = re.findall(pattern, text) # Print the matches print(matches)
This will generate the following result -
['[email protected]', '[email protected]']
Example 3
In this example, the pattern we have used is "\b[Aa]\w*", which filters one or more characters followed by a word boundary.
# Import the re module import re # Sample text txt = "Hello, tutorialspoint!" # Find words p = r"\b\w+\b" # Find all matches print(re.findall(p, txt))
This will produce the following result -
['Hello', 'tutorialspoint']