
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Python Regular Expression for Zero or More Occurrences
In this article we will be using Regular Expression to get zero or more occurrences within the pattern. Regular expressions (regex) are important features to identify patterns in text. In Python, the re module provides support for working with regex. The concept of zero or more occurrences is a key regex feature. This means that a pattern can appear somewhere between zero and an infinite number of times.
Zero or More Occurrences?
In regex, the * symbol denotes zero or more instances of the preceding element. For example, the pattern a* will match -
-
An empty string (zero occurrences)
-
"a"
-
"aa"
-
"aaa"
-
And so on...
Let us now look at some of the examples that vary from simple to advanced in the section below -
Example 1: Matching Zero or More Letters
In this example, we will define the regex pattern a*, which matches zero or more required characters (a in this case) in a string. The created pattern also returns empty matches in the groupings.
import re # Define a string to search input_string = "aaa bbb cc d eee" # Use regex pattern = r'a*' # Find all matches matches = re.findall(pattern, input_string) print(matches)
Output
When you run the program, it will show this output -
['', 'aaa', '', '', '', '']
Example 2: Matching Spaces or No Spaces
In the below Python program, we will use the \s* to match zero or more whitespace characters. The string "Hello World" includes both spaces and empty strings between words.
import re # Define a string with various spaces input_string = "Hello World" # Use Regex pattern = r'\s*' # Find all matches matches = re.findall(pattern, input_string) print(matches)
Output
After running the program, you will get this result -
['', '', '', '', '', ' ', '', '', '', '', '', '']
Example 3: Validating an Email
In this program, we will verify a simple email format. The regex pattern [a-zA-Z]* accepts zero or more letters before the '@' symbol in an email address. This means it can also match emails that end with "@example.com".
import re # Define a sample email string email = "[email protected]" # Match a simple email format pattern = r'[a-zA-Z]*@[a-zA-Z]+\.[a-zA-Z]+' # Search for the pattern if re.match(pattern, email): print(f"'{email}' is a valid email address.") else: print(f"'{email}' is not a valid email address.")
Output
This output will be displayed when the program runs -
'[email protected]' is a valid email address.
Example 4: Extract Tags from HTML
This example shows how to extract HTML tags from a string. The regex <.*?> matches HTML elements with zero or more characters between the opening and closing brackets. The ? makes sure that the match can correctly find tags that do not have a closing part, like self-closing or independent tags.
import re # Define a string containing HTML html_content = "<div>Hello</div><span>World</span>" # Regex to find HTML tags pattern = r'<.*?>' # Find all HTML tags tags = re.findall(pattern, html_content) print(tags)
Output
This will lead to the following outcome -
['<div>', '</div>', '<span>', '</span>']