
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Filter Valid Emails in a Pandas Series Using Regex
A regular expression is a sequence of characters that define a search pattern. In this program, we will use these regular expressions to filter valid and invalid emails.
We will define a Pandas series with different emails and check which email is valid. We will also use a python library called re which is used for regex purposes.
Algorithm
Step 1: Define a Pandas series of different email ids. Step 2: Define a regex for checking validity of emails. Step 3: Use the re.search() function in the re library for checking the validity of the email.
Example Code
import pandas as pd import re series = pd.Series(['[email protected]', 'hellowolrd.com']) regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$' for email in series: if re.search(regex, email): print("{}: Valid Email".format(email)) else: print("{} : Invalid Email".format(email))
Output
[email protected]: Valid Email hellowolrd.com : Invalid Email
Explanation
The regex variable has the following symbols:
- ^: Anchor for the start of the string
- [ ]: Opening and closing square brackets define a character class to match a single character
- \ : Escape character
- . : The dot matches any character except the newline symbol
- {} : The opening and closing curly brackets are used for range definition
- $ : The dollar sign is the anchor for the end of the string
Advertisements