
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Match Spaces and Newlines with Python Regex
Python's built-in module re (regular expression) provides special characters to match spaces, newlines, tabs etc. spaces can be extracted using the pattern " " and newlines can be extracted using the pattern "\n"
The following is a simple overview of these special characters-
- Whitespace Character \s : Matches any whitespace character.
- Tab \t : Matches a tab character.
- Newline \n : Matches a newline character.
- Vertical Tab \v : Matches a vertical tab character.
- Form Feed \f : Matches a form feed character.
Matching All Spaces and Newlines
To match spaces and newlines specifically, we can use a character class in regex like [ \n]. This tells Python to look for either a space (" ") or a newline ("\n") in the string. If we want to match one or more of them together, we can use a + quantifier after the class.
Example
The following example demonstrates how to find all groups of spaces and newlines in a text (string). The re.findall() method from the re module is used to find all occurrences of a pattern in a given string.
import re text = "Hello \n\nWorld!\nThis is Python." matches = re.findall(r"[ \n]+", text) print(matches)
This shows each match that includes spaces or newlines or both together-
[' \n\n', '\n', ' ', ' ']
Using \s with Filters to Match Spaces and Newlines
In Python regex, \s matches all whitespace (space, tab, newline, carriage return, vertical tab, form feed). To match only spaces/newlines, we need to filter \s results or use a modified pattern. If input contains only spaces/newlines, \s+ matches them perfectly.
Example
In this example, we used re.findall() function with \s+ to find whitespace sequences and a simple list comprehension to store only those spaces or newlines.
import re text = "Hello \n\n\tWorld\n" # Find all whitespace characters whitespace_matches = re.findall(r"\s+", text) # Filter out tabs or other types if needed (Optional) filtered = [match for match in whitespace_matches if all(c in [' ', '\n'] for c in match)] print(filtered)
Following is the output of the above code -
['\n']
Using re.sub() to Remove Spaces and Newlines
If the task was removing all spaces and newlines, then we can use re.sub(). This function is used for performing regular expression substitutions and replacing occurrences of a pattern in a string with a replacement string or a callable.
Example
The following example demonstrates how to remove all spaces and newlines in a string using re.sub() function.
import re text = "Text with \n extra spaces\nand newlines." cleaned = re.sub(r"[ \n]+", "", text) print(cleaned)
Following is the output of the above code ?
Textwithextraspacesandnewlines.