Computer >> Computer tutorials >  >> Programming >> Python

How does nested character class subtraction work in Python?


Nested Character Class Subtraction

Since we can use the full character class syntax within the subtracted character class, we can subtract a class from the class being subtracted. [0-9-[0-7-[0-3]]] first subtracts 0-3 from 0-7, yielding [0-9-[4-7]], or [0-38-9], which matches any character in the string 012389.

The class subtraction is always the last element in the character class. [0-9-[4-7]a-d] is not a valid regular expression. It should be rewritten as [0-9a-d-[4-7]]. The subtraction works on the whole class. 

While we can use nested character class subtraction, we cannot subtract two classes sequentially. To subtract ASCII characters and Arabic characters from a class with all Unicode letters, combine the ASCII and Arabic characters into one class, and subtract that, as in [\p{L}-[\p{IsBasicLatin}\p{IsArabic}]].