0% found this document useful (0 votes)
2 views35 pages

Python Unit - V

Unit V of the Python course focuses on Object-Oriented Programming (OOP) concepts, including classes, objects, inheritance, and polymorphism. It explains key terminology such as class variables, instance variables, and method overloading, along with practical examples of creating classes and instances. The unit also covers garbage collection, built-in class attributes, and different types of inheritance, highlighting the benefits of OOP in terms of code reusability and modularity.

Uploaded by

karpagam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views35 pages

Python Unit - V

Unit V of the Python course focuses on Object-Oriented Programming (OOP) concepts, including classes, objects, inheritance, and polymorphism. It explains key terminology such as class variables, instance variables, and method overloading, along with practical examples of creating classes and instances. The unit also covers garbage collection, built-in class attributes, and different types of inheritance, highlighting the benefits of OOP in terms of code reusability and modularity.

Uploaded by

karpagam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Python: Unit - V

UNIT - V
Python – Object Oriented
Python has been an object-oriented language since it existed. Because of this, creating
and using classes and objects are downright easy.
Almost everything in Python is an object, with its properties and methods. A Class is
like an object constructor, or a "blueprint" for creating objects.

Overview of OOP Terminology


➢ Class − A user-defined prototype for an object that defines a set of attributes that
characterize any object of the class. The attributes are data members (class variables
and instance variables) and methods, accessed via dot notation.
➢ Class variable − A variable that is shared by all instances of a class. Class variables are
defined within a class but outside any of the class's methods. Class variables are not
used as frequently as instance variables are.
➢ Data member − A class variable or instance variable that holds data associated with a
class and its objects.
➢ Function overloading − The assignment of more than one behavior to a particular
function. The operation performed varies by the types of objects or arguments involved.
➢ Instance variable − A variable that is defined inside a method and belongs only to the
current instance of a class.
➢ Inheritance − The transfer of the characteristics of a class to other classes that are
derived from it.
➢ Instance − An individual object of a certain class. An object obj that belongs to a class
Circle, for example, is an instance of the class Circle.
➢ Instantiation − The creation of an instance of a class.
➢ Method − A special kind of function that is defined in a class definition.
➢ Object − A unique instance of a data structure that's defined by its class. An object
comprises both data members (class variables and instance variables) and methods.
➢ Operator overloading − The assignment of more than one function to a particular
operator.

Department of CS&IT, S.S.D.M College. 1


Python: Unit - V

Creating Classes
The class statement creates a new class definition. The name of the class immediately
follows the keyword class followed by a colon.
Syntax:
class ClassName:
'Optional class documentation string'
class_suite
✓ The class has a documentation string, which can be accessed via
ClassName.__doc__.
✓ The class_suite consists of all the component statements defining class
members, data attributes and functions.
Example:
class Employee:
'Common base class for all employees'
empCount = 0

def __init__(self, name, salary):


self.name = name
self.salary = salary
Employee.empCount += 1

def displayCount(self):
print ("Total Employee”, Employee.empCount)

def displayEmployee(self):
print ("Name : ", self.name, ", Salary: ", self.salary)

✓ The variable empCount is a class variable whose value is shared among all
instances of a this class. This can be accessed as Employee.empCount from
inside the class or outside the class.
✓ The first method __init__() is a special method, which is called class constructor
or initialization method that Python calls when you create a new instance of this
class.

Department of CS&IT, S.S.D.M College. 2


Python: Unit - V

✓ You declare other class methods like normal functions with the exception that
the first argument to each method is self. Python adds the self-argument to the
list for you; you do not need to include it when you call the methods.

Creating Instance Objects


To create instances of a class, call the class using class name and pass in whatever
arguments its __init__ method accepts.
Example:
"This would create first object of Employee class"
emp1 = Employee("Zara", 2000)
"This would create second object of Employee class"
emp2 = Employee("Manni", 5000)

Accessing Attributes
Access the object's attributes using the dot operator with object. Class variable would
be accessed using class name.
Example:
emp1.displayEmployee()
emp2.displayEmployee()
print "Total Employee”, Employee.empCount

Complete Program:
class Employee:
'Common base class for all employees'
empCount = 0

def __init__(self, name, salary):


self.name = name
self.salary = salary
Employee.empCount += 1

def displayCount(self):
print ("Total Employee”, Employee.empCount)

Department of CS&IT, S.S.D.M College. 3


Python: Unit - V

def displayEmployee(self):
print ("Name : ", self.name, ", Salary: ", self.salary)

"This would create first object of Employee class"


emp1 = Employee("Zara", 2000)

"This would create second object of Employee class"


emp2 = Employee("Manni", 5000)
emp1.displayEmployee()
emp2.displayEmployee()
print ("Total Employee:, Employee.empCount)

Instead of using the normal statements to access attributes, use the following functions,
✓ The getattr(obj, name[, default]) − to access the attribute of object.
✓ The hasattr(obj,name) − to check if an attribute exists or not.
✓ The setattr(obj,name,value) − to set an attribute. If attribute does not exist, then
it would be created.
✓ The delattr(obj, name) − to delete an attribute.
Example:
hasattr(emp1, 'age') # Returns true if 'age' attribute exists
getattr(emp1, 'age') # Returns value of 'age' attribute
setattr(emp1, 'age', 8) # Set attribute 'age' at 8
delattr(empl, 'age') # Delete attribute 'age'

Built-In Class Attributes


Every Python class keeps following built-in attributes and they can be accessed using
dot operator like any other attribute −
➢ __dict__ − Dictionary containing the class's namespace.
➢ __doc__ − Class documentation string or none, if undefined.
➢ __name__ − Class name.
➢ __module__ − Module name in which the class is defined. This attribute is
"__main__" in interactive mode.

Department of CS&IT, S.S.D.M College. 4


Python: Unit - V

➢ __bases__ − A possibly empty tuple containing the base classes, in the order of
their occurrence in the base class list.
Example:
class Employee:
empCount = 0

def __init__(self, name, salary):


self.name = name
self.salary = salary
Employee.empCount += 1

def displayCount(self):
print ("Total Employee”, Employee.empCount)

def displayEmployee(self):
print ("Name : ", self.name, ", Salary: ", self.salary)

print ("Employee.__doc__:", Employee.__doc__)


print ("Employee.__name__:", Employee.__name__)
print ("Employee.__module__:", Employee.__module__)
print ("Employee.__bases__:", Employee.__bases__)
print ("Employee.__dict__:", Employee.__dict__)

Destroying Objects (Garbage Collection)


Python deletes unneeded objects (built-in types or class instances) automatically to free
the memory space. The process by which Python periodically reclaims blocks of memory that
no longer are in use is termed Garbage Collection.
Python's garbage collector runs during program execution and is triggered when an
object's reference count reaches zero. An object's reference count changes as the number of
aliases that point to it changes.
An object's reference count increases when it is assigned a new name or placed in a
container (list, tuple, or dictionary). The object's reference count decreases when it's deleted

Department of CS&IT, S.S.D.M College. 5


Python: Unit - V

with del, its reference is reassigned, or its reference goes out of scope. When an object's
reference count reaches zero, Python collects it automatically.
Example:
a = 40 # Create object <40>
b=a # Increase ref. count of <40>
c = [b] # Increase ref. count of <40>

del a # Decrease ref. count of <40>


b = 100 # Decrease ref. count of <40>
c[0] = -1 # Decrease ref. count of <40>
Normally will not notice when the garbage collector destroys an orphaned instance and
reclaims its space. But a class can implement the special method __del__(), called a destructor,
that is invoked when the instance is about to be destroyed. This method might be used to clean
up any non-memory resources used by an instance.
Example:
class Point:
def __init__( self, x=0, y=0):
self.x = x
self.y = y
def __del__(self):
class_name = self.__class__.__name__
print class_name, "destroyed"

pt1 = Point()
pt2 = pt1
pt3 = pt1
print (id(pt1), id(pt2), id(pt3)) # prints the ids of the obejcts
del pt1
del pt2
del pt3

Department of CS&IT, S.S.D.M College. 6


Python: Unit - V

Inheritance
Instead of starting from scratch, you can create a class by deriving it from a preexisting
class by listing the parent class in parentheses after the new class name.
The child class inherits the attributes of its parent class, and you can use those attributes
as if they were defined in the child class. A child class can also override data members and
methods from the parent.
Syntax:
class SubClassName (ParentClass1[, ParentClass2, ...]):
'Optional class documentation string'
class_suite
Example:
class Parent: # define parent class
parentAttr = 100
def __init__(self):
print ("Calling parent constructor")

def parentMethod(self):
print ('Calling parent method')

def setAttr(self, attr):


Parent.parentAttr = attr

def getAttr(self):
print ("Parent attribute :", Parent.parentAttr)

class Child(Parent): # define child class


def __init__(self):
print ("Calling child constructor")

def childMethod(self):
print ('Calling child method')

c = Child() # instance of child

Department of CS&IT, S.S.D.M College. 7


Python: Unit - V

c.childMethod() # child calls its method


c.parentMethod() # calls parent's method
c.setAttr(200) # again call parent's method
c.getAttr() # again call parent's method
Similar way, drive a class from multiple parent classes as follows,
Syntax:
class A: # define your class A
.....
class B: # define your class B
.....
class C(A, B): # subclass of A and B
You can use issubclass() or isinstance() functions to check a relationships of two classes
and instances.
➢ The issubclass(sub, sup) boolean function returns true if the given subclass sub
is indeed a subclass of the superclass sup.
➢ The isinstance(obj, Class) boolean function returns true if obj is an instance of
class Class or is an instance of a subclass of Class.

Types of Inheritance
✓ Single Inheritance
✓ Multiple Inheritance
✓ Multilevel Inheritance
✓ Hierarchical Inheritance
✓ Hybrid Inheritance

Single Inheritance
Single inheritance occurs when a class inherits from only one superclass. This is the
simplest form of inheritance.
Example:
class Animal:
def speak(self):
print("Animal speaks")

Department of CS&IT, S.S.D.M College. 8


Python: Unit - V

class Dog(Animal):
def bark(self):
print("Dog barks")

# Create an instance of Dog


dog = Dog()
dog.speak() # Output: "Animal speaks"
dog.bark() # Output: "Dog barks"

Multiple Inheritance
Multiple inheritance occurs when a class inherits from multiple superclasses. It allows
a subclass to inherit attributes and methods from more than one parent class.
Syntax:
class A:
pass
class B:
pass
class C(A, B): # C inherits from both A and B
pass
Example:
class A:
def method_a(self):
print("Method A")

class B:
def method_b(self):
print("Method B")

class C(A, B):


def method_c(self):
print("Method C")

# Create an instance of C

Department of CS&IT, S.S.D.M College. 9


Python: Unit - V

c = C()
c.method_a() # Output: "Method A"
c.method_b() # Output: "Method B"
c.method_c() # Output: "Method C"

Multilevel Inheritance
Multilevel inheritance occurs when a class inherits from a superclass, and then another
class inherits from this subclass, creating a chain of inheritance.
Syntax:
class A:
pass
class B(A): # B inherits from A
pass
class C(B): # C inherits from B
pass
Example:
class A:
def method_a(self):
print("Method A")

class B(A):
def method_b(self):
print("Method B")

class C(B):
def method_c(self):
print("Method C")

# Create an instance of C
c = C()
c.method_a() # Output: "Method A"
c.method_b() # Output: "Method B"
c.method_c() # Output: "Method C"

Department of CS&IT, S.S.D.M College. 10


Python: Unit - V

Hierarchical Inheritance
Hierarchical inheritance occurs when multiple subclasses inherit from the same
superclass.
Syntax:
class Animal:
pass
class Dog(Animal): # Dog inherits from Animal
pass
class Cat(Animal): # Cat also inherits from Animal
pass
Example:
class Animal:
def speak(self):
print("Animal speaks")

class Dog(Animal):
def bark(self):
print("Dog barks")

class Cat(Animal):
def meow(self):
print("Cat meows")

# Create an instance of Dog and Cat


dog = Dog()
cat = Cat()
dog.speak() # Output: "Animal speaks"
dog.bark() # Output: "Dog barks"
cat.speak() # Output: "Animal speaks"
cat.meow() # Output: "Cat meows"

Department of CS&IT, S.S.D.M College. 11


Python: Unit - V

Hybrid Inheritance
Hybrid inheritance is a combination of multiple inheritance and multilevel inheritance.
Syntax:
class A:
pass
class B(A):
pass
class C(A):
pass
class D(B, C): # D inherits from both B and C
pass
Example:
class A:
def method_a(self):
print("Method A")

class B(A):
def method_b(self):
print("Method B")

class C(A):
def method_c(self):
print("Method C")

class D(B, C):


def method_d(self):
print("Method D")

# Create an instance of D
d = D()
d.method_a() # Output: "Method A"
d.method_b() # Output: "Method B"
d.method_c() # Output: "Method C"

Department of CS&IT, S.S.D.M College. 12


Python: Unit - V

d.method_d() # Output: "Method D"

Benefits
➢ Code Reusability: Inheritance promotes code reuse by allowing subclasses to
inherit attributes and methods from their superclasses.
➢ Modularity: It facilitates modular programming by organizing code into
hierarchical structures.
➢ Extensibility: Inheritance allows classes to be easily extended by adding new
attributes and methods in subclasses.

Polymorphism
Polymorphism in Python refers to the ability of different objects to respond to the same
method calls or operators in different ways. It allows objects of different classes to be treated
as objects of a common superclass. There are two main types of polymorphism in Python:
method overriding (runtime polymorphism) and operator overloading.

Method Overriding
You can always override your parent class methods. One reason for overriding parent's
methods is because you may want special or different functionality in your subclass.
Example:
class Parent: # define parent class
def myMethod(self):
print ('Calling parent method')

class Child(Parent): # define child class


def myMethod(self):
print ('Calling child method')

c = Child() # instance of child


c.myMethod() # child calls overridden method

Department of CS&IT, S.S.D.M College. 13


Python: Unit - V

Base Overloading Methods


Following table lists some generic functionality that you can override in your own
classes.

S.No. Method, Description & Sample Call

__init__ ( self [,args...] )


1 Constructor (with any optional arguments)
Sample Call : obj = className(args)

__del__( self )
2 Destructor, deletes an object
Sample Call : del obj

__repr__( self )
3 Evaluable string representation
Sample Call : repr(obj)

__str__( self )
4 Printable string representation
Sample Call : str(obj)

__cmp__ ( self, x )
5 Object comparison
Sample Call : cmp(obj, x)

Operator Overloading
Suppose you have created a Vector class to represent two-dimensional vectors, what
happens when you use the plus operator to add them? Most likely Python will yell at you.
Define the __add__ method in your class to perform vector addition and then the plus operator
would behave as per expectation
Example:
class Vector:
def __init__(self, a, b):
self.a = a
self.b = b

def __str__(self):
return ('Vector (%d, %d)' % (self.a, self.b))

Department of CS&IT, S.S.D.M College. 14


Python: Unit - V

def __add__(self,other):
return Vector(self.a + other.a, self.b + other.b)

v1 = Vector(2,10)
v2 = Vector(5,-2)
print (v1 + v2)

Benefits
➢ Code Reusability: Polymorphism allows you to reuse code by creating classes
that are interchangeable with each other.
➢ Flexibility and Extensibility: It provides flexibility in designing software
systems and allows for easy extension of functionality by adding new
subclasses.

Type Identification
Type identification in Python refers to the process of determining the data type of a
variable or object within your code. Python is a dynamically-typed language, meaning the
interpreter automatically assigns data types to variables based on the value they hold. It's often
useful to programmatically check the type of data in order to handle it appropriately during
runtime.
➢ type()
➢ isinstance()
type()
The type() function in Python is a built-in function that returns the type of an object.
It's commonly used to dynamically check the data type of variables or objects during runtime.
Syntax:
type(object)
Here, object is the object whose type you want to determine. The type() function returns
the type of the object as a type object.
Example:
x=5
print(type(x)) # Output: <class 'int'>

Department of CS&IT, S.S.D.M College. 15


Python: Unit - V

y = "Hello"
print(type(y)) # Output: <class 'str'>

z = [1, 2, 3]
print(type(z)) # Output: <class 'list'>
✓ type(x) returns <class 'int'> because x is an integer.
✓ type(y) returns <class 'str'> because y is a string.
✓ type(z) returns <class 'list'> because z is a list.

Checking Type Equivalence


The type() function can be used to compare the types of two objects. It returns True if
the types are the same and False otherwise.
Example:
x=5
y = 10
print(type(x) == type(y)) # Output: True

Use Cases
➢ Debugging: You can use the type() function for debugging purposes to
understand the types of objects at different stages of your program.
➢ Type Checking: You can perform runtime type checking to ensure that variables
or objects have the expected types before performing operations on them.
➢ Dynamic Behavior: You can use the type information obtained from type() to
dynamically control the behavior of your code based on the types of objects.
The type() function is a powerful tool for introspection and dynamic behavior in
Python. It allows you to inspect the types of objects at runtime, enabling you to write more
flexible and robust code.

isinstance()
The isinstance() function in Python is used to check if an object is an instance of a
specified class or a subclass thereof. It returns True if the object is an instance of the specified
class, or False otherwise.

Department of CS&IT, S.S.D.M College. 16


Python: Unit - V

Syntax:
isinstance(object, classinfo)
✓ object is the object whose type you want to check.
✓ classinfo can be a class, type, or a tuple of classes and types. If object is an
instance of any of the classes or types specified in classinfo, isinstance() returns
True.
Example:
x=5
print(isinstance(x, int)) # Output: True
s = "Hello"
print(isinstance(s, str)) # Output: True
l = [1, 2, 3]
print(isinstance(l, list)) # Output: True

Handling Inheritance
isinstance() can also check if an object is an instance of a subclass of the specified class.
Example:
class Animal:
pass
class Dog(Animal):
pass
d = Dog()
print(isinstance(d, Animal)) # Output: True

Handling Multiple Classes:


You can pass a tuple of classes or types to check if an object is an instance of any of
them.
Example:
x=5
print(isinstance(x, (int, float))) # Output: True
l = [1, 2, 3]
print(isinstance(l, (list, tuple))) # Output: True

Department of CS&IT, S.S.D.M College. 17


Python: Unit - V

Use Cases
➢ Type Checking: isinstance() is commonly used for type checking in Python
programs. It allows you to validate the types of objects before performing
operations on them.
➢ Handling Inheritance: isinstance() can be used to handle polymorphic behavior
in object-oriented programming by checking if an object is an instance of a base
class or any of its subclasses.
➢ Conditional Behavior: You can use isinstance() to control the behavior of your
code based on the types of objects passed to it.
The isinstance() function is a versatile tool for checking the types of objects in Python,
allowing you to write more flexible and robust code that can handle different types of input
data.

Character Matches
Character matches typically refer to the process of searching for specific characters or
sequences of characters within strings. This process is commonly done using regular
expressions, which provide a flexible and powerful way to define patterns for matching text.
Character matches can involve various operations, including searching for exact
character sequences, matching specific character classes, finding patterns with wildcards, and
more.
Example:
import re

text = "I have an apple, he has an apple, we all love apples."


pattern = "apple"
matches = re.findall(pattern, text)
print(matches) # Output: ['apple', 'apple', 'apple']

Types
✓ Exact Matches
✓ Character Classes
✓ Quantifiers
✓ Special Characters

Department of CS&IT, S.S.D.M College. 18


Python: Unit - V

✓ Grouping
✓ Dot Character
✓ Greedy Matches
✓ Match Objects
✓ Substituting
✓ Splitting a String
✓ Matching at Beginning or End
✓ Compiling Regular Expressions

Exact Matches
Performing an exact match typically involves searching for a specific sequence of
characters within a string. You can achieve this using various methods, including using built-
in string methods or regular expressions.

Using String Methods


You can use string methods like str.find() or str.index() to find the position of the
substring within the string.
Example:
text = "The quick brown fox jumps over the lazy dog"
pattern = "fox"
if text.find(pattern) != -1:
print("Exact match found") # Output: Exact match found
else:
print("No exact match")

Using Regular Expressions


Regular expressions provide a powerful way to search for patterns within strings,
including exact matches.
Example:
import re
text = "The quick brown fox jumps over the lazy dog"
pattern = r"\bfox\b" # \b represents word boundaries
match = re.search(pattern, text)

Department of CS&IT, S.S.D.M College. 19


Python: Unit - V

if match:
print("Exact match found") # Output: Exact match found
else:
print("No exact match")

Character Classes
Character classes in regular expressions allow you to match specific groups or classes
of characters within a string. They provide a convenient way to specify patterns that match
certain types of characters, such as digits, letters, whitespace, or custom character ranges.

Predefined Character Classes


➢ \d: Matches any digit character (equivalent to [0-9]).
➢ \D: Matches any non-digit character (equivalent to [^0-9]).
➢ \w: Matches any alphanumeric character or underscore (equivalent to [a-zA-Z0-
9_]).
➢ \W: Matches any non-alphanumeric character or underscore (equivalent to [^a-
zA-Z0-9_]).
➢ \s: Matches any whitespace character (equivalent to [\t\n\r\f\v]).
➢ \S: Matches any non-whitespace character (equivalent to [^\t\n\r\f\v]).

Custom Character Classes


You can also define custom character classes using square brackets [], allowing you to
specify a range or list of characters to match.

➢ [abc]: Matches any of the characters 'a', 'b', or 'c'.


➢ [a-z]: Matches any lowercase letter from 'a' to 'z'.
➢ [A-Z]: Matches any uppercase letter from 'A' to 'Z'.
➢ [0-9]: Matches any digit from 0 to 9.
➢ [a-zA-Z]: Matches any letter, both lowercase and uppercase.
➢ [a-zA-Z0-9]: Matches any alphanumeric character.

Department of CS&IT, S.S.D.M College. 20


Python: Unit - V

Negated Character Classes


You can negate a character class by placing a caret ^ at the beginning. This matches
any character not included in the character class.

➢ [^0-9]: Matches any non-digit character.


➢ [^a-zA-Z]: Matches any non-letter character.
➢ [^a-zA-Z0-9]: Matches any non-alphanumeric character.
Example:
import re

text = "The quick brown fox jumps over the lazy dog 123"

# Matches any digit character


digits = re.findall(r"\d", text)
print("Digits:", digits) # Output: Digits: ['1', '2', '3']

# Matches any word character or underscore


alphanumeric = re.findall(r"\w", text)
print("Alphanumeric:", alphanumeric)

# Matches any non-word character


non_alphanumeric = re.findall(r"\W", text)
print("Non-Alphanumeric:", non_alphanumeric)

Quantifiers
Quantifiers are special characters in regular expressions that specify the number of
occurrences of a character or group of characters that should be matched in a string. They
provide a way to define patterns that involve repetition.
➢ Asterisk (*)
➢ Plus (+)
➢ Question Mark (?)
➢ Curly Braces ({})

Department of CS&IT, S.S.D.M College. 21


Python: Unit - V

Asterisk (*) - Zero or More Occurrences


The asterisk * matches zero or more occurrences of the preceding character or group.
Example:
import re
text = "abbbbcccd"
# Matches zero or more 'b' characters
pattern = "ab*"
matches = re.findall(pattern, text)
print(matches) # Output: ['abbbb']

Plus (+) - One or More Occurrences


The plus + matches one or more occurrences of the preceding character or group.
Example:
# Matches one or more 'b' characters
pattern = "ab+"
matches = re.findall(pattern, text)
print(matches) # Output: ['abbbb']

Question Mark (?) - Zero or One Occurrence


The question mark ? matches zero or one occurrence of the preceding character or
group.
Example:
# Matches zero or one 'b' character
pattern = "ab?"
matches = re.findall(pattern, text)
print(matches) # Output: ['ab']

Curly Braces ({})


Curly Braces ({n}) - Exactly n Occurrences
The curly braces {n} match exactly n occurrences of the preceding character or group.
Example:
# Matches exactly three 'b' characters
pattern = "ab{3}"

Department of CS&IT, S.S.D.M College. 22


Python: Unit - V

matches = re.findall(pattern, text)


print(matches) # Output: ['abbb']

Curly Braces ({n,}) - n or More Occurrences


The curly braces {n,} match n or more occurrences of the preceding character or group.
Example:
# Matches two or more 'b' characters
pattern = "ab{2,}"
matches = re.findall(pattern, text)
print(matches) # Output: ['abbbb']

Curly Braces ({n,m}) - Between n and m Occurrences


The curly braces {n,m} match between n and m occurrences of the preceding character
or group.
Example:
# Matches between two and four 'b' characters
pattern = "ab{2,4}"
matches = re.findall(pattern, text)
print(matches) # Output: ['abbb']

Special Characters
Special characters in Python regular expressions have special meanings and are used to
define specific patterns for matching text. They are a fundamental part of regular expressions
and provide a powerful way to search for complex patterns within strings. Here are some
common special characters in Python regular expressions:
➢ Dot (.): Matches any single character except newline \n. It is often used as a
wildcard to match any character.
➢ Caret (^): Matches the start of a string.
➢ Dollar Sign ($): Matches the end of a string.
➢ Pipe (|): Acts as an OR operator, allowing you to specify multiple alternatives.
➢ Question Mark (?): Makes the preceding character or group optional, matching
zero or one occurrence.

Department of CS&IT, S.S.D.M College. 23


Python: Unit - V

➢ Asterisk (*): Matches zero or more occurrences of the preceding character or


group.
➢ Plus (+): Matches one or more occurrences of the preceding character or group.
➢ Curly Braces ({}): Allow you to specify exact, minimum, or range of
occurrences of the preceding character or group.
Example:
import re
# Input text
text = "The quick brown fox jumps over the lazy dog"
# Dot (.)
pattern_dot = "q..ck"
matches_dot = re.findall(pattern_dot, text)
print("Dot (.) Matches:", matches_dot) # Output: ['quick']
# Caret (^)
pattern_caret = "^The"
match_caret = re.search(pattern_caret, text)
print("Caret (^) Match:", match_caret.group() if match_caret else "No
match") # Output: Caret (^) Match: The
# Dollar Sign ($)
pattern_dollar = "dog$"
match_dollar = re.search(pattern_dollar, text)
print("Dollar Sign ($) Match:", match_dollar.group() if match_dollar else
"No match") # Output: Dollar Sign ($) Match: dog
# Pipe (|)
pattern_pipe = "fox|dog"
matches_pipe = re.findall(pattern_pipe, text)
print("Pipe (|) Matches:", matches_pipe) # Output: Pipe (|) Matches: ['fox',
'dog']
# Question Mark (?)
pattern_question = "brown?"
matches_question = re.findall(pattern_question, text)
print("Question Mark (?) Matches:", matches_question) # Output: Question
Mark (?) Matches: ['brown']

Department of CS&IT, S.S.D.M College. 24


Python: Unit - V

# Asterisk (*)
pattern_asterisk = "qu*ck"
matches_asterisk = re.findall(pattern_asterisk, text)
print("Asterisk (*) Matches:", matches_asterisk) # Output: Asterisk (*)
Matches: ['q', 'qu']
# Plus (+)
pattern_plus = "qu+ck"
matches_plus = re.findall(pattern_plus, text)
print("Plus (+) Matches:", matches_plus) # Output: Plus (+) Matches:
['quick']
# Curly Braces ({})
pattern_curly = "o{2}ver"
matches_curly = re.findall(pattern_curly, text)
print("Curly Braces ({}) Matches:", matches_curly) # Output: Curly Braces
({}) Matches: ['oov']
# Curly Braces ({})
pattern_curly_range = "o{1,3}ver"
matches_curly_range = re.findall(pattern_curly_range, text)
print("Curly Braces ({}) Range Matches:", matches_curly_range) # Output:
Curly Braces ({}) Range Matches: ['oov', 'oov']
These special characters, along with regular characters, form the building blocks of
regular expressions in Python and enable you to define sophisticated patterns for pattern
matching tasks. Understanding how to use special characters effectively is essential for
mastering regular expressions in Python.

Grouping
Grouping in Python regular expressions allows you to define sub patterns within a
larger pattern and capture specific parts of the matched text. It's useful for extracting specific
information from strings or for applying quantifiers to multiple characters or groups.
You can create a group by enclosing a sub pattern within parentheses ( and ).
➢ Capturing Groups
➢ Non-Capturing Groups
➢ Nested Groups

Department of CS&IT, S.S.D.M College. 25


Python: Unit - V

Capturing Groups
When you enclose a sub pattern within parentheses ( and ), you create a capturing group.
Capturing groups remember the text matched by the sub pattern and allow you to access it later
using special methods like group() or groups().
Example:
import re

text = "The price of the product is $25.99"

# Capture the price value


pattern = "The price of the product is \$(\d+\.\d+)"
match = re.search(pattern, text)
if match:
price = match.group(1)
print("Price:", price) # Output: Price: 25.99

Non-Capturing Groups
Sometimes you may want to group a pattern for applying a quantifier or alternation, but
you don't need to capture the matched text. In such cases, you can use a non-capturing group
(?:).
Example:
text = "The color of the car is blue or green"

# Match the color without capturing the alternatives


pattern = "The color of the car is (?:blue|green)"
match = re.search(pattern, text)
if match:
color = match.group()
print("Color:", color) # Output: Color: The color of the car is
blue or green
In this example, (?:blue|green) is a non-capturing group that matches the alternatives
'blue' or 'green' without capturing them individually.

Department of CS&IT, S.S.D.M College. 26


Python: Unit - V

Nested Groups
Groups can be nested within each other to create more complex patterns. Nested groups
allow you to apply quantifiers or alternation to specific parts of a pattern.
Example:
import re

text = "The price of the product is $25.99 or $19.95"

# Capture both prices


pattern = "The price of the product is (\$\d+\.\d+) or (\$\d+\.\d+)"
matches = re.findall(pattern, text)
print("Prices:", matches) # Output: Prices: [('$25.99', '$19.95')]

Grouping in Python regular expressions allows you to create more sophisticated


patterns, extract specific information from strings, and apply quantifiers or alternation to
specific parts of a pattern effectively.

Dot Character
The dot character (.) is a metacharacter in regular expressions, and it matches any single
character except for a newline character (\n). It acts as a wildcard, representing any character
in a pattern.
When the dot character is included in a regular expression pattern, it matches exactly
one character at the position where it appears in the pattern. This character can be any character
except for a newline.
Example:
import re
text = "cat, bat, rat"
# Match words starting with 'c', followed by any character, and ending with 't'
pattern = "c.t"
matches = re.findall(pattern, text)
print(matches) # Output: ['cat']

Department of CS&IT, S.S.D.M College. 27


Python: Unit - V

In this example, the dot character(.) in the regular expression pattern c.t matches any
single character in place of the dot, resulting in the match 'cat' from the input text 'cat, bat, rat'.
The dot character is a powerful tool in regular expressions and is commonly used to
match patterns where specific characters are not known or relevant. However, it's important to
be cautious when using the dot character, as it matches any character, including special
characters like punctuation marks or spaces.

Greedy Matches
Greedy matching is a concept in regular expressions where the quantifiers attempt to
match as much text as possible while still allowing the overall pattern to match successfully.
In other words, a greedy quantifier will match as many repetitions of the preceding element as
possible.

Greedy Quantifiers: Greedy quantifiers, such as *, +, and {}, attempt to match as much
text as possible while still allowing the overall pattern to match successfully.
Matching as Much as Possible: With greedy quantifiers, the regex engine will try to
match as many repetitions of the preceding element as possible, expanding outward from the
starting position.
Backtracking: If the overall pattern match fails due to excessive greediness, the regex
engine may backtrack and try alternative matches with fewer repetitions.
Example:
import re
text = "abcdef"
# Greedy match to find all characters
pattern = ".*"
match = re.match(pattern, text)
if match:
print("Greedy Match:", match.group()) # Output: Greedy
Match: abcdef
In this example, the pattern .* matches the entire string "abcdef" because the * quantifier
is greedy and tries to match as many characters as possible.

Department of CS&IT, S.S.D.M College. 28


Python: Unit - V

Greedy vs. Non-Greedy Quantifiers


Greedy Quantifiers: *, +, and {} are greedy by default, matching as much text as
possible.
Non-Greedy (or Lazy) Quantifiers: *?, +?, and {}?` are their non-greedy counterparts,
matching as little text as possible.

Use Cases
➢ Capturing Larger Text Blocks: Greedy matching is useful when you want to
capture larger text blocks, such as entire paragraphs or sections.
➢ Performance Considerations: Greedy matching can sometimes lead to
performance issues, especially with large input texts or overly complex patterns.
In such cases, non-greedy quantifiers can be used to restrict matching to the
smallest possible text block.
Example:
import re
text = "<start>Some text here<end> More text <start>Some other
text<end>"
# Greedy match to capture text between markers
pattern = "<start>(.*?)<end>"
matches = re.findall(pattern, text)
print("Greedy Matches:", matches) # Output: Greedy Matches: ['Some
text here', 'Some other text']
In this example, the non-greedy quantifier (.*?) is used to match as little text as possible
between the <start> and <end> markers, resulting in the extraction of the text blocks "Some
text here" and "Some other text".

Match Objects
Match objects in Python represent the result of a pattern matching operation performed
by the re.match(), re.search(), or re.findall() functions. They contain information about the
match, including the matched text, the position of the match, and any captured groups.
➢ Properties and Methods: Match objects provide properties and methods to
access information about the match, such as the matched text, the start and end
positions of the match, and any captured groups.

Department of CS&IT, S.S.D.M College. 29


Python: Unit - V

➢ Accessing Matched Text: You can access the matched text using the group()
method or index 0. Additionally, you can access captured groups using their
index.
➢ Match Positions: Match objects provide methods like start() and end() to
retrieve the start and end positions of the match in the input string.
➢ Iterating Over Matches: If multiple matches are found, you can iterate over
match objects using a loop.
Example:
import re
text = "The price of the product is $25.99"
# Match the price value
pattern = r"\$\d+\.\d+"
match = re.search(pattern, text)
if match:
# Access matched text
print("Matched Text:", match.group()) # Output: Matched Text:
$25.99
# Access start and end positions
print("Start Position:", match.start()) # Output: Start Position: 23
print("End Position:", match.end()) # Output: End Position: 29
✓ We use re.search() to find the first occurrence of the pattern \$\d+\.\d+
(matching a dollar sign followed by digits and a decimal point).
✓ The search() function returns a match object, which we store in the variable
match.
✓ We access information about the match using various properties and methods
of the match object, such as group(), start(), and end().
Match objects are essential for extracting information from strings, validating patterns,
and performing various text-processing tasks using regular expressions in Python.

Substituting
Substituting in Python regular expressions involves replacing matched patterns in a
string with specified replacement text. The re.sub() function is used for this purpose, allowing
you to perform substitutions based on patterns defined using regular expressions.

Department of CS&IT, S.S.D.M College. 30


Python: Unit - V

Syntax:
re.sub(pattern, replacement, string, count=0, flags=0)
➢ pattern: The regular expression pattern to search for.
➢ replacement: The replacement text to substitute for matched patterns.
➢ string: The input string in which substitutions will be made.
➢ count: Optional. Specifies the maximum number of substitutions to make
(default is 0, meaning all occurrences).
➢ flags: Optional. Additional flags that control the behavior of the regex matching.

Replacement Text: The replacement text can include references to captured groups in
the pattern using \1, \2, etc., to insert the text matched by specific capturing groups.
Substitution Process: The re.sub() function scans the input string for occurrences of the
pattern. Whenever a match is found, it replaces the matched text with the specified replacement
text.
Example:
import re
text = "The price of the product is $25.99"
# Substitute the price value with 'XX.XX'
pattern = r"\$\d+\.\d+"
replacement = "XX.XX"
substituted_text = re.sub(pattern, replacement, text)
print("Substituted Text:", substituted_text)
# Output: Substituted Text: The price of the product is XX.XX
✓ We use the re.sub() function to substitute the price value (e.g., $25.99) in the
input text with the replacement text 'XX.XX'.
✓ The pattern \$\d+\.\d+ matches a dollar sign followed by digits and a decimal
point, representing a typical price format.
✓ The replacement text 'XX.XX' is specified as the replacement for matched
patterns.
✓ The substituted text "The price of the product is XX.XX" is printed as the
output.

Department of CS&IT, S.S.D.M College. 31


Python: Unit - V

Splitting a String
Splitting a string based on a regular expression pattern in Python can be done using the
re.split() function. This function splits a string into substrings based on matches of the specified
pattern.
Syntax:
re.split(pattern, string, maxsplit=0, flags=0)
✓ pattern: The regular expression pattern to use for splitting the string.
✓ string: The input string to be split.
✓ maxsplit: Optional. Specifies the maximum number of splits to perform (default
is 0, meaning all occurrences).
✓ flags: Optional. Additional flags that control the behavior of the regex matching.
Example:
import re
text = "apple,banana,orange,grape"
# Split the text based on comma (,) as delimiter
tokens = re.split(r",", text)
print("Tokens:", tokens)
# Output: Tokens: ['apple', 'banana', 'orange', 'grape']

# Split the text based on any whitespace character


tokens = re.split(r"\s+", text)
print("Tokens:", tokens)
# Output: Tokens: ['apple,banana,orange,grape']

# Split the text based on any vowel character


tokens = re.split(r"[aeiou]", text)
print("Tokens:", tokens)
# Output: Tokens: ['', 'ppl', ',b', 'n', 'n', ',', 'r', 'ng', ',', 'r', 'ng', ',', 'gr',
'p', '']
✓ We use the re.split() function to split the input text based on different regular
expression patterns.
✓ For each pattern, the input text is split into substrings, and the resulting tokens
are printed as a list.

Department of CS&IT, S.S.D.M College. 32


Python: Unit - V

Matching at Beginning or End


Matching at the beginning or end of a string in Python regular expressions is done using
special anchor characters. These anchor characters help to define patterns that must occur at
the start or end of a string.
➢ Caret (^) Anchor: The caret symbol (^) is used as an anchor to match the
beginning of a string. When ^ is placed at the start of a regular expression
pattern, it indicates that the pattern must occur at the beginning of the string.
➢ Dollar Sign ($) Anchor: The dollar sign symbol ($) is used as an anchor to match
the end of a string. When $ is placed at the end of a regular expression pattern,
it indicates that the pattern must occur at the end of the string.
➢ Matching the Entire String: By combining ^ and $ anchors, you can create a
pattern that matches the entire string from start to end.
➢ Multiline Mode: In Python, the behavior of ^ and $ can change when using the
re.MULTILINE flag. In multiline mode, ^ matches the beginning of a line, and
$ matches the end of a line.
Example:
import re
text = "Python is awesome"
# Match "Python" at the beginning of the string
pattern_start = r"^Python"
match_start = re.search(pattern_start, text)
print("Match at the beginning:", match_start.group() if match_start else
"No match")
# Output: Match at the beginning: Python

# Match "awesome" at the end of the string


pattern_end = r"awesome$"
match_end = re.search(pattern_end, text)
print("Match at the end:", match_end.group() if match_end else "No
match")
# Output: Match at the end: awesome

# Match the entire string "Python is awesome"

Department of CS&IT, S.S.D.M College. 33


Python: Unit - V

pattern_full = r"^Python is awesome$"


match_full = re.search(pattern_full, text)
print("Match the entire string:", match_full.group() if match_full else
"No match")
# Output: Match the entire string: Python is awesome
➢ The ^Python pattern matches "Python" at the beginning of the string.
➢ The awesome$ pattern matches "awesome" at the end of the string.
➢ The ^Python is awesome$ pattern matches the entire string "Python is
awesome" from start to end.

Compiling Regular Expressions


Regular expressions can be compiled into pattern objects using the re.compile()
function. Compiling regular expressions allows you to reuse them efficiently across multiple
matching operations.
Syntax:
re.compile(pattern, flags=0)
✓ pattern: The regular expression pattern to compile.
✓ flags: Optional. Additional flags that control the behavior of the regex matching.
Pattern Objects: The re.compile() function returns a pattern object, which represents the
compiled form of the regular expression pattern. Pattern objects have methods for performing
matching operations, such as search(), match(), findall(), etc.
Efficiency: Compiling regular expressions can improve performance, especially when
the same pattern is used multiple times in a program. The compiled pattern object can be reused
across multiple matching operations without needing to recompile the pattern each time.
Example:
import re
# Compile a regular expression pattern
pattern = re.compile(r"\b[A-Z][a-z]*\b")

# Input text
text = "Hello World! This is a Sample Text."

# Perform matching operation using the compiled pattern

Department of CS&IT, S.S.D.M College. 34


Python: Unit - V

matches = pattern.findall(text)
print("Matches:", matches)
# Output: Matches: ['Hello', 'World', 'This', 'Sample', 'Text']
➢ We use the re.compile() function to compile the regular expression pattern \b[A-
Z][a-z]*\b, which matches words starting with an uppercase letter followed by
lowercase letters.
➢ The compiled pattern object pattern is then used to perform a findall() operation
on the input text.
➢ The resulting matches are printed as a list of words found in the text that match
the specified pattern.

Department of CS&IT, S.S.D.M College. 35

You might also like