Python Fundamentals
Python Fundamentals
What is Python?
● Python is a high-level, general-purpose programming language.
● Python focuses on making code readable by using significant indentation. It is dynamically typed
and includes automatic garbage collection.
● Python supports various programming styles, including structured, object-oriented, and
functional programming.
History of python:
Python's history dates back to the late 1980s and early 1990s. It was created by Guido van Rossum, a
Dutch programmer. Here is a brief overview of Python's history:
● Origin (Late 1980s - Early 1990s): Guido van Rossum started working on Python in the late 1980s
while he was at the Centrum Wiskunde & Informatica (CWI) in the Netherlands. He aimed to
create a language that was easy to read and write, with a clean and concise syntax. Python was
influenced by several programming languages, including ABC and Modula-3.
● First Release (February 20, 1991): Guido van Rossum released the first version of Python, Python
0.9.0, on February 20, 1991. This release included many of the fundamental features of Python,
such as exception handling and functions.
● Python 1.0 (January 26, 1994): Python 1.0 was released, featuring various improvements and
bug fixes. This version laid the foundation for Python's early growth.
● Python 2.0 (October 16, 2000): Python 2.0 introduced list comprehensions, garbage collection,
and Unicode support. It became widely adopted and used for many years.
● Python 3.0 (December 3, 2008): Python 3.0, also known as Python 3, was a major milestone in
Python's history. It introduced significant changes to the language to improve consistency and
eliminate certain design flaws. However, these changes were not backward-compatible with
Python 2, leading to a period of coexistence between Python 2 and 3.
● Python 2 End of Life (January 1, 2020): Python 2 reached its end of life (EOL) on January 1, 2020.
This marked the official end of support for Python 2, and users were strongly encouraged to
migrate to Python 3.
● Continued Development: Python 3 has continued to evolve, with new versions being released
regularly. Each version introduces improvements, new features, and optimizations while
maintaining backward compatibility within the Python 3 series.
Features of Python:
Python is indeed a versatile programming language that possesses many of the characteristics. Here's a
breakdown of these attributes:
Code Execution:
In Python, the process of code execution involves a few steps, and it doesn't involve a traditional
compilation step like in languages such as C or C++. Instead, Python uses an interpreted and
dynamically-typed approach. Here's a simplified overview of how a Python script is executed:
1. Source Code: You write your Python code in a high-level, human-readable form. This is called
the source code.
2. Lexical Analysis (Tokenization): The Python interpreter performs lexical analysis on the source
code, breaking it down into a sequence of tokens. This involves identifying keywords, identifiers,
literals, and operators.
3. Syntax Analysis (Parsing): The parser then organizes the tokens into a hierarchical structure
called the Abstract Syntax Tree (AST). The AST represents the syntactic structure of the Python
code.
4. Intermediate Code Generation (Bytecode): The Python interpreter generates an intermediate
code known as bytecode from the AST. Bytecode is a low-level, platform-independent
representation of the source code.
5. Bytecode Compilation: The bytecode is stored in compiled form as .pyc files or within memory,
ready for execution. This compilation step enhances the efficiency of the interpretation process.
6. Execution by the Python Virtual Machine (PVM):
a. The Python Virtual Machine (PVM) is responsible for executing the bytecode. It's part of the
Python interpreter.
b. The PVM uses a stack-based virtual machine to execute operations defined by the bytecode.
c. The dynamic nature of Python allows for features like runtime type checking and late
binding.
7. Memory Management and Garbage Collection:
a. Python employs automatic memory management with a combination of reference counting
and a cyclic garbage collector.
b. Memory is allocated for objects during execution and reclaimed when objects are no longer
referenced.
8. Dynamic Typing and Runtime Evaluation: Python is dynamically typed, meaning that variable
types are determined at runtime. This allows for flexibility but requires runtime type checking.
9. Exception Handling: The interpreter includes mechanisms for handling exceptions at runtime.
When an exception occurs, the interpreter searches for an appropriate exception handler in the
call stack.
10. Standard Library Interaction: The Python interpreter interacts with the Python Standard Library,
which provides a vast collection of modules and packages for various functionalities.
This process allows Python to be a highly dynamic and flexible language. The focus is on readability, ease
of use, and rapid development rather than a separate compilation step. The compilation to bytecode
and execution by the PVM happen on-the-fly, making Python an interpreted language.
Variables:
A variable in Python is a named storage location used to store data values. Variables are created when
you assign a value to them. Python is a dynamically typed language, which means you don't have to
declare the type of a variable explicitly; it is inferred at runtime.
# Following conventions
user_name = "Alice"
user_email = "[email protected]"
In this example, my_var, user_age, and total_sum are valid variable names, following the naming rules.
The commented-out lines show examples of invalid variable names that violate the rules.
Global Variables: A global variable is defined outside of any function and is accessible throughout the
entire program, including inside functions.
Example:
Local Variables: A local variable is defined inside a function and is only accessible within that function. It
is not visible to code outside the function.
Example:
def print_local_var():
local_var = 20 # This is a local variable
print(local_var)
print_local_var() # Output: 20
# Trying to access local_var outside the function will result in an error.
# Uncommenting the line below will result in a NameError.
# print(local_var)
In this example, local_var is a local variable that is only accessible within the function print_local_var.
If you want to modify a global variable inside a function, you need to use the global keyword.
Example:
global_var = 10
def modify_global_var():
global global_var
global_var = 30
modify_global_var()
print(global_var) # Output: 30
In this example, the modify_global_var function uses the global keyword to indicate that it is modifying
the global variable global_var. After calling the function, the value of global_var outside the function is
changed.
Caution:
While global variables provide a way to share data across different parts of a program, it's generally a
good practice to use them judiciously. Overuse of global variables can lead to code that is harder to
understand and maintain. In many cases, passing values as parameters and returning values from
functions is a cleaner and more modular approach.
Remember that local variables in functions are confined to the scope of that function, and modifying
them does not affect variables with the same name outside the function.
Operators:
In Python, operators are special symbols or keywords that perform operations on operands. Operands
can be variables, values, or expressions. Python supports various types of operators, categorized into
different groups. Here are some common types of operators in Python:
Arithmetic Operators:
Used for mathematical operations.
● + # Addition
● # Subtraction
● # Multiplication
● / # Division
● % # Modulus (remainder)
● ** # Exponentiation (power)
● // # Floor Division (returns the floor value after division)
Code:
a = 10
b=3
addition = a + b
subtraction = a - b
multiplication = a * b
division = a / b
modulus = a % b
exponentiation = a ** b
floor_division = a // b
print("Addition:", addition)
print("Subtraction:", subtraction)
print("Multiplication:", multiplication)
print("Division:", division)
print("Modulus:", modulus)
print("Exponentiation:", exponentiation)
print("Floor Division:", floor_division)
Comparison Operators:
Used to compare values.
● == # Equal to
● != # Not equal to
● < # Less than
● # Greater than
● <= # Less than or equal to
● >= # Greater than or equal to
Code:
x=5
y=8
print("Is x equal to y?", x == y)
print("Is x not equal to y?", x != y)
print("Is x less than y?", x < y)
print("Is x greater than y?", x > y)
print("Is x less than or equal to y?", x <= y)
print("Is x greater than or equal to y?", x >= y)
Logical Operators:
Used for logical operations.
Code:
is_sunny = True
is_warm = False
print("Is it a sunny day and warm?", is_sunny and is_warm)
print("Is it a sunny day or warm?", is_sunny or is_warm)
print("Is it not a sunny day?", not is_sunny)
Assignment Operators:
Used to assign values to variables.
● = # Assignment
● += # Add and assign
● -= # Subtract and assign
● *= # Multiply and assign
● /= # Divide and assign
● %= # Modulus and assign
● **= # Exponentiate and assign
● //= # Floor divide and assign
Code:
total = 10
total += 5 # total = total + 5
total -= 3 # total = total - 3
total *= 2 # total = total * 2
total /= 4 # total = total / 4
print("Total:", total)
Bitwise Operators:
Used for bit-level operations.
Code:
num1 = 10 # Binary: 1010
num2 = 5 # Binary: 0101
bitwise_and = num1 & num2
bitwise_or = num1 | num2
bitwise_xor = num1 ^ num2
bitwise_not_num1 = ~num1
left_shift = num1 << 1
right_shift = num1 >> 1
print("Bitwise AND:", bitwise_and)
print("Bitwise OR:", bitwise_or)
print("Bitwise XOR:", bitwise_xor)
print("Bitwise NOT (num1):", bitwise_not_num1)
print("Left Shift (num1):", left_shift)
print("Right Shift (num1):", right_shift)
Membership Operators:
Used to test whether a value is a member of a sequence.
Code:
fruits = ['apple', 'banana', 'orange']
print("Is 'banana' in fruits?", 'banana' in fruits)
print("Is 'grape' not in fruits?", 'grape' not in fruits)
Identity Operators:
Used to compare the memory location of two objects.
Code:
x = [1, 2, 3]
y = [1, 2, 3]
z=x
print("Are x and y the same object?", x is y)
print("Are x and z the same object?", x is z)
print("Are x and y not the same object?", x is not y)
Unary Operators:
Operators that operate on a single operand.
These operators play a crucial role in performing various operations in Python, and understanding their
usage is fundamental to writing effective and expressive code.
Datatypes:
In Python, data types represent the type or nature of the data that a variable can store. Python is
dynamically typed, which means you don't need to declare the data type of a variable explicitly. The
interpreter infers the type based on the value assigned to the variable. Here are some common data
types in Python:
Numeric Types:
● int: Integer type, e.g., x = 5.
● float: Floating-point type, e.g., y = 3.14.
● complex: Complex number type, e.g., z = 2 + 3j.
Code:
# int
x=5
print(x, type(x))
# float
y = 3.14
print(y, type(y))
# complex
z = 2 + 3j
print(z, type(z))
Sequence Types:
● str: String type, e.g., name = "John".
● list: List type, e.g., numbers = [1, 2, 3].
Code:
# str
name = "John"
print(name, type(name))
# list
numbers = [1, 2, 3]
print(numbers, type(numbers))
Set Types:
● set: Set type, e.g., my_set = {1, 2, 3}.
Code:
# set
my_set = {1, 2, 3}
print(my_set, type(my_set))
Mapping Type:
● dict: Dictionary type, e.g., person = {'name': 'John', 'age': 30}.
Code:
# dict
person = {'name': 'John', 'age': 30}
print(person, type(person))
Boolean Type:
● bool: Boolean type, either True or False.
Code:
# dict
person = {'name': 'John', 'age': 30}
print(person, type(person))
None Type:
● NoneType: Represents the absence of a value, often denoted as None.
Code:
# NoneType
no_value = None
print(no_value, type(no_value))
Code:
# Decimal
from decimal import Decimal
decimal_number = Decimal('3.14')
print(decimal_number, type(decimal_number))
# Fraction
from fractions import Fraction
fraction_number = Fraction(3, 4)
print(fraction_number, type(fraction_number))
Binary Types:
● bytes: Immutable sequence of bytes, e.g., binary_data = b'hello'.
● bytearray: Mutable sequence of bytes, e.g., mutable_binary = bytearray(b'hello').
● memoryview: A view object that exposes an array’s buffer interface.
# bytes
binary_data = b'hello'
print(binary_data, type(binary_data))
# bytearray
mutable_binary = bytearray(b'hello')
print(mutable_binary, type(mutable_binary))
Collection Types:
● tuple: Immutable sequence, e.g., coordinates = (10, 20).
# tuple
coordinates = (10, 20)
print(coordinates, type(coordinates))
Other Built-in Types:
● range: Represents a range of numbers, e.g., my_range = range(5).
# range
my_range = range(5)
print(my_range, type(my_range))
User-Defined Types:
● class: Allows you to define your own types using classes.
Type Conversion:
# int(), float(), str()
num_str = "123"
num_int = int(num_str)
num_float = float(num_str)
print(num_int, type(num_int))
print(num_float, type(num_float))
Positive indexing:
● “apple”[0] refers to 'a'
● “apple”[1] refers to 'p'
● “apple”[2] refers to 'p'
● “apple”[3] refers to 'l'
● “apple”[4] refers to 'e'
Negative indexing:
● “apple”[-1] refers to 'e'
● “apple”[-2] refers to 'l'
● “apple”[-3] refers to 'p'
● “apple”[-4] refers to 'p'
● “apple”[-5] refers to 'a'
Slicing
word = "apple"
List: Mutable (can be modified after creation), defined using square brackets [].
Tuple: Immutable (cannot be modified after creation), defined using parentheses ().
Immutable (cannot be
Mutability Mutable (can be modified)
modified)
Syntax Defined using square brackets [] Defined using parentheses ()
Indexing:
Indexing in Python starts from 0, meaning the first element has an index of 0, the second has an index of
1, and so on.
Slicing:
Slicing allows you to extract a portion of a sequence by specifying a range of indices.
In both examples, slicing creates a new sequence containing elements within the specified range. The
syntax is start_index:stop_index, where the start index is inclusive, and the stop index is exclusive.
Omitting either index implies the beginning or end of the sequence, respectively.
String Formatting
String formatting in Python allows you to create strings with dynamic content by substituting
placeholders with values. There are multiple ways to format strings in Python, including using the %
operator, the format() method, and f-strings (formatted string literals).
1. Using % Operator:
name = "Alice"
age = 25
# Old-style formatting using % operator
formatted_string = "My name is %s and I am %d years old." % (name, age)
print(formatted_string)
# Using f-strings
formatted_string = f"My name is {name} and I am {age} years old."
print(formatted_string)
Alignment:
message = "Hello"
# Left-aligned
left_aligned = "{:<10}".format(message)
print(left_aligned)
# Right-aligned
right_aligned = "{:>10}".format(message)
print(right_aligned)
# Center-aligned
center_aligned = "{:^10}".format(message)
print(center_aligned)
Type-specific Formatting:
number = 42
# Binary
binary_format = "Binary: {:b}".format(number)
print(binary_format)
# Hexadecimal
hex_format = "Hex: {:x}".format(number)
print(hex_format)
# Scientific notation
scientific_format = "Scientific: {:e}".format(number)
print(scientific_format)
These are just a few examples of string formatting in Python. Choose the method that best fits your
needs and the Python version you are working with. f-strings are generally considered more concise and
readable for modern Python code.
String Functions:
number = 42
string_number = str(number)
print("String representation of number:", string_number)
char = 'A'
unicode_code = ord(char)
print("Unicode code point of 'A':", unicode_code)
unicode_code = 65
character = chr(unicode_code)
print("Character represented by Unicode code 65:", character)
alpha_string = "Python"
is_alpha = alpha_string.isalpha()
print("Is alphabetic:", is_alpha)
numeric_string = "123"
is_digit = numeric_string.isdigit()
print("Is numeric:", is_digit)
alphanumeric_string = "Python123"
is_alnum = alphanumeric_string.isalnum()
print("Is alphanumeric:", is_alnum)
name = "Alice"
age = 30
formatted_text = "My name is {} and I'm {} years old.".format(name, age)
print("Formatted text:", formatted_text)
text = "Python"
centered_text = text.center(10, '*')
print("Centered text:", centered_text)
text = "Python"
left_justified_text = text.ljust(10, '-')
print("Left-justified text:", left_justified_text)
text = "Python"
right_justified_text = text.rjust(10, '-')
print("Right-justified text:", right_justified_text)
uppercase_text = "PYTHON"
is_upper = uppercase_text.isupper()
print("Is uppercase:", is_upper)
lowercase_text = "python"
is_lower = lowercase_text.islower()
print("Is lowercase:", is_lower)
swapcase(): Returns a new string with uppercase characters converted to lowercase and vice versa.
mixed_case_text = "PyThOn"
swapped_case_text = mixed_case_text.swapcase()
print("Swapped case text:", swapped_case_text)
zfill(): Returns a copy of the string left-filled with '0' characters.
number = "42"
zero_padded_number = number.zfill(5)
print("Zero-padded number:", zero_padded_number)
indented_text = "Python\tProgramming"
expanded_text = indented_text.expandtabs(4)
print("Expanded text:", expanded_text)
numeric_string = "123"
is_numeric = numeric_string.isnumeric()
print("Is numeric:", is_numeric)
rpartition(): Splits the string into three parts, starting from the right.
rindex(): Returns the highest index of a substring. Raises an error if not found.
sentence = "Python is fun and Python is powerful!"
index_of_is = sentence.rindex("is")
print("Rightmost index of 'is':", index_of_is)
rsplit(): Splits the string into a list of substrings, starting from the right.
my_list = [1, 2, 3, 4, 5]
length = len(my_list)
print("Length:", length)
numbers = [1, 2, 3, 4, 5]
total = sum(numbers)
print("Sum:", total)
unsorted_list = [3, 1, 4, 1, 5, 9, 2]
sorted_list = sorted(unsorted_list)
print("Sorted list:", sorted_list)
enumerate(): Returns an enumerate object, which contains pairs of index and element.
my_list = [1, 2, 3]
my_list.append(4)
print("Appended list:", my_list)
first_list = [1, 2, 3]
second_list = [4, 5, 6]
first_list.extend(second_list)
print("Extended list:", first_list)
insert(): Inserts an element at a specified position in the list.
my_list = [1, 2, 3]
my_list.insert(1, 5)
print("Inserted list:", my_list)
remove(): Removes the first occurrence of a specified element from the list.
my_list = [1, 2, 3, 2, 4]
my_list.remove(2)
print("Removed list:", my_list)
pop(): Removes and returns the element at the specified position (default is the last element).
my_list = [1, 2, 3, 4]
popped_element = my_list.pop(2)
print("Popped element:", popped_element)
my_list = [1, 2, 3]
my_list.clear()
print("Cleared list:", my_list)
my_list = [1, 2, 2, 3, 4, 2]
count_of_2 = my_list.count(2)
print("Count of 2:", count_of_2)
my_list = [1, 2, 3, 4, 5]
my_list.reverse()
print("Reversed list:", my_list)
numbers = [3, 1, 4, 1, 5, 9, 2]
numbers.sort()
print("Sorted list:", numbers)
my_list = [1, 2, 3]
my_list.clear()
print("Cleared list:", my_list)
Tuples are immutable, meaning they cannot be modified after creation. As a result, tuples have fewer
methods compared to lists. However, there are still some methods available:
my_tuple = (1, 2, 3, 4, 5)
length = len(my_tuple)
print("Length:", length)
numbers = (1, 2, 3, 4, 5)
total = sum(numbers)
print("Sum:", total)
unsorted_tuple = (3, 1, 4, 1, 5, 9, 2)
sorted_list = sorted(unsorted_tuple)
print("Sorted list:", sorted_list)
my_tuple = (1, 2, 2, 3, 4, 2)
count_of_2 = my_tuple.count(2)
print("Count of 2:", count_of_2)
my_set = {1, 2, 3, 4, 5}
length = len(my_set)
print("Length:", length)
numbers = {1, 2, 3, 4, 5}
total = sum(numbers)
print("Sum:", total)
my_set = {1, 2, 3}
my_set.add(4)
print("Updated set:", my_set)
update(): Adds elements from another iterable (e.g., list, tuple, set) to the set.
first_set = {1, 2, 3}
second_set = {3, 4, 5}
first_set.update(second_set)
print("Updated set:", first_set)
remove(): Removes a specified element from the set. Raises an error if the element is not present.
my_set = {1, 2, 3, 4}
my_set.remove(3)
print("Updated set:", my_set)
discard(): Removes a specified element from the set. Does not raise an error if the element is not
present.
my_set = {1, 2, 3, 4}
my_set.discard(3)
print("Updated set:", my_set)
pop(): Removes and returns an arbitrary element from the set. Raises an error if the set is empty.
my_set = {1, 2, 3, 4}
popped_element = my_set.pop()
print("Popped element:", popped_element)
my_set = {1, 2, 3, 4}
my_set.clear()
print("Cleared set:", my_set)
union(): Returns a new set containing all unique elements from two or more sets.
set1 = {1, 2, 3}
set2 = {3, 4, 5}
union_set = set1.union(set2)
print("Union set:", union_set)
intersection(): Returns a new set containing common elements from two or more sets.
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
intersection_set = set1.intersection(set2)
print("Intersection set:", intersection_set)
difference(): Returns a new set containing elements that are in the first set but not in the others.
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
difference_set = set1.difference(set2)
print("Difference set:", difference_set)
symmetric_difference(): Returns a new set containing elements that are in either of the sets, but not
both.
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
symmetric_difference_set = set1.symmetric_difference(set2)
print("Symmetric Difference set:", symmetric_difference_set)
issubset(): Returns True if all elements of the set are present in another set.
set1 = {1, 2, 3}
set2 = {1, 2, 3, 4, 5}
is_subset = set1.issubset(set2)
print("Is subset:", is_subset)
issuperset(): Returns True if all elements of another set are present in the set.
set1 = {1, 2, 3, 4, 5}
set2 = {1, 2, 3}
is_superset = set1.issuperset(set2)
print("Is superset:", is_superset)
set1 = {1, 2, 3}
set2 = {4, 5, 6}
is_disjoint = set1.isdisjoint(set2)
print("Are sets disjoint:", is_disjoint)
length = len(my_dict)
print("Length:", length)
pop(): Removes and returns the value for a specified key. Raises an error if the key is not found.
popitem(): Removes and returns the last key-value pair as a tuple. Raises an error if the dictionary is
empty.
setdefault(): Returns the value for a specified key. If the key is not found, it inserts the key with a
specified default value.
pop(): Removes and returns the value for a specified key. Raises an error if the key is not found.
In Python, a block of code refers to a set of statements that are grouped together and executed as a
single unit. Blocks are defined by indentation, and the standard convention is to use four spaces for each
level of indentation. Blocks are fundamental to the structure of Python programs and are used in various
constructs like functions, loops, conditional statements, and classes.
Conditional statements are crucial in programming because they provide a way to control the flow of a
program based on certain conditions. This allows your program to make decisions dynamically,
responding to different situations or inputs. Without conditional statements, programs would follow a
static and unchanging sequence of instructions, making them less flexible and less capable of handling
diverse scenarios.
Now, let's look at examples of each type of conditional statement:
1. if Statement:
age = 25
if age >= 18:
print("You are eligible to vote.")
2. if-else Statement:
temperature = 25
if temperature > 30:
print("It's a hot day.")
else:
print("It's not a hot day.")
3. if-elif Statement:
score = 75
if score >= 90:
print("A grade")
elif score >= 80:
print("B grade")
elif score >= 70:
print("C grade")
else:
print("Fail")
4. if-elif-else Statement:
number = 0
if number > 0:
print("Positive number")
elif number < 0:
print("Negative number")
else:
print("Zero")
5. Nested if Statements:
num = 10
if num > 0:
if num % 2 == 0:
print("Positive even number")
else:
print("Positive odd number")
else:
print("Not a positive number")
Let's consider some real-world examples where conditional statements are commonly used in
programming:
traffic_light_color = "red"
if traffic_light_color == "red":
print("Stop")
elif traffic_light_color == "yellow":
print("Prepare to stop")
else:
print("Go")
In a traffic light control system, the program decides the action based on the color of the traffic light.
2. User Authentication:
current_temperature = 28
if current_temperature > 30:
print("Turn on the air conditioner")
elif 25 <= current_temperature <= 30:
print("Normal room temperature")
else:
print("Turn on the heater")
A program controlling the temperature in a room based on the current temperature.
5. Grade Calculation:
Loops in Python:
In Python, loops are used to repeatedly execute a block of code as long as a certain condition is met.
They provide a way to automate repetitive tasks and iterate over collections of data.
for Loop: It is used for iterating over a sequence (that is either a list, tuple, dictionary, string, or other
iterable types).
Syntax:
while condition:
# code to be executed
Example:
count = 0
while count < 5:
print(count)
count += 1
Example 1: Sum of Numbers using for Loop
numbers = [1, 2, 3, 4, 5]
sum_numbers = 0
base = 2
exponent = 5
result = 1
number = 5
factorial = 1
while number > 0:
factorial *= number
number -= 1
print("Factorial:", factorial)
This example calculates the factorial of a number using a while loop.
Real-world examples of the need for loops in programming are abundant. Here are a few scenarios
where loops are essential:
Function in Python:
A function in Python is a reusable block of code that performs a specific task. It allows you to organize
code into modular and reusable components, making your code more readable, maintainable, and
efficient. Functions are defined using the def keyword, and they can take parameters, return values, and
be called multiple times.
Syntax:
where
Types of Functions:
Built-in Functions: Functions that are built into Python, such as print(), len(), and type().
Parameters:
Parameters are values that are passed into a function. They are specified in the function definition and
used within the function's code.
Parameter Types:
Positional Parameters:
greet("Alice", "Hello")
Default Parameters:
Keyword Parameters:
greet(greeting="Hi", name="Charlie")
def print_items(*items):
for item in items:
print(item)
Examples:
Simple Function:
def greet(name):
print(f"Hello, {name}!")
greet("Alice")
result = add_numbers(5, 3)
print("Sum:", result)
Function with Default Parameter:
greet(greeting="Hi", name="Charlie")
def print_items(*items):
for item in items:
print(item)
You can use the *args syntax to indicate that a function can accept any number of positional arguments.
These arguments are collected into a tuple.
Example:
def print_arguments(*args):
for arg in args:
print(arg)
Example:
def print_keyword_arguments(**kwargs):
for key, value in kwargs.items():
print(f"{key}: {value}")
You can use both *args and **kwargs in the same function to accept a combination of positional and
keyword arguments.
Example:
Exception Handling:
Error:
An error is a broader term that encompasses any unexpected or incorrect behavior in a program. Errors
can be categorized into different types, and exceptions are one specific type of error. Other types of
errors include syntax errors, runtime errors, and logical errors.
Examples of Errors:
Syntax Error:
Runtime Error:
Logical Error:
Exception(Runtime Error):
An exception in programming refers to an unexpected event or error that occurs during the execution of
a program. Exceptions disrupt the normal flow of a program and may lead to unintended or incorrect
behavior. These events can be caused by various factors, such as user input, file operations, network
issues, or programming errors.
In Python, exceptions are represented as objects. When an exceptional situation arises, an exception
object is raised. If the exception is not handled (caught) by the program, it typically results in the
termination of the program and the display of an error message.
● Program Stability: Without exception handling, unexpected errors could cause a program to
crash, leading to instability and potential data loss.
● User-Friendly Error Messages: Exception handling allows you to provide meaningful error
messages to users, making it easier for them to understand and report issues.
● Debugging and Maintenance: Proper exception handling helps in identifying and fixing issues
during development and maintenance phases, contributing to code robustness.
● Graceful Degradation: Exception handling enables a program to continue running and handle
errors gracefully, even if some parts of the code encounter issues.
● Resource Cleanup: It allows for proper cleanup of resources (like file handles or network
connections) even in the presence of errors, preventing resource leaks.
● Error Logging: Exception handling facilitates the logging of errors, providing valuable information
for diagnosing and resolving issues.
● Fault Isolation: By handling exceptions at appropriate levels, you can isolate faults and prevent
them from propagating through the entire program.
Example:
try:
# Code that may raise an exception
num1 = int(input("Enter a numerator: "))
num2 = int(input("Enter a denominator: "))
result = num1 / num2
except ZeroDivisionError:
# Handling division by zero exception
print("Error: Division by zero!")
except ValueError:
# Handling invalid input (non-integer) exception
print("Error: Invalid input! Please enter integers.")
else:
# Code to be executed if no exception is raised
print("Result:", result)
finally:
# Code to be executed regardless of whether an exception is raised or not
print("Finally block: This will always execute.")
In this example, the try block contains the code that may raise an exception. The except blocks handle
specific exceptions, and the else block is executed if no exception occurs. The finally block contains code
that will always be executed, whether an exception is raised or not.
Custom exceptions:
Example 1
try:
age = int(input("Enter the age : "))
if age<18:
raise Exception;
else:
print("the age is valid")
except Exception:
print("The age is not valid")
Example 2
try:
a = int(input("Enter a : "))
b = int(input("Enter b : "))
if b is 0:
raise ArithmeticError;
else:
print("a/b = ",a/b)
except ValueError:
print("The values must be numbers")
except ArithmeticError:
print("The value of b can't be 0")
assert Statement
Python has built-in assert statement to use assertion condition in the program. assert statement has a
condition or expression which is supposed to be always true.
If the condition is false assert halts the program and gives an AssertionError.
Example 1
def avg(marks):
assert len(marks) != 0,"The List is empty."
return sum(marks)/len(marks)
marks1 = [67,59,86,75,92]
print("The Average of Marks1:",avg(marks1))
marks2 = []
print("The Average of Marks2:",avg(marks2))
Example 2
x = int(input("Enter x :"))
y = int(input("Enter y :"))
# It uses assert to check for 0
print ("x / y value is : ")
assert y != 0, "Divide by 0 error"
Example 3
# Assert with try - except
try:
x = 5
y = 10
assert x == y, "Values are not equal" # This will raise an
AssertionError because x is not equal to y
except AssertionError as e:
print(f"AssertionError occurred: {e}")
# You can add handling code here, such as logging the error or taking
corrective action
File Handling:
File handling in programming refers to the process of reading from and writing to files. Files are used to
store and retrieve data persistently, providing a way for programs to store information between runs.
File handling allows programs to interact with external storage devices, such as hard drives, to store and
retrieve data.
In most programming languages, including Python, file handling involves the following key operations:
● Opening a File: Use the open() function to open a file. This function returns a file object that is
used to interact with the file.
● Reading from a File: Use methods like read(), readline(), or readlines() to read data from the file.
● Writing to a File: Use methods like write() to write data to the file.
● Closing a File: Use the close() method to close the file after reading or writing.
Why File Handling is Required:
● Data Persistence: File handling enables programs to store data persistently, allowing
information to be saved between different program executions.
● Data Sharing: Files provide a common and easily accessible way for different programs or
components to share and exchange data.
● Configuration Storage: Configuration settings for programs can be stored in files, making it easy
to modify settings without changing the program code.
● Database Interaction: Many programs interact with databases through files. Data is often read
from or written to files before being transferred to or from a database.
● Logging and Debugging: Files are commonly used for logging information and debugging
purposes. Program logs can be stored in files for later analysis.
● Text and Binary Data Handling: File handling allows programs to work with both text and binary
data, making it versatile for various types of applications.
● Resource Sharing: Files provide a means for multiple processes or programs to share and access
the same data.
● Backup and Recovery: File handling facilitates data backup and recovery strategies by allowing
the program to store critical information in external files.
In summary, file handling is required for the persistent storage and retrieval of data, data sharing
between programs, configuration management, database interaction, logging, debugging, resource
sharing, and backup and recovery processes. It is a fundamental aspect of many software applications.
Closing a file.
To make sure the file is closed every time it is opened, we should follow one of the below approaches.
Use finally.
try:
f = open("test.txt", "w+")
f.write("123123123")
except Exception as e:
print(e)
finally:
f.close()
read method.
with open("large_file.txt", "r") as f:
print(f.read())
read() method read the whole content of the file in one go.
This method also accepts a parameter, size. This is the number of characters or bytes to read.
readline method.
with open("large_file.txt", "r") as f:
print(f.readline())
readline() method read only one line at a time.
This method also accepts a parameter, size. This is the number of characters or bytes to read.
readlines method.
with open("large_file.txt", "r") as f:
print(f.readlines())
readlines() method also fetches the whole content of the file in one go.
write method.
with open("write_example.txt", "w") as f:
f.write("hello")
Write the content to the file.
writelines method.
names = list()
names.append("Anurag")
names.append("Ashutosh")
names.append("rana")
with open("names_example.txt", "w") as f:
f.writelines(names)
Write the list of lines/text to the file.
r+ -> open for both reading and writing (file pointer is at the
beginning of the file)
w+ -> open for both reading and writing (truncate the file if it exists)
a -> open for writing (append to the end of the file if exists & file
wb+ -> Opens a file for writing and reading in Binary format.
tell() method
In python programming, within file handling concept tell() function is used to get the actual position of
file object. By file object we mean a cursor. And it’s cursor, who decides from where data has to be read
or written in a file.
tell() method is used to find the current location of the cursor within the file.
seek() method
In python programming, within file handling concept seek() function is used to shift/change the position
of file object to required position. By file object we mean a cursor. And it’s cursor, who decides from
where data has to be read or write in a file.
seek(offset, whence) method is used to move the pointer with the given offset, respective to the
whence.
2 -> from the end of the file. This value can be used in binary mode only.
Regular expressions are useful for a variety of tasks, including data validation,
text parsing, and search and replace operations. They can be used in
programming languages like Python, Perl, and JavaScript, as well as in text
editors and command line tools.
Validating user input in a web form (e.g. ensuring that an email address is in a
valid format).
Reformatting data in a file (e.g. replacing all occurrences of a date format with
a different format).
Extracting specific data from a large text document (e.g. finding all occurrences
of a keyword).
Regular expressions provide a flexible way to search and manipulate text data,
allowing you to quickly and efficiently perform complex operations on large
amounts of data. They can be used in a wide range of applications, from web
development to data science, and are a powerful tool for anyone working with
text data.
String concatenation:
String concatenation is a basic method for combining text data, but it doesn't
offer any pattern matching or manipulation capabilities. Regular expressions, on
the other hand, can be used to perform complex pattern matching and
manipulation operations, such as searching for specific patterns or extracting
data based on context.
Parsing libraries:
Parsing libraries can be used to extract data from structured text formats, such
as XML or JSON. While these libraries offer powerful functionality for parsing
specific formats, they may not be as flexible as regular expressions when it
comes to handling more general text data. Regular expressions can be used to
search and manipulate text data in a wide range of formats, making them a
more versatile tool.
Characters:
Character classes:
Character classes are groups of characters that can be matched with a single
regular expression. For example, the character class [aeiou] matches any vowel
in the text data.
Metacharacters:
Anchors:
Anchors are metacharacters that match specific positions in the text data. For
example, the ^ anchor matches the beginning of a line or string, while the
$ anchor matches the end of a line or string.
Groups:
Quantifiers:
Overall, regular expressions provide a powerful and flexible syntax for defining
patterns and searching for text data. By learning the basics of regular
expression syntax, you can use regular expressions to perform complex text
manipulation tasks in a wide range of applications.
Basic syntax:
Character literals:
Character classes:
Character classes are a group of characters that can be matched with a single
regular expression. For example, the character class [aeiou] matches any vowel
in the text data. Other common character classes include [0-9] (matching any
digit), [A-Z] (matching any uppercase letter), and [a-z] (matching any
lowercase letter).
Character ranges:
Negated character classes are character classes that match any character
except those specified in the class. For example, the regular expression [^a-z]
matches any character that is not a lowercase letter.
The question mark quantifier matches zero or one occurrence of the preceding
character or group. For example, the regular expression ab?c matches 'ac' and
'abc', but not 'abbc' or 'abbbc'.
The curly braces quantifier allows you to specify a range of occurrences for the
preceding character or group. For example, the regular expression a{2,4}b
matches 'aab', 'aaab', and 'aaaab', but not 'ab' or 'aaaaab'.
Using quantifiers effectively is important for creating flexible and precise regular
expressions. By specifying the appropriate quantifiers, you can match patterns
of varying length and complexity in text data.
By combining the dot character with quantifiers, you can match multiple
characters in the text data. For example, the regular expression a.*c matches
any sequence of characters that starts with 'a' and ends with 'c', including zero
or more characters of any type in between.
To match a literal dot character in the text data, you can use the backslash ()
character to "escape" the dot. For example, the regular expression a\.c matches
'a.c', but not 'abc' or 'adc'.
Because the dot character matches any character except for newline characters,
it can be useful for matching a wide range of special characters in the text data.
For example, the regular expression .\d matches any character followed by a
digit, and the regular expression .\s matches any character followed by a
whitespace character.
Anchors:
Explanation of the ^ and $ characters and their use in anchoring matches to the beginning or
end of a string or line.
The caret (^) and dollar sign ($) characters are special characters in regular
expressions that allow you to anchor matches to the beginning or end of a
string or line. Here are some key concepts related to using the caret and dollar
sign characters:
The caret character (^) matches the beginning of a string or line. For example,
the regular expression ^hello matches any string or line that begins with the
word 'hello'.
The dollar sign ($) matches the end of a string or line. For example, the regular
expression world$ matches any string or line that ends with the word 'world'.
By combining the caret and dollar sign characters, you can anchor matches to
both the beginning and end of a string or line. For example, the regular
expression ^hello world$ matches only the exact string "hello world" and not
any string that contains "hello world" as a substring.
Multiline matching:
In some regular expression engines, the caret and dollar sign characters can be
used to anchor matches to the beginning and end of each line in a multiline
string. For example, the regular expression ^hello$ matches any line that
consists solely of the word 'hello', but not lines that contain other words in
addition to 'hello'.
Using the caret and dollar sign characters effectively is important for creating
regular expressions that match specific patterns at the beginning or end of
strings or lines. By anchoring matches to specific positions in the text data, you
can create precise and flexible patterns that match only the desired sequences
of characters.
Discussion of the difference between multiline and single-line mode.
Regular expression engines can operate in two different modes: single-line mode
and multiline mode. These modes affect the behavior of the caret (^) and dollar
sign ($) characters, as well as the dot (.) character. Here's a brief overview of
the differences between single-line and multiline mode:
Single-line mode:
In single-line mode (also called dot-all mode), the dot (.) character matches
any character, including newline characters. This means that a regular
expression like .* will match any sequence of characters, including multiple lines
of text. The caret (^) and dollar sign ($) characters match only the beginning
and end of the entire text data, rather than the beginning and end of each line.
Multiline mode:
In multiline mode, the dot (.) character matches any character except newline
characters. This means that a regular expression like .* will match only a single
line of text. The caret (^) and dollar sign ($) characters match the beginning
and end of each line in addition to the beginning and end of the entire text
data.
To specify which mode a regular expression engine should use, you can set an
appropriate flag or option when compiling or executing the regular expression.
For example, in Python's regular expression module, you can set the re.DOTALL
flag to enable single-line mode, and the re.MULTILINE flag to enable multiline
mode.
Suppose you want to match any string that contains either the word "apple" or
the word "orange". You could create a regular expression like this:
apple|orange
This regular expression matches any string that contains either "apple" or
"orange", regardless of the surrounding characters. For example, the regular
expression would match the following strings:
Note that the | character has lower precedence than other regular expression
operators, such as quantifiers (*, +, ?) and grouping parentheses (()). This means
that if you want to use the | character with other operators or parentheses,
you may need to use grouping parentheses to specify the correct order of
operations.
The | character is a powerful tool for creating flexible and customizable regular
expressions that can match a wide variety of patterns. By specifying multiple
alternatives separated by the | character, you can create regular expressions
that match complex patterns in text data.
Use of parentheses to group alternatives for more complex matching.
In regular expressions, parentheses are used to group together parts of a
pattern, allowing you to apply operators or modifiers to the entire group. This
can be especially useful when you want to match complex patterns that include
multiple alternatives or quantifiers.
Suppose you want to match any string that contains either the word "apple" or
the word "orange", followed by the word "juice". You could create a regular
expression like this:
(apple|orange) juice
This regular expression matches any string that contains either "apple juice" or
"orange juice", regardless of the surrounding characters.
By using parentheses to group the alternative patterns together, you can apply
the "juice" modifier to the entire group, ensuring that the regular expression
matches only when the entire phrase is present.
Parentheses can also be used to group together parts of a pattern that are
repeated multiple times, as in the following example:
Suppose you want to match any string that contains a sequence of digits that
repeats exactly three times. You could create a regular expression like this:
(\d{3}){3}
This regular expression matches any string that contains a sequence of exactly
three digits, repeated exactly three times. The outer set of parentheses groups
together the repeated pattern, and the inner set of braces specifies the
repetition count.
By using parentheses to group parts of your regular expressions, you can create
more complex patterns that match the desired sequences of characters. This
can be especially useful for matching patterns in text data that include multiple
alternatives, quantifiers, or other special characters.
Grouping:
Explanation of how parentheses can be used to group parts of a regular expression together.
In regular expressions, parentheses are used to group together parts of a
pattern, allowing you to apply operators or modifiers to the entire group. This
can be especially useful when you want to match complex patterns that include
multiple alternatives or quantifiers.
For example, suppose you want to match a phone number in a specific format:
(123) 456-7890. You could create a regular expression like this:
\(\d{3}\) \d{3}-\d{4}
This regular expression matches any string that contains a sequence of three
digits enclosed in parentheses, followed by a space, then a sequence of three
digits, a dash, and a sequence of four digits.
Suppose you want to match any string that contains a sequence of digits,
followed by either the word "apples" or the word "oranges". You could create a
regular expression
\d+ (apples|oranges)
This regular expression matches any string that contains one or more digits,
followed by a space, then either the word "apples" or the word "oranges".
By using parentheses to group the alternatives together, you can apply the
quantifier (+) to the entire sequence of digits, ensuring that the regular
expression matches any string with one or more digits, regardless of the
surrounding characters.
To refer to a matched group later in the expression, you can use a backslash
followed by the group number or group name. Group numbers start at 1 and
increase for each additional set of parentheses. For example, if you use two sets
of parentheses to group a pattern, the first group is numbered 1, and the
second group is numbered 2.
Suppose you have a string that contains a person's name in the format "Last,
First". You want to extract the last name and use it in a new string in the
format "Hello, Last". You could use a regular expression like this:
([A-Za-z]+),\s([A-Za-z]+)
This regular expression matches any string that contains a sequence of one or
more uppercase or lowercase letters, followed by a comma and a space, and
then another sequence of one or more uppercase or lowercase letters. The first
set of parentheses groups the last name, and the second set of parentheses
groups the first name.
To refer to the matched groups later in the expression, you can use backslashes
followed by the group numbers (1 and 2, in this case). For example, to replace
the original string with the new format, you could use the following code:
import re
pattern = re.compile(r'([A-Za-z]+),\s([A-Za-z]+)')
print(result)
Hello, Doe
Use of non-capturing groups when the matched group doesn't need to be saved.
Sometimes, you may want to use parentheses to group parts of a regular
expression without actually capturing the matched data. In these cases, you can
use non-capturing groups.
Because the middle group doesn't need to be saved, it's enclosed in a non-
capturing group. If we used a regular capturing group here, we'd end up with
unnecessary groupings in the final match result.
Backreferences:
For example, suppose you want to match any string that repeats a sequence of
characters twice, such as "hellohello" or "goodbyegoodbye". You can use a regular
expression like this:
(\w+)\1
Backreferences can be very useful in many different scenarios. For example, you
can use them to ensure that two parts of a string match each other, or to
simplify complex regular expressions by reusing parts of the pattern.
It's important to note that not all regular expression engines support
backreferences, or they may support them in slightly different ways. So, if
you're using backreferences in your regular expressions, be sure to check the
documentation for the specific engine you're using.
Let's say you have a string that contains a repeated word, like "hello hello". You
want to use a regular expression to match this pattern.
You can use a capturing group to match the first word, and then use a
backreference to match the second occurrence of the same word. Here's the
regular expression you can use:
(\w+)\s+\1
So, the regular expression matches any sequence of word characters followed by
one or more whitespace characters, and then the exact same sequence of word
characters again.
Here's some example Python code that uses this regular expression to find
repeated words in a string:
import re
string = "hello hello world world"
pattern = r"(\w+)\s+\1"
matches = re.findall(pattern, string)
print(matches) # Output: ['hello', 'world']
Note that this regular expression only matches repeated words that appear in
the same order. If you want to match repeated words that can appear in any
order, you'll need to use a more complex regular expression that involves
lookahead or backtracking.
Explanation of how lookahead and lookbehind assertions allow matching based on the
context surrounding a match.
Lookahead and lookbehind assertions are advanced features of regular
expressions that allow you to match text based on the context surrounding a
match, without actually including that context in the match itself.
Lookahead assertions are patterns that match a position in the string, but don't
actually consume any characters. They are denoted by (?=pattern) for a positive
lookahead, or (?!pattern) for a negative lookahead. For example, the pattern
(?=foo) matches any position in the string where the next three characters are
"foo", but doesn't actually consume those characters.
import re
string = "42 apples"
pattern = r"\d+(?=\s\w+)"
match = re.search(pattern, string)
print(match.group()) # Output: '42'
Lookbehind assertions are similar to lookahead assertions, but they match the
text preceding the current position. They are denoted by (?<=pattern) for a
positive lookbehind, or (?<!pattern) for a negative lookbehind. For example, the
pattern (?<=foo) matches any position in the string where the previous three
characters are "foo", but doesn't include those characters in the match.
import re
string = "apples 42"
pattern = r"(?<=\w\s)\d+"
match = re.search(pattern, string)
print(match.group()) # Output: '42'
In this example, the regular expression (?<=\w\s)\d+ matches one or more digits
(\d+) that are preceded by a word character (\w) and a whitespace character
(\s). The (?<=\w\s) is a positive lookbehind assertion that matches the word and
whitespace characters, but doesn't include them in the match.
Lookahead and lookbehind assertions can be very powerful when used correctly,
but they can also be tricky to work with. It's important to understand how
they work and how they interact with the rest of your regular expression
pattern.
Example of using positive and negative lookahead/lookbehind to match specific patterns.
Here are some examples of using positive and negative lookahead/lookbehind to
match specific patterns:
Matching email addresses that end with ".com" or ".org" using positive
lookahead:
import re
string = "my email is [email protected]"
pattern = r"\b\w+@\w+\.(?=com|org)\b"
match = re.search(pattern, string)
print(match.group()) # Output: '[email protected]'
Matching strings that don't contain a specific pattern using negative lookahead:
import re
string = "The quick brown fox jumps over the lazy dog"
pattern = r"^(?!.*cat).*fox.*$"
match = re.search(pattern, string)
print(match.group()) # Output: 'The quick brown fox jumps over the lazy dog'
In this example, the regular expression ^(?!.*cat).*fox.*$ matches any string that
contains the word "fox", but doesn't contain the word "cat". The ^ and
$ anchors match the beginning and end of the string, and the .* matches any
number of characters between "cat" and "fox". The negative lookahead (?!.*cat)
ensures that the pattern only matches if "cat" doesn't appear in the string.
Greediness:
import re
string = "abcabcabcab"
pattern = r"a.+b"
match = re.search(pattern, string)
print(match.group()) # Output: 'abcabcabcab'
In this example, the pattern a.+b matches the entire string abcabcabcab,
because the .+ quantifier is greedy and matches as many characters as possible.
import re
string = "abcabcabcab"
pattern = r"a.+?b"
match = re.search(pattern, string)
print(match.group()) # Output: 'abcab'
In this example, the pattern a.+?b matches the string abcab, because the .+?
quantifier is non-greedy and matches as few characters as possible.
In general, it's a good idea to use non-greedy quantifiers when you're matching
patterns that might occur multiple times in a string and you only want to
match the first occurrence. However, if you want to match the longest possible
string that matches a pattern, you should use a greedy quantifier.
import re
string = "abcabcabc"
# match the shortest possible string that starts with "a" and ends with "c"
pattern = r"a.+?c"
match = re.search(pattern, string)
print(match.group()) # Output: 'abc'
Escape characters:
Here are some examples of escape characters and their special meanings in
regular expressions:
import re
string = "The quick brown fox jumps over the lazy dog."
# match the period character followed by a space
pattern = r"\. "
matches = re.findall(pattern, string)
print(matches) # Output: ['. ']
In this example, we're using the escape character \ to match the period
character (.) as a literal character. The regular expression pattern \. matches a
period character, and we're using a space character after it to match the
period character followed by a space. The findall() function returns a list of all
matches found in the string, which in this case is just the period character
followed by a space.
import re
string = "The answer is 42." # match any digit character
pattern = r"\d"
matches = re.findall(pattern, string)
print(matches) # Output: ['4', '2']
In this example, the regular expression pattern \d matches any digit character
in the string, which in this case are the digits 4 and 2.
\w: Matches any word character (i.e. alphanumeric characters and underscores).
import re
string = "Hello, World!" # match any word character
pattern = r"\w"
matches = re.findall(pattern, string)
print(matches) # Output: ['H', 'e', 'l', 'l', 'o', 'W', 'o', 'r', 'l', 'd']
In this example, the regular expression pattern \w matches any word character
in the string, which are the letters in the words "Hello" and "World".
import re
string = "The quick brown fox\njumps over the lazy dog." # match any
whitespace character
pattern = r"\s"
matches = re.findall(pattern, string)
print(matches) # Output: [' ', ' ', '\n', ' ', ' ']
Flags:
import re
string = "Hello, World!" # match any letter in a case-insensitive manner
pattern = r"h"
matches = re.findall(pattern, string, re.I)
print(matches) # Output: ['H']
In this example, the regular expression pattern h matches the lowercase letter
"h" in the string, but with the re.I flag, it also matches the uppercase letter "H".
import re
string = "apple\nbanana\ncherry" # match the start of each line
pattern = r"^"
matches = re.findall(pattern, string, re.M)
print(matches) # Output: ['a', 'b', 'c']
In this example, the regular expression pattern ^ matches the start of each line
in the string, rather than just the start of the entire string.
import re
string = "apple\nbanana\ncherry" # match any character, including newlines
pattern = r"."
matches = re.findall(pattern, string, re.S)
print(matches) # Output: ['a', 'p', 'p', 'l', 'e', '\n', 'b', 'a', 'n', 'a', 'n', 'a', '\n', 'c',
'h', 'e', 'r', 'r', 'y']
In this example, the regular expression pattern . matches any character in the
string, including the newline characters between the lines of text.
Other commonly used flags include re.UNICODE or re.U for Unicode matching,
and re.ASCII or re.A for ASCII matching.
Examples of flags, such as i for case-insensitive matching and g for global matching.
Sure, here are some examples of flags and their usage:
m (multiline): This flag is used to match patterns across multiple lines. For
example, the regular expression /^hello/m would match "hello" at the beginning
of each line in a multi-line string.
s (dot-all): This flag is used to match any character, including newlines, with
the dot (.) character. For example, the regular expression /hello.world/s would
match "hello\nworld" as a single string.
u (unicode): This flag is used to match patterns in Unicode mode. This allows
the regular expression engine to handle Unicode characters correctly.
y (sticky): This flag is used to perform a "sticky" search, which matches only at
the position indicated by the lastIndex property of the RegExp object.
These are just a few examples of the flags available in regular expressions.
Best practices:
Keep it simple: Start with a simple regular expression that matches the basic
pattern you're looking for. Then, gradually add complexity as needed.
Use character classes: Use character classes like [a-z], [A-Z], and [0-9] to
match specific characters.
Be specific: Avoid using the dot (.) character unless you really need to match
any character. Instead, be as specific as possible in your regular expression.
Use anchors: Use the ^ and $ characters to anchor your regular expression to
the beginning and end of a string, respectively.
Use non-capturing groups: Use non-capturing groups (?:...) when you don't
need to capture the matched group. This can improve performance.
Use quantifiers wisely: Use quantifiers like *, +, and ? sparingly, and make sure
they are used in the right context.
Test your regular expressions: Use a tool like regex101.com to test your regular
expressions and make sure they match what you expect.
Document your regular expressions: If you're using a regular expression in your
code, add comments to explain what it does and why it's necessary.
Break it down: If your regular expression is getting too complex, break it down
into smaller, more manageable parts.
Examples of common mistakes to avoid, such as overly complex expressions and redundant
character classes.
Here are some common mistakes to avoid when writing regular expressions:
Redundant character classes: Avoid using redundant character classes like [a-
zA-Z] or [0-9a-fA-F]. Instead, use the case-insensitive flag (i) or the \d, \w,
and \s escape sequences.
Overuse of the dot (.) character: The dot character matches any character,
which can lead to unexpected matches. Instead, be as specific as possible in your
regular expression.
Greedy quantifiers: Greedy quantifiers like * and + can cause your regular
expression to match more than you intend. Use non-greedy quantifiers like *?
and +? to match the smallest possible string.
Using too many optional groups: Optional groups can make your regular
expression more complex and harder to read. Use them sparingly, and only
when they are necessary.
Not using anchors: Use the ^ and $ characters to anchor your regular expression
to the beginning and end of a string, respectively. This will ensure that your
regular expression only matches what you expect.