0% found this document useful (0 votes)
8 views

Python Interview Questions

The document provides an overview of Python data structures including lists, tuples, sets, and dictionaries, highlighting their mutability, memory usage, performance, and methods. It also discusses key differences between these structures and introduces various Python libraries essential for data engineering, data analysis, cloud services, and utility functions. Additionally, it covers concepts like decorators and generators, explaining their functionality and advantages in Python programming.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Python Interview Questions

The document provides an overview of Python data structures including lists, tuples, sets, and dictionaries, highlighting their mutability, memory usage, performance, and methods. It also discusses key differences between these structures and introduces various Python libraries essential for data engineering, data analysis, cloud services, and utility functions. Additionally, it covers concepts like decorators and generators, explaining their functionality and advantages in Python programming.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Python Interview Questions

List:
 Mutable: Elements can be changed after creation.
 Memory Usage: Consumes more memory.
 Performance: Slower iteration compared to tuples but better for insertion and
deletion operations.
 Methods: Offers various built-in methods for manipulation.

Tuple:
 Immutable: Elements cannot be changed after creation.
 Memory Usage: Consumes less memory.
 Performance: Faster iteration compared to lists but lacks the flexibility of lists.
 Methods: Limited built-in methods.

Key Difference Between List, Tuple, Set, and Dictionary in Python

Mutability:
 List: Mutable (modifiable).
 Tuple: Immutable (non-modifiable).
 Set: Mutable, but elements inside must be immutable.
 Dictionary: Mutable; keys are immutable, but values can change.
Order:
 List: Maintains order of elements.
 Tuple: Maintains order of elements.
 Set: No guaranteed order.
 Dictionary: As of Python 3.7+, insertion order is preserved.
Uniqueness:
 List: Allows duplicates.
 Tuple: Allows duplicates.
 Set: Only unique elements.
 Dictionary: Unique keys, values can be duplicated.
Data Structure:
 List: Ordered collection.
 Tuple: Ordered collection.
 Set: Unordered collection.
 Dictionary: Collection of key-value pairs.
Dictionary
Unlike all other collection types, dictionaries strictly contain key-value pairs.
 In Python versions < 3.7: is an unordered collection of data.
 In Python v3.1: a new type of dictionary called ‘OrderedDict’ was introduced,
which was similar to dictionary in python; the difference was
that orderedDict was ordered (as the name suggests)
 In the latest version of Python, i.e. 3.7: Finally, in python 3.7, dictionary is now
an ordered collection of key-value pairs. The order is now guaranteed in the
insertion order, i.e. the order in which they were inserted.
Syntax

Dictionary
Unlike all other collection types, dictionaries strictly contain key-value pairs.
 In Python versions < 3.7: is an unordered collection of data.
 In Python v3.1: a new type of dictionary called ‘OrderedDict’ was introduced,
which was similar to dictionary in python; the difference was
that orderedDict was ordered (as the name suggests)
 In the latest version of Python, i.e. 3.7: Finally, in python 3.7, dictionary is now
an ordered collection of key-value pairs. The order is now guaranteed in the
insertion order, i.e. the order in which they were inserted.
Syntax
Code:
dict1 = {"key1": "value1", "key2": "value2"}
dict2 = {}
dict3 = dict({1: "one", 2: "two", 3: "three"})
print(dict1)
print(dict2)
print(dict3)
Output:
{'key2': 'value2', 'key1': 'value1'}
{}
{1: 'one', 2: 'two', 3: 'three'}

Indexing
Code:
dict1 = {"one": 1, "two": 2, "three": 3}
print(dict1.keys())
print(dict1.values())
print(dict1['two'])
Output:
['three', 'two', 'one']
[3, 2, 1]
2
Adding New Element
Code:
dict1 = {"India": "IN", "Russia": "RU", "Australia": "AU"}
dict1.update({"Canada": "CA"})
print(dict1)
dict1.pop("Australia")
print(dict1)
Output:
{'Canada': 'CA', 'Australia': 'AU', 'India': 'IN', 'Russia': 'RU'}
{'Canada': 'CA', 'India': 'IN', 'Russia': 'RU'}
Deleting Element
Code:
dict1 = {"India": "IN", "Russia": "RU", "Australia": "AU"}
dict1.pop('Russia')
print(dict1)
Output:
{'Australia': 'AU', 'India': 'IN'}
Sorting Elements
Code:
dict1 = {"India": "IN", "Russia": "RU", "Australia": "AU"}
print(sorted(dict1))
Output:
['Australia', 'India', 'Russia']
Searching Elements
Code:
dict1 = {"India": "IN", "Russia": "RU", "Australia": "AU"}
print(dict1['Australia'])
Output:
AU

Key Differences Between Dictionary, List, Set, and Tuple


Syntax
 Dictionary: Uses curly brackets { } with key-value pairs separated by
commas.
 List: Employs square brackets [ ] with comma-separated elements.
 Set: Utilizes curly brackets { } with comma-separated elements.
 Tuple: Employs parentheses ( ) with comma-separated elements.
Order
 Dictionary: Maintains order in Python 3.7+ but is unordered in Python 3.6.
 List: Maintains order.
 Set: Unordered.
 Tuple: Maintains order.
Duplicate Data
 Dictionary: Keys are unique, values can be duplicated.
 List: Allows duplicate elements.
 Set: Does not allow duplicate elements.
 Tuple: Allows duplicate elements.
Indexing
 Dictionary: Key-based indexing.
 List: Integer-based indexing starting from 0.
 Set: No index-based mechanism.
 Tuple: Integer-based indexing starting from 0.
Adding Elements
 Dictionary: Uses key-value pairs.
 List: New items can be added using append() method.
 Set: Uses add() method.
 Tuple: Being immutable, new data cannot be added.
Deleting Elements
 Dictionary: Uses pop(key) method to remove specified key and value.
 List: Uses pop() method to delete an element.
 Set: Uses pop() method to remove an element.
 Tuple: Being immutable, no data can be popped or deleted.
Sorting Elements
 Dictionary: Keys can be sorted using the sorted() method.
 List: Uses sort() method to sort elements.
 Set: Unordered, so sorting is not applicable.
 Tuple: Being immutable, data cannot be sorted.
Searching Elements
 Dictionary: Uses the get(key) method to retrieve value for a specified key.
 List: Uses index() method to search and return index of first occurrence.
 Set: Unordered, so searching is not applicable.
 Tuple: Uses index() method to search and return index of first occurrence.
Reversing Elements
 Dictionary: No integer-based indexing, so no reversal.
 List: Uses reverse() method to reverse elements.
 Set: Unordered, so reversing is not advised.
 Tuple: Being immutable, reverse method is not applicable.
Counting Elements
 Dictionary: count() not defined for dictionaries.
 List: Uses count() method to count occurrence of a specific element.
 Set: count() is not defined for sets.
 Tuple: Uses count() method to count occurrence of a specific element.

Most Useful Python Libraries for Data Engineering


DATA WORKFLOW AND PIPELINE LIBRARIES

1. Library: apache-airflow
The apache-airflow library is a widely used scheduler and monitors for executing and
managing tasks, batch jobs, and orchestrating data pipelines. Data engineers can
use it to manage tasks and dependencies within a data workflow that can handle a
large number of tasks. It provides a simple UI and API that includes scripting for
failure handling and error recovery, all wrapped in a high-performance framework. It
allows one to define complex workflows as directed acyclic graphs (DAGs) of tasks,
where the edges between tasks represent dependencies and the nodes represent
the actual tasks that are to be executed.

1. Library: kafka-python
Apache Kafka is a popular distributed messaging platform used for building real-time
data pipelines and streaming applications that stores data and replicates it across
multiple servers, providing high availability and durability in case of server failures.
The Kafka-python library provides a high-level API for producing and consuming
messages from Apache Kafka, as well as lower-level APIs for more advanced use
cases such as asynchronous processing that facilitates sending and receiving
messages without blocking the main thread of execution.

DATA ANALYSIS LIBRARIES

1. Library: pandas
Pandas is one of the most popular Python libraries for working with small- and
medium-sized datasets. Built on top of NumPy, Pandas (abbreviation for Python
Data Analysis Library) is ideal for data analysis and data manipulation. It’s
considered a must-have given its large collection of powerful features such as data
merging, handling missing data, data exploration, and overall efficiency. Data
engineers use it to quickly read data from various sources, perform analysis and
transformation operations on the data, and output the results in various formats.
Pandas is also frequently paired with other python libraries for data engineering,
such as scikit-learn for data analysis and machine learning tasks.

1. Library: pyarrow
Developed by some of the same authors of Pandas (Wes McKinney), to solve some
of the scalability issues of Pandas, Apache Arrow uses the now popular columnar
data store for better performance and flexibility. The PyArrow library provides a
Python API for the functionality provided by the Arrow libraries, along with tools for
Arrow integration and interoperability with pandas, NumPy, and other software in the
Python ecosystem. For data engineers, pyarrow provides a scalable library to easily
integrate data from multiple sources into a single, unified, and large dataset for easy
manipulation and analysis.

CLOUD LIBRARIES

1. Library: boto3
AWS is one of the most popular cloud service providers so there’s no surprise that
boto3 is on top of the list. Boto3 is a Software Development Kit (SDK) library for
programmers to write software that makes use of a long list of Amazon services
including data engineer favorites such as Glue, EC2, RDS, S3, Kinesis, Redshift,
and Athena. In addition to performing common tasks such as uploading and
downloading data, and launching and managing EC2 instances, data engineers can
leverage Boto3 to programmatically access and manage many AWS services, that
can be used to build data pipelines and automate data workflow tasks.

1. Library: Azure-core
From another of the top 5 cloud providers, Azure Core is a python library and API for
interacting with the Azure cloud services and is used by data engineers for accessing
resources and automating engineering tasks. Common tasks include submitting and
monitoring batch jobs, accessing databases, data containers, and data lakes, and
generally managing resources such as virtual machines and containers. A related
library for Python is azure-storage-blob, a library built to manage retrieve, and store
large amounts of unstructured data such as images, audio, video, or text.

DATA AND BIG DATA LIBRARIES

1. Library: SQLAlchemy
SQLAlchemy is the Python SQL toolkit that provides a high-level interface for
interacting with databases. It allows data engineers to query data from a database
using SQL-like statements and perform common operations such as inserting,
updating, and deleting data from a database. SQLAlchemy also provides support for
object-relational mapping (ORM), which allows data engineers to define the structure
of their database tables as Python classes and map those classes to the actual
database tables. SQLAlchemy provides a full suite of well-known enterprise-level
persistence patterns, designed for efficient and high-performing database access
such as connection pooling and connection reuse.

1. Library: pyspark
Apache Spark is one of the most popular open-source data engineering platforms
thanks to its scalable design that lets it process large amounts of data fast, and
makes it ideal for tasks that require real-time processing or big data analysis
including ETL, machine learning, and stream processing. It can also easily integrate
with other platforms, such as Hadoop and other big data platforms, making it easier
for data engineers to work with a variety of data sources and technologies. The
PySpark library allows data engineers to work with a wide range of data sources and
formats, including structured data, unstructured data, and streaming data.

UTILITY LIBRARIES

1. Library: python-dateutil
The need to manipulate date and time is ubiquitous in Python, and often the built-in
datetime module doesn’t suffice. The dateutil module is a popular extension to the
standard datetime module. If you’re seeking to implement timezones, calculate time
deltas, or want more powerful generic parsing, then this library is a good choice.

Python Decorators:
Python decorators are a powerful aspect of the language that allow you to modify the
behavior of functions or methods. They are functions themselves that wrap around
another function, enabling you to add functionality to existing code without modifying
it. Let’s dive into decorators with a simple example.

What Are Decorators?


Decorators in Python use the @decorator_function syntax and are essentially
functions that take another function as an argument and return a new function. They
help in modifying or enhancing the behavior of the original function.
Example: Creating a Decorator

Explanation:
 my_decorator is a function that takes another function (func) as an argument.
 wrapper is a nested function within my_decorator that adds extra functionality
before and after the original function (func) is called.
 @my_decorator is used above the say_hello function declaration, indicating
that say_hello will be passed to my_decorator as an argument.
 When say_hello is called, it actually executes the wrapper function created by
the decorator my_decorator. This allows for the additional behavior to be
added before and after the original say_hello function execution.
Conclusion:
Decorators are a versatile tool in Python that enable you to modify the behavior of
functions without changing their actual code. They are widely used in frameworks
like Flask and Django for tasks such as authentication, logging, and more.
Understanding decorators can greatly enhance your ability to write clean, reusable,
and efficient code.

Generators in Python

Generators are a powerful tool in Python, providing a way to create iterators in a


more memory-efficient and readable manner. They allow for the creation of lazy
iterables that generate values on the fly without the need to store the entire
sequence in memory. This article delves into what generators are, how they work,
their advantages, and practical use cases.

What Are Generators?


Generators are a type of iterable, like lists or tuples, but unlike lists, they do not store
their contents in memory. Instead, they generate values on the fly as you iterate over
them. Generators are defined using functions and the yield statement, which allows
them to produce a sequence of values over time rather than computing and storing
all the values upfront.
How Do Generators Work?
Generators are defined using a function with one or more yield statements. When a
generator function is called, it returns a generator object without starting execution
immediately. Each time the next() method is called on the generator object, the
generator function executes until it reaches the yield statement, where it pauses and
returns the yielded value. The state of the generator function, including local
variables and the point of execution, is saved. Execution resumes from this state the
next time next() is called.
Here’s a simple example of a generator:

In this example, count_up_to is a generator function that yields numbers from 1 up to


a specified maximum value.
Advantages of Generators
1. Memory Efficiency: Generators are much more memory efficient than lists,
especially when dealing with large data sets. They generate values on the fly
and do not require the entire sequence to be stored in memory.
2. Lazy Evaluation: Generators compute values only when needed, which can
improve performance and responsiveness in applications where the full
sequence of values is not required all at once.
3. Readable and Concise Code: Generators can lead to more readable and
concise code, particularly in situations where you need to create iterators or
perform complex data processing in a step-by-step manner
Use Cases for Generators
Reading Large Files: Generators are ideal for reading large files line by line without
loading the entire file into memory.
Python Iterators and Iterables

Introduction

This blog will help you to understand the below concepts.


 What is an Iterable
 What is an Iterator
 How Iteration work under for loop
 What is an iterator protocol
 What is lazy evaluation
 Benefit of an Iterator
What is an Iterable :
An Iterable is any Python object that has capability to return it’s members one at a
time and can be iterated over for loop. Example List,String,Tuple are the example of
iterable because we can extract the member of this element one by one and also we
can iterate over for loop.

In above example we have created list as nums and then iterated using for loop.
So now question is, how we will know whether object is an iterable or not?
Answer : If object contains __iter__() method (It’s also called as dunder or magic
method) then it’s an iterable and we can check whether object contains __Iter__()
method or not using built in “dir” keyword.

Now it’s clear any object which contains __iter__() method is an Iterable.
What is an Iterator:
An Iterator is an object which stores the current state though iteration and produce
next value when you call next() method.Any object that has a __next__() method is
therefore an Iterator.We can create an iterator object by applying the iter() built-in
function to an iterable.
We can use next method to fetch data from iterator in a sequence and once data is
consumed then it will throw an StopIteration exception.
How Iteration work under for loop:
We can use for loop in python to iterate on an iterable like string,list and tuple etc.
But how this is actually implemented? let’s have a look.

From above code we can understand that for loop is internally using while loop and
iterator.
What is iterator protocol :
Python iterator protocol includes two function one is iter() and the other is next().
iter() function is used to convert an iterable object to an iterator and next() function is
used to fetch next value.
What is lazy evaluation :
Iterator allow us to to create lazy iterable that don’t do any work until we ask them for
next item.
Because of their laziness, the iterators can help us to deal with infinite long iterables.
In some cases we can’t even store all the information in the memory,so we can
create iterator which can give us next element whenever we ask it.
Iterator helps us to save memory and CPU time and this approach is called lazy
evaluation.
What are the advantage of an iterator?
 Iterator in python saves resources. To get all the element, only one element is
stored in the memory at the time.unlike list or tuple where all the values are
stored at once.
 For smaller datasets, Iterator and list based approach have similar
performance.for larger dataset, iterator saves both time and space.
 Cleaner code.
 Iterator can work with infinite sequences.

Pandas for Data Engineering!

When using pandas for data engineering, several key concepts are particularly
important:
1. DataFrames and Series:
 DataFrames: The primary data structure in pandas, representing tabular data
with rows and columns.
 Series: A one-dimensional array-like object containing a sequence of values,
which can be of any data type.
1. Data Loading and Saving:
 Reading data: Using functions
like pd.read_csv(), pd.read_excel(), pd.read_sql(), etc., to load data from
various file formats and databases.
 Writing data: Using functions like df.to_csv(), df.to_excel(), df.to_sql(), etc., to
save data into various file formats and databases.
3. Data Cleaning and Preparation:
 Handling missing values: Methods such as df.isnull(), df.dropna(),
and df.fillna().
 Data type conversion: Functions like df.astype().
 String manipulation: Using the .str accessor for string operations on Series.
4. Indexing and Selecting Data:
 Indexing: Using df.set_index() and df.reset_index().
 Selecting data: Using .loc[], .iloc[], and boolean indexing for accessing
specific rows and columns.
5. Data Transformation:
 Aggregation: Using df.groupby() for aggregating data by groups.
 Pivot tables: Using df.pivot_table() for summarizing data.
 Merging and joining: Using pd.merge(), df.join(), and pd.concat() for
combining multiple DataFrames.
6. Reshaping Data:
 Melt and Pivot: Using pd.melt() to transform DataFrames from wide to long
format and df.pivot() to transform DataFrames from long to wide format.
 Stack and Unstack: Using df.stack() and df.unstack() to reshape the data.
7. Time Series Data:
 Datetime operations: Using pd.to_datetime(), df.resample(), and .dtaccessor
for time-based operations.
 Rolling and expanding windows: Using df.rolling() and df.expanding()for
calculating rolling statistics.
8. Performance Optimization:
 Vectorization: Avoiding loops by using pandas’ built-in functions that operate
on entire columns or DataFrames.
 Memory usage: Optimizing memory usage by downcasting data types
with pd.to_numeric().
9. Visualization:
 Plotting: Using df.plot() for basic visualizations and integrating with libraries
like Matplotlib and Seaborn for advanced visualizations.
10. Advanced Data Manipulation:
 Apply functions: Using df.apply() and df.applymap() for applying functions to
DataFrame elements.
 Lambda functions: Using lambda functions for inline operations
within apply() and other pandas methods.
Pandas vs PySpark..!

1. Definitions
1.1 What is PySpark?
PySpark is the Python library for Spark programming. It allows you to use the
powerful and efficient data processing capabilities of Apache Spark from within the
Python programming language. PySpark provides a high-level API for distributed
data processing that can be used to perform common data analysis tasks, such as
filtering, aggregation, and transformation of large datasets.
1.2 What is Pandas?
Pandas is a Python library for data manipulation and analysis. It provides powerful
data structures, such as the DataFrame and Series, that are designed to make it
easy to work with structured data in Python. With pandas, you can perform a wide
range of data analysis tasks, such as filtering, aggregation, and transformation of
data, as well as data cleaning and preparation.
Both definitions look more or less the same, but there is a difference in their
execution and processing architecture. Let’s go over some major differences
between these two.
2. Key Differences between PySpark and Pandas
1. PySpark is a library for working with large datasets in a distributed computing
environment, while pandas is a library for working with smaller, tabular
datasets on a single machine.
2. PySpark is built on top of the Apache Spark framework and uses the Resilient
Distributed Datasets (RDD) data structure, while pandas uses the DataFrame
data structure.
3. PySpark is designed to handle data processing tasks that are not feasible
with pandas due to memory constraints, such as iterative algorithms and
machine learning on large datasets.
4. PySpark allows for parallel processing of data, while pandas does not.
5. PySpark can read data from a variety of sources, including Hadoop
Distributed File System (HDFS), Amazon S3, and local file systems,
while pandas is limited to reading data from local file systems.
6. PySpark can be integrated with other big data tools like Hadoop and Hive,
while pandas is not.
7. PySpark is written in Scala, and runs on the Java Virtual Machine (JVM),
while pandas is written in Python.
8. PySpark has a steeper learning curve than pandas, due to the additional
concepts and technologies involved (e.g. distributed computing, RDDs, Spark
SQL, Spark Streaming, etc.).
How to decide which library to use — PySpark vs Pandas
The decision of whether to use PySpark or pandas depends on the size and
complexity of the dataset and the specific task you want to perform.
1. Size of the dataset: PySpark is designed to handle large datasets that are
not feasible to work with on a single machine using pandas. If you have a
dataset that is too large to fit in memory, or if you need to perform iterative or
distributed computations, PySpark is the better choice.
2. Complexity of the task: PySpark is a powerful tool for big data processing
and allows you to perform a wide range of data processing tasks, such as
machine learning, graph processing, and stream processing. If you need to
perform any of these tasks, PySpark is the better choice.
3. Learning Curve: PySpark has a steeper learning curve than pandas, as it
requires knowledge of distributed computing, RDDs, and Spark SQL. If you
are new to big data processing and want to get started quickly, pandas may
be the better choice.
4. Resources available: PySpark requires a cluster or distributed system to run,
so you will need access to the appropriate infrastructure and resources. If you
do not have access to these resources, then pandas is a good choice.
In summary, use PySpark for large datasets and complex tasks that are not feasible
with pandas, and use pandas for small datasets and simple tasks that can be
handled on a single machine.

What is NumPy and Why is it Important? 🤔


Roughly speaking, NumPy (short for Numerical Python) is an open source library
that provides support for large, multidimensional arrays and matrices, along with a
collection of mathematical functions to operate on these arrays. What sets NumPy
apart is its speed and efficiency, thanks to its C implementation and its focus on
vectorized operations.
The importance of NumPy in the field of data science and numerical analysis is
immense:
1. Efficient Data Manipulation: NumPy allows complex numerical
computations to be performed more efficiently than with native Python data
structures. Its focus on vectorization minimizes the need for explicit loops and
iterative operations, which are less efficient in pure Python.
2. Foundation for High-Level Libraries: Popular libraries in the field of
machine learning and data science, such as Pandas, Matplotlib and Scikit-
learn, are built on top of NumPy. Understanding NumPy provides a solid
foundation for working with these advanced tools.
3. Applications in Diverse Fields: From physics and engineering to
bioinformatics and finance, NumPy is used to perform data analysis and
mathematical modeling in a multitude of disciplines.
In short, NumPy is not just another library in a data scientist’s arsenal; it is the
backbone of scientific computing in Python, facilitating the analysis of complex data
and the implementation of numerical algorithms.

What is __init__() in Python?

The __init__() method is known as a constructor in object-oriented programming


(OOP) terminology. It is used to initialize an object's state when it is created. This
method is automatically called when a new instance of a class is instantiated.
Purpose:
 Assign values to object properties.
 Perform any initialization operations.
Example:
We have created a `book_shop` class and added the constructor and `book()`
function. The constructor will store the book title name and the `book()` function will
print the book name.
To test our code we have initialized the `b` object with “Sandman” and executed the
`book()` function.
What is the difference between a mutable data type and an immutable data
type?

Mutable data types:


 Definition: Mutable data types are those that can be modified after their
creation.
 Examples: List, Dictionary, Set.
 Characteristics: Elements can be added, removed, or changed.
 Use Case: Suitable for collections of items where frequent updates are
needed.

Immutable data types:


 Definition: Immutable data types are those that cannot be modified after their
creation.
 Examples: Numeric (int, float), String, Tuple.
 Characteristics: Elements cannot be changed once set; any operation that
appears to modify an immutable object will create a new object.

Why use else in try/except construct in Python?

`try:` and `except:` are commonly known for exceptional handling in Python, so
where does `else:` come in handy? `else:` will be triggered when no exception is
raised.
Example:
Let’s learn more about `else:` with a couple of examples.
1. On the first try, we entered 2 as the numerator and “d” as the denominator.
Which is incorrect, and `except:` was triggered with “Invalid input!”.
2. On the second try, we entered 2 as the numerator and 1 as the denominator
and got the result 2. No exception was raised, so it triggered the `else:`
printing the message “Division is successful.”
A Comprehensive Guide to Python String Functions

1. Length of a String: len()


The len() function returns the number of characters in a string.
text = “Hello, Python”
length = len(text)
print(length) # Output: 13
2. Concatenation: +
You can concatenate two strings using the + operator.
str1 = “Hello”
str2 = “Python”
result = str1 + “ “ + str2
print(result) # Output: “Hello Python”
3. String Repetition: *
You can repeat a string multiple times using the * operator.
text = “Python”
repeated_text = text * 3
print(repeated_text) # Output: “PythonPythonPython”
4. Accessing Characters: Indexing
You can access individual characters of a string using indexing. Python uses zero-
based indexing.
text = “Python”
first_char = text[0] # ‘P’
second_char = text[1] # ‘y’
5. Slicing: [:]
Slicing allows you to extract a substring from a string by specifying a start and end
index.
text = “Python Programming”
substring = text[7:15] # “Program”
6. Upper and Lower Case: upper() and lower()
You can convert a string to uppercase or lowercase using these methods.
text = “Python”
uppercase_text = text.upper() # “PYTHON”
lowercase_text = text.lower() # “python”
7. String Replacement: replace()
The replace() function replaces all occurrences of a substring with another string.
text = “I love programming in Python.”
new_text = text.replace(“Python”, “Java”)
print(new_text)
# Output: “I love programming in Java.”
8. Counting Substrings: count()
You can count the number of occurrences of a substring in a string using count().
text = “Python is easy to learn. Python is fun.”
count = text.count(“Python”)
print(count) # Output: 2
9. Checking Prefix and Suffix: startswith() and endswith()
These functions check if a string starts with or ends with a given substring.
text = “Hello, World!”
startsWithHello = text.startswith(“Hello”) # True
endsWithWorld = text.endswith(“World!”) # True
10. Splitting a String: split()
You can split a string into a list of substrings based on a delimiter.
text = “apple,banana,cherry”
fruits = text.split(“,”)
print(fruits) # Output: [‘apple’, ‘banana’, ‘cherry’]
11. Joining a List: join()
You can join a list of strings into a single string using the join() method.
fruits = [‘apple’, ‘banana’, ‘cherry’]
text = “, “.join(fruits)
print(text) # Output: “apple, banana, cherry”
These are some of the essential string functions in Python. They provide powerful
tools for working with text data in various ways. Depending on your needs, you can
leverage these functions to manipulate and process strings effectively in your Python
programs.

You might also like