Python Developer Interview: Deep Dive Q&A Document
1. Can you tell me what Python is and why you chose it as your primary
language?
Answer: Python is a high-level, interpreted programming language known for its simplicity,
readability, and vast community support. I chose Python because it allows rapid development
and has extensive libraries for data analysis, web development, automation, and more.
2. You mentioned Python has a large library ecosystem. Can you name a
few libraries you've used, and in what kind of projects?
Answer: Yes, I've worked with:
● Pandas for data manipulation in analysis projects
● NumPy for numerical computing
● Matplotlib/Seaborn for data visualization
● Scikit-learn for building machine learning models
● Requests for APIs
● Flask for web application backend
3. Let’s take Pandas for example. How does Pandas handle missing data in
a DataFrame?
Answer: Pandas handles missing data using NaN values. Functions like isnull(),
dropna(), and fillna() help identify and handle them.
4. What’s the difference between fillna() and dropna()? Can you give
an example where using one over the other would be better?
Answer:
● fillna() fills missing values with a specified value or method (e.g., forward fill).
● dropna() removes rows or columns with missing values. Use fillna() when data is
critical and we want to impute values. Use dropna() when missing data is negligible
and can be safely removed.
5. Okay, now imagine you’re working with a large dataset, and your
fillna() function is taking too long. What could be the reason? How
would you optimize it?
Answer: Large datasets can be slow due to memory constraints. Optimization tips:
● Use vectorized operations
● Avoid looping
● Use appropriate data types
● Consider filling specific columns rather than the entire DataFrame
6. Did you ever need to merge multiple DataFrames? Can you explain the
difference between merge(), join(), and concat()?
Answer:
● merge() is similar to SQL join (inner, outer, left, right)
● join() is a method for joining on indexes
● concat() combines along a particular axis (rows/columns)
7. What if two columns you’re merging have the same name? How does
Pandas handle that? What can you do to avoid conflicts?
Answer: Pandas adds suffixes like _x, _y. To avoid confusion, we can use the suffixes
parameter or rename columns beforehand.
8. Can you explain the difference between a list and a tuple?
Answer:
● List: Mutable, dynamic, defined using []
● Tuple: Immutable, static, defined using ()
9. Tuples are immutable, right? But why would someone use a tuple over a
list in real scenarios?
Answer: Tuples are used for fixed collections (like coordinates), are faster due to immutability,
and are hashable (can be used as dictionary keys).
10. If tuples are faster, why not use them everywhere instead of lists?
Answer: Lists offer flexibility to add, remove, or change elements dynamically, which makes
them better for mutable data.
🧠 Algorithm/Logic Chain
11. Write a function to count how many times each character appears in a
string.
def count_chars(s):
counts = {}
for char in s:
if char in counts:
counts[char] += 1
else:
counts[char] = 1
return counts
12. Modify it to be case-insensitive and ignore special characters.
import re
def count_chars_cleaned(s):
s = re.sub(r'[^a-zA-Z]', '', s).lower()
return count_chars(s)
13. What’s the time complexity of your function?
Answer: O(n), where n is the length of the string.
14. Can you improve the performance? Maybe use a built-in module or
different data structure?
Answer: Yes, using collections.Counter:
from collections import Counter
Counter(s)
🐍 OOP and Code Design Pressure Chain
15. Have you worked with classes in Python? Can you write a basic class
for a BankAccount?
class BankAccount:
def __init__(self, owner, balance):
self.owner = owner
self.balance = balance
def withdraw(self, amount):
if self.balance - amount >= 0:
self.balance -= amount
return True
return False
16. Add a method for withdrawing money that checks for a minimum
balance.
Answer: (Already included above in withdraw method)
17. Now, override the __str__ method so the object prints the account
details nicely.
def __str__(self):
return f"Owner: {self.owner}, Balance: {self.balance}"
18. What is __init__()? Is it a constructor like in C++ or Java? What's the
difference?
Answer: Yes, __init__() is a constructor method in Python, called automatically when a
class is instantiated. It initializes the object. Unlike C++ or Java, Python doesn't have function
overloading.
19. What if I told you I don’t want to use self? Can we write class methods
without it?
Answer: No, self is necessary to access instance variables/methods. However, we can write
static methods with @staticmethod that don’t use self.
⚙️ Bonus: Real-World Scenario Pressure
20. You are part of a team handling a real-time stock data processing
system in Python. Suddenly, data ingestion stops and your dashboard
breaks. What would be your first steps to debug it?
Answer:
● Check logs for recent errors
● Validate if the API source is live
● Inspect the ingestion script for errors or downtime
21. You find the API call is working but the DataFrame is empty. What now?
Answer:
● Check API schema for any changes
● Validate JSON structure
● Debug data parsing logic
22. You realize someone changed the schema of the API response. How do
you make your code more robust for such future cases?
Answer:
● Use try-except blocks
● Add schema validation checks
● Log schema mismatches
● Use dynamic field mapping
🔮 Most Probable Questions for HCL GET - Python
Developer Role
1. Write a function to remove duplicates from a list.
def remove_duplicates(lst):
return list(set(lst))
# Alternative (preserve order):
def remove_duplicates(lst):
seen = set()
result = []
for item in lst:
if item not in seen:
seen.add(item)
result.append(item)
return result
2. What are modules and packages in Python?
A module is a single Python file containing functions, classes, or variables (e.g., math, random).
A package is a collection of modules in a directory with an __init__.py file.
They help organize code and promote reusability.
3. What is PEP8? Why is it important?
PEP8 is Python’s style guide for writing clean, readable, and consistent code.
It helps teams collaborate better and makes code easier to debug and maintain.
Examples: 4-space indentation, snake_case naming, limiting line length to 79 chars.
4. Difference between mutable and immutable types in Python.
Mutable: Can be changed after creation (e.g., list, dict, set).
Immutable: Cannot be changed after creation (e.g., int, float, str, tuple).
Changing mutable objects in-place affects all references.
5. Explain Python’s Global Interpreter Lock (GIL).
GIL is a mutex in CPython that allows only one thread to execute Python bytecode at a time.
It simplifies memory management but limits true multi-threading in CPU-bound programs.
For parallelism, use multiprocessing instead of threading.
6. Describe a real-life problem you solved using Python.
I automated the generation of sales reports from Excel files using Pandas and scheduled it
using cron jobs.
This reduced 3 hours of manual work to just 5 minutes and ensured timely delivery of reports.
7. What is a virtual environment and why do we need it?
A virtual environment is an isolated Python environment that allows specific dependencies per
project.
Prevents version conflicts across projects.
Tools: venv, virtualenv.
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate
8. Explain how you would connect Python with a database (like
MySQL).
Use mysql-connector-python or SQLAlchemy.
import mysql.connector
conn = mysql.connector.connect(
host="localhost",
user="root",
password="password",
database="testdb"
)
cursor = conn.cursor()
cursor.execute("SELECT * FROM users")
for row in cursor.fetchall():
print(row)
9. What Python libraries have you used?
I’ve used:
- Pandas for data manipulation.
- NumPy for numerical operations.
- Matplotlib / Seaborn for visualizations.
- Scikit-learn for machine learning.
- SQLAlchemy for database operations.
- Requests for web APIs.
10. How do you handle missing values in a dataset using Python?
Using Pandas:
import pandas as pd
df = pd.read_csv("data.csv")
# Drop missing values
df.dropna()
# Fill missing values
df.fillna(0) # Fill with a constant
df.fillna(method='ffill') # Forward fill
df.fillna(df.mean()) # Fill with mean (numeric columns)