0% found this document useful (0 votes)
5 views

Top 100 Python Interview Questions for Data Analyst

The document provides a comprehensive list of the top 100 Python interview questions specifically tailored for data analysts. It covers various topics including Python fundamentals, data structures, algorithms, and libraries like Pandas and NumPy, along with scenario-based and advanced problem-solving questions. Each question is followed by a concise answer or explanation, making it a useful resource for interview preparation.

Uploaded by

itsparagp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Top 100 Python Interview Questions for Data Analyst

The document provides a comprehensive list of the top 100 Python interview questions specifically tailored for data analysts. It covers various topics including Python fundamentals, data structures, algorithms, and libraries like Pandas and NumPy, along with scenario-based and advanced problem-solving questions. Each question is followed by a concise answer or explanation, making it a useful resource for interview preparation.

Uploaded by

itsparagp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Top 100 Python Interview Questions for Data Analyst

🔹 Python Fundamentals
1. What are Python's key features?​
Python is interpreted, dynamically typed, high-level, portable, open-source, object-oriented, and
has rich standard libraries.

2. What are Python data types?​


Common types include: int, float, str, bool, list, tuple, set, dict, and NoneType.

3. Difference between list, tuple, set, and dictionary?

●​ List: ordered, mutable​

●​ Tuple: ordered, immutable​

●​ Set: unordered, no duplicates​

●​ Dict: key-value pairs, ordered from Python 3.7+​

4. What is the difference between is and ==?​


== compares values, is checks object identity (memory location).

5. What is PEP8?​
PEP8 is Python’s official style guide for writing readable code.

**6. What are *args and kwargs?​


*args passes variable-length positional arguments; **kwargs passes variable-length
keyword arguments.

7. What are Python scopes?​


Local, Enclosing, Global, Built-in (LEGB rule).

8. Difference between shallow and deep copy?​


Shallow copy copies references; deep copy copies all nested objects.

9. What are lambda functions?​


Anonymous, inline functions defined using the lambda keyword.

10. How to handle exceptions in Python?​


Using try, except, else, and finally blocks.
11. What is the use of the with statement?​
Manages resources like files, automatically closes them.

12. Difference between break, continue, and pass?

●​ break: exits loop​

●​ continue: skips to next iteration​

●​ pass: does nothing (placeholder)​

13. What is list comprehension?​


Compact syntax for creating lists: [x for x in iterable if condition].

14. What is the difference between mutable and immutable types?​


Mutable types (like lists) can be changed; immutable types (like tuples) cannot.

15. Explain Python memory management.​


Managed by reference counting and garbage collection via the gc module.

🔹 Data Structures & Algorithms


16. What are iterators and generators?​
Iterators use __iter__() and __next__(); generators yield values lazily using yield.

17. Difference between a function and a method?​


Function: independent; Method: bound to an object/class.

18. How do you reverse a list?​


Using list[::-1], list.reverse(), or reversed(list).

19. How to remove duplicates from a list?​


Use set(list) or a loop to preserve order.

20. What are modules and packages?​


Module: .py file; Package: folder with __init__.py.

21. What are decorators?​


Functions that modify the behavior of other functions using @decorator.
22. How to merge two dictionaries?​
{**dict1, **dict2} or dict1.update(dict2).

23. Difference between sorted() and .sort()?​


sorted() returns a new list; .sort() modifies in place.

24. How to check if a key exists in a dictionary?​


Using 'key' in dict.

25. Difference between list and array in Python?​


Lists are general-purpose; arrays (from NumPy) are optimized for numerical data.

26. What are comprehensions in Python?​


Concise syntax to create lists, sets, or dicts.

27. Difference between enumerate() and range()?​


range() gives indices; enumerate() gives index-value pairs.

28. What is zip()?​


Combines multiple iterables into tuples.

29. What are map(), filter(), and reduce()?

●​ map: applies a function to all items​

●​ filter: filters items by condition​

●​ reduce: performs cumulative operation (from functools)​

30. What do globals() and locals() return?​


Dictionaries of global and local variables.

31. Difference between id() and type()?​


id() gives memory address; type() gives the data type.

32. What are docstrings?​


Triple-quoted strings used to document functions, classes, modules.

33. Purpose of __init__()?​


Constructor method in classes to initialize attributes.

34. What are magic methods?​


Special methods like __str__, __len__, __add__ used for operator overloading.
35. Why use if __name__ == "__main__"?​
To ensure code runs only when the script is executed directly.

🔹 Pandas & NumPy


36. What is Pandas?​
A Python library for data manipulation and analysis using DataFrames.

37. Difference between Series and DataFrame?​


Series: 1D labeled array; DataFrame: 2D labeled data structure.

38. How to read a CSV file?​


pd.read_csv('filename.csv')

39. How to check for null values?​


df.isnull() or df.isnull().sum()

40. How to drop missing values?​


df.dropna()

41. How to fill missing values?​


df.fillna(value) or df.fillna(method='ffill')

42. How to filter rows in Pandas?​


df[df['column'] > 10]

43. Apply a function to a column?​


df['column'].apply(func)

44. How to sort a DataFrame?​


df.sort_values(by='column')

45. How to group data?​


df.groupby('column').agg() or .mean() etc.

46. How to merge DataFrames?​


Using pd.merge(df1, df2, on='column')

47. Difference between merge() and join()?​


merge() is more flexible with keys and options.

48. How to remove duplicates?​


df.drop_duplicates()
49. What is NumPy?​
A library for numerical computations with fast array processing.

50. What are NumPy arrays?​


Multidimensional, efficient arrays for numerical operations.

51. How to create NumPy arrays?​


np.array(), np.zeros(), np.arange()

52. How to reshape arrays?​


array.reshape(rows, cols)

53. Difference between view and copy?​


View shares data; copy creates a new independent object.

54. What is broadcasting in NumPy?​


Automatic expansion of arrays for operations on different shapes.

55. How to calculate mean/median/std?​


Use np.mean(), np.median(), and np.std()

🔹 Scenario-Based & Problem Solving


56. How to find duplicate rows in DataFrame?​
df[df.duplicated()]

57. How to remove outliers using IQR?

python
CopyEdit
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1
df = df[~((df < (Q1 - 1.5 * IQR)) | (df > (Q3 + 1.5 *
IQR))).any(axis=1)]

58. Write a function to calculate factorial.

python
CopyEdit
def factorial(n): return 1 if n == 0 else n * factorial(n - 1)
59. Count missing values per column?​
df.isnull().sum()

60. Convert string column to datetime?​


df['date'] = pd.to_datetime(df['date'])

61. Drop rows with null values?​


df.dropna()

62. How to create a pivot table?​


pd.pivot_table(df, values='value', index='row', columns='column')

63. Convert column to category type?​


df['col'] = df['col'].astype('category')

64. Sort DataFrame by multiple columns?​


df.sort_values(by=['col1', 'col2'], ascending=[True, False])

65. Count word frequency in a sentence?

python
CopyEdit
from collections import Counter
Counter(text.split())

66. How to calculate correlation matrix?​


df.corr()

67. Create dummy variables?​


pd.get_dummies(df['category_column'])

68. Random sample from DataFrame?​


df.sample(frac=0.1)

69. Create a conditional column?​


df['new'] = np.where(df['col'] > 10, 'High', 'Low')

70. How to melt a DataFrame?​


pd.melt(df, id_vars=['id'])
🔹 Advanced
71. Difference between Python 2 and 3?​
Python 3 uses print() function, handles Unicode by default, and has different integer division
rules.

72. How to measure code execution time?​


Use time.time() or timeit module.

73. What is virtualenv?​


Tool to create isolated Python environments.

74. How to handle large datasets in Pandas?​


Use read_csv(chunksize=...), Dask, or PySpark.

75. How to use logging in Python?​


Using the built-in logging module.

76. What is GIL?​


Global Interpreter Lock – allows only one thread to execute at a time in CPython.

77. Threading vs Multiprocessing?​


Threading is for I/O-bound tasks; multiprocessing for CPU-bound tasks.

78. JSON vs Pickle?​


JSON is human-readable; Pickle is faster but Python-specific.

79. How to consume an API in Python?​


Using requests.get(url).json()

80. How to export a DataFrame?​


df.to_csv('file.csv') or df.to_excel('file.xlsx')
🧠 Additional Problem-Based Questions for Data Analyst
Interviews
81. Write a Python function to check if a string is a palindrome.

def is_palindrome(s):

return s == s[::-1]

82. Write code to find the most frequent element in a list.

from collections import Counter

def most_frequent(lst):

return Counter(lst).most_common(1)[0][0]

83. Given a list of numbers, return a list with only the even numbers.

def filter_even(lst):

return [x for x in lst if x % 2 == 0]

84. Write a Python function to merge two dictionaries.

def merge_dicts(d1, d2):

return {**d1, **d2}

85. Write a function to reverse words in a sentence.

def reverse_words(sentence):

return ' '.join(sentence.split()[::-1])

86. Write a Python program to find the second highest number in a list.

def second_highest(lst):

return sorted(set(lst))[-2]

87. Write a function to remove punctuation from a string.

import string
def remove_punctuation(text):

return text.translate(str.maketrans('', '', string.punctuation))

88. Count the number of vowels in a string.

def count_vowels(s):

return sum(1 for char in s if char.lower() in 'aeiou')

89. How to flatten a nested list in Python?

def flatten(lst):

return [item for sublist in lst for item in sublist]

90. How would you group a DataFrame by a column and get the mean of each group?

df.groupby('column').mean()

91. Write a function to find all prime numbers less than N.

def primes_less_than(n):

return [x for x in range(2, n) if all(x % d != 0 for d in range(2,


int(x**0.5)+1))]

92. How to rename multiple columns in a DataFrame?

df.rename(columns={'old1': 'new1', 'old2': 'new2'}, inplace=True)

93. Write a Python function to check if a year is a leap year.

def is_leap(year):

return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)

94. Find common elements in two lists.

def common_elements(a, b):

return list(set(a) & set(b))

95. Convert a list of dictionaries to a DataFrame.

import pandas as pd
df = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 3, 'b': 4}])

96. How to drop duplicate rows based on specific columns?

df.drop_duplicates(subset=['col1', 'col2'], inplace=True)

97. Write a program to find the longest word in a sentence.

def longest_word(sentence):

return max(sentence.split(), key=len)

98. How to read only specific columns from a CSV using pandas?

pd.read_csv('file.csv', usecols=['col1', 'col2'])

99. Calculate the correlation matrix of a DataFrame.

df.corr()

100. Write a Python function to generate Fibonacci series up to N terms.

def fibonacci(n):

a, b = 0, 1

for _ in range(n):

print(a, end=' ')

a, b = b, a + b

You might also like