0% found this document useful (0 votes)
25 views29 pages

? Python Interview Q

Uploaded by

Alok Misra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views29 pages

? Python Interview Q

Uploaded by

Alok Misra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

📘 Python Interview Q&A (For Data Analyst

Faculty)

🔹 Section 1: Python Basics


Q1. What is Python? Why is it popular?
👉 Python is a high-level, interpreted, general-purpose programming language. It is popular
because of its simple syntax, huge libraries (like Pandas, NumPy, Matplotlib), and strong
community support.

Q2. What are Python data types?


👉 Common data types:

 Numbers → int, float


 Text → str
 Sequence → list, tuple
 Mapping → dict
 Boolean → True/False

Example:

a = 10 # int
b = 12.5 # float
c = "Hello" # string
d = [1,2,3] # list
e = (4,5,6) # tuple
f = {"x": 1} # dictionary

Q3. What is the difference between List, Tuple, and Dictionary?

 List → Ordered, changeable, allows duplicates [1,2,3]


 Tuple → Ordered, immutable, allows duplicates (1,2,3)
 Dictionary → Key-value pairs, unordered {"name":"Alok"}

Q4. What are Python operators?

 Arithmetic: + - * / % ** //
 Comparison: ==, !=, >, <, >=, <=
 Logical: and, or, not

Q5. What is indentation in Python?


👉 Python uses indentation (spaces) instead of curly braces {} to define blocks.

if True:
print("Indented correctly")

Q6. Difference between is and ==?

 == checks values are equal.


 is checks memory identity (same object).

a = [1,2,3]
b = [1,2,3]
print(a == b) # True
print(a is b) # False

🔹 Section 2: Control Flow


Q7. What are conditional statements in Python?
👉 if, elif, else

age = 20
if age >= 18:
print("Adult")
else:
print("Minor")

Q8. How does a for loop differ from a while loop?

 for → Runs for fixed range/collection


 while → Runs till condition is True

Q9. Write Python code to print even numbers 1–20.

for i in range(1,21):
if i % 2 == 0:
print(i)
Q10. Write a program to find factorial of a number.

n = 5
fact = 1
for i in range(1, n+1):
fact *= i
print(fact) # 120

🔹 Section 3: Functions & File Handling


Q11. What is a function in Python?
👉 A block of reusable code. Defined using def.

def add(a, b):


return a+b

print(add(5,3)) # 8

Q12. What are default arguments in Python?

def greet(name="Guest"):
print("Hello", name)

greet() # Hello Guest


greet("Alok") # Hello Alok

Q13. How do you open and read a file?

f = open("data.txt", "r")
print(f.read())
f.close()

Q14. Difference between read(), readline(), readlines()?

 read() → whole file


 readline() → one line
 readlines() → list of all lines

🔹 Section 4: Pandas (Data Analysis)


Q15. How do you create a DataFrame in pandas?
import pandas as pd
data = {"Name": ["A","B","C"], "Score":[85,90,95]}
df = pd.DataFrame(data)
print(df)

Q16. How do you read a CSV file?

df = pd.read_csv("data.csv")
print(df.head())

Q17. How to check basic information of a DataFrame?

 df.head() → first 5 rows


 df.tail() → last 5 rows
 df.info() → data types, non-null counts
 df.describe() → summary statistics

Q18. How to filter rows in pandas?

df[df["Score"] > 90]

Q19. How to group data in pandas?

df.groupby("Name")["Score"].mean()

Q20. How to handle missing values?

 df.dropna() → drop missing


 df.fillna(0) → fill with 0

🔹 Section 5: NumPy (Numerical Python)


Q21. What is NumPy? Why used?
👉 NumPy is used for fast mathematical operations on arrays.

import numpy as np
arr = np.array([1,2,3,4])
print(arr.mean()) # 2.5
Q22. Difference between Python list and NumPy array?

 List → slower, general-purpose


 NumPy array → faster, optimized for math operations

🔹 Section 6: Visualization
Q23. How do you plot a simple line chart?

import matplotlib.pyplot as plt


x = [1,2,3,4]
y = [2,4,6,8]
plt.plot(x, y, marker="o")
plt.title("Line Chart")
plt.show()

Q24. How do you plot a bar chart in matplotlib?

x = ["A","B","C"]
y = [10,20,15]
plt.bar(x, y)
plt.title("Bar Chart")
plt.show()

Q25. What is Seaborn?


👉 Seaborn is a visualization library built on matplotlib. It makes statistical plots easier.

🔹 Section 7: Advanced / Faculty-Level


Q26. Explain difference between shallow copy and deep copy in Python.

 Shallow copy → only references copied (changes affect original).


 Deep copy → creates independent copy.

Q27. What is the difference between Python list comprehension and loops?
👉 List comprehension is a shorter way:

squares = [x*x for x in range(5)]


Q28. What is the difference between CSV and Excel in Python?

 CSV → pd.read_csv()
 Excel → pd.read_excel()

Q29. How do you merge/join two DataFrames in pandas?


👉 Using merge() or concat()

pd.merge(df1, df2, on="ID")

Q30. What are some real-world uses of Python in data analysis?

 Cleaning datasets
 Exploratory Data Analysis (EDA)
 Visualization & dashboards
 Machine learning
📘 Regression in Python – Interview Q&A

🔹 Section 1: Basics of Regression


Q1. What is regression? Why is it used?
👉 Regression is a statistical method to model the relationship between a dependent variable
(target) and one or more independent variables (features).

 Helps in prediction (e.g., house price prediction)


 Helps in understanding relationships between variables

Q2. What is the difference between regression and classification?

 Regression → Output is continuous (e.g., price, salary, temperature).


 Classification → Output is categorical (e.g., spam/not spam, disease/healthy).

Q3. Types of regression in Python?

 Linear Regression
 Multiple Linear Regression
 Polynomial Regression
 Logistic Regression (used for classification but named regression)
 Regularized Regression (Ridge, Lasso)

🔹 Section 2: Linear Regression in Python


Q4. How do you implement simple linear regression in Python?

import pandas as pd
from sklearn.linear_model import LinearRegression

# Sample data
data = {"Hours": [1,2,3,4,5], "Score": [10,20,30,40,50]}
df = pd.DataFrame(data)

X = df[["Hours"]] # Independent variable (2D)


y = df["Score"] # Dependent variable
model = LinearRegression()
model.fit(X, y)

print("Slope:", model.coef_)
print("Intercept:", model.intercept_)
print("Predicted:", model.predict([[6]])) # For 6 study hours

Q5. What is the regression equation?


👉 For Linear Regression:

y=b0+b1x

 b0 = Intercept
 b1 = Coefficient (slope)

Example: if Intercept = 5, Coefficient = 2, then:

y=5+2x

Q6. What is the difference between simple and multiple linear regression?

 Simple → One independent variable (e.g., Hours → Score)


 Multiple → More than one independent variable (e.g., Hours + Sleep → Score)

Q7. Write Python code for multiple regression.

data = {"Hours":[1,2,3,4,5], "Sleep":[6,7,5,8,7], "Score":[40,50,45,60,65]}


df = pd.DataFrame(data)

X = df[["Hours","Sleep"]]
y = df["Score"]

model = LinearRegression()
model.fit(X, y)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)

🔹 Section 3: Model Evaluation


Q8. How do you evaluate regression models?
 R² (Coefficient of Determination) → Measures how well model fits data (closer to 1 =
better).
 MAE (Mean Absolute Error)
 MSE (Mean Squared Error)
 RMSE (Root Mean Squared Error)

from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error


y_pred = model.predict(X)
print("R2:", r2_score(y, y_pred))
print("MAE:", mean_absolute_error(y, y_pred))
print("MSE:", mean_squared_error(y, y_pred))

Q9. What is overfitting in regression?


👉 Overfitting happens when a model fits the training data too well (captures noise), but performs
poorly on unseen data.

 Solution → Cross-validation, Regularization (Ridge/Lasso), Simpler model

🔹 Section 4: Advanced Regression


Q10. What is polynomial regression?
👉 Used when relationship between X and y is non-linear.

from sklearn.preprocessing import PolynomialFeatures


import numpy as np

X = np.array([1,2,3,4,5]).reshape(-1,1)
y = np.array([1,4,9,16,25]) # Quadratic relation

poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

model = LinearRegression()
model.fit(X_poly, y)
print(model.predict(poly.transform([[6]]))) # Predict for x=6

Q11. What are Ridge and Lasso regression? Why used?

 Both are regularization techniques (to avoid overfitting).


 Ridge (L2 penalty) → Shrinks coefficients but not to zero.
 Lasso (L1 penalty) → Can shrink coefficients to zero (feature selection).

from sklearn.linear_model import Ridge, Lasso


ridge = Ridge(alpha=1.0)
lasso = Lasso(alpha=0.1)
Q12. What is logistic regression? Is it regression?
👉 Logistic Regression is used for classification (binary outcomes like pass/fail).

 Predicts probability (0–1) using sigmoid function.


 Though named regression, it is a classification algorithm.

🔹 Section 5: Real-World Application Questions


Q13. How would you explain regression to management students?
👉 Regression is like fitting a line (or curve) through data points to predict outcomes and
understand influence of factors.
Example: Predicting sales based on advertising spend.

Q14. Suppose you built a regression model and got R² = 0.25. What does it mean?
👉 Only 25% of variation in dependent variable is explained by independent variables → weak
model.

Q15. What are assumptions of linear regression?

 Linearity
 Independence of errors
 Homoscedasticity (constant variance)
 Normal distribution of errors
 No multicollinearity (independent variables not highly correlated)
📘 Correlation in Python – Interview Q&A

🔹 Section 1: Basics of Correlation


Q1. What is correlation? Why is it important in data analysis?
👉 Correlation measures the strength and direction of the relationship between two variables.

 Value ranges from -1 to +1


 Helps identify whether variables move together or in opposite directions.

Q2. What are types of correlation?

 Positive correlation (+ve): Both variables increase together (📈 Example: Hours studied
vs. Exam score).
 Negative correlation (-ve): One increases, other decreases (📉 Example: Price vs.
Demand).
 Zero correlation: No relationship (⚪ Example: Shoe size vs. Salary).

Q3. What is the difference between correlation and causation?

 Correlation → Two variables move together.


 Causation → One variable actually causes the other to change.
👉 Example: Ice cream sales and drowning cases are correlated (both increase in summer)
but not causally related.

🔹 Section 2: Python Implementation


Q4. How do you calculate correlation in Python using pandas?

import pandas as pd

data = {"Hours":[1,2,3,4,5], "Score":[10,20,30,40,50]}


df = pd.DataFrame(data)

print(df.corr())
👉 .corr() gives Pearson correlation by default.

Q5. What are the common correlation methods in Python?

 Pearson → Linear relationship (default)


 Spearman → Rank-based correlation (monotonic relationship)
 Kendall → Rank correlation (less common)

print(df.corr(method="pearson"))
print(df.corr(method="spearman"))
print(df.corr(method="kendall"))

Q6. How do you calculate correlation between two variables only?

print(df["Hours"].corr(df["Score"]))

Q7. How do you visualize correlation in Python?

import seaborn as sns


import matplotlib.pyplot as plt

sns.heatmap(df.corr(), annot=True, cmap="coolwarm")


plt.show()

👉 Heatmap shows correlation matrix visually.

🔹 Section 3: Applied Questions


Q8. If correlation between X and Y is +0.9, what does it mean?
👉 Very strong positive relationship. As X increases, Y increases.

Q9. If correlation between X and Y is -0.8, what does it mean?


👉 Strong negative relationship. As X increases, Y decreases.

Q10. If correlation between X and Y is 0, what does it mean?


👉 No linear relationship between X and Y.
Q11. How would you handle multicollinearity in regression?
👉 If independent variables are highly correlated, it causes problems in regression.
Solutions:

 Remove one variable


 Use Principal Component Analysis (PCA)
 Use Regularization (Ridge/Lasso)

Q12. What is the correlation matrix?


👉 A table showing correlation values between all pairs of variables in dataset.

print(df.corr())

🔹 Section 4: Faculty-Level Questions


Q13. When should you use Spearman correlation instead of Pearson?
👉 Use Spearman when data is non-linear but has a monotonic relationship (values move in
same order, not same rate).

Q14. Can correlation detect non-linear relationships?


👉 Pearson cannot (it assumes linear).
👉 Spearman/Kendall are better for non-linear monotonic trends.

Q15. How do missing values affect correlation in Python?


👉 Pandas by default ignores missing values. But too many NaNs can distort results.
Solution: Clean or impute data before correlation analysis.

Q16. Example of correlation in real-world data analysis?

 Marketing: Ad spend vs. Sales revenue


 Finance: Stock A vs. Stock B prices
 Health: Exercise hours vs. BMI
📘 Matplotlib in Python – Interview Q&A

🔹 Section 1: Basics
Q1. What is Matplotlib? Why is it used?
👉 Matplotlib is a data visualization library in Python. It is used to create 2D plots, bar charts,
histograms, scatter plots, etc..

Q2. How do you install and import Matplotlib?

pip install matplotlib


import matplotlib.pyplot as plt

👉 pyplot is the commonly used module.

Q3. What are some key features of Matplotlib?

 Supports many chart types (line, bar, scatter, histogram, pie).


 Highly customizable (colors, labels, legends).
 Works well with NumPy & Pandas.
 Can save plots as images.

🔹 Section 2: Basic Plots


Q4. How do you create a simple line plot?

import matplotlib.pyplot as plt

x = [1,2,3,4]
y = [2,4,6,8]

plt.plot(x, y)
plt.title("Line Chart")
plt.xlabel("X Axis")
plt.ylabel("Y Axis")
plt.show()
Q5. How do you add labels, title, and legend in a plot?

plt.plot(x, y, label="Line 1")


plt.xlabel("X Axis")
plt.ylabel("Y Axis")
plt.title("Line Plot Example")
plt.legend()
plt.show()

Q6. How do you plot multiple lines in one graph?

x = [1,2,3,4]
y1 = [1,2,3,4]
y2 = [2,4,6,8]

plt.plot(x, y1, label="y = x")


plt.plot(x, y2, label="y = 2x")
plt.legend()
plt.show()

Q7. How do you plot a bar chart in Matplotlib?

categories = ["A","B","C"]
values = [10,20,15]

plt.bar(categories, values, color="skyblue")


plt.title("Bar Chart")
plt.show()

Q8. How do you plot a histogram?


👉 Histogram shows frequency distribution.

data = [7,8,5,6,6,7,8,9,10,10,8,7,6]

plt.hist(data, bins=5, color="orange", edgecolor="black")


plt.title("Histogram Example")
plt.show()

Q9. How do you create a scatter plot?

x = [5,7,8,7,6,9,5,6,7,8]
y = [99,86,87,88,100,86,103,87,94,78]

plt.scatter(x, y, color="red")
plt.title("Scatter Plot")
plt.show()
Q10. How do you create a pie chart?

sizes = [30, 40, 20, 10]


labels = ["A","B","C","D"]

plt.pie(sizes, labels=labels, autopct="%1.1f%%", startangle=90)


plt.title("Pie Chart Example")
plt.show()

🔹 Section 3: Customization
Q11. How do you change line style, color, and markers?

plt.plot(x, y, color="green", linestyle="--", marker="o")


plt.show()

Q12. How do you add grid to a plot?

plt.plot(x, y)
plt.grid(True)
plt.show()

Q13. How do you adjust figure size in Matplotlib?

plt.figure(figsize=(8,5))
plt.plot(x, y)
plt.show()

Q14. How do you save a Matplotlib plot as an image?

plt.plot(x, y)
plt.savefig("plot.png")

Q15. How do you create subplots (multiple plots in one figure)?

plt.subplot(1, 2, 1) # 1 row, 2 columns, 1st plot


plt.plot([1,2,3],[4,5,6])

plt.subplot(1, 2, 2) # 2nd plot


plt.plot([1,2,3],[7,8,9])

plt.show()
🔹 Section 4: Applied / Faculty-Level
Q16. What is the difference between plt.plot() and plt.scatter()?

 plt.plot() → Line plot (connects points).


 plt.scatter() → Individual points (no line).

Q17. How does Matplotlib integrate with Pandas?


👉 You can call plotting functions directly on a DataFrame.

import pandas as pd
df = pd.DataFrame({"x":[1,2,3,4], "y":[10,20,30,40]})
df.plot(x="x", y="y", kind="line")
plt.show()

Q18. What is the role of tight_layout() in Matplotlib?


👉 Adjusts spacing so titles, labels, and plots don’t overlap.

plt.tight_layout()

Q19. Can you make 3D plots in Matplotlib?


👉 Yes, using mpl_toolkits.mplot3d. Example:

from mpl_toolkits.mplot3d import Axes3D


import numpy as np

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = [1,2,3,4]
y = [10,20,30,40]
z = [5,15,25,35]
ax.scatter(x, y, z)
plt.show()

Q20. What are some limitations of Matplotlib?

 Syntax can be verbose.


 Not as modern-looking as Seaborn/Plotly.
 3D support is limited.
📘 NumPy in Python – Interview Q&A

🔹 Section 1: Basics
Q1. What is NumPy? Why is it used?
👉 NumPy (Numerical Python) is a Python library used for fast mathematical and scientific
computing.

 Provides ndarray (n-dimensional array)


 Faster than Python lists
 Supports linear algebra, statistics, Fourier transforms

Q2. How do you install and import NumPy?

pip install numpy


import numpy as np

Q3. Difference between Python list and NumPy array?

 List → Can store mixed data types, slower.


 NumPy array → Stores same data type, optimized for fast numerical operations.

a = [1,2,3] # Python list


b = np.array([1,2,3]) # NumPy array

Q4. What is an ndarray?


👉 ndarray is the core data structure of NumPy — a multi-dimensional array.

🔹 Section 2: Creating Arrays


Q5. How do you create a NumPy array?

arr = np.array([1,2,3,4])
print(arr)
Q6. How do you create arrays of zeros/ones/random numbers?

np.zeros((2,3)) # 2x3 array of zeros


np.ones((2,3)) # 2x3 array of ones
np.random.rand(2,3) # 2x3 array of random numbers

Q7. What is the difference between arange() and linspace()?

 np.arange(start, stop, step) → Like Python range()


 np.linspace(start, stop, num) → Divides range into equal parts

np.arange(0,10,2) # [0,2,4,6,8]
np.linspace(0,1,5) # [0. ,0.25,0.5,0.75,1.]

🔹 Section 3: Array Operations


Q8. How do you check array shape and size?

arr = np.array([[1,2,3],[4,5,6]])
print(arr.shape) # (2,3)
print(arr.size) # 6

Q9. How do you perform element-wise operations?

a = np.array([1,2,3])
b = np.array([4,5,6])
print(a + b) # [5 7 9]
print(a * b) # [ 4 10 18]

Q10. How do you calculate statistics in NumPy?

arr = np.array([1,2,3,4,5])
print(arr.mean()) # 3.0
print(arr.std()) # 1.414...
print(arr.sum()) # 15

Q11. How do you reshape arrays?

arr = np.array([1,2,3,4,5,6])
print(arr.reshape(2,3)) # 2x3 matrix

Q12. How do you slice and index NumPy arrays?


arr = np.array([10,20,30,40,50])
print(arr[1:4]) # [20 30 40]

For 2D arrays:

mat = np.array([[1,2,3],[4,5,6]])
print(mat[0,1]) # 2

🔹 Section 4: Linear Algebra


Q13. How do you perform matrix multiplication?

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
print(np.dot(a,b))

Q14. How do you find transpose and inverse of a matrix?

print(a.T) # Transpose
print(np.linalg.inv(a)) # Inverse

Q15. How do you calculate eigenvalues and eigenvectors?

vals, vecs = np.linalg.eig(a)


print("Eigenvalues:", vals)
print("Eigenvectors:", vecs)

🔹 Section 5: Applied / Faculty-Level


Q16. How is NumPy faster than lists?
👉 NumPy uses contiguous memory blocks (C implementation) → operations are vectorized
(no Python loops).

Q17. What is broadcasting in NumPy?


👉 Broadcasting allows operations between arrays of different shapes.

a = np.array([1,2,3])
b = 5
print(a + b) # [6 7 8]

Q18. How do you stack arrays vertically and horizontally?


a = np.array([1,2])
b = np.array([3,4])
print(np.vstack((a,b))) # Vertical
print(np.hstack((a,b))) # Horizontal

Q19. What is the difference between copy() and view()?

 view() → Creates a new view (changes affect original).


 copy() → Creates an independent array.

Q20. Real-world use of NumPy in Data Analysis?

 Handling large datasets efficiently


 Numerical computations (mean, std, matrix ops)
 Pre-processing before machine learning
📘 Pandas in Python – Interview Q&A

🔹 Section 1: Basics
Q1. What is Pandas? Why is it used?
👉 Pandas is a Python library for data manipulation and analysis.

 Provides two main data structures:


o Series → 1D (like a column in Excel)
o DataFrame → 2D (like a table in Excel)
 Used for cleaning, filtering, grouping, joining, and analyzing data.

Q2. How do you install and import Pandas?

pip install pandas


import pandas as pd

Q3. What is the difference between a Series and a DataFrame?

 Series → One-dimensional (labels + values).

s = pd.Series([10,20,30], index=["a","b","c"])

 DataFrame → Two-dimensional (rows + columns).

df = pd.DataFrame({"Name":["A","B"], "Score":[85,90]})

🔹 Section 2: Data Input & Inspection


Q4. How do you read and write CSV files in Pandas?

df = pd.read_csv("data.csv") # Read
df.to_csv("output.csv", index=False) # Write

Q5. How do you inspect a DataFrame quickly?

 df.head() → first 5 rows


 df.tail() → last 5 rows
 df.info() → data types, null values
 df.describe() → summary statistics
 df.shape → (rows, columns)

Q6. How do you select a column and multiple columns?

df["Name"] # Single column


df[["Name","Score"]] # Multiple columns

Q7. How do you select rows by index and by condition?

df.iloc[0] # First row by index


df[df["Score"] > 80] # Rows where Score > 80

🔹 Section 3: Data Cleaning


Q8. How do you handle missing values in Pandas?

df.dropna() # Drop missing rows


df.fillna(0) # Replace NaN with 0
df["Score"].fillna(df["Score"].mean()) # Fill with mean

Q9. How do you rename columns?

df.rename(columns={"Name":"Student_Name"}, inplace=True)

Q10. How do you drop a column?

df.drop("Score", axis=1, inplace=True)

Q11. How do you change data type of a column?

df["Score"] = df["Score"].astype(float)

🔹 Section 4: Data Operations


Q12. How do you sort a DataFrame?
df.sort_values("Score", ascending=False)

Q13. How do you group data in Pandas?

df.groupby("Name")["Score"].mean()

Q14. How do you merge two DataFrames?

pd.merge(df1, df2, on="ID")

Q15. How do you concatenate DataFrames vertically and horizontally?

pd.concat([df1, df2], axis=0) # Vertical


pd.concat([df1, df2], axis=1) # Horizontal

Q16. How do you apply functions to columns?

df["Score"].apply(lambda x: x*2)

Q17. How do you get unique values and their counts?

df["Name"].unique()
df["Name"].value_counts()

Q18. How do you find correlation between columns in Pandas?

df.corr()

🔹 Section 5: Advanced / Faculty-Level


Q19. What is the difference between loc[] and iloc[]?

 loc[] → Label-based (row/column names).


 iloc[] → Index-based (numeric positions).

df.loc[0,"Name"] # First row, Name column


df.iloc[0,0] # Same result using index

Q20. How do you pivot a DataFrame?


df.pivot(index="Name", columns="Subject", values="Score")

Q21. What is the difference between apply(), map(), and applymap()?

 apply() → Applies function along rows/columns.


 map() → Applies function element-wise to a Series.
 applymap() → Applies function element-wise to whole DataFrame.

Q22. How do you check for duplicates and remove them?

df.duplicated()
df.drop_duplicates(inplace=True)

Q23. How do you export DataFrame to Excel?

df.to_excel("output.xlsx", index=False)

Q24. How do you handle large datasets efficiently in Pandas?

 Use chunksize while reading files.


 Use df.sample() to test on subset.
 Use categorical data types for memory saving.

Q25. Real-world use of Pandas in data analysis?

 Importing raw data (CSV, Excel, SQL)


 Cleaning & transforming datasets
 Exploratory Data Analysis (EDA)
 Feature engineering for ML models
📘 Google Colab – Interview Q&A

🔹 Section 1: Basics
Q1. What is Google Colab? Why is it used?
👉 Google Colab (Colaboratory) is a free, cloud-based Jupyter notebook environment
provided by Google.

 No installation required (runs in browser).


 Free GPU/TPU support.
 Great for teaching, collaboration, and data analysis.

Q2. What is the difference between Jupyter Notebook and Google Colab?

 Jupyter Notebook → Local installation, runs on your computer.


 Google Colab → Cloud-based, runs on Google servers, shareable like Google Docs.

Q3. Do you need to install Python for Colab?


👉 No. Colab already has Python + popular libraries (NumPy, Pandas, Matplotlib, Scikit-
learn, TensorFlow, etc.) pre-installed.

Q4. How do you access Google Colab?


👉 Open https://colab.research.google.com with a Google account → Create New Notebook.

🔹 Section 2: Features & Usage


Q5. How do you write and run code in Colab?

 Create a code cell → Write Python code → Press Shift + Enter.


 Create a text cell → Use Markdown for notes, explanations.
Q6. How do you mount Google Drive in Colab?
👉 To access files stored in Drive:

from google.colab import drive


drive.mount('/content/drive')

Q7. How do you upload and download files in Colab?

 Upload from local system:

from google.colab import files


uploaded = files.upload()

 Download from Colab:

files.download("output.csv")

Q8. How do you install external libraries in Colab?

!pip install seaborn

👉 Exclamation mark ! is used to run shell commands in Colab.

Q9. How do you check GPU availability in Colab?

import tensorflow as tf
print(tf.config.list_physical_devices("GPU"))

Or use: !nvidia-smi

Q10. How do you change runtime in Colab?


👉 Menu: Runtime → Change runtime type → Select GPU/TPU

🔹 Section 3: Collaboration & Teaching


Q11. How do you share a Colab notebook with students?
👉 Click Share → Set permission (View/Edit) → Share link (similar to Google Docs).
Q12. Can multiple people edit the same Colab notebook at once?
👉 Yes, real-time collaboration is supported (like Google Docs).

Q13. How do you use Markdown in Colab?


👉 Markdown is used in text cells for notes, equations, and formatting. Example:

# Heading 1
## Heading 2
**Bold Text**
*Italic Text*
- Bullet Point

For math equations:

$y = mx + b$

🔹 Section 4: Applied Questions


Q14. How do you import data from a URL in Colab?

import pandas as pd
url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
df = pd.read_csv(url)
print(df.head())

Q15. How do you connect Colab with Kaggle datasets?

!pip install kaggle


from google.colab import files
files.upload() # Upload kaggle.json API key

Q16. What are the advantages of using Colab for teaching?

 No setup issues for students.


 Easy sharing and collaboration.
 Free GPU/TPU for ML/DL experiments.
 Integration with Google Drive for saving work.

Q17. What are some limitations of Colab?

 Requires internet connection.


 Limited session time (~12 hrs).
 Limited free resources (GPU quota).
 Not suitable for very large datasets.

Q18. Can Colab be used for SQL?


👉 Yes, using SQLite or BigQuery. Example:

import sqlite3

Q19. Can you run shell commands in Colab?


👉 Yes, by prefixing with !

!ls
!pwd

Q20. Real-world use of Colab in data analysis?

 Running Python data analysis labs for students.


 Sharing notebooks for collaborative research.
 Training ML models on free GPU.
 Quick prototyping without local setup.

You might also like