Python ML Interview Questions
1. What is the difference between list and tuple?
➢ list → changeable (mutable)
➢ tuple → can’t be changed (immutable)
2. What is a loop in Python?
➢ for loop – used to repeat things a set number of times
➢ while loop – repeats while a condition is True
3. What is the difference between is and ==?
➢ == → compares values
➢ is → compares locations in memory
4. What is a regular expression (regex)?
It’s a pattern used to search or match strings — like finding phone numbers, emails, or
words in text.
5. What is the difference between a Python list and a NumPy array?
➢ List: Slower, can hold different data types
➢ NumPy Array: Faster, used for math, holds only one data type
6. What is the difference between AI, Machine Learning, and Deep
Learning?
➢ AI (Artificial Intelligence) is when we make machines smart — like a robot that can play
chess.
➢ Machine Learning (ML) is a part of AI where machines learn from data — like a program
that learns to recognize cats in photos.
➢ Deep Learning is a part of ML that uses brain-like structures called neural networks —
great for things like voice assistants or self-driving cars.
7. What are the 3 types of Machine Learning?
➢ Supervised Learning – You give the computer both the question and the answer (like
pictures of animals with names), so it learns to guess next time.
➢ Unsupervised Learning – You only give data (no answers), and it finds patterns (like
grouping similar customers).
➢ Reinforcement Learning – The computer learns by trial and error, getting rewards or
penalties (like learning to play a video game).
8. What is overfitting and how can you avoid it?
Overfitting means your model is too smart — it memorizes the training data instead of
understanding it.
How to avoid it:
➢ Use more data
➢ Make your model simpler
➢ Use tricks like regularization, dropout, or early stopping
9. What are some important Python libraries used in AI/ML?
➢ NumPy – for numbers and arrays
➢ Pandas – for working with tables and data
➢ Matplotlib / Seaborn – for charts and graphs
➢ Scikit-learn – for basic machine learning
➢ TensorFlow / PyTorch – for deep learning (neural networks)
➢ OpenCV – for images
➢ NLTK / spaCy – for working with text (like chatbots)
10. How do you deal with missing data in a dataset?
If some values are missing:
➢ You can remove that row or column.
➢ You can fill in the missing part using the average or most common value.
➢ You can use models that can handle missing data (like XGBoost).
11. What is the bias-variance tradeoff?
➢ Bias = model is too simple → misses patterns (bad)
➢ Variance = model is too complex → memorizes training data (also bad)
➢ A good model needs a balance – not too simple, not too complex.
12. What’s the difference between classification and regression?
➢ Classification = predicting a label (e.g., spam or not spam)
➢ Regression = predicting a number (e.g., price of a house)
13. What is an activation function in neural networks?
An activation function helps the network learn complex things by adding non-linear
thinking. It decides whether a neuron should "fire" or not.
Common ones:
➢ ReLU – fast and simple
➢ Sigmoid – good for probabilities
➢ Softmax – used for multi-class classification
14. What is gradient descent?
It's a way to teach the model. It keeps adjusting the model a little at a time to make it
better — like learning by making small mistakes and correcting them.
15. What’s the difference between bagging and boosting?
➢ Bagging = train models at the same time and combine their answers (like Random
Forest).
➢ Boosting = train models one after the other, where each one tries to fix the mistakes of
the last one (like XGBoost).
16. What is a confusion matrix?
Simple Answer: It’s a table that shows how well a model did in classification.
➢ TP = Model said "Yes", and it was really "Yes"
➢ FP = Model said "Yes", but it was "No"
➢ FN = Model said "No", but it was "Yes"
➢ TN = Model said "No", and it was really "No"
17. What is one-hot encoding?
It’s a way to turn words or categories into numbers. Each category becomes a column, and
you put 1 where it belongs, 0 everywhere else.
18. How do you know if your model is good?
Use:
➢ Accuracy – how many predictions were correct
➢ Precision – how many predicted "yes" were really "yes"
➢ Recall – how many real "yes" were found
➢ F1 Score – balance of precision and recall
19. What’s the difference between shallow and deep learning?
➢ Shallow learning – uses simple models like decision trees, linear regression
➢ Deep learning – uses neural networks with many layers (great for voice, images, etc.)
20. How does a decision tree work?
It asks yes/no questions at each step to split the data. It keeps asking until it reaches a
decision.
Like:
➢ Is the color red?
➢ Is the size big?
➢ Then it's a "Truck".