0% found this document useful (0 votes)
33 views5 pages

Top 50 ML Interview Questions Recreated

The document presents the top 50 machine learning interview questions and answers, covering fundamental concepts such as supervised vs unsupervised learning, overfitting, bias-variance tradeoff, and evaluation metrics like precision, recall, and F1-score. It also discusses various algorithms and techniques including decision trees, ensemble learning, neural networks, and hyperparameter tuning. Additionally, it addresses practical issues like handling missing data, data leakage, and the curse of dimensionality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views5 pages

Top 50 ML Interview Questions Recreated

The document presents the top 50 machine learning interview questions and answers, covering fundamental concepts such as supervised vs unsupervised learning, overfitting, bias-variance tradeoff, and evaluation metrics like precision, recall, and F1-score. It also discusses various algorithms and techniques including decision trees, ensemble learning, neural networks, and hyperparameter tuning. Additionally, it addresses practical issues like handling missing data, data leakage, and the curse of dimensionality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Top 50 Machine Learning Interview Questions & Answers

1. What is the difference between supervised and unsupervised learning?

Supervised learning uses labeled data; unsupervised does not.

2. What is overfitting and how can it be prevented?

Overfitting is when a model learns noise. Prevent it with regularization or cross-validation.

3. What is bias-variance tradeoff?

It's the tradeoff between underfitting (high bias) and overfitting (high variance).

4. What are precision and recall?

Precision = TP / (TP + FP), Recall = TP / (TP + FN).

5. What is F1-score?

Harmonic mean of precision and recall.

6. What is a confusion matrix?

A table showing TP, TN, FP, FN.

7. What is the difference between classification and regression?

Classification predicts labels; regression predicts continuous values.

8. What is cross-validation?

Technique for model validation using train/test splits.

9. What is regularization?

Adding penalty to reduce overfitting (L1 or L2).

10. Difference between L1 and L2 regularization?

L1 adds absolute value, L2 adds square of weights.

11. How does a decision tree work?

Splits data using feature thresholds to reduce impurity.


12. What is pruning in decision trees?

Removing less useful branches to prevent overfitting.

13. What is ensemble learning?

Combining models to improve performance.

14. Difference between Bagging and Boosting?

Bagging = parallel models; Boosting = sequential.

15. What is Random Forest?

Ensemble of decision trees with randomness.

16. What is XGBoost?

Efficient gradient boosting framework.

17. What is SVM?

Classifier that finds the best separating hyperplane.

18. What is the kernel trick?

Transforms data to allow linear separation.

19. What is a ROC curve?

Graph of TPR vs FPR. AUC measures performance.

20. What is a learning rate?

Controls step size during optimization.

21. What is gradient descent?

Algorithm to minimize loss function.

22. What is feature engineering?

Creating features to improve model performance.

23. What is one-hot encoding?

Binary vector representation of categories.


24. What is PCA?

Reduces dimensionality while keeping variance.

25. What are hyperparameters?

Configurable settings like depth, learning rate.

26. How do you tune hyperparameters?

Using GridSearchCV or RandomizedSearchCV.

27. What is early stopping?

Stops training when validation score worsens.

28. What is a neural network?

Layered structure for learning patterns.

29. What is a CNN?

Neural network for image tasks using filters.

30. What is ReLU?

Activation function: max(0, x).

31. What is dropout?

Randomly disables neurons to reduce overfitting.

32. What is transfer learning?

Using pretrained model for new tasks.

33. What is an epoch?

One pass through the training data.

34. What is batch size?

Number of samples per training step.

35. Regression evaluation metrics?

MSE, RMSE, MAE, R2 score.


36. What is the curse of dimensionality?

Too many features degrade performance.

37. How to handle missing data?

Drop, fill, or impute with models.

38. What are outliers?

Extreme values; can be removed or handled.

39. Difference: batch vs online learning?

Batch uses all data; online uses one sample at a time.

40. What is A/B testing?

Compare two variants for performance.

41. What is data leakage?

Using future info in training data.

42. What is underfitting?

Model too simple; poor training performance.

43. What is variance in ML?

Models sensitivity to training data.

44. What is multicollinearity?

Highly correlated features impact stability.

45. What is label encoding?

Convert categories to integers.

46. What is stratified sampling?

Maintain class balance in splits.

47. What is SMOTE?

Synthetic oversampling for imbalanced classes.


48. Accuracy vs F1-score?

F1 is better for imbalance; accuracy can mislead.

49. When to use logistic regression?

For binary classification with interpretability.

You might also like