Machine Learning Viva Questions
Machine Learning Viva Questions
5. What is NumPy?
Ans: NumPy is the fundamental package for scientific computing in Python. It is a Python
library that provides a multidimensional array object, various derived objects, and an
assortment of routines for fast operations on arrays, including mathematical, logical, shape
manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic
statistical operations, random simulation and much more.
6. What is Pandas?
Ans: Pandas is defined as an open-source library that provides high-performance data
manipulation in Python. It is built on top of the NumPy package, which means Numpy is
required for operating the Pandas. Pandas is used to analyze data.
Regression Classification
the output variable must be of continuous the output variable must be a discrete value
nature or real value
we try to find the best fit line, which can we try to find the decision boundary, which
predict the output more accurately can divide the dataset into different classes
Ex: Weather Prediction, House price Ex: Identification of spam emails, Speech
prediction Recognition, Identification of cancer cells
9. What is a confusion matrix?
Ans: A Confusion matrix is an N x N matrix used for evaluating the performance of a
classification model, where N is the number of target classes. The matrix compares the actual
target values with those predicted by the machine learning model.
a. Accuracy = Correct Predictions / Total Predictions
= TP + TN / TP + TN + FP + FN
b. Precision = Predictions Actually Positive / Total Predicted Positive
= TP / TP + FP
c. Recall = Predictions Actually Positive / Total Actual Positive
d. F1 Score = 2 x (Recall x Precision) / (Recall + Precision)
High bias mainly occurs due to a much simple model. Below are some ways to reduce the high
bias: