0% found this document useful (0 votes)
30 views5 pages

Machine Learning QB

The document contains a series of questions related to machine learning concepts, including data splitting, types of learning, algorithms, evaluation metrics, and specific techniques like Naïve Bayes and k-Nearest Neighbour. It covers various aspects of machine learning, such as supervised and unsupervised learning, model evaluation, bias-variance trade-off, and decision trees. Additionally, it poses practical scenarios for applying machine learning principles and evaluating model performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

Machine Learning QB

The document contains a series of questions related to machine learning concepts, including data splitting, types of learning, algorithms, evaluation metrics, and specific techniques like Naïve Bayes and k-Nearest Neighbour. It covers various aspects of machine learning, such as supervised and unsupervised learning, model evaluation, bias-variance trade-off, and decision trees. Additionally, it poses practical scenarios for applying machine learning principles and evaluating model performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Machine Learning QB

1. Identify the two parts in which the data is split up into-


i ) Training data and valid data
ii) Training data and test data
iii) Training data and big data
iv) None of the above
2. Identify the correct alternative to fill in the blank space:
A computer program is said to learn from __________ E with respect to some
class of tasks T and performance measure P, if its performance at tasks in T, as
measured by P, improves with E.
i. Training
ii. Experience
iii. Database
iv. Algorithm
3. Identify the type of learning in which labelled training data is used?
i) Supervised learning
ii) Reinforcement learning
iii) Unsupervised learning
iv) None of these
4. Machine learning is a subset of which of the following?
i) Artificial intelligence ii) Deep learning iii) Data learning iv) None of these
5. What is true about Machine Learning?
i) The main focus of ML is to allow computer systems to learn from experience
without being explicitly programmed or human intervention
ii) ML is a type of artificial intelligence that extracts patterns out of raw data by
using an algorithm or method
iii) ML is a field of computer science
iv) all of these
6. Name the learning process which does the price prediction of crude oil
i. Unsupervised learning
ii. Supervised regression problem
iii. Supervised classification problem
iv. Categorical attribute
7. Which of the following are common classes of problems in machine learning?
i) Classification ii) Regression iii) Clustering iv) All of these
8. What are the two techniques in Machine Learning?
9. What are the areas in robotics and information processing where sequential
prediction problems arises?
10. What is PAC?
11. What according to you is more important between model accuracy and model
performance?
12. What is a test set?
13. What are the different methods for Sequential Supervised Learning?
14. What are the reasons for using feature scaling?
15. What is the rank of the following matrix?

16. What are the main criteria of a multiple regression model?


17. By If a test of hypothesis has a Type I error probability (α) of 0.01, what do
we mean?
18. What is the difference between machine learning and statistical learning?
19. Imagine, you are working with “Analytics Vidhya” and you want to develop
a machine learning algorithm which predicts the number of views on the articles.
Your analysis is based on features like author name, number of articles written by
the same author on Analytics Vidhya in past and a few other features. Which of
the following evaluation metric would you choose in that case?
A. Mean Square Error
B. Accuracy
C. F1 Score
20. Which of the following statement(s) is / are true for Gradient Decent (GD)
and Stochastic Gradient Decent (SGD)?
A. In GD and SGD, you update a set of parameters in an iterative manner to
minimize the error function.
B. In SGD, you have to run through all the samples in your training set for a single
iteration. update of a parameter in each iteration.
C. In GD, you either use the entire data or a subset of training data to update a
parameter in each
21. In which type of data, machine learning algorithms do not work.
61. Identify what type of algorithm is k-Nearest Neighbour is
i. Supervised learning algorithm
ii. Unsupervised learning algorithm
iii. Semi-supervised learning algorithm
iv. Weakly supervised learning algorithm
62. Identify which of the following is not an inductive bias in a decision tree?
i. It prefers longer tree over shorter tree
ii. Trees that place nodes near the root with high information gain
are preferred
iii. Overfitting is a natural phenomenon in a decision tree
iv. Prefer the shortest hypothesis that fits the data
63. Identify what does Naïve Bayes require?
i. Categorical values
ii. Numerical values
iii. Either a or b
iv. Both a and b
77. What are the advantages of Naïve Bayes Algorithm?
23. Explain Bias and Variance? Also explain about Bias-Variance Trade- Off.
24. Explain hypothesis, hypothesis space, and Inductive bias.
25. Define concept learning. Explain the task of concept learning.
26. How the concept learning can be viewed as the task of searching? Explain.
27. Explain sum of squares due to error in multiple linear regression.
28. In what areas pattern recognition is used?
29. How does Machine Learning differ from Deep Learning?
30. Mention some of the Exploratory Data Analysis (EDA) Techniques?
31. Is it possible to test for the probability of improving model accuracy without
cross validation techniques? If yes , then please explain.
32. Define and explain the concept of inductive bias with some examples?
33. A dataset is given to you about utilities fraud detection. You have built a
classifier model and achieve a performance score of 98.5%. Is this a good model?
If yes, justify; if no, then what can you do?
34. You are given a data set on cancer detection. You’ve built a classification
model and achieved an accuracy of 95%. Why shouldn’t you be happy with your
model performance? What can you do about it?
35. What are the criteria for the splitting of training set and test? Suppose your
data size is one trillion, how do you split the data in training and test set?
36. For given the set of values X = (3, 9, 11, 5, 2) T and Y = (1, 8, 11, 4, 3)T ,
determine the regression coefficients.
37. Apply multiple linear regression for the values given in Table below, where
weekly sales along with sales for products x1 and x2 are provided. Use matrix
approach for finding multiple linear regression.

84. Explain, in brief, the decision tree algorithm, node and leaf in decision tree.
85. What is entropy of a decision tree? Explain information gain in a decision
tree.
86. Explain the kNN model in detail. Discuss the error rate and validation error
in the kNN algorithm. Explain how to calculate the distance between the test data
and the training data for kNN.
87. Explain prior probability, posterior probability and likelihood probability with
an example.
88. Explain Naïve Bayes classifier? Why is it named so? What is optimal Bayes
classifier?
89. Explain with examples how unsupervised learning is different from
supervised learning?
106. Explain Naïve Bayes classifier.
118. What is entropy? Why is it used in Machine learning algorithm?
126. Consider the training dataset given in Table below. Use k-NN with 3 nearest
neighbours, to predict whether the student obtaining 6.1 in CGPA, 40 in
Assessment, 5 in Project submitted; will pass or fail.

129. For the following set of training samples, find which attribute can be chosen
as the root for decision tree classification.

You might also like