0% found this document useful (0 votes)
43 views15 pages

MLMock Testsolution

The document discusses key concepts in hypothesis testing and machine learning algorithms. It provides examples to distinguish between type I and type II errors, defines p-values as the probability of rejecting the null hypothesis, and identifies random forest as an algorithm that uses bagging. It also answers questions about gradient boosting, linear regression, Bayesian networks, and probability.

Uploaded by

NAVEEN SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views15 pages

MLMock Testsolution

The document discusses key concepts in hypothesis testing and machine learning algorithms. It provides examples to distinguish between type I and type II errors, defines p-values as the probability of rejecting the null hypothesis, and identifies random forest as an algorithm that uses bagging. It also answers questions about gradient boosting, linear regression, Bayesian networks, and probability.

Uploaded by

NAVEEN SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 15

1.

Type I error occurs when we:


a. reject a false null hypothesis
b. reject a true null hypothesis
c. do not reject a false null hypothesis
d. do not reject a true null hypothesis
e. fail to make a decision regarding whether to reject a hypothesis or not

Answer B

2. In a criminal trial, a Type I error is made when:


a. a guilty defendant is acquitted (set free)
b. an innocent person is convicted (sent to jail)
c. a guilty defendant is convicted
d. an innocent person is acquitted
e. no decision is made about whether to acquit or convict the defendant

Ans B

3. A Type II error occurs when we:


a. reject a false null hypothesis
b. reject a true null hypothesis
c. do not reject a false null hypothesis
d. do not reject a true null hypothesis
e. fail to make a decision regarding whether to reject a hypothesis or not

Ans C

4. If we reject the null hypothesis, we conclude that:


a. there is enough statistical evidence to infer that the alternative hypothesis is
true
b. there is not enough statistical evidence to infer that the alternative
hypothesis is true
c. there is enough statistical evidence to infer that the null hypothesis is true
d. the test is statistically insignificant at whatever level of significance the
test was conducted at
e. further tests need to be carried out to determine for sure whether the null
hypothesis should
be rejected or not

Ans A

5. The p-value of a test is the:


a. smallest significance level at which the null hypothesis cannot be rejected
b. largest significance level at which the null hypothesis cannot be rejected
c. smallest significance level at which the null hypothesis can be rejected
d. largest significance level at which the null hypothesis can be rejected
e. probability that no errors have been made in rejecting or not rejecting the null
hypothesis

Ans C

6. Choose the options that are correct regarding machine learning (ML) and
artificial intelligence (AI),
(A) ML is an alternate way of programming intelligent machines.
(B) ML and AI have very different goals.
(C) ML is a set of techniques that turns a dataset into a software.
(D) AI is a software that can emulate the human mind.
Ans A, D

7. Which of the following is a widely used and effective machine learning


algorithm based on the idea of bagging?

a. Decision Tree

b. Regression

c. Classification

d. Random Forest

Ans D

8. Which of the following is a good test dataset characteristic?

a. Large enough to yield meaningful results

b. Is representative of the dataset as a whole

c. Both A and B

d. None of the above

Ans C

9. Different learning methods does not include?


a) Memorization
b) Analogy
c) Deduction
d) Introduction

Ans D

10. In order to determine the p-value of a hypothesis test, which of the following
is not needed?
a. whether the test is one-tail or two-tail
b. the value of the test statistic
c. the form of the null and alternate hypotheses
d. the level of significance
e. all of the above are needed to determine the p-value

Ans D

11. The rejection probability of Null Hypothesis when it is true is called as?
a) Level of Confidence
b) Level of Significance
c) Level of Margin
d) Level of Rejection

Ans B

12. A statement made about a population for testing purpose is called?


a) Statistic
b) Hypothesis
c) Level of Significance
d) Test-Statistic
Ans B

13.A feature F1 can take certain value: A, B, C, D, E, & F and represents grade
of students from a college.

1) Which of the following statement is true in following case?

A) Feature F1 is an example of nominal variable.


B) Feature F1 is an example of ordinal variable.
C) It doesn’t belong to any of the above category.
D) Both of these

Ans B

14. The purpose of hypothesis testing is to:


a. test how far the mean of a sample is from zero
b. determine whether a statistical result is significant
c. determine the appropriate value of the significance level
d. derive the standard error of the data
e. determine the appropriate value of the null hypothesis

Ans B

15. To find the minimum or the maximum of a function, we set the gradient to zero
because:

a.The value of the gradient at extrema of a function is always zero

b.Depends on the type of problem

c.Both A and B

d.None of the above

Ans A

16. Which of the following is a disadvantage of decision trees?

a. Factor analysis

b. Decision trees are robust to outliers

c. Decision trees are prone to be overfit

d. None of the above

Ans C

17. Which of the following is/are true about bagging trees?

A.In bagging trees, individual trees are independent of each other


B. Bagging is the method for improving the performance by aggregating the results
of weak learners
A) 1
B) 2
C) 1 and 2
D) None of these

Ans C

18. What are tree based classifiers?


a. Classifiers which form a tree with each attribute at one level
b. Classifiers which perform series of condition checking with one attribute at a
time
c. Both options except none
d. None of the options

Ans C

19 Tree/Rule based classification algorithms generate ... rule to perform the


classification.
a. if-then.
b. while.
c. do while.
d. switch.

Ans A

20. Which of the following sentences are correct in reference to


Information gain?
a. It is biased towards single-valued attributes
b. It is not biased towards multi-valued attributes
c. ID3 makes use of information gain
d. The approach used by ID3 is greedy

Ans C,D

21. What is estimate number of neurons in human cortex?


a) 10^8
b) 10^52
c) 10^11
d) 10^20

Ans C

22. Why can’t we design a perfect neural network?


a) full operation is still not known of biological neurons
b) number of neuron is itself not precisely known
c) number of interconnection is very large & is very complex
d) all of the mentioned

Ans D

23. What is temporal learning?


a) concerned with capturing input-output relationship in patterns
b) concerned with capturing weight relationships
c) both weight & input-output relationships
d) none of the mentioned
Ans B

24. Which one of these is a tree based learner?


a. Rule based
b. Bayesian Belief Network
c. Bayesian classifier
d. Random Forest

Ans D

25. Which of the following is true about training and testing error in such case?
Suppose you want to apply AdaBoost algorithm on Data D which has T observations.
You set half the data for training and half for testing initially. Now you want to
increase the number of data points for training T1, T2 … Tn where T1 < T2…. Tn-1 <
Tn.
A) The difference between training error and test error increases as number of
observations increases
B) The difference between training error and test error decreases as number of
observations increases
C) The difference between training error and test error will not change
D) None of These

Ans B

26. For what purpose energy minima are used?


a) pattern classification
b) patten mapping
c) pattern storage
d) none of the mentioned

Ans C

27. Consider the hyperparameter “number of trees” and arrange the options in terms
of time taken by each hyperparameter for building the Gradient Boosting model?
remaining hyperparameters are same
1. Number of trees = 100
2. Number of trees = 500
3. Number of trees = 1000
A) 1~2~3
B) 1<2<3
C) 1>2>3
D) None of these

Ans B

28. In gradient boosting it is important use learning rate to get optimum output.
Which of the following is true abut choosing the learning rate?
A) Learning rate should be as high as possible
B) Learning Rate should be as low as possible
C) Learning Rate should be low but it should not be very low
D) Learning rate should be high but it should not be very high
Solution: C
29. Given the following data pairs (x, y), find the regression equation.
(1, 1.24), (2, 5.23), (3, 7.24), (4, 7.60), (5, 9.97), (6, 14.31), (7, 13.99), (8,
14.88),
(9, 18.04), (10, 20.70)
a. y = 0.490 x - 0.053
b. y = 2.04 x
c. y = 1.98 x + 0.436
d. y = 0.49 x

Ans C

30.Failing to reject the null hypothesis when it is false is:


a. alpha
b. Type I error
c. beta
d. Type II error

Ans D

31. A parameter is:


a. a sample characteristic
b. a population characteristic
c. unknown
d. normal normally distributed

Ans B

32. A statistic is:


a. a sample characteristic
b. a population characteristic
c. unknown
d. normally distributed

Ans A

33. When asked questions concerning personal hygiene, people commonly lie. This is
an example of:
a. sampling bias
b. confounding
c. non-response bias
d. response bias

Ans D

34. Selection of a football team for FIFA World Cup is called as?
a) random sampling
b) systematic sampling
c) purposive sampling
d) cluster sampling

Ans C

35. Which of the following statements about Naive Bayes is incorrect?


A. Attributes are equally important.
B. Attributes are statistically dependent of one another given
the class value.
C. Attributes are statistically independent of one another given the
class value.
D. Attributes can be nominal or numeric
E. All of the above

Ans B

36. How the bayesian network can be used to answer any query?
a) Full distribution
b) Joint distribution
c) Partial distribution
d) All of the mentioned

Ans B

37. How the compactness of the bayesian network can be described?


a) Locally structured
b) Fully structured
c) Partial structure
d) All of the mentioned

Ans A

38. What is the consequence between a node and its predecessors while creating
bayesian network?
a) Functionally dependent
b) Dependant
c) Conditionally independent
d) Both Conditionally dependant & Dependant

Ans C

39. Bag I contains 4 white and 6 black balls while another Bag II contains 4 white
and 3 black balls. One ball is drawn at random from one of the bags and it is found
to be black. Find the probability that it was drawn from Bag I.
A. ½ B. 3/5 C. 3/7 D. 7/12

Ans D
Solution:Let E1 be the event of choosing the bag I, E2 the event of choosing the
bag II and A be the event of drawing a black ball.

Then,P(E1) = P(E2) = 1/2

Also,P(A|E1) = P(drawing a black ball from Bag I) = 6/10 = 3/5

P(A|E2) = P(drawing a black ball from Bag II) = 3/7

By using Bayes’ theorem, the probability of drawing a black ball from bag I out of
two bags,
P(E1|A) = P(E1)P(A|E1)P(E1)P(A│E1)+P(E2)P(A|E2)

=12 × 3512 × 37 + 12 × 35 = 7/12

40. Previous probabilities in Bayes Theorem that are changed with help of new
available information are classified as _________________
a) independent probabilities
b) posterior probabilities
c) interior probabilities
d) dependent probabilities

Ans B

41. At a certain university, 4% of men are over 6 feet tall and 1% of women are
over 6 feet tall. The total student population is divided in the ratio 3:2 in
favour of women. If a student is selected at random from among all those over six
feet tall, what is the probability that the student is a woman?
a) 2⁄5
b) 3⁄5
c) 3⁄11
d) 1⁄100
Ans C

42. Three companies A, B and C supply 25%, 35% and 40% of the notebooks to a
school. Past experience shows that 5%, 4% and 2% of the notebooks produced by these
companies are defective. If a notebook was found to be defective, what is the
probability that the notebook was supplied by A?
a) 44⁄69
b) 25⁄69
c) 13⁄24
d) 11⁄24

Answer: b
Explanation: Let A, B and C be the events that notebooks are provided by A, B and C
respectively.
Let D be the event that notebooks are defective
Then,
P(A) = 0.25, P(B) = 0.35, P(C) = 0.4
P(D|A) = 0.05, P(D|B) = 0.04, P(D|C) = 0.02
P(A│D) = (P(D│A)*P(A))/(P(D│A) * P(A) + P(D│B) * P(B) + P(D│C) * P(C) )
= (0.05*0.25)/((0.05*0.25)+(0.04*0.35)+(0.02*0.4)) = 2000/(80*69)
= 25⁄69.

43. A computer program is said to learn from experience E with


respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.Suppose we feed a
learning algorithm a lot of historical weather data, and have it learn to predict
weather. What would be a reasonable choice for P?

a. The weather prediction task.


b. The process of the algorithm examining a large amount of historical weather
data.
c. None of these.
d.The probability of it correctly predicting a future date's weather.

Ans D

44. A computer program is said to learn from experience E with


respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.Suppose we feed a
learning algorithm a lot of historical weather data, and have it learn to predict
weather.
In this setting, what is T?

a.The probability of it correctly predicting a future date's weather.


b.The weather prediction task.
c. The process of the algorithm examining a large amount of historical weather
data.
d. None of these.

Ans B

45. The relationship between number of beers consumed (x) and blood alcohol content
(y) was studied
in 16 male college students by using least squares regression. The following
regression equation
was obtained from this study:
y^2= -0.0127 + 0.0180x
The above equation implies that:
a. each beer consumed increases blood alcohol by 1.27%
b. on average it takes 1.8 beers to increase blood alcohol content by 1%
c. each beer consumed increases blood alcohol by an average of amount of 1.8%
d. each beer consumed increases blood alcohol by exactly 0.018

Ans C

46. A residual plot:


a. displays residuals of the explanatory variable versus residuals of the response
variable.
b. displays residuals of the explanatory variable versus the response variable.
c. displays explanatory variable versus residuals of the response variable.
d. displays the explanatory variable versus the response variable.
e. displays the explanatory variable on the x axis versus the response variable on
the y axis.

Ans C

47. In the following multiple-choice questions, select the best answer.


1. The correlation coefficient is used to determine:
a. A specific value of the y-variable given a specific value of the x-variable
b. A specific value of the x-variable given a specific value of the y-variable
c. The strength of the relationship between the x and y variables
d. None of these

Ans C

48. Movie Recommendation systems are an example of:


Classification
Clustering
Reinforcement Learning
Regression
Options:

A. 2 Only

B. 1 and 2

C. 1 and 3

D. 2 and 3

E. 1, 2 and 3

Ans C

49. Sentiment Analysis is an example of:

Regression
Classification
Clustering
Reinforcement Learning
Options:
A. 1 and 3

B. 1, 2 and 3

C. 1, 2 and 4

D. 1, 2, 3 and 4

Ans C

50. How can Clustering (Unsupervised Learning) be used to improve the accuracy of
Linear Regression model (Supervised Learning):

1.Creating different models for different cluster groups.


2.Creating an input feature for cluster ids as an ordinal variable.
3.Creating an input feature for cluster centroids as a continuous variable.
4.Creating an input feature for cluster size as a continuous variable.
Options:

A. 1 only

B. 1 and 2

C. 1 and 4

D. 3 only

E. 2 and 4

F. All of the above


Ans F

51. Which of the following is the most appropriate strategy for data cleaning
before performing clustering analysis, given less than desirable number of data
points:

1. Capping and flouring of variables


2. Removal of outliers

A. 1 only

B. 2 only

C. 1 and 2

D. None of the above

Ans A

52. What is the minimum no. of variables/ features required to perform clustering?

A. 0

B. 1

C. 2

D. 3
Ans B

53.If two variables V1 and V2, are used for clustering. Which of the following are
true
for K means clustering with k =3?

If V1 and V2 has a correlation of 1, the cluster centroids will be in a straight


line
If V1 and V2 has a correlation of 0, the cluster centroids will be in straight line
Options:

A. 1 only

B. 2 only

C. 1 and 2

D. None of the above

Ans A

54. Which of the following can act as possible termination conditions in K-Means?
1. For a fixed number of iterations.
2. Assignment of observations to clusters does not change between iterations.
Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS falls below a threshold.
Options:
A. 1, 3 and 4
B. 1, 2 and 3
C. 1, 2 and 4
D. All of the above

Ans D

55. Is it possible that Assignment of observations to clusters does not change


between successive iterations in K-Means
A. Yes
B. No
C. Can’t say
D. None of these

Ans A

56. Feature scaling is an important step before applying K-Mean algorithm. What
is reason behind this?
A. In distance calculation it will give the same weights for all features
B. You always get the same clusters. If you use or don’t use feature scaling
C. In Manhattan distance it is an important step but in Euclidian it is not
D. None of these
Solution (A)

57. Genetic Algorithm are a part of

A.Evolutionary Computing

B. inspired by Darwin's theory about evolution - "survival of the fittest"

C. are adaptive heuristic search algorithm based on the evolutionary ideas of


natural selection and genetics
D. All of the above

Ans D

58. What is the problem associated with historical DNA samples?


a) They are less in amount thus amplification is difficult
b) Because the samples are very old, there can be contamination
c) They degrade during repeated cooling and heating cycles
d) As the samples are old, the standard sequences for comparison is not present

Ans B

59. Which of the following is useful in applications of PCR( polymerase chain


reaction)?
a) It is manual
b) Only one sample’s analysis can be carried out at a time
c) It is having a high speed d) The amount of DNA required initially is
high

Ans C

60. A represents the dominant allele and a represents the recessive allele of a
pair. If, in 1000 offspring, 500 are aa
and 500 are of some other genotype, which of the following are most probably the
genotypes of the parents?
a. Aa and Aa
b. Aa and aa
c. AA and Aa
d. AA and aa
e. aa and aa

Ans A

61. Sickle cell anaemia is a genetic disorder. Which of the following doesn’t holds
true for it?
a) It can be analysed by PCR
b) It destroys a restriction site
c) The mutation is in alpha globulin gene
d) The conventional approach took weeks for the whole analyses to be carried out

Ans C

62. Which of the following is the most likely explanation for a high rate of
crossing-over between two genes?
a. The two genes are far apart on the same chromosome.
b. The two genes are both located near the centromere.
c. The two genes are sex-linked.
d. The two genes code for the same protein.
e. The two genes are on different chromosomes.

Ans A

63. How many types of reinforcement learning exist?


a) 2
b) 3
c) 4
d) 5

Ans B

64. What is fixed credit assignment?


a) reinforcement signal given to input-output pair don’t change with time
b) input-output pair determine probability of postive reinforcement
c) input pattern depends on past history
d) none of the mentioned
Ans A

65. Whats true for Drive reinforcement learning?


a) logical And & Or operations are used for input output relations
b) weight corresponds to minimum & maximum of units are connected
c) weights are expressed as linear combination of orthogonal basis vectors
d) change in weight uses a weighted sum of changes in past input values

Ans D

66. Whats true for principal component learning?


a) logical And & Or operations are used for input output relations
b) weight corresponds to minimum & maximum of units are connected
c) weights are expressed as linear combination of orthogonal basis vectors
d) change in weight uses a weighted sum of changes in past input values

Ans C

67. What is the final stage of an agent-based modeling (ABM) methodology?


1 Identifying the agents and determining their behavior
2 Determining agent-related data
3 Validating agent behavior against reality
4 Determining the suitability of ABM

Ans 3

68.All of the following are suitable problems for genetic algorithms EXCEPT
1. dynamic process control
2.pattern recognition with complex patterns
3.simulation of biological models
4.simple optimization with few variables

Ans 4

69. PCR amplification can be used for which type of samples?


a) Old samples only
b) Recent samples only
c) Equally to both recent and old samples
d) Recent samples are preferred but can be applied to old samples also

Ans C

70. Nucleosome is made up of __________


a) DNA, histone core protein
b) DNA, histone core protein, linker H1
c) RNA, histone core protein
d) RNA, histone core protein, linker H1

Ans B

You might also like