0% found this document useful (0 votes)
32 views

Computational Machine Learning Mock Test

This document contains a mock test for a computational machine learning exam. It consists of 25 multiple choice questions covering various machine learning topics like supervised vs unsupervised learning, model evaluation metrics, overfitting, and cross validation techniques.

Uploaded by

Hà Ngân
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Computational Machine Learning Mock Test

This document contains a mock test for a computational machine learning exam. It consists of 25 multiple choice questions covering various machine learning topics like supervised vs unsupervised learning, model evaluation metrics, overfitting, and cross validation techniques.

Uploaded by

Hà Ngân
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Computational Machine Learning Mock Test

Le Chi Cuong-VinPioneers

Question 1: A tech company wants to develop a system that recommends movies to its users based on
their previous watching habits. Which type of machine learning should be employed?
1. Unsupervised Learning
2. Supervised Learning
3. Semi-Supervised Learning
4. Reinforcement Learning
Question 2: A retail store wants to segment their customers into different groups based on purchasing
behavior to tailor marketing strategies. Which machine learning approach is most appropriate?
1. Clustering
2. Classification
3. Regression
4. Reinforcement Learning
Question 3: An autonomous vehicle must learn to navigate in real time by experiencing various traffic
conditions and outcomes based on different actions taken. Which type of machine learning best describes
this scenario?
1. Supervised Learning
2. Unsupervised Learning
3. Semi-Supervised Learning
4. Reinforcement Learning
Question 4: A company receives thousands of support tickets daily and wants to automatically categorize
them to ensure they are sent to the appropriate department. Which machine learning method should
they use?
1. Clustering
2. Classification
3. Regression
4. Anomaly Detection

1
Question 5: A mobile app development company wants to improve user engagement by learning from
how users interact with their app and adjusting the app’s features in response to those interactions.
Which type of machine learning is most appropriate?
1. Supervised Learning
2. Unsupervised Learning
3. Semi-Supervised Learning
4. Reinforcement Learning
Question 6: A research lab has a large set of images, only some of which are labeled. They need to
categorize all images into distinct categories such as landscapes, cityscapes, and portraits. Which learning
method is most suitable for this task?
1. Supervised Learning
2. Unsupervised Learning
3. Semi-Supervised Learning
4. Reinforcement Learning
Question 7: Which of the following is a typical application of unsupervised learning?
1. Predicting the stock market
2. Detecting fraudulent transactions
3. Segmenting customers based on purchasing behavior
4. Classifying emails into spam and non-spam
Question 8: What is the purpose of data labeling in the context of machine learning data preparation?
1. To enhance the accuracy of the model by increasing data volume
2. To assign meaningful tags or labels to the data, facilitating supervised learning
3. To divide the dataset into training and testing sets
4. To identify outliers and remove them from the dataset
Question 9: What is the main reason for splitting data into training and testing sets in machine learning?
1. To ensure the model can handle large amounts of data
2. To allow the model to learn from one set and validate on another to prevent overfitting
3. To make the computation faster
4. To label the data automatically
Question 10: Data augmentation is commonly used in which of the following scenarios?
1. When the dataset is too large and needs to be reduced
2. When the dataset is imbalanced
3. When there is an abundance of labeled data
4. When the data is perfectly labeled and prepared
Question 11: What is the primary function of a perceptron in deep learning?
1. It acts as a basic decision-making unit that processes input data.
2. It clusters data into meaningful groups.
3. It reduces the dimensionality of the input data.
4. It labels the data automatically.

2
Question 12: In the context of neural networks, what is a multilayer perceptron (MLP)?
1. A single-layer feedforward neural network.
2. A type of convolutional neural network used for image processing.
3. A deep neural network consisting of multiple layers of perceptrons.
4. A reinforcement learning algorithm used for sequence prediction.
Question 13: Why might a deep learning model be preferred over a simpler machine learning model for
tasks like image classification?
1. Deep learning models are always faster to train.
2. Deep learning models can handle linear relationships better.
3. Deep learning models can automatically learn and improve from vast amounts of data.
4. Deep learning models require less computational resources.
Question 14: What is the primary purpose of an activation function in neural networks?
1. To normalize the input data
2. To allow the network to learn non-linear patterns
3. To reduce the dimensionality of the data
4. To prevent overfitting during training
Question 15: What is the primary benefit of using cross-validation in machine learning model evalua-
tion?
1. It reduces the computational time needed for training models.
2. It allows for extensive use of all available data by rotating the validation set.
3. It simplifies the model training process.
4. It reduces the need for data preprocessing.
Question 16: What does stratified k-fold cross-validation specifically aim to achieve?
1. It ensures that each fold is used as a validation set exactly once.
2. It guarantees that all folds have a similar mean response value.
3. It preserves the percentage of samples for each class in every fold.
4. It minimizes the variance between the different folds.
Question 17: In the context of Leave-One-Out (LOO) cross-validation, how many times is the model
trained if the dataset contains 150 observations?
1. Once
2. 150 times
3. 149 times
4. 15 times
Question 18: What distinguishes Leave-P-Out (LPO) cross-validation from Leave-One-Out (LOO)?
1. LPO uses a larger validation set than LOO.
2. LPO requires more computational power than LOO.
3. LPO is less exhaustive and quicker than LOO.
4. LPO does not rotate the validation set, unlike LOO.

3
Question 19: Select the correct answer:
1. In a medical diagnostic test for a rare disease, a patient who actually has the disease tests positive.
What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
2. In a spam email filter, a legitimate email is incorrectly marked as spam. What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
3. In a credit card fraud detection system, a non-fraudulent transaction is correctly identified as not
fraudulent. What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
4. In a medical test for a life-threatening condition, a patient who actually has the disease but has the
condition tests negative. What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
5. In a security system, a harmless object is identified as a threat. What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
6. In an animal conservation project, a species that is actually endangered is correctly identified as
such. What is this called?
(a) True Positive
(b) False Positive
(c) True Negative
(d) False Negative
Question 20: Which metric would you use to evaluate a model’s ability to predict positive classes
accurately?
1. Precision
2. Recall
3. F1-Score
4. Mean Squared Error
Question 21: Which of the following is a sign that a machine learning model might be overfitting?
1. The model performs well on the training data but poorly on unseen data.
2. The model performs equally well on both training and unseen data.
3. The model performs poorly on both training and unseen data.
4. The model’s performance on unseen data exceeds its performance on training data.
Question 22: What technique can be used to prevent overfitting in a machine learning model?

4
1. Decreasing the complexity of the model.
2. Using a larger dataset.
3. Applying regularization techniques.
4. All of the above.
Question 23: What does it indicate if a model has high variance?
1. The model performs well on the training data but poorly on unseen data.
2. The model performs poorly on both training and unseen data.
3. The model performs consistently on both training and unseen data.
4. The model requires more data to perform effectively.
Question 24: The following figure depicts training and validation curves of a learner with increasing
model complexity

1. Which of the curves is more likely to be the training error and which is more likely to be the
validation error? Indicate on the graph by filling the dotted lines.
2. In which regions of the graph are bias and variance low and high? Indicate clearly on the graph
with four labels: “low variance”, “high variance”, “low bias”, “high bias”.
3. In which regions does the model overfit or underfit? Indicate clearly on the graph by labeling
“overfit” and “underfit”.
Question 25: For each of the listed descriptions below, circle whether the experimental set-up is ok or
problematic. If you think it is problematic, briefly state all the problems with their approach:
1. A project team reports a low training error and claims their method is good.
(a) Ok
(b) Problematic
2. A project team claimed great success after achieving 98 percent classification accuracy on a binary
classification task where one class is very rare (e.g., detecting fraud transactions). Their data
consisted of 50 positive examples and 5 000 negative examples.
(a) Ok
(b) Problematic
3. A project team split their data into training and test. Using their training data and cross-validation,
they chose the best parameter setting. They built a model using these parameters and their training

5
data, and then report their error on test data.
(a) Ok
(b) Problematic
Question 26: The following table lists a dataset from the credit scoring domain. Underneath the table
we list two prediction models that are consistent with this dataset, Model 1 and Model 2.

1. Which of these two models do you think will generalise better to samples not contained in the
dataset?
2. Do you think that the model that you rejected in part (1) of this question is overfitting or underfitting
the data?
Question 27:
In the context of regression, what does RMSE (Root Mean Squared Error) measure?
1. The number of classification errors.
2. The average squared difference between the predicted values and actual values.
3. The proportion of variance in the dependent variable that is predictable.
4. The average distance between each data point and the mean of the data set.

You might also like