0% found this document useful (0 votes)
127 views17 pages

Midterm Exam - Attempt Review

Uploaded by

mrym shuttfup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views17 pages

Midterm Exam - Attempt Review

Uploaded by

mrym shuttfup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

6/19/24, 11:43 PM Midterm Exam: Attempt review

Home / My courses / UGRD-CYBS6101-2333T / MIDTERM EXAMINATION / Midterm Exam

Started on Tuesday, 18 June 2024, 6:49 PM


State Finished
Completed on Tuesday, 18 June 2024, 7:32 PM
Time taken 43 mins 30 secs
Marks 47.00/50.00
Grade 94.00 out of 100.00

Question 1
Correct

Mark 1.00 out of 1.00

What is the process of selecting a subset of data for analysis called?

Select one:
a. Normalizing
b. Sampling
c. Filtering
d. Cleaning

Question 2

Correct

Mark 1.00 out of 1.00

What is the EM algorithm used to estimate in the "E" step?

Select one:
a. The latent variables
b. The model parameters
c. The likelihood of the model
d. The prediction accuracy of the model

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 1/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 3

Correct

Mark 1.00 out of 1.00

What is a batch learning algorithm?

Select one:
a. An algorithm that processes the training data in small groups or batches
b. An algorithm that processes all of the training data at once
c. An algorithm that processes the training data in real-time
d. An algorithm that processes the training data one example at a time

Question 4

Correct

Mark 1.00 out of 1.00

Which of the following is NOT a disadvantage of the k-means algorithm?

Select one:
a. It can be computationally expensive for large datasets
b. It is sensitive to the initial placement of centroids
c. It can handle categorical variables
d. It may produce suboptimal results if the clusters are not spherical

Question 5
Correct

Mark 1.00 out of 1.00

What is the process of transforming data into a consistent format called?

Select one:
a. Filtering
b. Cleaning
c. Normalizing
d. Sampling

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 2/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 6

Correct

Mark 1.00 out of 1.00

Which of the following is NOT a node type in KNIME?

Select one:
a. Sink node
b. Output node
c. Processor node
d. Source node

Question 7

Correct

Mark 1.00 out of 1.00

What is a key characteristic of Bayesian networks?

Select one:
a. They are based on probability theory
b. They use linear algebra for prediction
c. They use decision trees for prediction
d. They are trained on large amounts of data

Question 8
Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm used for clustering tasks?

Select one:
a. DBSCAN
b. Agglomerative clustering
c. K-means
d. All of the above

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 3/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 9

Correct

Mark 1.00 out of 1.00

What is the technical term for an edge in a directed acyclic graph (DAG)?

Select one:
a. Edge
b. Vertex
c. Graph
d. Cycle

Question 10

Correct

Mark 1.00 out of 1.00

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters based on the mean
distance between their points.

Select one:
a. Average
b. Single
c. Centroid
d. Complete

Question 11
Correct

Mark 1.00 out of 1.00

What is the main goal of the EM algorithm?

Select one:
a. To minimize the error between the predicted and actual values of the data
b. To maximize the likelihood of a model given the data
c. To minimize the cost or loss function of a model
d. To maximize the prediction accuracy of the model

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 4/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 12

Correct

Mark 1.00 out of 1.00

What is the Kullback-Leibler (KL) distance used for?

Select one:
a. To measure the dissimilarity between two probability distributions
b. To measure the uncertainty of a probability distribution
c. To measure the predictability of a probability distribution
d. To measure the similarity between two probability distributions

Question 13

Correct

Mark 1.00 out of 1.00

What is the least squares method used for?

Select one:
a. To calculate the mean of a data set
b. To find the line of best fit for a set of data
c. To solve systems of linear equations
d. To calculate the variance of a data set

Question 14
Correct

Mark 1.00 out of 1.00

How is the Hebb rule used in the training of a neural network?

Select one:
a. It is used to determine the input to the neural network
b. It is used to determine the structure of the neural network
c. It is used to calculate the output of the neural network
d. It is used to adjust the weights of the neural network based on the input and output

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 5/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 15

Incorrect

Mark 0.00 out of 1.00

What is the role of the centroid in the k-means algorithm?

Select one:
a. It is a data point that is randomly chosen to be the initial center of a cluster 
b. It is a data point that is randomly chosen to be removed from the cluster
c. It is a data point that is representative of the cluster
d. It is the center point of a cluster

Question 16

Correct

Mark 1.00 out of 1.00

What is KNIME used for?

Select one:
a. Data mining
b. All of the above
c. Data visualization
d. Data analysis

Question 17
Correct

Mark 1.00 out of 1.00

What is supervised learning used for?

Select one:
a. Regression tasks
b. Unsupervised learning tasks
c. Classification tasks
d. Both classification and regression tasks

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 6/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 18

Correct

Mark 1.00 out of 1.00

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the
minimum distance between them.

Select one:
a. Single
b. Average
c. Complete
d. Centroid

Question 19
Correct

Mark 1.00 out of 1.00

Can the Naive Bayes classifier handle missing or incomplete data?

Select one:
a. It can handle missing data but not incomplete data
b. No, it cannot handle missing or incomplete data
c. Yes, it can handle missing or incomplete data
d. It can handle incomplete data but not missing data

Question 20
Correct

Mark 1.00 out of 1.00

Which of the following is NOT a common application of the k-means algorithm?

Select one:
a. Customer segmentation
b. Image compression
c. Anomaly detection
d. Regression analysis

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 7/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 21

Incorrect

Mark 0.00 out of 1.00

Which of the following is NOT a limitation of the k-means algorithm?

Select one:
a. It is sensitive to the initial placement of centroids
b. It may produce suboptimal results if the clusters are not spherical
c. It requires the user to specify the number of clusters in advance 
d. It is not affected by the scale of the variables

Question 22

Correct

Mark 1.00 out of 1.00

The KL distance can be used to measure the difference between two probability distributions in terms of the information content of
the distributions. In this context, the KL distance is also known as:

Select one:
a. The information ratio
b. The information gain
c. The information distance
d. The information divergence

Question 23
Correct

Mark 1.00 out of 1.00

What is the main goal of the k-means algorithm?

Select one:
a. To classify data into predefined categories
b. To discover patterns or relationships within a dataset
c. To predict the value of a continuous target variable
d. To partition a dataset into a specified number of clusters

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 8/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 24

Correct

Mark 1.00 out of 1.00

What is the process of identifying and removing duplicate data called?

Select one:
a. Cleaning
b. Filtering
c. De-duplication
d. Sampling

Question 25

Correct

Mark 1.00 out of 1.00

How are batch learning algorithms typically used?

Select one:
a. To predict continuous values in batch mode
b. To predict continuous values in real-time
c. To classify data in real-time
d. To classify data in batch mode

Question 26
Correct

Mark 1.00 out of 1.00

What are some disadvantages of batch learning algorithms?

Select one:
a. They are slow to adapt to changes in the data
b. They are prone to overfitting
c. They require a small amount of data
d. They require a large amount of resources

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 9/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 27

Correct

Mark 1.00 out of 1.00

What is the disadvantage of the Naive Bayes classifier?

Select one:
a. It is unable to handle large amounts of data
b. It is less accurate
c. It is inflexible
d. It is slower to train and predict

Question 28

Correct

Mark 1.00 out of 1.00

What is the process of calculating the error between the desired output and the actual output of a perceptron called?

Select one:
a. Testing
b. Pruning
c. Training
d. Validation

Question 29
Correct

Mark 1.00 out of 1.00

How does supervised learning differ from unsupervised learning?

Select one:
a. Supervised learning involves predicting a continuous value, while unsupervised learning involves predicting a categorical
value
b. Supervised learning involves clustering data, while unsupervised learning involves predicting a value
c. Supervised learning involves predicting a value, while unsupervised learning involves clustering data
d. Supervised learning involves labeled data, while unsupervised learning involves unlabeled data

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 10/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 30

Correct

Mark 1.00 out of 1.00

What is the Naive Bayes classifier used for?

Select one:
a. To classify data into different categories based on certain features
b. All of the above
c. To predict the value of a continuous variable
d. To predict the probability of an event occurring

Question 31

Correct

Mark 1.00 out of 1.00

Is the least squares method a deterministic or a probabilistic method?

Select one:
a. Probabilistic
b. Deterministic
c. Neither deterministic nor probabilistic
d. Both deterministic and probabilistic

Question 32
Correct

Mark 1.00 out of 1.00

What is an example of a real-world application of directed acyclic graphs (DAGs)?

Select one:
a. Social media networks
b. Data pipelines
c. All of the above
d. Computer networks

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 11/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 33

Correct

Mark 1.00 out of 1.00

The KL distance is often used in machine learning and artificial intelligence to compare two probability distributions, such as a model's
predicted distribution and the true distribution. In this context, the KL distance can be used as a:

Select one:
a. Loss function
b. Cost function
c. Kernel function
d. Activation function

Question 34
Correct

Mark 1.00 out of 1.00

What is the "M" step in the EM algorithm?

Select one:
a. The step where the model parameters are updated
b. The step where the expectation of the latent variables is calculated
c. The step where the prediction accuracy of the model is calculated
d. The step where the likelihood of the model is maximized

Question 35
Correct

Mark 1.00 out of 1.00

In information theory, the KL distance can be used to measure the information lost when approximating one distribution with
another. Which of the following is NOT a property of the KL distance in this context?

Select one:
a. It is non-negative
b. It is zero only when the two distributions are identical
c. It is non-symmetric
d. It is always positive

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 12/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 36

Correct

Mark 1.00 out of 1.00

Hierarchical clustering is sensitive to the ______________ of the data.

Select one:
a. All of the above
b. Scale
c. Outliers
d. Variance

Question 37

Correct

Mark 1.00 out of 1.00

What is the EM algorithm used for?

Select one:
a. Classification
b. All of the above
c. Regression
d. Clustering

Question 38
Correct

Mark 1.00 out of 1.00

How is the slope of the line of best fit calculated using the least squares method?

Select one:
a. By dividing the sum of the product of the x values and the y values by the sum of the squares of the x values
b. By dividing the sum of the y values by the sum of the x values
c. By dividing the sum of the y values by the sum of the squares of the x values
d. By dividing the sum of the product of the x values and the y values by the sum of the x values

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 13/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 39

Correct

Mark 1.00 out of 1.00

The KL distance can be used to measure the information lost when approximating one distribution with another. In this context, the
distribution being approximated is known as the:

Select one:
a. Approximation distribution
b. Target distribution
c. Reference distribution
d. Base distribution

Question 40
Correct

Mark 1.00 out of 1.00

What is an example of a classification task in supervised learning?

Select one:
a. Grouping customers into different segments based on their spending habits
b. Determining whether an email is spam or not
c. Predicting the price of a house based on its characteristics
d. Predicting the stock price for the next day based on historical data

Question 41
Correct

Mark 1.00 out of 1.00

What is the technical term for a node in a directed acyclic graph (DAG)?

Select one:
a. Graph
b. Vertex
c. Edge
d. Cycle

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 14/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 42

Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm used for regression tasks?

Select one:
a. K-nearest neighbors
b. Decision tree
c. Support vector machine
d. Linear regression

Question 43

Correct

Mark 1.00 out of 1.00

What is a node in a Bayesian network?

Select one:
a. A probabilistic relationship between two variables
b. A point in the network where two or more edges meet
c. A variable in the system being modeled
d. All of the above

Question 44
Correct

Mark 1.00 out of 1.00

How is the line of best fit calculated using the least squares method?

Select one:
a. By minimizing the variance of the data set
b. By minimizing the sum of the absolute values of the errors between the data points and the line of best fit
c. By minimizing the mean of the data set
d. By minimizing the sum of the squares of the errors between the data points and the line of best fit

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 15/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 45

Incorrect

Mark 0.00 out of 1.00

What is the process of using data mining techniques to identify trends and make predictions called?

Select one:
a. Data visualization
b. Data analysis
c. Data modeling 
d. Data mining

Question 46

Correct

Mark 1.00 out of 1.00

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the
maximum distance between them.

Select one:
a. Average
b. Complete
c. Single
d. Centroid

Question 47
Correct

Mark 1.00 out of 1.00

What is the "E" step in the EM algorithm?

Select one:
a. The step where the expectation of the latent variables is calculated
b. The step where the prediction accuracy of the model is calculated
c. The step where the model parameters are updated
d. The step where the likelihood of the model is maximized

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 16/17
6/19/24, 11:43 PM Midterm Exam: Attempt review

Question 48

Correct

Mark 1.00 out of 1.00

What is the process of visualizing data using charts and graphs called?

Select one:
a. Data modeling
b. Data analysis
c. Data mining
d. Data visualization

Question 49

Correct

Mark 1.00 out of 1.00

The KL distance between two discrete probability distributions P and Q is defined as:

Select one:
a. The sum of the products of the probabilities of each event in P and Q
b. The sum of the logarithm of the ratio of the probabilities of each event in P and Q
c. The sum of the differences between the probabilities of each event in P and Q
d. The sum of the ratio of the probabilities of each event in P and Q

Question 50
Correct

Mark 1.00 out of 1.00

What does the least squares method aim to minimize?

Select one:
a. The sum of the absolute values of the errors between the data points and the line of best fit
b. The sum of the squares of the errors between the data points and the line of best fit
c. The mean of the data set
d. The variance of the data set

◄ Prelim Lab Exam

Jump to...

Midterm Lab Exam ►

https://trimestralexam.amaesonline.com/2333A/mod/quiz/review.php?attempt=20676&cmid=12924&showall=1 17/17

You might also like