0% found this document useful (0 votes)
11 views6 pages

Question Bank 3&4 Unit ML

Uploaded by

jagadeeswaris.ai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

Question Bank 3&4 Unit ML

Uploaded by

jagadeeswaris.ai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

UAM1501 - MACHINE LEARNING – II

QUESTION BANK
UNIT- III
2-Marks

1. What is the dirichlet process?


The Dirichlet process (DP) is a stochastic process whose sample paths are proba- bility
measures with probability one. Stochastic processes are distributions over function spaces,
with sample paths being random functions drawn from the dis- tribution.
2. What is the EM algorithm?
The EM algorithm is an iterative approach that cycles between two modes. The first mode
attempts to estimate the missing or latent variables, called the estimation-step or E-step. The
second mode attempts to optimize the parameters of the model to best explain the data, called
the maximization-step or M-step.
3. What is the gaussian mixture model?
A Gaussian mixture model is a probabilistic model that assumes all the data points are
generated from a mixture of a finite number of Gaussian distributions with unknown
parameters. One can think of mixture models as generalizing k-means clustering to
incorporate information about the covariance structure of the data as well as the centers of
the latent Gaussians.
4. Define model-based clustering.
It is a statistical approach to data clustering. The observed (multivariate) data is assumed to
have been generated from a finite mixture of component models. Each component model is a
probability distribution, typically a parametric multivariate distribution.
5. How does affinity propagation work?
In contrast to other clustering algorithms, Affinity Propagation does not require the number
of clusters to be specified beforehand. Instead, it iteratively adjusts the “responsibilities” and
“availabilities” between data points to determine the number of clusters and the assignment
of data points to those clusters.
6. What is a Laplacian in graph theory?
The Laplacian allows a natural link between discrete representations, such as graphs, and
continuous representations, such as vector spaces and manifolds. The most important
application of the Laplacian is spectral clustering that corresponds to a computationally
tractable solution to the graph partitionning problem.
7. Difference between single link complete link average link.
Single linkage: computes the minimum distance between clusters before merging them.
Complete linkage: computes the maximum distance between clusters before merging them.
Average linkage: computes the average distance between clusters before merging them.
8. Define Bayesian hierarchical clustering.
The Bayesian Hierarchical Clustering (BHC) algorithm is a fast approximate inference
method for a Dirichlet process mixture model, which performs agglomerative hierarchical
clustering in a Bayesian framework. BHC has previously been used to cluster genes from
single time point microarray observations has applied an iterative reclassification extension
to BHC which leads to improvements in the quality of the clustering.
9. Define curse of dimensionality.
It is defined as follows, as the number of dimensions or features increases, the amount of
data needed to generalize the machine learning model accurately increases exponentially.
The increase in dimensions makes the data sparse, and it increases the difficulty of
generalizing the model. More training data is needed to generalize that model better.
10. What is dimensionality reduction?
It refers to techniques for reducing the number of input variables in training data. When
dealing with high dimensional data, it is often useful to reduce the dimensionality by
projecting the data to a lower dimensional subspace which captures the “essence” of the data.
11. What is singular value decomposition?
The singular value decomposition of a matrix A is the factorization of A into the product of
three matrices A = UDV T where the columns of U and V are orthonormal and the matrix D
is diagonal with positive real entries. The SVD is useful in many tasks.
12. What is the latent variable model approach?
Latent variable modeling refers to a varied group of statistical procedures that use one or
more unobserved (latent) variables to explain and explore relationships between a larger set
of observed variables.
13. Define Gibbs Sampling.
The Gibbs Sampling is a Monte Carlo Markov Chain method that iteratively draws an
instance from the distribution of each variable, conditional on the current values of the other
variables in order to estimate complex joint distributions. In contrast to the Metropolis-
Hastings algorithm, we always accept the proposal.
14. What is a Latent Dirichlet Allocation?
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique to extract topics
from a given corpus. The term latent conveys something that exists but is not yet developed.
In other words, latent means hidden or concealed.

16-Marks

1. Solve the below problem using k-means clustering algorithm.


Point Coordinates Point Coordinates
A1 (2,10) A9 (10,12)
A2 (2,6) A10 (7,5)
A3 (11,11) A11 (9,11)
A4 (6,9) A12 (4,6)
A5 (6,4) A13 (3,10)
A6 (1,2) A14 (3,8)
A7 (5,10) A15 (6,11)
A8 (4,9)

2. Explain in detail about EM algorithm with suitable example.


3. Applying and fitting dirichlet process mixture models.
4. Solve the problem using spectral clustering algorithms.

5. Solve the problem using hierarchical clustering algorithms.


6. Solve the problem using fuzzy based clustering algorithms.
Cluster (1, 3) (2, 5) (4, 8) (7, 9)
1) 0.8 0.7 0.2 0.1
2) 0.2 0.3 0.8 0.9

7. Solve the following data using principal components analysis.


CLASS 1 CLASS 2

X=2,3,4 X=5,6,7

Y=1,5,3 Y=6,7,8

8. Explain in detail about latent dirichlet allocation,

UNIT- IV
2-Marks

1. What is bayesian network?


A Bayesian network (BN) is a probabilistic graphical model for representing knowledge
about an uncertain domain where each node corresponds to a random variable and each
edge represents the conditional probability for the corresponding random variables.
2. Define joint probability distribution.
A joint probability distribution represents a probability distribution for two or more
random variables. Instead of events being labelled A and B, the condition is to use X and
Y as given below.
f(x,y) = P(X = x, Y = y)
The main purpose of this is to look for a relationship between two variables.
3. What is linear gaussian model?
A linear-Gaussian model is a Bayes net where all the variables are Gaussian, and each
variable's mean is linear in the values of its parents. They are widely used because they
support efficient inference. Linear dynamical systems are an important special case.
4. What is d-separation used for?
d-separation is a criterion for deciding, from a given a causal graph, whether a set X of
variables is independent of another set Y, given a third set Z. The idea is to associate
"dependence" with "connectedness" (i.e., the existence of a connecting path) and
"independence" with "unconnected-ness" or "separation".
5. Discuss factorization properties.
The integers and the polynomials over a field share the property of unique factorization,
that is, every nonzero element may be factored into a product of an invertible element
(a unit, ±1 in the case of integers) and a product of irreducible elements (prime numbers, in
the case of integers), and this factorization is unique up to rearranging the factors and
shifting units among the factors.
6. What is conditional independence property?
Conditional independence is basically the concept of independence, P(A ∩ B) = P(A) *
P(B), applied to the conditional model. But I've seen not just the definition of P(A ∩ B|C),
but also P(A|B ∩ C)!.
7. Define learning.
Learning is “a process that leads to change, which occurs as a result of experience and
increases the potential for improved performance and future learning.
8. What is plate notation?
plate notation is a method of representing variables that repeat in a graphical model.
Instead of drawing each repeated variable individually, a plate or rectangle is used to
group variables into a subgraph that repeat together, and a number is drawn on the plate to
represent the number of repetitions of the subgraph in the plate.[1] The assumptions are
that the subgraph is duplicated that many times, the variables in the subgraph are indexed
by the repetition number, and any links that cross a plate boundary are replicated once for
each subgraph repetition.
9. What is meant by Naïve Bayes classifier?
Naïve Bayes classifiers. The Naïve Bayes classifier is a supervised machine learning
algorithm, which is used for classification tasks, like text classification. It is also part of a
family of generative learning algorithms, meaning that it seeks to model the distribution
of inputs of a given class or category.
10. Why is Markov chain used?
Predicting traffic flows, communications networks, genetic issues, and queues are
examples where Markov chains can be used to model performance. Devising a physical
model for these chaotic systems would be impossibly complicated but doing so using
Markov chains is quite simple.
11. Define markov decision process.
A Markov decision process (MDP) is defined as a stochastic decision-making process that
uses a mathematical framework to model the decision-making of a dynamic system in
scenarios where the results are either random or controlled by a decision maker, which
makes sequential decisions over time.
12. What is a Markov model?
A Markov model is a stochastic method for randomly changing systems that possess the
Markov property. This means that, at any given time, the next state is only dependent on
the current state and is independent of anything in the past.
13. Define hidden markov models.
These are used to represent systems with some unobservable states. In addition to
showing states and transition rates, hidden Markov models also represent observations
and observation likelihoods for each state. Hidden Markov models are used for a range of
applications, including thermodynamics, finance and pattern recognition.

16-Marks

1. Consider the following Bayesian network. A, B, C, and D are Boolean random variables. If
we know that A is true, what is the probability of D being true? Explain in Bayesian
network in detail.

2. Write short notes on conditional independence.

3. Explain in detail about the markov random fields.

4. Solve naive bayes classifiers for a given problem


Outlook Play Outlook Play
0 Rainy Yes 7 Overcast Yes
1 Sunny Yes 8 Rainy No
2 Overcast Yes 9 Sunny No
3 Overcast Yes 10 Sunny Yes
4 Sunny No 11 Rainy No
5 Rainy Yes 12 Overcast Yes
6 Sunny Yes 13 Overcast Yes

5. Write short note on learning.


6. Briefly discuss about markov model.

7. Explain in detail about hidden markov model with forward algorithm.

You might also like