0% found this document useful (0 votes)
6 views5 pages

QB Unit 2

The document is a question bank for a Machine Learning course, specifically focusing on supervised learning topics for the academic year 2024-2025. It includes questions and answers on various machine learning concepts such as linear regression, classification algorithms, and support vector machines. Additionally, it covers advanced topics like Bayesian regression, decision trees, and ensemble methods like random forests.

Uploaded by

nvesh2kids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

QB Unit 2

The document is a question bank for a Machine Learning course, specifically focusing on supervised learning topics for the academic year 2024-2025. It includes questions and answers on various machine learning concepts such as linear regression, classification algorithms, and support vector machines. Additionally, it covers advanced topics like Bayesian regression, decision trees, and ensemble methods like random forests.

Uploaded by

nvesh2kids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

(ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING)


Academic Year: 2024-2025 (Even Semester)

AL3451 MACHINE LEARNING QUESTION BANK

II YEAR / IV SEM

UNIT II – SUPERVISED LEARNING

Prepared by

S.BASKARI, M.Tech,MBA,(Ph.D)

ASSISTANT PROFESSOR
PART A

1.What is the primary goal of linear regression?


The primary goal of linear regression is to model the relationship between a dependent variable and
one or more independent variables by fitting a linear equation to observed data.

2.What is the least squares method in linear regression?

The least squares method minimizes the sum of the squares of the residuals (the differences between
observed and predicted values) to find the best-fitting linear model.

3. What distinguishes multiple linear regression from simple linear regression?

Multiple linear regression involves more than one independent variable, whereas simple linear regression involves
only one independent variable.

4. How does Bayesian linear regression differ from ordinary linear regression?

Bayesian linear regression incorporates prior distributions over the model parameters and updates these priors with
data to obtain posterior distributions, providing a probabilistic interpretation of the regression model.

5. What role does gradient descent play in linear regression?

Gradient descent is an optimization algorithm used to minimize the cost function by iteratively adjusting the
model parameters in the direction of the steepest decrease in the cost function.

6. What is the discriminant function in the context of classification?

The discriminant function is a function used in classification to assign input data points to different classes by
determining decision boundaries.

7. Describe the Perceptron algorithm. The Perceptron algorithm is a simple linear classifier that updates the
weights of the model iteratively based on misclassified examples until it finds a separating hyperplane or reaches a
stopping criterion.

8. What is the key characteristic of logistic regression?

Logistic regression models the probability of a binary outcome using a logistic function, providing a probabilistic
interpretation of classification.

9. Explain the Naive Bayes classifier.

The Naive Bayes classifier is a probabilistic generative model that assumes independence between features given
the class label and uses Bayes' theorem to predict the probability of each class.

10. What is the objective of a Support Vector Machine (SVM)?

The objective of an SVM is to find the hyperplane that maximizes the margin between different classes, thereby
achieving the best possible separation.
11. How does a decision tree classify data?

A decision tree classifies data by recursively splitting the data into subsets based on the value of input features,
leading to a tree structure where each leaf node represents a class label.

12. What is a random forest?

A random forest is an ensemble learning method that constructs multiple decision trees during training and outputs
the mode of their predictions for classification tasks or the average for regression tasks.

13. What are the assumptions of the linear regression model? The assumptions of the linear regression model
include linearity, independence, homoscedasticity (constant variance of errors), normality of errors, and no
multicollinearity among the independent variables.

14. How is the goodness-of-fit of a linear regression model measured?

The goodness-of-fit of a linear regression model is commonly measured using the coefficient of determination
(R²), which indicates the proportion of the variance in the dependent variable that is predictable from the
independent variables

15. Compare the Perceptron algorithm with logistic regression.

The Perceptron algorithm is a simple linear classifier that updates weights based on misclassified examples
without providing probability estimates, while logistic regression models the probability of class membership using
the logistic function and is suitable for binary classification with a probabilistic interpretation.

16. What is the significance of the kernel trick in Support Vector Machines (SVM)?

The kernel trick allows SVMs to efficiently perform nonlinear classification by implicitly mapping input features
into a higher-dimensional space without explicitly computing the coordinates in that space, enabling the separation
of nonlinearly separable data.

17. What is the Gini index, and how is it used in decision trees?

The Gini index is a measure of impurity used to evaluate splits in decision trees. It quantifies how often a randomly
chosen element would be incorrectly classified if it were randomly labelled according to the distribution of labels
in the subset.

18. How does the inclusion of prior distributions influence the results in Bayesian linear regression?

The inclusion of prior distributions in Bayesian linear regression allows incorporating prior knowledge about the
model parameters, leading to posterior distributions that combine the prior information with the observed data,
providing a more comprehensive uncertainty estimation.

19. What is the main objective of the Maximum Margin Classifier?

The main objective of the Maximum Margin Classifier is to find a hyperplane that separates the data into classes
while maximizing the margin, which is the distance between the hyperplane and the nearest data points from either
class.

20. What are support vectors in the context of SVM?

Support vectors are the data points that are closest to the separating hyperplane in an SVM. They are critical in
defining the position and orientation of the hyperplane and directly influence the margin.

21. Define the term 'margin' in the context of SVM.


In SVM, the margin is defined as the distance between the hyperplane and the nearest points from each class. The
goal of SVM is to maximize this margin, which helps in achieving better generalization on the test data.

22. How does a linear SVM differ from a non-linear SVM?

A linear SVM uses a linear kernel to find a straight-line hyperplane for classification, suitable for linearly
separable data. A non-linear SVM uses kernel functions (e.g., polynomial, RBF) to transform the data into a
higher-dimensional space where a linear hyperplane can be used to separate non-linearly separable data.

23. What is the significance of the regularization parameter (C) in SVM?

The regularization parameter (C) in SVM controls the trade-off between maximizing the margin and minimizing
the classification error. A smaller C allows for a larger margin but potentially more misclassifications, while a
larger C aims for fewer misclassifications but may result in a smaller margin.

24. What is the role of the kernel function in SVM?

The kernel function in SVM enables the transformation of data into a higher dimensional space where a linear
hyperplane can separate the classes, allowing SVM AL3451_ML 4931_Grace College of Engineering,
Thoothukudi to handle non-linearly separable data. Common kernels include linear, polynomial, and radial basis
function (RBF).

25. Explain the concept of 'soft margin' in SVM.

The 'soft margin' concept in SVM allows for some misclassifications in the training data by introducing slack
variables. This approach provides a balance between achieving a large margin and allowing some classification
errors to improve generalization on noisy data.

26. What is the dual formulation in the context of SVM?

The dual formulation of SVM involves expressing the optimization problem in terms of Lagrange multipliers,
which simplifies the computation when using kernel functions and allows handling higher-dimensional spaces
without explicit transformations.

27. How does the SVM handle multi-class classification problems?

SVM handles multi-class classification problems using techniques like One-vs One (OvO) and One-vs-All (OvA).
OvO trains a classifier for every pair of classes, while OvA trains a classifier for each class against all other
classes.

28. What are the advantages of using SVM for classification tasks?

Advantages of using SVM for classification tasks include its effectiveness in high-dimensional spaces, robustness
to overfitting (especially with proper regularization), and its ability to handle non-linear data using appropriate
kernel functions
PART B&C

Linear Regression Models

1.Explain the least squares method for linear regression and its application in machine learning

2.Differentiate between single-variable and multiple-variable linear regression. Provide examples for both.

3. Discuss Bayesian linear regression and its advantages over standard linear regression.

4. Explain the gradient descent algorithm and its application in optimizing linear regression models.

5. Derive the equation for the cost function in linear regression and explain how gradient descent minimizes it.

6. Compare and contrast Bayesian linear regression and least squares regression.

Linear Classification Models

7. Describe the discriminant function and its role in linear classification.

8. Explain the Perceptron algorithm with an example. Discuss its convergence properties.

9. Illustrate the logistic regression model and derive its cost function.

10. Explain the Naive Bayes classifier and its assumptions. Discuss its application in text classification.

11. Compare probabilistic discriminative models (logistic regression) with probabilistic generative models (Naive
Bayes).

12. Discuss the maximum margin classifier and its implementation using support vector machines (SVM).

13. Explain the dual formulation of the support vector machine and the role of kernels.

Decision Tree and Random Forest

14. Illustrate the process of constructing a decision tree using the CART algorithm.

15. Explain how information gain and Gini index are used in decision tree splitting criteria.

16. Discuss the advantages and limitations of decision trees. How are these addressed in random forests?

17. Explain the concept of ensemble learning and how random forests improve upon single decision trees.

18. Compare and contrast decision trees with random forests in terms of accuracy, overfitting, and interpretability.

General Topics in Supervised Learning

19. Provide a detailed comparison of gradient descent and least squares methods for optimizing regression models.

20. Explain the concept of overfitting in supervised learning and discuss methods to prevent it, using examples
from decision trees and random forests.

You might also like