0% found this document useful (0 votes)
6 views6 pages

QB Unit 1

The document is a question bank for a Machine Learning course, prepared for the 2024-2025 academic year by S. Baskari. It covers fundamental concepts in machine learning, including linear algebra, supervised and unsupervised learning, reinforcement learning, and the importance of machine learning in various applications. Additionally, it discusses advanced topics like VC dimension, PAC learning, inductive bias, and the bias-variance trade-off.

Uploaded by

nvesh2kids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

QB Unit 1

The document is a question bank for a Machine Learning course, prepared for the 2024-2025 academic year by S. Baskari. It covers fundamental concepts in machine learning, including linear algebra, supervised and unsupervised learning, reinforcement learning, and the importance of machine learning in various applications. Additionally, it discusses advanced topics like VC dimension, PAC learning, inductive bias, and the bias-variance trade-off.

Uploaded by

nvesh2kids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

(ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING)


Academic Year: 2024-2025 (Even Semester)

AL3451 MACHINE LEARNING QUESTION BANK


II YEAR / IV SEM

UNIT I INTRODUCTION TO MACHINE LEARNING

Prepared by

S.BASKARI, M.Tech,MBA,(Ph.D)

ASSISTANT PROFESSOR
PART A

Review of Linear Algebra for Machine Learning

1. What is a vector, and how is it used in machine learning?


A vector is an ordered list of numbers representing a point in space or a quantity
with direction and magnitude. In machine learning, vectors are used to represent data points or
features. They form the basis of many algorithms, such as in calculating distances or
transformations. Vectors are fundamental in feature space representation.

2. What is a matrix, and why is it significant in machine learning?

A matrix is a rectangular array of numbers organized in rows and columns. It is used in


machine learning to represent datasets, perform linear transformations, and store model parameters.
Matrices are essential in operations like gradient computation and dimensionality reduction. Many
algorithms, such as PCA, rely on matrix algebra.

3. What is an eigen vector, and what is its role in ML?

An eigenvector is a vector that remains parallel to itself after a linear transformation.


In machine learning, eigenvectors are used in PCA to identify directions of maximum variance in
data. They are crucial for dimensionality reduction, helping to simplify complex datasets while
preserving significant information.

4. What does it mean for a matrix to be singular?

A singular matrix is one that does not have an inverse. This occurs when its determinant is
zero. In machine learning, singular matrices can arise in datasets with linearly dependent features,
which complicates computations like solving linear equations.

5. Define the concept of orthogonal vectors and their relevance in ML.

Orthogonal vectors have a dot product of zero, meaning they are perpendicular. In machine
learning, orthogonal vectors are important for ensuring that features are independent, which can
simplify computations. They are also used in algorithms like Gram-Schmidt for creating
orthonormal bases.

6. What is machine learning?

Machine learning is a branch of artificial intelligence where systems learn from data to
make predictions or decisions without being explicitly programmed. It involves using algorithms to
find patterns and relationships in data. ML is widely used in tasks like image recognition, speech
processing, and recommendation systems.

7. Differentiate between supervised and unsupervised learning.

Supervised learning uses labeled data to train models, where the output is known (e.g.,
regression, classification). Unsupervised learning works with unlabeled data, finding patterns or
groups (e.g., clustering). Both approaches aim to generalize from data but target different types of
problems.

8. What is reinforcement learning, and how does it differ from supervised learning?

Reinforcement learning trains agents through rewards or penalties based on actions taken in
an environment. Unlike supervised learning, it doesn’t rely on labeled data but on feedback from its
actions. It is widely used in games and robotics.

9. Why is machine learning important in real-world applications?

Machine learning automates decision-making by identifying patterns in data. It powers


applications like fraud detection, personalized recommendations, and autonomous vehicles. Its
ability to scale with data makes it a key technology for solving complex problems efficiently.

10. What is overfitting in machine learning?

Overfitting occurs when a model performs well on training data but fails to generalize to
unseen data. It happens when the model is too complex or learns noise in the data. Techniques like
regularization, pruning, and cross-validation help prevent overfitting.

11. Explain how machine learning is used in fraud detection.

Machine learning models analyze transactional data to identify patterns associated with
fraudulent activities. They detect anomalies and alert users in real time. Techniques like supervised
learning and anomaly detection are commonly used. This helps prevent financial losses and
enhances security.

12. What is the role of ML in personalized recommendations?

ML analyzes user preferences and behaviors to recommend products or content. Algorithms


like collaborative filtering and content-based filtering personalize the experience. For example,
streaming services use ML to suggest shows users are likely to enjoy. This increases user
engagement and satisfaction.

13. How is ML used in healthcare?

ML assists in disease diagnosis, drug discovery, and patient monitoring. For example, models
analyze medical images to detect cancer. Predictive analytics helps forecast patient outcomes. This
improves treatment accuracy and reduces diagnostic errors.

14. What is the importance of ML in natural language processing?

ML powers NLP tasks like sentiment analysis, translation, and speech recognition. Models
learn from text data to understand context and meaning. Applications include virtual assistants,
chatbots, and automatic transcription. NLP bridges the gap between human language and
computers.
15. How is ML applied in autonomous vehicles?

ML enables vehicles to perceive their environment through sensors and


cameras. It helps in object detection, path planning, and decision-making. Algorithms like deep
learning process massive amounts of driving data. This ensures safe and efficient autonomous
navigation. Vapnik-Chervonenkis (VC) Dimension

16. What is the Vapnik-Chervonenkis (VC) dimension?

The VC dimension measures the capacity of a hypothesis class by determining the


largest set of points it can shatter. Shattering means the hypothesis can classify all possible labelings
of the dataset. It helps assess a model’s complexity and generalization ability.

17. Why is VC dimension important in ML?

VC dimension quantifies the trade-off between model complexity and generalization.


A high VC dimension indicates greater capacity but may lead to overfitting. It is crucial for
understanding the limits of learnability in a hypothesis space.

18. Explain the concept of shattering in VC dimension.

Shattering occurs when a hypothesis class can perfectly classify all possible labelings
of a dataset. For example, a linear classifier can shatter 3 points in 2D space but not 4. It
demonstrates the expressiveness of the hypothesis.

19. How is VC dimension related to overfitting?

A higher VC dimension indicates a model with greater capacity to fit data, which can
increase the risk of overfitting. Models with a low VC dimension might underfit. Balancing VC
dimension is essential for good generalization.

20. What does a low VC dimension signify?

A low VC dimension indicates limited model capacity and less risk of overfitting.
However, it may also lead to underfitting if the model cannot capture the data complexity. It
emphasizes simplicity in the hypothesis space. Probably Approximately Correct (PAC) Learning

21. What is PAC learning?

PAC learning is a framework where a model learns a hypothesis that is


approximately correct with high probability. It uses parameters like error (ϵ) and confidence (δ) to
define learnability. PAC theory provides a foundation for understanding model generalization.

22. What are the parameters of PAC learning?

PAC learning relies on two parameters: error (ϵ) and confidence (δ). Error defines
the acceptable deviation from the true function, and confidence represents the probability that the
hypothesis is correct. These parameters guide learning algorithms.
23. Explain the term “probably” in PAC learning.

"Probably" indicates the probability (1−δ) that the learned hypothesis is


approximately correct. It quantifies the likelihood of a model performing well on unseen data,
emphasizing probabilistic guarantees in learning.

24. What does “approximately correct” mean in PAC learning?

"Approximately correct" refers to a hypothesis with an error less than a predefined threshold
(ϵ). It ensures that the model's predictions are close to the true labels. This balances accuracy with
feasibility in learning.

25. Why is PAC learning significant in ML?

PAC learning provides a theoretical framework for understanding generalization.


It helps analyze the feasibility of learning algorithms and the relationship between data size,
hypothesis complexity, and performance. It ensures models are robust to unseen data.

26. What is a hypothesis space?

The hypothesis space is the set of all possible functions a learning algorithm can
consider. It defines the model's flexibility and capacity. A larger hypothesis space provides more
options but increases the risk of overfitting. Selecting the right hypothesis space is crucial.

27. What is inductive bias in ML?

Inductive bias refers to the assumptions a learning algorithm makes to


generalize beyond the training data. For example, linear models assume linear relationships.
Inductive bias guides the learning process and affects model performance on unseen data.

28. Why is inductive bias necessary?

Without inductive bias, a learning algorithm cannot generalize beyond training


data. It reduces the hypothesis space to manageable solutions. Proper inductive bias ensures models
capture relevant patterns while avoiding overfitting.

29. What is the trade-off between bias and variance?

High bias leads to underfitting, as the model is too simplistic. High variance
causes overfitting, as the model captures noise in the data. The bias-variance trade-off seeks a
balance for optimal generalization.

30. How does hypothesis space affect model generalization?

A too-large hypothesis space increases the risk of overfitting, while a too-small space
may lead to underfitting. The hypothesis space must match the complexity of the problem to ensure
good generalization and performance.
PART B&C

1. Explain the role of linear algebra in machine learning with examples.

2. Describe the Vapnik-Chervonenkis (VC) dimension and its significance in machine learning.

3. Illustrate the concept of Probably Approximately Correct (PAC) learning with examples.

4. Discuss the importance of hypothesis spaces in machine learning and how they impact learning
performance.

5. What is inductive bias in machine learning? Explain its role with suitable examples.

6. Analyze the bias-variance trade-off and its effect on model performance in machine learning.

7. Provide a detailed review of linear algebra concepts essential for machine learning, such as vector
spaces, matrices, eigenvalues, and eigenvectors.

8. With examples, explain the Vapnik-Chervonenkis (VC) dimension and its role in determining the
capacity of a learning model.

9. What is PAC learning? Explain the conditions under which a hypothesis is considered PAC-
learnable.

10. Explain the concept of generalization in machine learning and discuss its importance with
relevant examples.

11. Discuss the hypothesis space in detail and explain how the choice of hypothesis space impacts
the learning algorithm.

12. What is the bias-variance trade-off? Explain its mathematical formulation and impact on
machine learning models.

You might also like