QB Unit 1
QB Unit 1
Prepared by
S.BASKARI, M.Tech,MBA,(Ph.D)
ASSISTANT PROFESSOR
PART A
A singular matrix is one that does not have an inverse. This occurs when its determinant is
zero. In machine learning, singular matrices can arise in datasets with linearly dependent features,
which complicates computations like solving linear equations.
Orthogonal vectors have a dot product of zero, meaning they are perpendicular. In machine
learning, orthogonal vectors are important for ensuring that features are independent, which can
simplify computations. They are also used in algorithms like Gram-Schmidt for creating
orthonormal bases.
Machine learning is a branch of artificial intelligence where systems learn from data to
make predictions or decisions without being explicitly programmed. It involves using algorithms to
find patterns and relationships in data. ML is widely used in tasks like image recognition, speech
processing, and recommendation systems.
Supervised learning uses labeled data to train models, where the output is known (e.g.,
regression, classification). Unsupervised learning works with unlabeled data, finding patterns or
groups (e.g., clustering). Both approaches aim to generalize from data but target different types of
problems.
8. What is reinforcement learning, and how does it differ from supervised learning?
Reinforcement learning trains agents through rewards or penalties based on actions taken in
an environment. Unlike supervised learning, it doesn’t rely on labeled data but on feedback from its
actions. It is widely used in games and robotics.
Overfitting occurs when a model performs well on training data but fails to generalize to
unseen data. It happens when the model is too complex or learns noise in the data. Techniques like
regularization, pruning, and cross-validation help prevent overfitting.
Machine learning models analyze transactional data to identify patterns associated with
fraudulent activities. They detect anomalies and alert users in real time. Techniques like supervised
learning and anomaly detection are commonly used. This helps prevent financial losses and
enhances security.
ML assists in disease diagnosis, drug discovery, and patient monitoring. For example, models
analyze medical images to detect cancer. Predictive analytics helps forecast patient outcomes. This
improves treatment accuracy and reduces diagnostic errors.
ML powers NLP tasks like sentiment analysis, translation, and speech recognition. Models
learn from text data to understand context and meaning. Applications include virtual assistants,
chatbots, and automatic transcription. NLP bridges the gap between human language and
computers.
15. How is ML applied in autonomous vehicles?
Shattering occurs when a hypothesis class can perfectly classify all possible labelings
of a dataset. For example, a linear classifier can shatter 3 points in 2D space but not 4. It
demonstrates the expressiveness of the hypothesis.
A higher VC dimension indicates a model with greater capacity to fit data, which can
increase the risk of overfitting. Models with a low VC dimension might underfit. Balancing VC
dimension is essential for good generalization.
A low VC dimension indicates limited model capacity and less risk of overfitting.
However, it may also lead to underfitting if the model cannot capture the data complexity. It
emphasizes simplicity in the hypothesis space. Probably Approximately Correct (PAC) Learning
PAC learning relies on two parameters: error (ϵ) and confidence (δ). Error defines
the acceptable deviation from the true function, and confidence represents the probability that the
hypothesis is correct. These parameters guide learning algorithms.
23. Explain the term “probably” in PAC learning.
"Approximately correct" refers to a hypothesis with an error less than a predefined threshold
(ϵ). It ensures that the model's predictions are close to the true labels. This balances accuracy with
feasibility in learning.
The hypothesis space is the set of all possible functions a learning algorithm can
consider. It defines the model's flexibility and capacity. A larger hypothesis space provides more
options but increases the risk of overfitting. Selecting the right hypothesis space is crucial.
High bias leads to underfitting, as the model is too simplistic. High variance
causes overfitting, as the model captures noise in the data. The bias-variance trade-off seeks a
balance for optimal generalization.
A too-large hypothesis space increases the risk of overfitting, while a too-small space
may lead to underfitting. The hypothesis space must match the complexity of the problem to ensure
good generalization and performance.
PART B&C
2. Describe the Vapnik-Chervonenkis (VC) dimension and its significance in machine learning.
3. Illustrate the concept of Probably Approximately Correct (PAC) learning with examples.
4. Discuss the importance of hypothesis spaces in machine learning and how they impact learning
performance.
5. What is inductive bias in machine learning? Explain its role with suitable examples.
6. Analyze the bias-variance trade-off and its effect on model performance in machine learning.
7. Provide a detailed review of linear algebra concepts essential for machine learning, such as vector
spaces, matrices, eigenvalues, and eigenvectors.
8. With examples, explain the Vapnik-Chervonenkis (VC) dimension and its role in determining the
capacity of a learning model.
9. What is PAC learning? Explain the conditions under which a hypothesis is considered PAC-
learnable.
10. Explain the concept of generalization in machine learning and discuss its importance with
relevant examples.
11. Discuss the hypothesis space in detail and explain how the choice of hypothesis space impacts
the learning algorithm.
12. What is the bias-variance trade-off? Explain its mathematical formulation and impact on
machine learning models.