0% found this document useful (0 votes)
18 views6 pages

Machine Learning

The document discusses various machine learning concepts including methods to avoid overfitting in decision trees, the role of hyperplanes in classification, and the steps involved in the gradient descent algorithm. It also outlines the goals and challenges of computational learning theory, types of Naive Bayes models, applications of the kNN algorithm, and the concept of elitism in genetic algorithms. Additionally, it explains the agent's actions in reinforcement learning and the significance of inductive learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views6 pages

Machine Learning

The document discusses various machine learning concepts including methods to avoid overfitting in decision trees, the role of hyperplanes in classification, and the steps involved in the gradient descent algorithm. It also outlines the goals and challenges of computational learning theory, types of Naive Bayes models, applications of the kNN algorithm, and the concept of elitism in genetic algorithms. Additionally, it explains the agent's actions in reinforcement learning and the significance of inductive learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MACHINE LEARNING

AAT-II
-Samyuktha Kanugula
21951A6248

1) Mention the two classes into which the different approaches to


avoid overfitting in decision trees are grouped.
The two primary classes are:
1. Pre-Pruning (Early Stopping):
o This approach involves stopping the tree's growth early,
before it fully fits the training data. Conditions like
maximum tree depth, minimum samples required for a
node split, or minimum information gain threshold are
set. Pre-pruning ensures the model does not become
overly complex, reducing the risk of overfitting while
maintaining generalization.
2. Post-Pruning (Pruning After Full Growth):
o Here, the tree is allowed to grow to its maximum size,
capturing all details of the training data. Afterward, less
significant branches are removed. Methods like reduced
error pruning and cost-complexity pruning are employed
to evaluate and eliminate sections that add little
predictive value. This balances complexity and
performance by fine-tuning the structure.

2) Describe hyperplanes and its relation with the number of


features present in the dataset.
A hyperplane is a generalization of a plane that serves as a decision
boundary in higher-dimensional spaces. It is commonly used in
machine learning algorithms like Support Vector Machines (SVM) for
classification tasks.
• Relation to Features:
For a dataset with nn features, the hyperplane exists in an nn-
dimensional space. For instance:
o A hyperplane in 2D space is a line.
o In 3D space, it becomes a 2D plane.
o For datasets with more than three features, hyperplanes
exist in higher-dimensional spaces and cannot be
visualized directly.
Hyperplanes separate data points of different classes, and
their orientation and position depend on the feature
values.

3) Mention the steps in gradient descent algorithm for training


linear units.
1. Initialize Parameters: Assign initial random values to weights
and bias.
2. Calculate Predictions: Compute the model output based on
current parameters.
3. Compute Loss: Measure the difference between predicted and
actual values using a loss function (e.g., Mean Squared Error).
4. Calculate Gradients: Compute the derivative of the loss function
concerning each parameter.
5. Update Parameters: Adjust weights and bias using the formula:
θ=θ−α∂L∂θ\theta = \theta - \alpha \frac{\partial L}{\partial
\theta}
where α\alpha is the learning rate.
6. Iterate: Repeat until the loss converges or the algorithm
reaches the maximum iterations.

4) What are the main goals of computational learning theory? What


are some of the main challenges faced?
Goals:
• Establish a theoretical foundation for machine learning
algorithms to understand their efficiency, effectiveness, and
limitations.
• Define concepts like VC dimension, sample complexity, and PAC
(Probably Approximately Correct) learning to assess algorithm
generalization.
• Develop algorithms that are computationally efficient and
perform well across various scenarios.
Challenges:
• Balancing computational complexity with algorithm
performance in real-world applications.
• Handling noisy, incomplete, or high-dimensional data, which
often violates theoretical assumptions.
• Bridging the gap between theoretical models and practical
implementations for robust learning.

5) Explain the types of Naive Bayes models in detail.


1. Gaussian Naive Bayes:
Assumes continuous features follow a Gaussian distribution. It
is used in scenarios where numerical input variables need
classification. For example, spam detection can use Gaussian
Naive Bayes for analyzing word frequencies.
2. Multinomial Naive Bayes:
Best suited for discrete data, especially in text classification
problems. It models the frequency of occurrences of events. A
common example is document classification, where it evaluates
the frequency of words across documents.
3. Bernoulli Naive Bayes:
Deals with binary/boolean data. It predicts classes based on the
presence or absence of features. A typical application is
sentiment analysis, determining whether words in a text are
positive or negative.

6) Explain any three applications of kNN algorithm in real life.


1. Recommendation Systems:
kNN identifies users with similar preferences to recommend
items such as movies, products, or books.
2. Medical Diagnosis:
kNN classifies diseases based on patient data, symptoms, and
historical cases, helping in quick diagnosis and treatment.
3. Fraud Detection:
Analyzes past transaction patterns to detect anomalous
behavior, which could signify fraudulent activities.

7) Describe elitism in the context of genetic algorithms and its


impact. Mention disadvantages of genetic algorithms.
Elitism:
In genetic algorithms, elitism involves retaining the best-performing
individuals across generations. By ensuring the survival of top
solutions, elitism prevents the loss of valuable genetic traits and
accelerates convergence to optimal solutions.
Disadvantages:
• Premature Convergence: The algorithm might focus too early
on suboptimal solutions, missing better alternatives.
• High Computational Cost: Repeated evaluations of fitness
functions are expensive for large datasets.
• Parameter Sensitivity: Requires careful tuning of mutation
rates, crossover rates, and population size.

8) Explain how an agent can take action to move from one state to
another with the help of rewards.
An agent in Reinforcement Learning (RL) interacts with its
environment by:
1. Observing its current state.
2. Taking an action based on its policy (set of rules).
3. Receiving a reward or penalty based on the outcome of the
action.
The agent iteratively adjusts its policy to maximize cumulative
rewards, using algorithms like Q-Learning or Deep Q-Networks. Over
time, this enables it to optimize decision-making and achieve desired
goals.

9) What is Inductive Learning Algorithm? Why is it needed even


when other reinforcement learnings like ID3 and AQ were
available?
Inductive Learning Algorithm:
A method of generalizing patterns or rules from observed data. It
identifies underlying relationships, which can then predict outcomes
for unseen data.
Need for Inductive Learning:
• ID3 and AQ specialize in rule-based learning for specific
problem types, while inductive learning is versatile and
applicable to broader scenarios.
• It is suitable for both discrete and continuous data, making it
more adaptable for real-world applications.

You might also like