0% found this document useful (0 votes)
11 views10 pages

ML Qa

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that allows systems to learn from data and make decisions without explicit programming. It includes various techniques such as supervised, unsupervised, and reinforcement learning, as well as algorithms like Naïve Bayes and k-NN. Additionally, concepts like Bayes' Theorem, decision trees, and clustering methods like k-Means are discussed, highlighting their applications and limitations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

ML Qa

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that allows systems to learn from data and make decisions without explicit programming. It includes various techniques such as supervised, unsupervised, and reinforcement learning, as well as algorithms like Naïve Bayes and k-NN. Additionally, concepts like Bayes' Theorem, decision trees, and clustering methods like k-Means are discussed, highlighting their applications and limitations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MACHINE LAERNING

1)Define Machine Learning and explain its relation to Artificial Intelligence

Definition of Machine Learning (ML)

Machine Learning (ML) is a branch of Artificial Intelligence (AI) that enables systems to learn from data,
identify patterns, and make decisions or predictions without explicit programming. ML focuses on creating
algorithms and models that improve their performance over time as they are exposed to more data.

Relationship to Artificial Intelligence

Artificial Intelligence is the broader field aimed at developing systems that simulate human intelligence,
such as reasoning, problem-solving, and decision-making. ML is a core subset of AI that provides the ability
to learn and adapt autonomously. While AI encompasses techniques like rule-based systems and robotics,
ML uses statistical and computational methods to achieve adaptability and improve performance. For
example, AI-powered virtual assistants like Siri or Alexa use ML to refine their responses based on user
interactions.

Features of Machine Learning

1. Data-Driven: ML models rely heavily on large datasets for training and performance improvement.

2. Self-Learning: Systems can learn and improve without human intervention.

3. Pattern Recognition: Identifies and leverages patterns in data for predictions or decisions.

4. Automation: Reduces the need for manual programming and decision-making.

5. Adaptability: Performance improves as new data becomes available.

6. Wide Applications: Used in image recognition, recommendation systems, natural language


processing, and more.

2) Compare supervised, unsupervised, and reinforcement learning with examples

Aspect Supervised Learning Unsupervised Learning Reinforcement Learning

Definition Learns from labeled data Finds patterns in unlabeled Learns by interacting with the
to make predictions. data. environment to maximize rewards.

Input Data Labeled (input-output Unlabeled (no predefined States, actions, and rewards.

T.VINEETHA BAI-2B1
Page 1
pairs). outputs).

Goal Predict outcomes for new Discover hidden structures Learn a strategy to achieve a goal.
inputs. or clusters.

Examples Email spam detection, Customer segmentation, Self-driving cars, game-playing


image classification. anomaly detection. bots.

Techniques Regression, Classification. Clustering, Dimensionality Q-Learning, Deep Q-Networks


Reduction. (DQN).

Output A model that maps input Groups or patterns in data. A policy for decision-making.
to output.

3) Explain Bayes Theorem and its application in the Naïve Bayes Classifier.

Bayes' Theorem is a fundamental concept in probability theory that describes the probability of an event
based on prior knowledge of conditions that might be related to the event. Mathematically, it is expressed
as:

P(A∣B)=P(B∣A)⋅P(A)P(B)/P(A|B)

Where:

 P(A∣B)P(A|B) is the posterior probability: the probability of event A occurring given that event B has
occurred.

 P(B∣A)P(B|A) is the likelihood: the probability of event B occurring given that event A has occurred.

 P(A)P(A) is the prior probability: the initial probability of event A.

 P(B)P(B) is the marginal likelihood: the total probability of event B occurring.

Naïve Bayes Classifier is a simple yet powerful classification algorithm based on Bayes' Theorem. It's called
"naïve" because it assumes that the features (attributes) of the data are independent of each other, which
is often not the case in real-world scenarios. Despite this simplifying assumption, Naïve Bayes classifiers
often perform surprisingly well.Here's how it works:

1. Training Phase: The algorithm uses a training dataset to estimate the probability of each class (prior
probability) and the conditional probability of each feature given the class (likelihood).

2. Prediction Phase: For a new instance, the algorithm calculates the posterior probability for each class
using Bayes' Theorem and assigns the class with the highest posterior probability.

T.VINEETHA BAI-2B1
Page 2
Applications of Naïve Bayes Classifier:

 Text Classification

 Medical Diagnosis

 Recommendation Systems

4) Describe the k-NN algorithm and its limitations.

k-Nearest Neighbors (k-NN) is a simple, non-parametric, and lazy machine learning algorithm used for
classification and regression tasks. It predicts the output for a given input based on the majority class or
average of the outputs of its kkk-nearest neighbors in the feature space.

How k-NN Works:

1. Training Phase:

o No explicit training is required; the algorithm stores the entire dataset.

2. Prediction Phase:

o Calculate the distance (e.g., Euclidean, Manhattan) between the input point and all data
points in the dataset.

o Select the kkk-closest points (neighbors).

o For classification, assign the class that is most frequent among the kkk-neighbors.

o For regression, take the average value of the kkk-neighbors.

Limitations of k-NN:

1. Computationally Expensive:

o Requires computing distances to all data points, making it slow for large datasets.

2. Memory Intensive:

o Stores the entire dataset, leading to high memory usage.

3. Sensitive to Feature Scaling:

o Performance depends on the scale of features. Features with larger ranges may dominate
distance calculations.

5) What is a Decision Tree, and what are its advantages and disadvantages?
T.VINEETHA BAI-2B1
Page 3
A Decision Tree is a supervised machine learning algorithm used for both classification and regression tasks.
It splits the data into subsets based on feature values, resulting in a tree-like structure. Each internal node
represents a feature, each branch represents a decision rule, and each leaf node represents an output label
or value.

Example:

 Task: Predicting whether a person buys a product (Yes/No).

 Features: Age, Income, Marital Status.

 Decision Tree Structure:

o First split based on Age: If age > 30, move to the left; else, move to the right.

o Second split based on Income for the left child node, and based on Marital Status for the right
child node.

Advantages and Disadvantages of Decision Tree:

Advantages Disadvantages

Easy to understand and interpret. Prone to overfitting if not pruned.

Handles both numerical and categorical data. Sensitive to small variations in data.

Can model non-linear relationships. Can be unstable with large data sets.

Requires little data preprocessing. Biased toward features with more levels.

Can be used for both classification and May suffer from poor performance with complex
regression. datasets.

6)Discuss the role of kernels in SVM and how they improve performance.

Role of Kernels in Support Vector Machines (SVM)

In SVM, kernels are functions that enable the algorithm to operate in higher-dimensional spaces without
explicitly transforming the data. This is done through the kernel trick, which allows SVM to find linear
hyperplanes in higher-dimensional spaces, even when the data is not linearly separable in the original
feature space.

How Kernels Improve Performance:

T.VINEETHA BAI-2B1
Page 4
1. Non-linear Data Mapping: Kernels map data to higher dimensions where a linear separation is
possible, enabling SVM to handle non
non-linear classification tasks.

2. Efficiency: The kernel trick avoids


voids the need for costly explicit transformation, making it
computationally efficient.

3. Versatility: Different types of kernels (e.g., Linear, Polynomial, Radial Basis Function (RBF)) allow
SVM to handle a wide range of data types and distributions.

4. Improved Accuracy: By projecting the data into higher dimensions, kernels increase the chances of
finding a more optimal hyperplane, leading to better classification performance.

Common Kernels:

 Linear Kernel: For linearly separable data.

 Polynomial Kernel: Captures


ures interactions between features.

 RBF Kernel: Handles highly non--linear


linear data by measuring distance from data points.

7) Differentiate between linear, lasso, and ridg


ridge regression.

8) Explain the concept of k-Means


Means clustering and its limitations.

T.VINEETHA BAI-2B1
Page 5
k-Means Clustering

k-Means is an unsupervised machine learning algorithm used for clustering similar data points into groups,
called clusters. The algorithm works as follows:

1. Initialization: Choose kkk initial centroids randomly.

2. Assignment: Assign each data point to the nearest centroid.

3. Update: Recalculate the centroids by averaging the points in each cluster.

4. Repeat: Repeat the assignment and update steps until the centroids no longer change or the
maximum number of iterations is reached.

Limitations of k-Means Clustering

1. Choosing kkk: The number of clusters kkk must be specified in advance, and choosing the wrong
value can affect performance.

2. Sensitive to Initialization: The algorithm can converge to different solutions based on the initial
centroids, making results inconsistent.

3. Assumes Spherical Clusters: k-Means assumes clusters are spherical and equally sized, which may
not be the case for all datasets.

4. Sensitive to Outliers: Outliers can skew the centroid, affecting the clustering results.

5. Non-Global Optimum: The algorithm may get stuck in a local minimum, especially with complex
data.

9) What are the differences between agglomerative and divisive clustering?

T.VINEETHA BAI-2B1
Page 6
10) Discuss the importance of dimensionality reduction and compare PCA and LDA.

Importance
ce of Dimensionality Reduction

 Reduces computational cost by lowering the number of features.

 Improves model performance by eliminating noise and redundant features.

 Makes data visualization easier in lower dimensions (2D, 3D).

 Helps avoid overfitting by simplifying


implifying the model.

 Enables better storage and faster processing of large datasets.

T.VINEETHA BAI-2B1
Page 7
11) What is Market Basket Analysis, and how is it applied in real
real-world scenarios?

Market Basket Analysis

Market Basket Analysis (MBA) is a data mining technique used to discover patterns of co-occurrence
co or
relationships between items that
hat are frequently bought together. It helps in identifying associations between
products in a transaction database.

Application in Real-World Scenarios:

1. Retail: Stores use MBA to understand which products customers often purchase together, helping to
optimize
mize product placement and promotions.

2. E-commerce: Online platforms recommend products based on items customers have previously
bought or viewed together.

3. Cross-Selling: Banks and telecom companies use MBA to suggest additional services or products to
existing customers.

4. Inventory Management: Helps businesses forecast demand by identifying product combinations that
should be stocked together.

5. Discount Strategies: Stores create bundle offers and discounts based on frequently bought product
pairs to increase sales.

12) Explain the working of the Apriori algorithm with an example.

T.VINEETHA BAI-2B1
Page 8
The Apriori algorithm is used for association rule mining to identify frequent item sets in a transaction
dataset and derive association rules. The algorithm works in a bottom-up manner, starting from individual
items and iteratively combining them to find larger item sets.

Working of Apriori:

1. Step 1 (Find Frequent Items): Start by finding individual items that meet the minimum support
threshold (items bought frequently).

2. Step 2 (Generate Candidate Item Sets): Combine the frequent items from the previous step to form
larger item sets (pairs, triplets, etc.).

3. Step 3 (Prune Infrequent Item Sets): Remove candidate item sets that don't meet the minimum
support threshold.

4. Step 4 (Repeat): Repeat the process until no further frequent item sets can be generated.

EXAMPLE:

Consider the following transaction data:

Transaction ID Items Purchased

T1 Bread, Butter

T2 Bread, Milk

T3 Butter, Milk

T4 Bread, Butter, Milk

Step 1: Find frequent individual items (1-item sets)

 Support for each item:

o Bread: 3/4 (75%)

o Butter: 3/4 (75%)

o Milk: 3/4 (75%)

Assume the minimum support threshold is 50%. All three items (Bread, Butter, Milk) are frequent since they
meet the threshold.

Step 2: Generate pairs of items (2-item sets)

 Candidate pairs:

T.VINEETHA BAI-2B1
Page 9
o {Bread, Butter}

o {Bread, Milk}

o {Butter, Milk}

Step 3: Calculate the support for pairs

 Support for each pair:

o {Bread, Butter}: 2/4 (50%)

o {Bread, Milk}: 2/4 (50%)

o {Butter, Milk}: 2/4 (50%)

All pairs have a support of 50%, which meets the minimum threshold.

Step 4: Generate 3-item sets

 Candidate 3-item set: {Bread, Butter, Milk}

Step 5: Calculate the support for the 3-item set

 Support for {Bread, Butter, Milk}: 1/4 (25%)

This pair does not meet the minimum support threshold (50%), so it is pruned.

Final Frequent Item Sets:

 1-item sets: {Bread}, {Butter}, {Milk}

 2-item sets: {Bread, Butter}, {Bread, Milk}, {Butter, Milk}

Association Rule Generation:

Now, we can generate association rules from the frequent item sets. For example:

 Rule 1: {Bread} → {Bu er} with support = 50% and confidence = 66.67%

 Rule 2: {Butter} → {Bread} with support = 50% and confidence = 66.67%

These rules show the relationships between items purchased together. The Apriori algorithm uses support
and confidence to identify strong rules in the data.

T.VINEETHA BAI-2B1
Page 10

You might also like