ML Qa
ML Qa
Machine Learning (ML) is a branch of Artificial Intelligence (AI) that enables systems to learn from data,
identify patterns, and make decisions or predictions without explicit programming. ML focuses on creating
algorithms and models that improve their performance over time as they are exposed to more data.
Artificial Intelligence is the broader field aimed at developing systems that simulate human intelligence,
such as reasoning, problem-solving, and decision-making. ML is a core subset of AI that provides the ability
to learn and adapt autonomously. While AI encompasses techniques like rule-based systems and robotics,
ML uses statistical and computational methods to achieve adaptability and improve performance. For
example, AI-powered virtual assistants like Siri or Alexa use ML to refine their responses based on user
interactions.
1. Data-Driven: ML models rely heavily on large datasets for training and performance improvement.
3. Pattern Recognition: Identifies and leverages patterns in data for predictions or decisions.
Definition Learns from labeled data Finds patterns in unlabeled Learns by interacting with the
to make predictions. data. environment to maximize rewards.
Input Data Labeled (input-output Unlabeled (no predefined States, actions, and rewards.
T.VINEETHA BAI-2B1
Page 1
pairs). outputs).
Goal Predict outcomes for new Discover hidden structures Learn a strategy to achieve a goal.
inputs. or clusters.
Output A model that maps input Groups or patterns in data. A policy for decision-making.
to output.
3) Explain Bayes Theorem and its application in the Naïve Bayes Classifier.
Bayes' Theorem is a fundamental concept in probability theory that describes the probability of an event
based on prior knowledge of conditions that might be related to the event. Mathematically, it is expressed
as:
P(A∣B)=P(B∣A)⋅P(A)P(B)/P(A|B)
Where:
P(A∣B)P(A|B) is the posterior probability: the probability of event A occurring given that event B has
occurred.
P(B∣A)P(B|A) is the likelihood: the probability of event B occurring given that event A has occurred.
Naïve Bayes Classifier is a simple yet powerful classification algorithm based on Bayes' Theorem. It's called
"naïve" because it assumes that the features (attributes) of the data are independent of each other, which
is often not the case in real-world scenarios. Despite this simplifying assumption, Naïve Bayes classifiers
often perform surprisingly well.Here's how it works:
1. Training Phase: The algorithm uses a training dataset to estimate the probability of each class (prior
probability) and the conditional probability of each feature given the class (likelihood).
2. Prediction Phase: For a new instance, the algorithm calculates the posterior probability for each class
using Bayes' Theorem and assigns the class with the highest posterior probability.
T.VINEETHA BAI-2B1
Page 2
Applications of Naïve Bayes Classifier:
Text Classification
Medical Diagnosis
Recommendation Systems
k-Nearest Neighbors (k-NN) is a simple, non-parametric, and lazy machine learning algorithm used for
classification and regression tasks. It predicts the output for a given input based on the majority class or
average of the outputs of its kkk-nearest neighbors in the feature space.
1. Training Phase:
2. Prediction Phase:
o Calculate the distance (e.g., Euclidean, Manhattan) between the input point and all data
points in the dataset.
o For classification, assign the class that is most frequent among the kkk-neighbors.
Limitations of k-NN:
1. Computationally Expensive:
o Requires computing distances to all data points, making it slow for large datasets.
2. Memory Intensive:
o Performance depends on the scale of features. Features with larger ranges may dominate
distance calculations.
5) What is a Decision Tree, and what are its advantages and disadvantages?
T.VINEETHA BAI-2B1
Page 3
A Decision Tree is a supervised machine learning algorithm used for both classification and regression tasks.
It splits the data into subsets based on feature values, resulting in a tree-like structure. Each internal node
represents a feature, each branch represents a decision rule, and each leaf node represents an output label
or value.
Example:
o First split based on Age: If age > 30, move to the left; else, move to the right.
o Second split based on Income for the left child node, and based on Marital Status for the right
child node.
Advantages Disadvantages
Handles both numerical and categorical data. Sensitive to small variations in data.
Can model non-linear relationships. Can be unstable with large data sets.
Requires little data preprocessing. Biased toward features with more levels.
Can be used for both classification and May suffer from poor performance with complex
regression. datasets.
6)Discuss the role of kernels in SVM and how they improve performance.
In SVM, kernels are functions that enable the algorithm to operate in higher-dimensional spaces without
explicitly transforming the data. This is done through the kernel trick, which allows SVM to find linear
hyperplanes in higher-dimensional spaces, even when the data is not linearly separable in the original
feature space.
T.VINEETHA BAI-2B1
Page 4
1. Non-linear Data Mapping: Kernels map data to higher dimensions where a linear separation is
possible, enabling SVM to handle non
non-linear classification tasks.
3. Versatility: Different types of kernels (e.g., Linear, Polynomial, Radial Basis Function (RBF)) allow
SVM to handle a wide range of data types and distributions.
4. Improved Accuracy: By projecting the data into higher dimensions, kernels increase the chances of
finding a more optimal hyperplane, leading to better classification performance.
Common Kernels:
T.VINEETHA BAI-2B1
Page 5
k-Means Clustering
k-Means is an unsupervised machine learning algorithm used for clustering similar data points into groups,
called clusters. The algorithm works as follows:
4. Repeat: Repeat the assignment and update steps until the centroids no longer change or the
maximum number of iterations is reached.
1. Choosing kkk: The number of clusters kkk must be specified in advance, and choosing the wrong
value can affect performance.
2. Sensitive to Initialization: The algorithm can converge to different solutions based on the initial
centroids, making results inconsistent.
3. Assumes Spherical Clusters: k-Means assumes clusters are spherical and equally sized, which may
not be the case for all datasets.
4. Sensitive to Outliers: Outliers can skew the centroid, affecting the clustering results.
5. Non-Global Optimum: The algorithm may get stuck in a local minimum, especially with complex
data.
T.VINEETHA BAI-2B1
Page 6
10) Discuss the importance of dimensionality reduction and compare PCA and LDA.
Importance
ce of Dimensionality Reduction
T.VINEETHA BAI-2B1
Page 7
11) What is Market Basket Analysis, and how is it applied in real
real-world scenarios?
Market Basket Analysis (MBA) is a data mining technique used to discover patterns of co-occurrence
co or
relationships between items that
hat are frequently bought together. It helps in identifying associations between
products in a transaction database.
1. Retail: Stores use MBA to understand which products customers often purchase together, helping to
optimize
mize product placement and promotions.
2. E-commerce: Online platforms recommend products based on items customers have previously
bought or viewed together.
3. Cross-Selling: Banks and telecom companies use MBA to suggest additional services or products to
existing customers.
4. Inventory Management: Helps businesses forecast demand by identifying product combinations that
should be stocked together.
5. Discount Strategies: Stores create bundle offers and discounts based on frequently bought product
pairs to increase sales.
T.VINEETHA BAI-2B1
Page 8
The Apriori algorithm is used for association rule mining to identify frequent item sets in a transaction
dataset and derive association rules. The algorithm works in a bottom-up manner, starting from individual
items and iteratively combining them to find larger item sets.
Working of Apriori:
1. Step 1 (Find Frequent Items): Start by finding individual items that meet the minimum support
threshold (items bought frequently).
2. Step 2 (Generate Candidate Item Sets): Combine the frequent items from the previous step to form
larger item sets (pairs, triplets, etc.).
3. Step 3 (Prune Infrequent Item Sets): Remove candidate item sets that don't meet the minimum
support threshold.
4. Step 4 (Repeat): Repeat the process until no further frequent item sets can be generated.
EXAMPLE:
T1 Bread, Butter
T2 Bread, Milk
T3 Butter, Milk
Assume the minimum support threshold is 50%. All three items (Bread, Butter, Milk) are frequent since they
meet the threshold.
Candidate pairs:
T.VINEETHA BAI-2B1
Page 9
o {Bread, Butter}
o {Bread, Milk}
o {Butter, Milk}
All pairs have a support of 50%, which meets the minimum threshold.
This pair does not meet the minimum support threshold (50%), so it is pruned.
Now, we can generate association rules from the frequent item sets. For example:
Rule 1: {Bread} → {Bu er} with support = 50% and confidence = 66.67%
These rules show the relationships between items purchased together. The Apriori algorithm uses support
and confidence to identify strong rules in the data.
T.VINEETHA BAI-2B1
Page 10