DM - MP
DM - MP
1 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
1. Define Entropy.
2 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
2. Prediction Phase:
3 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
• Count how often each attribute value occurs for each class
(conditional probability).
Example:
Let's say you are classifying emails as either spam (C1) or not spam
(C2). You look at the words in each email as attributes.
• During training, you count how often each word occurs in spam
and non-spam emails.
• You also count how many emails are spam and non-spam.
Key Concepts:
1. Instance-Based Learning:
4 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
2. Distance Metric:
3. Parameter k:
2. Prediction:
2. Calculate Distances:
5 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
3. Identify Neighbors:
Example:
Let's say you have a dataset with instances representing different
types of fruits based on features like weight and sweetness. You want
to predict the type of fruit for a new instance.
• Choose k, say k = 3.
• Measure the distances between the new instance and all instances
in the dataset.
Considerations:
• Choice of k: The value of k affects the algorithm's performance.
Too small k may lead to noisy predictions, while too large k may
result in overly smooth predictions.
6 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
splits the dataset into subsets based on the most significant attribute,
resulting in a tree-like structure.
Key Concepts:
1. Decision Tree:
2. Splitting Criteria:
• The algorithm selects the attribute and the split point (or
threshold) that best separates the data into homogeneous
subsets.
3. Recursive Splitting:
4. Classification:
5. Regression:
• The algorithm selects the attribute and split point that best
separates the entire dataset.
2. Splitting:
7 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
3. Leaf Nodes:
4. Prediction/Classification:
• For a new instance, it traverses the tree from the root to a leaf,
making decisions based on attribute values.
Example:
Let's consider a classification task where we want to predict whether a
passenger survived or not based on features like age, gender, and
ticket class.
For a regression task, the target might be the price of a house based
on features like the number of bedrooms and square footage.
• The tree would split the dataset based on features to create leaves
that represent predicted house prices.
Applications:
• Classification: Predicting outcomes like spam or non-spam emails,
customer churn, etc.
8 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
9 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
10 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
2. C4.5:
11 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
2. Gain Ratio:
3. Gini Index:
12 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
result in the lowest Gini Index are chosen at each node during
the tree-building process. The goal is to create pure nodes
where most instances belong to a single class.
4. Chi-Square:
13 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
• Apriori Algorithm:
14 of 29 06-02-2024, 20:12
Model Question paper 1 https://awaisahmed.notion.site/Model-Question-paper-1-0de375d19c08...
15 of 29 06-02-2024, 20:12