0% found this document useful (0 votes)

13 views22 pages

Answer 2022-23

pyq

Uploaded by

daddy0ro3ree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views22 pages

Answer 2022-23

pyq

Uploaded by

daddy0ro3ree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Group-A: Very Short Answer Type Questions

1. Answer any ten of the following:

(i) ______ Is a classification algorithm used to
assign observations to a discrete set of classes.
Answer: Decision Tree (or any appropriate
classification algorithm like Logistic Regression,
Naive Bayes).
(ii) The number of nodes in the input layer is 10
and the hidden layer is 5. The maximum number
of connections from the input layer to the hidden
layer are:
Answer: 10×5=5010 \times 5 = 5010×5=50
(iii) True or False: Hierarchical clustering is slower
than non-hierarchical clustering? Answer: True.
(iv) True or False: Ensemble learning can only be
applied to supervised learning methods. Answer:
False.
(v) A collection of individual models that learn to
predict a target by combining their strengths and
avoiding the weaknesses of each is called:
Answer: Ensemble Model.
(vi) Semi-supervised learning algorithm deals with
which types of data?
Answer: Data that is a combination of labeled and
unlabeled samples.
(vii) In an election, N candidates are competing
against each other, and people are voting for
either of the candidates. Voters don't
communicate with each other while casting their
votes. Which of the following ensemble methods
works similarly to the above-discussed election
procedure?
Answer: Bagging (Bootstrap Aggregating).
(viii) A feature F1 can take certain values: A, B,
C, D, E, and F, and represents the grade of students
from a college. Feature F1 is an example of a
______ variable.
Answer: Categorical Variable.
(ix) Imagine a newborn starts to learn walking. It
will try to find a suitable policy to learn walking
after repeated falling and getting up. Specify what
type of machine learning is best suited?
Answer: Reinforcement Learning.
(x) The selling price of a house depends on many
factors. For example, it depends on the number of
bedrooms, number of kitchens, number of
bathrooms, the year the house was built, and the
square footage of the lot. Given these factors,
predicting the selling price of the house is an
example of which type of linear regression.
Answer: Multiple Linear Regression.
(xi) Targeted marketing, Recommended Systems,
and Customer Segmentation are applications in
which algorithm?
Answer: Clustering Algorithm.
(xii) The ______ is the difference between a
sample statistic used to estimate a population
parameter and the actual but unknown value of
the parameter. Answer: Bias.
Group-B (Short Answer Type Questions)
Answer all of the following (5 marks each):
2. Explain Matrix Factorization and where it is used.
Detailed Answer: Matrix Factorization is a
mathematical technique that decomposes a matrix
into two or more matrices, whose product
approximates the original matrix. It is primarily
used in recommendation systems, especially
collaborative filtering, to discover latent factors
representing user preferences and item attributes.
For example, in movie recommendations, Matrix
Factorization can predict user ratings for unseen
movies by uncovering hidden patterns in user-item
interactions.
3. Why ensemble learning is used? What is the
general principle of an ensemble method, and
what is bagging and boosting in an ensemble
method?
Detailed Answer:
o Purpose of Ensemble Learning: Ensemble
learning improves model performance by
combining predictions from multiple models
to reduce errors.
o General Principle: It leverages the diversity of
individual models to achieve better accuracy
and robustness than any single model.
o Bagging: It trains multiple models on different
random subsets of data and aggregates their
predictions (e.g., Random Forest). Bagging
reduces variance and prevents overfitting.
o Boosting: It trains models sequentially, where
each new model corrects the errors of the
previous ones (e.g., AdaBoost, Gradient
Boosting). Boosting reduces bias and improves
prediction accuracy.

5.K-Means and KNN Algorithm

6. Deciding the Value of "K" in the KNN Algorithm
(Expanded)
How to Choose kkk:
1. Cross-Validation:
o Split the dataset into training and testing
sets.
o Evaluate model accuracy for different kkk
values.
o Choose kkk that minimizes the error rate.
2. Domain Knowledge:
o Use domain-specific insights to decide the
granularity needed for prediction.
3. Data Size:
o Small datasets often require smaller kkk
values, while larger datasets may benefit
from larger kkk.
4. Odd vs. Even kkk:
o Odd kkk avoids ties in classification
problems.
o For regression, odd or even kkk has less
significance as predictions are based on
averages.

Why is Odd kkk Preferable?

1. Avoiding Ties in Classification:
o When kkk is even, there’s a chance of
equal votes for two classes, leading to
ambiguity.
o An odd kkk ensures a clear majority vote.
2. Multi-Class Scenarios:
o With multiple classes, odd kkk avoids the
possibility of equal votes among several
classes.

Impact of kkk on Model Performance:

1. Small kkk (e.g., 1 or 3):
o Advantage: Captures local patterns and
adapts to fine-grained variations.
o Disadvantage: Sensitive to noise and
outliers, leading to overfitting.
2. Large kkk (e.g., 10 or 20):
o Advantage: Reduces noise impact by
considering more neighbors.
o Disadvantage: May oversmooth the
decision boundary, ignoring local
patterns.

Practical Example:
• A fruit classification task:
o Kkk=3: A fruit is classified based on the
labels of its 3 nearest neighbors.
o Kkk=10: A smoother decision boundary is
created, but small clusters may be
ignored.

Group-C (Long Answer Type Questions)

Answer all of the following (15 marks each):
7. (a) Discuss the different types of Machine
Learning.
Expanded Answer: Machine Learning is classified
into three main types:
o Supervised Learning:
▪ Definition: Works with labeled data
(input-output pairs), where the algorithm
learns the mapping from inputs to
outputs.
▪ Applications: Examples include
classification tasks like spam detection
and regression tasks like predicting house
prices.
▪ Key Algorithms: Linear Regression,
Logistic Regression, Decision Trees,
Support Vector Machines.
o Unsupervised Learning:
▪ Definition: Works with unlabeled data to
find patterns or structures within the
dataset.
▪ Applications: Clustering (e.g., customer
segmentation), association rule mining,
and dimensionality reduction (e.g., PCA).
▪ Key Algorithms: K-means, DBSCAN,
Hierarchical Clustering.
o Reinforcement Learning:
▪ Definition: Agents learn by interacting
with an environment, receiving rewards or
penalties, and optimizing their actions.
▪ Applications: Game AI (e.g., AlphaGo),
robotics, and self-driving cars.
▪ Key Algorithms: Q-Learning, Deep
QNetworks, Policy Gradient Methods.
(b) What are parametric and non-parametric models?
Expanded Answer:
o Parametric Models:
▪ Assume a fixed number of parameters,
independent of data size.
▪ Faster to train and easier to interpret.
▪ Examples: Linear Regression, Logistic
Regression, Naive Bayes.
o Non-parametric Models:
▪ Do not assume a fixed parameter count;
they grow with data.
▪ Capture more complex relationships but
may require more data.
▪ Examples: KNN, Decision Trees, Random
Forests.
(c) How is machine learning related to AI?
Expanded Answer: Machine Learning is a subset of AI
focused on enabling systems to learn from data
without explicit programming. It forms the foundation
for AI applications like natural language processing,
image recognition, and recommendation systems. AI
encompasses broader areas such as expert systems and
robotics, while ML provides the tools and techniques to
implement intelligent behavior.
8. (a) Explain Generative Mixture Model. Expanded
Answer: A Generative Mixture Model assumes
that data points are generated from a mixture of
several probability distributions, where each
distribution represents a cluster. For example, the
Gaussian Mixture Model (GMM) uses Gaussian
distributions to model clusters. The goal is to
determine the parameters of these distributions
and the probabilities that a data point belongs to a
particular cluster.
(b) With a proper diagram, explain the steps of a
generative mixture model. Expanded Answer:
o Steps:
1. Initialize the parameters (e.g., means,
variances, and weights for each
distribution).
2. E-step: Compute the posterior
probabilities of cluster membership for
each data point.
3. M-step: Update the parameters of the
distributions using these probabilities.
4. Repeat E and M steps until convergence. o
Diagram: A flowchart can depict data
points being assigned to clusters,
parameters being updated, and
probabilities being recalculated iteratively.
(c) Write down the steps of PCA (Principal
Component Analysis).
Expanded Answer:
o Steps:
1. Standardize the Dataset: Ensure all
features have the same scale.
2. Compute Covariance Matrix: Measure
the relationships between features.
3. Calculate Eigenvalues and Eigenvectors:
Derive principal components.
4. Sort Components: Order eigenvectors by
eigenvalues in descending order.
5. Select Top Components: Retain
components explaining the most variance.
6. Transform Data: Project data onto the
selected components.
o Applications: PCA reduces dimensions in
highdimensional datasets, aiding in
visualization and speeding up algorithms.
9. (a) Explain the Confusion Matrix with respect to
machine learning algorithms with a suitable
example.
Expanded Answer: The Confusion Matrix evaluates
classification performance by comparing actual vs.
predicted outcomes: o Definitions:
▪ True Positive (TP): Correctly predicted

positive cases.
▪ True Negative (TN): Correctly predicted
negative cases.
▪ False Positive (FP): Incorrectly predicted
positives.
▪ False Negative (FN): Incorrectly predicted
negatives.
o Example: In email spam classification:
▪ TP: Emails correctly classified as spam.
▪ TN: Emails correctly classified as not
spam.
▪ FP: Genuine emails wrongly classified as
spam.
▪ FN: Spam emails wrongly classified as
genuine.
o Performance Metrics: Accuracy, precision, recall,
and F1-score are derived from the matrix.
(b) Calculate the accuracy percentage for the given
Confusion Matrix.
Expanded Answer:
o Given Confusion Matrix:
▪ TP = 12, TN = 9, FP = 3, FN = 1.
o Accuracy Formula: Accuracy = (TP + TN) / Total
= (12 + 9) / 25 = 84%.
• 10: Supervised Feature Selection Techniques and
Related Concepts
• (a) Explain three techniques under supervised
feature selection:
• Filter Method:
• Overview: Features are ranked based on their
correlation with the target variable, without
involving a predictive model.
• Example Techniques:
• Chi-square test: Evaluates the dependency
between categorical features and target variables.
• ANOVA (Analysis of Variance): Measures how
different groups (e.g., classes) vary with respect to
numerical features.
• Illustration: A dataset with student performance—
Chi-square can identify if "study hours" are
significantly related to "exam scores."
• Advantages: Computationally inexpensive and
fast.
• Wrapper Method:
•

Overview: Features are selected based on their

contribution to the model's performance. Subsets
of features are iteratively tested.
• Example Techniques:
• Forward Selection: Starts with no features, adding
one at a time that improves model accuracy.
• Backward Elimination: Starts with all features,
removing the least significant one iteratively.
• Illustration: Recursive Feature Elimination (RFE)
evaluates subsets by training a model multiple
times and ranking features by importance.
• Advantages: Finds feature subsets that optimize
the model but can be computationally expensive.
• Embedded Method:
• Overview: Feature selection is integrated within
the training process of a machine learning model.
• Example Techniques:
• Lasso Regression (L1 regularization): Penalizes less
relevant features, effectively reducing their
coefficients to zero.
•

• Decision Trees: Rank features by importance

during tree construction.
Illustration: In Lasso Regression for predicting
house prices, irrelevant features like "zipcode"
are automatically excluded.
• Advantages: Combines the speed of filters with
the accuracy of wrappers.
•

• (b) Benefits of Feature Selection in Machine

Learning:
• Reduces Overfitting:
• Removes irrelevant features, preventing the
model from learning noise.
• Illustration: In text classification, removing rarely
used words reduces irrelevant complexity.
• Improves Interpretability:
• Simplifies models, making them easier to
understand and communicate.
• Illustration: A customer churn model with 5 key
predictors (e.g., "call duration" and "complaints")
is more interpretable than one with 50 features.
•

• Speeds up Training and Inference:

• Fewer features reduce computational cost and
memory usage.
Illustration: Training a neural network with 10
features is faster than with 1000 features.
•

• (c) Curse of Dimensionality:

• Definition: As dimensions increase, data points
become sparse, and models struggle to
generalize.
• Impacts:
• Distance Metrics Become Less Informative: In
high-dimensional space, the difference between
the nearest and farthest points diminishes.
• Computational Complexity: High dimensions
exponentially increase the resources needed for
processing.
• Illustration: For a dataset with 100 dimensions,
clustering becomes inefficient as all points appear
equidistant.
•

• Solutions: Use dimensionality reduction

techniques like PCA or t-SNE.
•

• Question 11: Artificial Intelligence, Deep Learning,

and Comparisons
(a) What is Artificial Intelligence, and why do we
need it?
• Definition: AI is a field of computer science that
focuses on creating systems capable of tasks that
typically require human intelligence (e.g.,
reasoning, learning, perception).
• Why Needed:
• Automates repetitive tasks, freeing humans for
creative work.
• Solves complex, large-scale problems like
diagnosing diseases and predicting natural
disasters.
• Enhances efficiency and accuracy in industries like
finance, healthcare, and transportation.
• (b) What is Deep Learning, and provide real-world
examples?
•

• Definition: Deep Learning is a subset of ML that

uses multi-layered neural networks to extract and
learn complex data patterns.
• Examples:
• Self-Driving Cars: Uses object detection models
like CNNs to identify pedestrians, signs, and lanes.
Facial Recognition: Identifies faces in security
systems and smartphones.
• Chatbots: Conversational AI like GPT understands
and generates human-like responses.
•

Adolescent Reproductive Health
100% (2)
Adolescent Reproductive Health
218 pages
CA DOJ Letter To SFDA People v. Christopher Samayoa
No ratings yet
CA DOJ Letter To SFDA People v. Christopher Samayoa
21 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
The Cartoon Guide To Statistics-3
100% (1)
The Cartoon Guide To Statistics-3
8 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
21 pages
AD8552-Machnie Learning QB
No ratings yet
AD8552-Machnie Learning QB
25 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
ML Viva Questions
No ratings yet
ML Viva Questions
8 pages
ML - Questions & Answer
No ratings yet
ML - Questions & Answer
45 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
4 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
Lecture Notes - Metal Forming PDF
No ratings yet
Lecture Notes - Metal Forming PDF
68 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
5 Tradition Asia v. Moya (De Guzman)
No ratings yet
5 Tradition Asia v. Moya (De Guzman)
3 pages
Quiz 4 5 6
No ratings yet
Quiz 4 5 6
11 pages
Machine Learning MCQ S
No ratings yet
Machine Learning MCQ S
318 pages
ML Questions
No ratings yet
ML Questions
9 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
MLfinal 1
No ratings yet
MLfinal 1
7 pages
Unit 1
No ratings yet
Unit 1
20 pages
ML QB Ans
No ratings yet
ML QB Ans
48 pages
Machine Learning Qs
No ratings yet
Machine Learning Qs
10 pages
Research Stem
No ratings yet
Research Stem
50 pages
GC Method Development Tree
No ratings yet
GC Method Development Tree
9 pages
Jonathan Livingston Seagull Book Review
100% (2)
Jonathan Livingston Seagull Book Review
4 pages
Shivaji University, Kolhapur
No ratings yet
Shivaji University, Kolhapur
12 pages
Length Lab
No ratings yet
Length Lab
3 pages
Answer To Declaration of Nullity
No ratings yet
Answer To Declaration of Nullity
5 pages
Research 1 2
No ratings yet
Research 1 2
27 pages
Lesson 2 NON STOCK CORPORATIONS
No ratings yet
Lesson 2 NON STOCK CORPORATIONS
9 pages
SEM MLOps
No ratings yet
SEM MLOps
58 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
MLQB Unit 3
No ratings yet
MLQB Unit 3
12 pages
Sem Rpa
No ratings yet
Sem Rpa
61 pages
Data Science - QB
No ratings yet
Data Science - QB
8 pages
DS - UNIT - III - QB & Ans
No ratings yet
DS - UNIT - III - QB & Ans
25 pages
Classification
No ratings yet
Classification
50 pages
Dr. Gauri, (Hariom Verma)
No ratings yet
Dr. Gauri, (Hariom Verma)
6 pages
Top 50 Machine Learning Interview Q A
No ratings yet
Top 50 Machine Learning Interview Q A
13 pages
ML Question Bank-1
No ratings yet
ML Question Bank-1
10 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Ielts Exam Paper
No ratings yet
Ielts Exam Paper
15 pages
Antone Vs Beronilla, G.R. No. 183824, December 08, 2010
100% (1)
Antone Vs Beronilla, G.R. No. 183824, December 08, 2010
15 pages
T&D at Godrej
No ratings yet
T&D at Godrej
20 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Unit 2
No ratings yet
Unit 2
57 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
Answer 2023-24
No ratings yet
Answer 2023-24
19 pages
Diplomatic List January 2017
No ratings yet
Diplomatic List January 2017
206 pages
Machine Learning Algorithms 1728923216
No ratings yet
Machine Learning Algorithms 1728923216
12 pages
Recommendations For Writing VisualBasic For Applications (VBA) Code in RSViewSE
No ratings yet
Recommendations For Writing VisualBasic For Applications (VBA) Code in RSViewSE
9 pages
QB For AIML
No ratings yet
QB For AIML
4 pages
Pseudocode and Flowcharts
No ratings yet
Pseudocode and Flowcharts
5 pages
Anesthetic Management For Woman With Single Ventricle Heart After BCPS Who Undergoes Curretage Procedure
No ratings yet
Anesthetic Management For Woman With Single Ventricle Heart After BCPS Who Undergoes Curretage Procedure
3 pages
Orange: From Experimental Machine Learning To Interactive Data Mining
No ratings yet
Orange: From Experimental Machine Learning To Interactive Data Mining
16 pages
Exam
No ratings yet
Exam
10 pages
Machine Learning SEM
No ratings yet
Machine Learning SEM
5 pages
Cisco Identity Services Engine Network Component Compatibility, Release 2.3
No ratings yet
Cisco Identity Services Engine Network Component Compatibility, Release 2.3
28 pages
Professor Hamzah Mohd Salleh - IIUM
No ratings yet
Professor Hamzah Mohd Salleh - IIUM
3 pages
Clustering Cba 1
No ratings yet
Clustering Cba 1
10 pages
Actividad 14: Asignatura Campo de Conocimiento
No ratings yet
Actividad 14: Asignatura Campo de Conocimiento
3 pages
Questions and Answers
No ratings yet
Questions and Answers
7 pages
ML QB With Answer
No ratings yet
ML QB With Answer
20 pages
ML Questions Answers
No ratings yet
ML Questions Answers
4 pages
1137 1994 Sale Deed
No ratings yet
1137 1994 Sale Deed
14 pages
SEC III Artificial Intelligence Question Bank
No ratings yet
SEC III Artificial Intelligence Question Bank
86 pages
ML DS Interview Quetions
No ratings yet
ML DS Interview Quetions
17 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Full ML Viva Questions Answers Q1 To Q70
No ratings yet
Full ML Viva Questions Answers Q1 To Q70
6 pages
AIML Unit 4
No ratings yet
AIML Unit 4
26 pages
Unit-5 MECH 3-2
No ratings yet
Unit-5 MECH 3-2
14 pages
Fat Loss Ebook
No ratings yet
Fat Loss Ebook
26 pages
Aiml Unit Iii Class Test 3.1
No ratings yet
Aiml Unit Iii Class Test 3.1
3 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
Bba English Lecture 10 Solution
No ratings yet
Bba English Lecture 10 Solution
10 pages
Machine Learning One Mark Answers
No ratings yet
Machine Learning One Mark Answers
4 pages
Machine Learning QB
No ratings yet
Machine Learning QB
5 pages
Practice MCQ AI
No ratings yet
Practice MCQ AI
4 pages
MLT QN Bank Merged
No ratings yet
MLT QN Bank Merged
26 pages
Final Exam Sujective Ch-1-8 Question Bank Fill in Blanks
No ratings yet
Final Exam Sujective Ch-1-8 Question Bank Fill in Blanks
5 pages
Chapter 1 Developing A Business Mindset
No ratings yet
Chapter 1 Developing A Business Mindset
41 pages
6 соц
No ratings yet
6 соц
3 pages
QBank All Mod
No ratings yet
QBank All Mod
5 pages
CHP 1,2
No ratings yet
CHP 1,2
18 pages
Sem3 Asmt Answers
No ratings yet
Sem3 Asmt Answers
20 pages
Machine Lar Arii
No ratings yet
Machine Lar Arii
9 pages
ML
No ratings yet
ML
18 pages
Machine Learning Complete Questions
No ratings yet
Machine Learning Complete Questions
3 pages

Answer 2022-23

Uploaded by

Answer 2022-23

Uploaded by

Group-A: Very Short Answer Type Questions

1. Answer any ten of the following:

5.K-Means and KNN Algorithm

Why is Odd kkk Preferable?

Impact of kkk on Model Performance:

Group-C (Long Answer Type Questions)

Overview: Features are selected based on their

• Decision Trees: Rank features by importance

• (b) Benefits of Feature Selection in Machine

• Speeds up Training and Inference:

• (c) Curse of Dimensionality:

• Solutions: Use dimensionality reduction

• Question 11: Artificial Intelligence, Deep Learning,

• Definition: Deep Learning is a subset of ML that

You might also like