0% found this document useful (0 votes)
23 views15 pages

Ai&ml 2

Uploaded by

manglamdubey2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views15 pages

Ai&ml 2

Uploaded by

manglamdubey2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

AI&ML

1. Describe a validation technique in machine learning with an illustrative example.

ANS- Validation is a technique for validating the model efficiency by


training it on the subset of inp
.

K-FOLD CROSS-VALIDATION

The steps for k-fold cross-validation are ::


o Split the input dataset into K groups
o For each group:
o Take one group as the reserve or test data set.
o Use remaining groups as the training dataset
o Fit the model on the training set and evaluate the performance of the model using the test
set.

An example of 5-folds cross-validation.


Limitations of Cross-Validation

 For the ideal conditions, it provides the optimum output. But for the inconsistent data, it
may produce a drastic result.
 In predictive modeling, the data evolves over a period, due to which, it may face the
differences between the training set and validation sets.

Applications of Cross-Validation

 This technique can be used to compare the performance of different predictive modeling
methods.
 It has great scope in the medical research field.
 It can also be used for the meta-analysis.

2. Define confusion matrix and explain its components like TP, TN, FP, FN, Type-1
and Type-2 errors and calculation using confusion matrix.

ANS- The confusion matrix also known as an Error matrix. is a matrix used to determine the
performance of the classification models for a given set of test data. Features of Confusion
matrix are:

o For the 2 prediction classes of classifiers, the matrix is of 2*2 table, for 3 classes, it is 3*3
table, and so on.

o The matrix is divided into two dimensions, that are predicted values and actual
values along with the total number of predictions.
o Predicted values are those values, which are predicted by the model, and actual values are
the true values for the given observations.

o It looks like the below table:

The above table has the following cases:

o True Negative: Model has given prediction No, and the real or actual value was also No.

o True Positive: The model has predicted yes, and the actual value was also true.

o False Negative: The model has predicted no, but the actual value was Yes, it is also
called as Type-II error.

o False Positive: The model has predicted Yes, but the actual value was No. It is also
called a Type-I error.

Need/Advantages for Confusion Matrix in Machine learning

o It evaluates the performance of the classification models, when they make predictions on
test data, and tells how good our classification model is.

o It not only tells the error made by the classifiers but also the type of errors such as it is
either type-I or type-II error.

Calculation using confusion matrix:


o Classification Accuracy: It defines how often the model predicts the correct output. The
formula is given below:

o Misclassification rate/error rate: It defines how often the model gives the wrong
predictions. The formula is given below:
o Precision: It can be defined as the number of correct outputs provided by the model. It
can be calculated using the below formula:

o Recall: It is defined as the out of total positive classes. The recall must be as high as
possible.

o F-measure: This score helps us to evaluate the recall and precision at the same time. The
F-score is maximum if the recall is equal to the precision. It can be calculated using the
below formula:

3. Explain ROC-AUC curve.

ANS- AUC-ROC curve is a performance measurement metric of a classification model at


different threshold values.

ROC Curve:

ROC or Receiver Operating Characteristic curve represents a probability graph to show the
performance of a classification model at different threshold levels. The curve is plotted
between two parameters, which are:

o True Positive Rate or TPR

o False Positive Rate or FPR

In the curve, TPR is plotted on Y-axis, whereas FPR is on the X-axis.se

TPR:
FPR or False Positive Rate can be calculated as:

Here, TP: True Positive

FP: False Positive

TN: True Negative

FN: False Negative

AUC Curve:

AUC is known for Area Under the ROC curve calculates the two-dimensional area under the
entire ROC curve ranging from (0,0) to (1,1), as shown below image:

Applications:

1. Classification of 3D model
The curve is used to classify a 3D model and separate it from the normal models.

2. Healthcare
The curve has various applications in the healthcare sector. It can be used to detect cancer
disease in patients. It does this by using false positive and false negative rates, and accuracy
depends on the threshold value used for the curve.

3. Binary Classification

AUC-ROC curve is mainly used for binary classification problems to evaluate their
performance.

When to Use AUC-ROC

AUC is preferred due to the following cases:

o AUC is used to measure how well the predictions are ranked instead of giving their
absolute values. Hence, we can say AUC is Scale-Invariant.

o It measures the quality of predictions of the model without considering the selected
classification threshold. It means AUC is classification-threshold-invariant.

When not to use AUC-ROC

o AUC is not preferable when we need to calibrate probability output.

o Further, AUC is not a useful metric when there are wide disparities in the cost of false
negatives vs false positives, and it is difficult to minimize one type of classification error.

4. Define clustering.

ANS- Clustering or cluster analysis is a unsupervised machine learning technique, which


groups the unlabelled dataset. It can be defined as "A way of grouping the data points into
different clusters, consisting of similar data points. The objects with the possible similarities
remain in a group that has less or no similarities with another group."

e.g.- Grouping documents according to the topic, recommendation system of amazon and
Netflix etc.

The clustering methods are broadly divided into Hard clustering (datapoint belongs to only
one group) and Soft Clustering (data points can belong to another group also). But there are
also other various approaches of Clustering exist:

5. Explain the concept of K-means.

ANS- It is an iterative, centroid based unsupervised learning algorithm that divides the
unlabeled dataset into k different clusters in such a way that each dataset belongs only one
group that has similar properties.

algorithm mainly performs two tasks:


o Determines the best value for K center points or centroids by an iterative process.

o Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.

Working:

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input dataset).

Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid
of each cluster.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

Step-7: The model is ready.


e.g.-

6. Write the algorithm of K-medoid method and its advantages and disadvantages. (for
numerical y/t)

ANS-

1. Initialize: select k random points out of the n data points as the medoids.

2. Associate each data point to the closest medoid by using any common distance metric
methods.

3. While the cost decreases: For each medoid m, for each data o point which is not a
medoid:

 Swap m and o, associate each data point to the closest medoid, and recompute the
cost.
 If the total cost is more than that in the previous step, undo the swap.

Advantages:

1. It is simple to understand and easy to implement.

2. K-Medoid Algorithm is fast.

3. PAM is less sensitive to outliers than other partitioning algorithms.

Disadvantages:

1. It is not suitable for clustering non-spherical (arbitrarily shaped) groups of objects. This is
because it uses compactness as clustering criteria instead of connectivity.

2. It may obtain different results for different runs on the same dataset because the first k
medoids are chosen randomly.

7. Explain Density-Based Spatial Clustering Of Applications With Noise (DBSCAN)


algorithm.

ANS- Partitioning methods (K-means, PAM clustering) and hierarchical clustering are suitable
only for compact and well-separated clusters and they are also severely affected by the presence
of noise and outliers in the data. Real-life data may contain irregularities, like: clusters can be of
arbitrary shape, data may contain noise. Thus we use DBSCAN in such cases.

Parameters Required For DBSCAN Algorithm

1. eps: It defines the neighborhod around a data point. If it is chosen very large then the
clusters will merge and the majority of the data points will be in the same clusters.To find
the eps value we use k-distance graph.

2. MinPts: Minimum number of neighbors (data points) within eps radius. The larger the
dataset, the larger value of MinPts must be chosen. MinPts can be derived from the
number of dimensions D in the dataset as, MinPts >= D+1. The minimum value of
MinPts must be chosen at least 3.

STEPS USED IN DBSCAN ALGORITHM


1. Find all the neighbor points within eps and identify the core points or visited with
more than MinPts neighbors.
2. For each core point if it is not already assigned to a cluster, create a new cluster.
3. Find recursively all its density-connected points and assign them to the same cluster
as the core point.
4. Iterate through the remaining unvisited points in the dataset.

After this we came out with three types of points:


 Core — This is a point that has at least m points within distance n from itself.
 Border — This is a point that has at least one Core point at a distance n.
 Noise — This is a point that is neither a Core nor a Border.

8. Discuss the types of machine learning technique.


ANS- Machine learning involves showing a large volume of data to a machine so that it can
learn and make predictions, find patterns, or classify data. The three machine learning types
are supervised, unsupervised, and reinforcement learning.
1. Supervised Learning:
 In this technique, the algorithm learns from labeled training data.
 Each input is paired with the corresponding output, the goal is for the
algorithm to learn the mapping from inputs to outputs, enabling it to make
predictions or decisions on new, unseen data.
 Common algorithms include linear regression, decision trees, support
vector machines (SVM), and neural networks.
Supervised learning examples/applications
 Predicting real estate prices.
 Classifying whether bank transactions are fraudulent or not.
 Finding disease risk factors.
 Determining whether loan applicants are low-risk or high-risk.
2. Unsupervised Learning:
 The algorithm works with unlabeled data, seeking to find patterns or
intrinsic structures within the data.
 It explores the data to learn about its properties, clusters similar data points,
or reduces the dimensionality for easier visualization or processing.
 Clustering algorithms like k-means, hierarchical clustering, and principal
component analysis (PCA) fall under this category.
Unsupervised learning examples/applications
 Creating customer groups based on purchase behavior.
 Grouping inventory according to sales and/or manufacturing metrics.
 Pinpointing associations in customer data.
3. Reinforcement Learning:
 This technique involves an agent that learns to make decisions by interacting
with an environment.
 The agent learns from feedback in the form of rewards or penalties as it
navigates through the environment.
 Algorithms in reinforcement learning include Q-learning, deep Q-networks
(DQN), and policy gradients.

Reinforcement learning examples/applications


 Teaching cars to park themselves and drive autonomously.
 Dynamically controlling traffic lights to reduce traffic jams.
 Training robots to learn policies using raw video images as input that they can use to
replicate the actions they see

9. Explain in brief the concept of K-NN.

ANS- K-Nearest neighbor is Machine Learning non-parametric algorithm algorithms


based on Supervised Learning technique. K-NN algorithm assumes the similarity between
the new case/data and available cases and put the new case into the category that is most
similar to the available categories. It is also called a lazy learner algorithm because it does
not learn from the training set immediately.

Algorithm:
Step-1: Select the number K of the neighbors.

Step-2: Calculate the Euclidean distance of K number of neighbors.

Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.

Step-4: Among these k neighbors, count the number of the data points in each category.

Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.

Step-6: Our model is ready.

Euclidean Distance :

Points to remember while selecting the value of K:

o Odd number should be preferred.


o Very low value such 1 or 2 can be noisy.
o Large value may find some difficulties.
o Most preferred value of K is “5”.

Advantages of KNN Algorithm:

o It is simple to implement.

o It is robust to the noisy training data

o It can be more effective if the training data is large.

Disadvantages of KNN Algorithm:

o Always needs to determine the value of K which may be complex some time.

o The computation cost is high.

Applications of KNN

Banking system, Calculating credit card ratings, Politics, Speech Recognition, Handwriting
detection, Image Recognition, Video Recogniton.

K-
NN

10. Explain the concept of Decision trees.

ANS- Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems. It is called a decision tree because, similar to a tree,
it starts with the root node, which expands on further branches and constructs a tree-like
structure. In order to build a tree, we use the CART algorithm, which stands
for Classification and Regression Tree algorithm. A decision tree simply asks a question,
and based on the answer (Yes/No), it further split the tree into subtrees.
Int DT:

Internal nodes represent the features of a dataset.

Branches represent the decision rules.

Each leaf node represents the outcome.

Algorithm:

Step-1: Begin the tree with the root node, says S, which contains the complete dataset.

Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).

Step-3: Divide the S into subsets that contains possible values for the best attributes.

Step-4: Generate the decision tree node, which contains the best attribute.

Step-5: Recursively make new decision trees using the subsets of the dataset created in step
-3. Continue this process until a stage is reached where you cannot further classify the nodes
and called the final node as a leaf node.

Advantages of the Decision Tree


o It is simple to understand.

o It can be very useful for solving decision-related problems.

o It helps to think about all the possible outcomes for a problem.

Disadvantages of the Decision Tree

o The decision tree contains lots of layers, which makes it complex.

o It may have an overfitting issue, which can be resolved using the Random Forest
algorithm.

o For more class labels, the computational complexity of the decision tree may increase.

You might also like