0% found this document useful (0 votes)
35 views7 pages

Machine Learning GNIT Suggestions

Uploaded by

atik pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views7 pages

Machine Learning GNIT Suggestions

Uploaded by

atik pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

MACHINE LEARNING: GNIT SUGGESTIONS

1. Define Artificial Neural Network. Write down the applications of ANN in Machine
Learning.
 Neural networks are part of deep learning, which comes under artificial intelligence. They
(neural networks) are a set of algorithms that are modelled after the human brain. They are
also called artificial neural networks (ANN).
 These networks have revolutionized the field of artificial intelligence in the past decade.
They have shown remarkable success in image recognition, self-driving cars, and medical
diagnosis.
 Image Recognition: ANNs excel at recognizing patterns and features in images. They
have been successfully applied to tasks such as identifying objects, faces, and scenes in
photographs or videos.
 Speech Recognition: ANNs play a crucial role in converting spoken language into text.
They are used in voice assistants, transcription services, and other speech-related
applications.
 Medical Diagnosis: ANNs assist in diagnosing diseases based on medical data. They
analyse patient records, images (such as X-rays or MRIs), and other relevant information
to provide accurate diagnoses.
2. Compare and contrast Biological Neural Network and Computer Networks with proper
diagrams.
 Difference:

BIOLOGICAL NEURAL NETWORK


ASPECT COMPUTER NETWORK (CN)
(BNN)
Basic Unit Neuron Node (computer, router, etc.)
Communicatio
Electrochemical signals Digital signals (bits)
n
Processing Parallel Serial
Adaptability High (synaptic plasticity) Low (configurations set)
Fault Tolerance Yes (redundancy in connections) Yes (redundancy in links)
Configuration Adaptive Static (mostly)

3. Define Machine Learning.


 Machine Learning is the field of study that gives computers the capability to learn without
being explicitly programmed. Machine learning (ML) is a branch of artificial intelligence
(AI) and computer science that focuses on using data and algorithms to enable AI to imitate
the way that humans learn, gradually improving its accuracy.
4. Explain different perspectives and issues in Machine Learning
 Perspectives of Machine Learning:
 Perspective of machine learning involves searching very large Space of possible
hypotheses to determine one that best fit the observed data and any prior knowledge held
by learner
 Issues in Machine Learning:
 What algorithms should be used?
 Which algorithms perform best for which types of problems?
 How much training data is sufficient? and testing data?
 What kind of methods should be used?
 What methods should be used to reduce learning overhead?
 For which type of data, which methods should be used?
5. Define Concept Learning. Tasks of concept learning.
6. Explain how the concept learning can be viewed as a task of searching.
 Concept learning can be viewed as the task of searching through a large space of hypotheses
implicitly defined by the hypothesis representation. The goal of this search is to find the
hypothesis that best fits the training examples.
7. Differentiate between Classification and Regression in Machine Learning.

SL.
CLASSIFICATION REGRESSION
NO.
In this problem statement, the target In this problem statement, the target variables
1.
variables are discrete. are continuous.
In this algorithm, we try to find the best
In this algorithm, we try to find the best-fit
possible decision boundary which can
2. line which can represent the overall trend in
separate the two classes with the
the data.
maximum possible separation.
The output variable has to be a discrete The output variable must be either continuous
3.
value. nature or real value.
We can further divide Classification
We can further divide Regression algorithms
4. algorithms into Binary Classifiers and
into Linear and Non-linear Regression.
Multi-class Classifiers.
Example use cases are Spam detection, Example use cases are Stock price prediction,
5.
image recognition, sentiment analysis house price prediction, demand forecasting.

8. Explain briefly the working principle of Logistic Regression algorithm.


 Logistic regression is used for binary classification where we use sigmoid function, that
takes input as independent variables and produces a probability value between 0 and 1.
 A sigmoid function is a mathematical function with a characteristic "S"-shaped curve or
sigmoid curve. It transforms any value in the domain to a number between 0 and 1.
 The sigmoid function can be used in the hidden layers, which take the output from the
previous layer and brings the input values between 0 and 1.
9. Differentiate between Supervised and Unsupervised Learning.

SL.
SUPERVISED LEARNING UNSUPERVISED LEARNING
NO.
Uses Known and Labelled Data as
1. Uses Unknown Data as input
input
2. Less Computational Complexity More Computational Complex
3. Uses off-line analysis Uses Real-Time Analysis of Data
4. The number of Classes is known The number of Classes is not known
5. Accurate and Reliable Results Moderate Accurate and Reliable Results

10. What do you mean by a well-defined learning problem? Explain with an example.
 Well Posed Learning Problem: A computer program is said to learn from Experience E
when given a task T, and some performance measure P. If it performs on T with a
performance measure P, then it upgrades with experience E.
 Any problem can be segregated as well-posed learning problem if it has three traits:
 Task
 Performance Measure
 Experience
 A Checkers Learning Problem:
 Task: Playing checkers game
 Performance Measure: percent of games won against opponent
 Experience: playing implementation games against itself
11. Find-S algorithm and Candidate Elimination algorithm
12. What is Perceptron model?
 A perception model is designed to interpret and understand sensory input from the
environment. It processes raw data (such as images, sounds, or sensor readings) and extracts
meaningful information.
13. What is instance based learning?
 Instance-based learning involves using the entire dataset to make predictions. The machine
learns by storing all instances of data and then using these instances to make predictions on
new data. The machine compares the new data to the instances it has seen before and uses
the closest match to make a prediction.
 In instance-based learning, no model is created. Instead, the machine stores all of the
training data and uses this data to make predictions based on new data. Instance-based
learning is often used in pattern recognition, clustering, and anomaly detection.
14. Explain the role of Mean Squared Error (MSE) in Machine Learning in brief.
 In the fields of regression analysis and machine learning, the Mean Square Error (MSE) is a
crucial metric for evaluating the performance of predictive models. It measures the average
squared difference between the predicted and the actual target values within a dataset. The
primary objective of the MSE is to assess the quality of a model's predictions by measuring
how closely they align with the ground truth.
15. Highlight the role of gradient descent in linear regression.
 Training can be posed as an optimization problem, in which the goal is to optimize a
function (usually to minimize a cost function E)
with respect to a number of free variables,
usually weights w i. The gradient decent
algorithm begins from an initialization of the
weights (e.g. a random initialization) and in an
iterative procedure updates the weights w i by a
quantity Δ wi, where Δ wi = – α (∂ E /∂ wi ) and (∂ E /∂ wi ) is the gradient of the cost function
with respect to the weights, while α is a constant which takes small values in order to keep
the updates low and avoid oscillations.
16. Numerical problems from confusion matrix, given actual value, predicted value. Calculate
accuracy, precision, recall, f1-score, and specificity.
17. Explain the concepts of entropy and information gain with examples.
 Entropy: Entropy is a measure of uncertainty or disorder in a dataset. In machine learning,
it quantifies how mixed or random the labels or classes in a dataset are. High entropy
indicates high disorder or randomness, while low entropy suggests a more organized and
predictable dataset.
 Information Gain: Entropy plays a critical role in decision tree algorithms. Information
gain, a metric used in decision trees, measures the reduction in entropy achieved by splitting
a dataset on a particular feature. The goal is to maximize information gain, as it helps create
more informative and accurate decision trees.
18. How does the Random Forest algorithm work? Explain with an example.
 Random Forest is a popular machine learning algorithm that belongs to the supervised
learning technique. It can be used for both Classification and Regression problems in ML. It
is based on the concept of ensemble learning, which is a process of combining multiple
classifiers to solve a complex problem and to improve the performance of the model.
 Random Forest is a classifier that contains a number of decision trees on various subsets of
the given dataset and takes the average to improve the predictive accuracy of that dataset.
19. Briefly explain hyperplane of Support Vector Machine (SVM) classification algorithm
with a neat diagram.
 Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-
dimensional space, but we need to find out the best decision boundary that helps to classify
the data points. This best boundary is known as the hyperplane of SVM.
20. Apply ID3 algorithm for constructing decision tree for the following training example:

21. Draw a comparative study of clustering and classification with examples.

SL.
CLASSIFICATION CLUSTERING
NO.
1. Used for supervised learning Used for unsupervised learning
Process of classifying the input instances Grouping the instances based on their
2.
based on their corresponding class labels similarity without the help of class labels
It has labels so there is need of training
There is no need of training and testing
3. and testing dataset for verifying the
dataset
model created
Less complex as compared to
4. More complex as compared to clustering
classification
K-means clustering algorithm, Fuzzy c-
Logistic regression, Naive Bayes
5. means clustering algorithm, Gaussian
classifier, Support vector machines, etc.
(EM) clustering algorithm, etc.

22. On the basis of the above-given data, classify the below set of data as "Underweight" or
"Normal" using KNN algorithm. [K = 3, Weight = 57, Height = 170]

23. Explain the following distance metrics with proper examples:


a) Euclidean distance
 Euclidean distance measures the straight-line distance between two points in Euclidean
space.
 Formula:

√∑
n
( x− y )2
i=1

b) Manhattan distance
 Also known as L1 distance, Manhattan distance measures the sum of absolute
differences between corresponding coordinates.
 Formula:
n

∑|x− y|
i=1

c) Minkowski distance
 Minkowski distance is a generalization of both Euclidean and Manhattan distances.
 It depends on a parameter ‘p’:
 When p = 1, it reduces to Manhattan distance.
 When p = 2, it becomes Euclidean distance.
 Formula:

( )
n 1

∑|xi − y i|
p p

i=1
d) Hamming distance
 Hamming distance is used for binary vectors (e.g., strings, DNA sequences).
 It counts the number of differing positions between two vectors.
24. Using the K-Means Clustering algorithm, form two clusters (k1, k2) and assign the above
data points to the appropriate clusters.

25. If the weather is sunny, can a player play golf? Find out by implementing the Naive Bayes
classifier algorithm.

26. Following is the dataset mentioning details of stolen vehicles. Using Naive Bayes classifier,
classify the new data (Red, SUV, Domestic) i.e., whether this particular vehicle is stolen or
not.

You might also like