Machine Learning
Machine Learning
MACHINE LEARNING
1. Define machine learning. Which are different applications of ML? what
is difference between traditional programming and ML?
applications of ML
https://www.javatpoint.com/applications-of-machine-learning
https://www.enjoyalgorithms.com/blog/introduction-to-machine-learning
https://www.javatpoint.com/regression-vs-classification-in-machine-learning
4. Define the terms variance and bias. Explain trade-off between variance
and bias?
bias
difference between the average prediction of our model and the correct
value which we are trying to predict.
MACHINE LEARNING 1
variability of model prediction for a given data point or a value which tells
us spread of our data
💡 formula
MACHINE LEARNING 2
high bias and low variance ⇒ simple model with few parameters
high variance and low bias ⇒ overfitting model with large number of
parameters
linear regression
https://www.javatpoint.com/linear-regression-in-machine-learning
non-linear regression
flexible
parametric or non-parametric
Example
non linear relationship between gold and US CPI inflation and
currency depreciation in many countries
MACHINE LEARNING 3
gold prices are affected the most by inflation
Application
forestry research ⇒
power function to relate tree volume or weight in
relation to its diameter or height
multivariate regression
MACHINE LEARNING 4
cost function ⇒ cost for these wrong predictions
Step 4: Minimize the cost and loss function
Advantages
helps you find a relationship between multiple variables
Disadvantages
requires high-level mathematical calculations
complex
https://www.geeksforgeeks.org/ml-linear-regression-vs-logistic-regression/
https://www.investopedia.com/terms/v/variance-inflation-factor.asp
MACHINE LEARNING 5
💡 optimization algorithm which is commonly-used to train machine
learning models and neural networks
Learning rate
cost function
MACHINE LEARNING 6
calculates avg error for the entire training set
sums the error for each point in a training set, updating the model only
after all training examples have been evaluated
runs a training epoch for each example within the dataset and it updates
each training example's parameters one at a time
combination of both
splits the training dataset into small batch sizes and performs updates
on each of those batches.
c. with saddle points, the negative gradient only exists on one side of
the point, reaching a local maximum on one side and a local
minimum on the other.
a. Vanishing gradients:
MACHINE LEARNING 7
ii. gradient continues to become smaller
b. Exploding gradients:
💡 Overfitting
is a modeling error that occurs when a function or model is too closely
fit the training set and getting a drastic difference of fitting in test set.
If our model does much better on the training set than on the test set,
then we’re likely overfitting.
2. Data Augmentation
3. Cross-Validation
4. Feature Selection
5. Regularization
regularization
MACHINE LEARNING 8
https://www.javatpoint.com/regularization-in-machine-learning
https://www.javatpoint.com/machine-learning-support-vector-machine-algorithm
LDA
💡 https://www.geeksforgeeks.org/ml-linear-discriminant-analysis/
💡 youtube - https://youtu.be/83x5X66uWK0
MACHINE LEARNING 9
https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-
component-analysis#:~:text=The Principal Component Analysis
is,plotting in 2D and 3D.
https://www.upgrad.com/blog/top-dimensionality-reduction-techniques-for-
machine-learning/
https://www.i2tutorials.com/what-is-single-layer-perceptron-and-difference-
between-single-layer-vs-multilayer-perceptron/
18. How does Gradient descent help in minimizing the cost function?
https://towardsdatascience.com/minimizing-the-cost-function-gradient-descent-
a5dd6b5350e1
https://towardsdatascience.com/understanding-backpropagation-algorithm-
7bb3aa2f95fd
MLE
https://analyticsindiamag.com/how-is-maximum-likelihood-estimation-
used-in-machine-learning/#:~:text=By Sourabh Mehta-,Maximum
Likelihood Estimation (MLE) is a probabilistic based approach to,panel
data and discrete data.
MAP
💡 https://youtu.be/TSMJ-QRnk54
MACHINE LEARNING 10
https://towardsdatascience.com/what-is-map-understanding-the-statistic-
of-choice-for-comparing-object-detection-models-1ea4f67a9dbd
https://www.geeksforgeeks.org/artificial-neural-networks-and-its-applications/
22. Define learning rate in neural network. How to choose learning rate for
optimization problem?
https://towardsdatascience.com/learning-rate-a6e7b84f1658
23. Define the terms Training, Activation function, Weights and loss
function in ANN.
Training
Activation function
💡 https://www.geeksforgeeks.org/activation-functions-neural-networks/
Weights
MACHINE LEARNING 11
💡 Weight is the parameter within a neural network that transforms input
data within the network's hidden layers. A neural network is a series of
nodes, or neurons. Within each node is a set of inputs, weight, and a
bias value
Loss function
💡 https://www.geeksforgeeks.org/ml-common-loss-functions/
💡 https://www.geeksforgeeks.org/activation-functions-neural-networks/
26. How does gradient descent help in minimizing the cost function?
https://towardsdatascience.com/machine-leaning-cost-function-and-gradient-
descend-75821535b2ef
27. How does the decision tree algorithm works? Give one example.
https://www.geeksforgeeks.org/decision-tree-introduction-example/
28. Which are the attribute selection measures in decision tree? Explain.
MACHINE LEARNING 12
https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html
29. What is mean by pruning? Which are different techniques used for
pruning?
https://www.kdnuggets.com/2022/09/decision-tree-pruning-hows-whys.html
https://analyticsindiamag.com/what-is-pruning-in-tree-based-ml-models-and-
why-is-it-done/
https://www.jigsawacademy.com/blogs/data-science/decision-tree-in-
machine-learning/
https://www.javatpoint.com/overfitting-and-underfitting-in-machine-
learning#:~:text=Overfitting occurs when our machine,and accuracy of
the model.
regularization
https://www.javatpoint.com/regularization-in-machine-learning
32. Which are different cross validation methods? Explain two cross
validation methods.
https://www.geeksforgeeks.org/cross-validation-machine-learning/
https://analyticssteps.com/blogs/bootstrapping-method-types-working-and-
applications
https://www.vosesoftware.com/riskwiki/TheparametricBootstrap.php
MACHINE LEARNING 13
To estimate the uncertainty about the population standard deviation
using non-parametric bootstrap,
https://www.vosesoftware.com/riskwiki/ThenonparametricBootstrap.php
https://www.analyticsvidhya.com/blog/2018/06/comprehensive-guide-for-
ensemble-models/
35. What are the advantages and disadvantages of random forest learning
algorithm?
https://www.mygreatlearning.com/blog/random-forest-algorithm/
k means clustering
MACHINE LEARNING 14
💡 YouTube - https://youtu.be/CLKW6uWJtTc
hierarchical clustering
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/
💡 YouTube - https://youtu.be/7enWesSofhg
Birch algorithm
https://www.javatpoint.com/birch-in-data-mining
HMM algorithm
https://www.jigsawacademy.com/blogs/data-science/hidden-markov-
model
CURE algorithm
https://www.geeksforgeeks.org/basic-understanding-of-cure-algorithm/
38. Let’s say you are building a model that detects whether a person
has diabetes or not. After the train-test split, you got a test set of length 100,
out of which 70 data points are labelled positive (1), and 30 data points are
labelled negative (0). Draw confusion matrix based on the given data.
Calculate True positive rate, True negative rate, False positive rate and False
negative rate.
MACHINE LEARNING 15
https://www.kdnuggets.com/2020/09/performance-machine-learning-
model.html
https://www.geeksforgeeks.org/human-activity-recognition-using-deep-
learning-model/
https://www.synopsys.com/ai/what-is-reinforcement-
learning.html#:~:text=How Does Reinforcement Learning
Work,maximization of expected cumulative reward.
https://www.guru99.com/reinforcement-learning-
tutorial.html#reinforcement-learning-algorithms
algorithm - https://www.geeksforgeeks.org/ml-expectation-maximization-
algorithm
convergence - https://arxiv.org/pdf/1611.00519.pdf
MACHINE LEARNING 16
42. Write an algorithm for GMM.
https://towardsdatascience.com/gaussian-mixture-modelling-gmm-
833c88587c7f
43. What are ensemble methods? Which are different types of ensemble
methods?
https://towardsdatascience.com/ensemble-methods-in-machine-learning-
what-are-they-and-why-use-them-68ec3f9fef5f
https://towardsdatascience.com/how-neural-networks-solve-the-xor-problem-
59763136bdd7
MACHINE LEARNING 17