Module 3 Data Science Machine Learning
Module 3 Data Science Machine Learning
MACHINE LEARNING
“Machine learning is a field of study that
gives computers the ability to learn
without being explicitly programmed.”
—Arthur Samuel
■Matplotlib is a popular 2D plotting package with
some 3D functionality.
Split Validation
K-fold cross validation
Leave One Out cross
validation
Cross validation
Cross validation is a technique used in machine
learning to evaluate the performance of a model on
unseen data. It involves dividing the available data
into multiple folds or subsets, using one of these folds
as a validation set, and training the model on the
remaining folds. This process is repeated multiple
times, each time using a different fold as the
validation set.
An example of split validation in machine learning
is when a dataset is divided into three sets: training,
validation, and testing:
Training set: Used to train the model
Validation set: Used to validate the model
Testing set: Used to test the model
K-fold cross validation
K-fold cross validation in machine learning cross-validation
is a powerful technique for evaluating predictive models in
data science. It involves splitting the dataset into k
subsets or folds, where each fold is used as the validation
set in turn while the remaining k-1 folds are used for
training.
n the K-Fold method, the dataset is divided into ‘k’
subsets, called folds. Using all, but one fold, training is
done. The fold left out is used in the evaluation of the
model once it is trained. This method performs k iterations
k times, in each of which, a different subset is reserved for
testing.
L.O.O.C.V. (Leave One Out
Cross Validation),
In the method of L.O.O.C.V. (Leave One Out Cross Validation),
the model is trained using the entire dataset, while leaving out
only one data point of the dataset, and then iterating for each
data point. One prominent benefit of this method is that all the
data points are used, thus it is low bias
Regularization is a technique
used to reduce errors by fitting
the function appropriately on the
given training set and avoiding
overfitting.
L1 reularuzation(Lasso
regularization)
L2 regularization (Ridge
regularization)
L1 norm calculates the sum of
the absolute values of the vector
elements,
Also, suppose that the fruits are apple, banana, cherry, and
grape. Suppose one already knows from their previous
work (or experience) that, the shape of every fruit present in
the basket so, it is easy for them to arrange the same type of
fruits in one place. Here, the previous work is called training
data in Data Mining terminology.