0% found this document useful (0 votes)
14 views

Three Machine Learning Algorithms

The document provides information about three machine learning algorithms: Random Forest, Support Vector Machines, and Artificial Neural Networks. It describes the basic concepts, workings, advantages and applications of each algorithm.

Uploaded by

jsksjskjs.3
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Three Machine Learning Algorithms

The document provides information about three machine learning algorithms: Random Forest, Support Vector Machines, and Artificial Neural Networks. It describes the basic concepts, workings, advantages and applications of each algorithm.

Uploaded by

jsksjskjs.3
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

NOTES ON Three Machine Learning Algorithms

## Random Forest
Random Forest is a versatile and powerful supervised ML
algorithm used for both classification and regression tasks. It
operates by constructing a multitude of decision trees during
the training phase and outputs the mode (for classification) or
mean (for regression) prediction of the individual trees.
Random Forest works in the following manner:

1) Random Sampling with Replacement (Bootstrapping): RF


builds multiple decision trees by sampling the training data with
replacement. This means that each tree in the forest is trained
on a bootstrapped subset of the original dataset. This random
sampling helps introduce diversity among the trees.

IMPORTANT NOTE ABOUT BOOTSTAPPING:


Bootstrapping is a resampling technique used in statistics to
estimate the sampling distribution of a statistic by repeatedly
resampling with replacement from the observed dataset. It is
particularly useful when the underlying distribution of the data
is unknown or when the sample size is small. It is performed
through:
Sampling with Replacement: Bootstrapping involves randomly
sampling observations from the original dataset with
replacement to create new "bootstrap samples." This means
that each observation in the original dataset has the same
chance of being selected for a bootstrap sample at each
iteration, and some observations may be selected multiple
times while others may not be selected at all.

Estimation: After creating multiple bootstrap samples (typically


thousands or more), the statistic of interest (e.g., mean,
median, standard deviation, regression coefficient) is computed
for each bootstrap sample.

Calculating the Sampling Distribution: The collection of statistics


computed from the bootstrap samples forms the "bootstrap
distribution" or "sampling distribution" of the statistic of
interest. From this distribution, various properties such as
confidence intervals, standard errors, and hypothesis test
statistics can be estimated.

2) Random Feature Selection: At each node of the decision tree,


instead of considering all features for splitting, Random Forest
only considers a random subset of features. This further adds
randomness to the trees and helps prevent overfitting.
3) Decision Tree Building: Each decision tree is grown to its
maximum depth or until it reaches a stopping criterion (e.g.,
minimum number of samples in a leaf node).The trees are
typically constructed using techniques such as CART
(Classification and Regression Trees).

4) Voting (Classification) or Averaging (Regression): For


classification tasks, the prediction of each tree is considered,
and the class with the most votes across all trees is chosen as
the final prediction. For regression tasks, the predictions of all
trees are averaged to produce the final prediction.

### Advantages of Random Forest


a) Robust to Overfitting: By aggregating predictions from
multiple trees and introducing randomness, Random Forest is
less prone to overfitting compared to individual decision trees.
b) Handles Large Datasets: Random Forest can efficiently handle
large datasets with a large number of features and instances.
c) Implicit Feature Selection: Random Forest provides a
measure of feature importance, allowing users to identify the
most relevant features in the dataset.
Overall, Random Forest is a popular choice for many machine
learning tasks due to its robustness, flexibility, and ease of use.
## SUPPORT VECTOR MACHINES (SVM) -
A Support Vector Machine (SVM) is a supervised ML algorithm
primarily used for classification tasks, although it can also be
adapted for regression tasks. SVMs are effective in high-
dimensional spaces and are particularly well-suited for
problems where the number of features exceeds the number of
samples. SVM works by:

1) Separating Hyperplane: SVM aims to find the optimal


hyperplane that best separates the data points into different
classes. In two dimensions, this hyperplane is a line; in three
dimensions, it's a plane, and in higher dimensions, it's a
hyperplane.

2) Maximizing Margin:The optimal hyperplane is the one that


maximizes the margin, which is the distance between the
hyperplane and the nearest data point from each class, also
known as support vectors. Maximizing the margin helps
improve the generalization ability of the classifier.

3) Kernel Trick: SVMs can efficiently handle nonlinear


relationships between features through the kernel trick. By
mapping the input features into a higher-dimensional space,
SVM can find a hyperplane that effectively separates the data
points.
4) Soft Margin SVM: In situations where the data is not linearly
separable, or there are outliers, a soft-margin SVM can be used.
This allows for some misclassification of data points, controlled
by a regularization parameter (C), to find a better separating
hyperplane.

5) Kernel Functions: SVMs can utilize different types of kernel


functions (e.g., linear, polynomial, radial basis function (RBF),
sigmoid) to map the data into higher-dimensional space. The
choice of kernel function depends on the problem at hand and
the characteristics of the data.

6) Decision Function: Once trained, the SVM constructs a


decision function based on the support vectors and their
associated weights. This decision function is used to classify
new, unseen data points into one of the predefined classes.

What are the Advantages of Support Vector Machines:


a) Effective in High-Dimensional Spaces: SVMs perform well
even in cases where the number of dimensions exceeds the
number of samples.
b) Versatile: SVMs can be adapted for various tasks, including
classification, regression, and outlier detection.
c) Regularization: SVMs offer regularization parameters to
control overfitting and handle noise effectively.
d) Kernel Trick: SVMs can handle nonlinear relationships
between features using kernel functions, enabling them to
model complex decision boundaries.

Support Vector Machines are widely used in various fields,


including image classification, text classification, bioinformatics,
and finance, owing to their effectiveness and versatility in
handling both linear and nonlinear data.

## Artificial Neural Network (ANN)


ANN is a computational model inspired by the structure and
functioning of the human brain's biological neural networks.
ANNs are composed of interconnected nodes called artificial
neurons or perceptrons, organized in layers. These networks are
used for solving various machine learning tasks, including
classification, regression, clustering, and pattern recognition.

Structure of Artificial Neural Network:


1) Input Layer - this consists of neurons that receive the input
data. Each neuron corresponds to a feature in the input data.
2) Hidden Layers - Hidden layers are layers between the input
and output layers. They contain one or more layers of neurons
that perform intermediate computations. Deep neural networks
have multiple hidden layers.

3) Output Layer - The output layer produces the network's final


predictions. The number of neurons in the output layer
depends on the task; for example, in classification tasks, the
number of output neurons corresponds to the number of
classes.

4) Connections and Weights - Each connection between


neurons has an associated weight, which determines the
strength of the connection. During training, these weights are
adjusted to minimize the difference between the predicted and
actual outputs.

5) Activation Functions - they introduce nonlinearity into the


network, allowing it to learn complex relationships in the data.
Common activation functions include sigmoid, tanh, ReLU
(Rectified Linear Unit), and softmax.

## How does ANN work?


1) Forward Propagation - During forward propagation, input
data is fed into the network, and computations are performed
layer by layer until the output layer produces predictions. Each
neuron in a layer receives inputs from the previous layer,
applies a weighted sum of inputs, adds a bias term, and passes
the result through an activation function.

2) Backpropagation - this is the process of updating the


network's weights based on the difference between predicted
and actual outputs (the loss). This process involves computing
gradients of the loss function with respect to each weight and
adjusting the weights using gradient descent or its variants.

3) Training - training an ANN involves iteratively feeding training


data through the network, adjusting weights through
backpropagation to minimize the loss function, and optimizing
network performance.

4) Inference - Once trained, the ANN can be used for making


predictions on new, unseen data by simply performing forward
propagation.

Advantages of Artificial Neural Networks:


a) Ability to Learn Complex Patterns: ANNs can learn complex
patterns and relationships in data, making them suitable for
various tasks.
b) Adaptability: ANNs can adapt and learn from new data,
allowing them to generalize well to unseen examples.
Feature Learning: Deep neural networks can automatically learn
useful features from raw data, reducing the need for manual
feature engineering.
c) Parallel Processing: ANNs can be efficiently implemented on
parallel architectures, making them suitable for large-scale
computations.

Artificial Neural Networks are widely used in fields such as


image recognition, natural language processing, speech
recognition, autonomous vehicles, and many more, owing to
their effectiveness and flexibility in solving diverse problems.

## Comparing prediction outputs of the 3 algorithms above

Comparing Random Forest (RF), Support Vector Machine (SVM),


and Artificial Neural Network (ANN) models for classification
tasks involves evaluating their performance based on various
metrics.
1) Accuracy - measures the proportion of correctly classified
instances out of the total instances. Higher accuracy indicates
better performance.

2) Confusion Matrix - The confusion matrix provides a detailed


breakdown of the model's predictions, including true positives,
true negatives, false positives, and false negatives. It helps in
understanding the types of errors made by the model.

3) Precision and Recall - Precision measures the proportion of


true positive predictions out of all positive predictions, while
recall measures the proportion of true positive predictions out
of all actual positive instances. Higher precision and recall
values indicate better performance.

4) F1-score - this is the harmonic mean of precision and recall,


providing a balance between the two metrics. It's useful when
there's an imbalance between the classes.

5) ROC Curve and AUC (if applicable) - For binary classification


tasks, Receiver Operating Characteristic (ROC) curve and Area
Under the Curve (AUC) can be used to evaluate the trade-off
between true positive rate and false positive rate across
different threshold values.
6) Computational Complexity and Training Time - Consider the
computational resources required and the training time of each
model. Some models, like SVM with large datasets or complex
neural networks, may require more computational resources
and time for training.

7) Robustness and Generalization - Evaluate how well each


model generalizes to unseen data and its robustness to noise
and outliers.

To compare RF, SVM, and ANN models, you can assess each
model's performance using these metrics on a common
validation or test dataset. Additionally, consider the
interpretability, ease of implementation, and suitability for the
specific problem domain when making comparisons. It's also a
good practice to perform cross-validation to ensure the
robustness of the evaluation. Ultimately, the choice of the best
model depends on the specific characteristics of the dataset
and the requirements of the problem at hand.

You might also like