0% found this document useful (0 votes)
43 views

Introduction of The Popular Machine Learning Algorithm

Machine Learning (ML) is a research area that has developed over the past few decades as a result of the work of a small group of computer enthusiasts who were interested in the idea of computers learning to play video games and from a branch of mathematics called statistics that hardly ever took computational methods into consideration. The development of a large number of algorithms that are frequently used for text interpretation, pattern recognition, and a variety of other business purposes has sparked clear research interest in data mining to find hidden regularities or irregularities in data. data. data. data. social data is growing by the second This article describes the idea and history of machine learning and contrasts the three most popular machine learning algorithms using some fundamental ideas. The Sentiment140 dataset has been used to demonstrate and evaluate the efficiency of each method in terms of training time, prediction time, and prediction accuracy. Machine learning algorithms have become indispensable tools in analyzing complex datasets and extracting valuable insights. Among the myriad of algorithms available, one particular technique has gained widespread popularity due to its versatility and effectiveness. This comprehensive review aims to delve into the efficacy of this popular machine learning algorithm by offering a comprehensive analysis of its underlying principles, diverse applications, notable strengths, and inherent limitations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Introduction of The Popular Machine Learning Algorithm

Machine Learning (ML) is a research area that has developed over the past few decades as a result of the work of a small group of computer enthusiasts who were interested in the idea of computers learning to play video games and from a branch of mathematics called statistics that hardly ever took computational methods into consideration. The development of a large number of algorithms that are frequently used for text interpretation, pattern recognition, and a variety of other business purposes has sparked clear research interest in data mining to find hidden regularities or irregularities in data. data. data. data. social data is growing by the second This article describes the idea and history of machine learning and contrasts the three most popular machine learning algorithms using some fundamental ideas. The Sentiment140 dataset has been used to demonstrate and evaluate the efficiency of each method in terms of training time, prediction time, and prediction accuracy. Machine learning algorithms have become indispensable tools in analyzing complex datasets and extracting valuable insights. Among the myriad of algorithms available, one particular technique has gained widespread popularity due to its versatility and effectiveness. This comprehensive review aims to delve into the efficacy of this popular machine learning algorithm by offering a comprehensive analysis of its underlying principles, diverse applications, notable strengths, and inherent limitations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Introduction of the Popular Machine Learning Algorithm


Jasmin Praful Bharadiya1
Doctor of Philosophy Information Technology, University of the Cumberlands, USA

Abstract:- Machine Learning (ML) is a research area Humans have employed a number of tools since the
that has developed over the past few decades as a result beginning of time to speed up the completion of various
of the work of a small group of computer enthusiasts tasks. Thanks to the inventiveness of the human brain,
who were interested in the idea of computers learning to various technologies have been realized. By enabling people
play video games and from a branch of mathematics to fulfill a variety of needs, including those related to travel,
called statistics that hardly ever took computational industry, and computing, these devices improved the quality
methods into consideration. The development of a large of human existence. There is also machine learning among
number of algorithms that are frequently used for text these.
interpretation, pattern recognition, and a variety of
other business purposes has sparked clear research Machine Learning Algorithms (MLA) are used to help
interest in data mining to find hidden regularities or machines handle data more effectively. Even after
irregularities in data. data. data. data. social data is examining the data, there are times when we are unable to
growing by the second This article describes the idea and interpret or extrapolate the results. In that case, we apply
history of machine learning and contrasts the three most machine learning. The need for machine learning has
popular machine learning algorithms using some expanded as a result of the abundance of data sets. Many
fundamental ideas. The Sentiment140 dataset has been industries use machine learning to find relevant data.
used to demonstrate and evaluate the efficiency of each Machine learning aims to learn from data. This problem,
method in terms of training time, prediction time, and which requires the processing of large volumes of data, is
prediction accuracy. Machine learning algorithms have addressed by a large number of mathematicians and
become indispensable tools in analyzing complex programmers using a variety of methods.
datasets and extracting valuable insights. Among the
myriad of algorithms available, one particular technique To handle data challenges, machine learning employs
has gained widespread popularity due to its versatility a variety of strategies. The lack of a single type of algorithm
and effectiveness. This comprehensive review aims to that works well in all circumstances is a point that data
delve into the efficacy of this popular machine learning scientists want to underline. Solve a problem. The type of
algorithm by offering a comprehensive analysis of its problem you are trying to solve, how many variables there
underlying principles, diverse applications, notable are, what type of model would work best, and other
strengths, and inherent limitations. considerations affect the type of algorithm that is used.

Keywords:- Algorithm, Supervised Learning, Unsupervised Here is a brief overview of some of the most popular
Learning, Regression, Deep Learning and Support Vector machine learning (ML) algorithms.
Machines.
II. LINEAR REGRESSION
I. INTRODUCTION
Machine learning uses a variety of methodologies to
Machine learning is a paradigm that can be used to address data difficulties. Data scientists are meant to
describe learning from the past (in this case, historical data) emphasize the fact that there is no particular type of
to enhance performance in the future. This field only algorithm that works well in all situations. Solve a dilemma.
focuses on autonomous learning techniques. The term The type of method used depends on the type of problem
"learning" describes the automatic adjustment or you are trying to solve, the number of variables, the type of
enhancement of an algorithm based on prior "experiences" optimal model, and other factors.
without any outside aid from a person.

Machine learning's exponential rise in recent years has


revolutionized several industries, from data analysis to
artificial intelligence. Computers are now able to learn from
data, recognize patterns, and forecast the future thanks to
machine learning algorithms. Because of its adaptability and
exceptional performance across a wide range of
applications, Random Forests has become one of the most
well-known and effective algorithms among the many others
that are currently accessible.
Fig 1 Linear Regression

IJISRT23MAY2009 www.ijisrt.com 2504


Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Linear regression can be expressed mathematically as:
y= β0+ β 1x+ ε

Since the Linear Regression algorithm represents a


linear relationship between a dependent (y) and one or more
independent (y) variables, it is known as Linear Regression.
This means it finds how the value of the dependent variable
changes according to the change in the value of the
independent variable. The relation between independent and
dependent variables is a straight line with a slope.

III. LOGISTIC REGRESSION

To estimate discrete values (often binary values such


as 0/1) from a set of independent variables, logistic
regression is used. By fitting the data to a logit function, it
helps predict the probability of an event. Also known as
logit regression.

These methods listed below are often used to help


improve logistic regression models

 Include Interaction Terms


 Eliminate Features
 Regularize Techniques
 Use A Non-Linear Model

Fig 3 Decision Tree

V. SVM (SUPPORT VECTOR MACHINE)


ALGORITHM

SVM algorithm is a method of a classification


algorithm in which you plot raw data as points in an n-
dimensional space (where n is the number of features you
have). The value of each feature is then tied to a particular
coordinate, making it easy to classify the data. Lines called
classifiers can be used to split the data and plot them on a
graph.

Fig 2 Logistic Regression Models

IV. DECISION TREE

Decision Tree algorithm in machine learning is one of


the most popular algorithm in use today; this is a supervised
learning algorithm that is used for classifying problems. It
works well in classifying both categorical and continuous
dependent variables. This algorithm divides the population
into two or more homogeneous sets based on the most
significant attributes/ independent variables.

Fig 4 SVM (Support Vector Machine) Algorithm

IJISRT23MAY2009 www.ijisrt.com 2505


Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
VI. NAIVE BAYES ALGORITHM

An assumption made by a Naive Bayes classifier is


that the existence of a feature in a class has no bearing on
the presence of other features.

A Naive Bayes classifier would consider each of these


features individually when determining the likelihood of a
certain outcome, even though these attributes are related to
each other.

A naive Bayesian model is simple to build and


efficient for large datasets. It is known to outperform the
most complex categorization techniques while remaining
basic.

Fig 6 KNN (K- Nearest Neighbors) Algorithm

VIII. K-MEANS

It is an unsupervised learning technique that solves


clustering problems. The datasets are divided into a number
of clusters - let's call it K - in such a way that the data points
in each cluster are homogeneous and distinct from those in
the other clusters.

 How K-Means Forms Clusters:

 For each cluster, the K-means algorithm selects k


centroids, or points.
 Each data point forms a cluster or K clusters based on
its closest centroids.
 Now creates new centroids based on existing cluster
members.
 These updated centroids are used to calculate the closest
Fig 5 Naive Bayes Algorithm distance for each data point. This procedure is repeated
until the centroids do not change.
VII. KNN (K- NEAREST NEIGHBORS)
ALGORITHM IX. RANDOM FOREST ALGORITHM
This method can be used to solve problems related to A group of decision trees is called a "random forest".
classification and regression. Solving categorization Each tree has a class assigned to it, and each tree "votes" for
problems seems to be a growing trend in the data science that class when a new object is classified based on its
industry. This is a simple technique that preserves all properties. The forest selects the tree with the most votes
instances that already exist after ranking new instances by among all the trees in the forest.
gaining approval from at least k of their neighbors. A case is
given to the class whose case has the most characteristics.  Each Tree is Planted & Grown as follows:

This calculation is performed using a distance  A random sample of N cases is selected if the training
function. KNN is simple to understand compared to reality. set has N occurrences. The training set for the tree will
For example, it makes sense to talk with a person's friends be this example.
and colleagues if you want to know more about the.  Each node is divided using the best division of a given
number mm if there are M input variables. m variables
 Things to Consider before Selecting K Nearest from M are randomly selected at each node. The value
Neighbours Algorithm: of m remained constant during this process.
 Each tree reaches the largest possible size. There is no
 The computational cost of KNN is high. size.
 Variables should be normalized to avoid biasing the
algorithm from higher variables.
 Data pre-processing is always necessary.

IJISRT23MAY2009 www.ijisrt.com 2506


Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
X. DIMENSIONALITY REDUCTION  Reinforcement Learning:
ALGORITHMS Reinforcement learning has seen advancements in
training algorithms and applications in various domains,
In the modern world, businesses, governments and such as robotics and game playing. Notable advancements
research institutes store and analyze huge volumes of data. include:
Knowing that there is a wealth of information in this raw
data as a data scientist, your task is to find important  Deep Q-Networks (DQN):
patterns and variables. DQN combines deep neural networks with Q-learning,
enabling agents to learn directly from raw sensory input. It
You can identify relevant information using has achieved remarkable results in playing video games.
dimensionality reduction methods such as decision tree,
factor analysis, missing value ratio, and random forest.  Proximal Policy Optimization (PPO):
PPO is an algorithm that improves policy optimization
XI. GRADIENT BOOSTING ALGORITHM AND in reinforcement learning. It has shown stable and efficient
ADABOOSTING ALGORITHM learning in complex environments.

When processing huge amounts of data to create  Transfer Learning:


predictions with high accuracy, boosting techniques such as Transfer learning has gained attention as a technique
Gradient Boosting Algorithm and AdaBoosting Algorithm that allows models to leverage knowledge from pre-trained
are used. Boosting is an ensemble learning approach that models and apply it to new tasks with limited data. Recent
increases resilience by combining the predictive strength of advancements include:
many base estimators.
 Pre-Trained Language Models:
In other words, it constructs a strong predictor by Large-scale pre-trained language models, such as
combining a number of weak or average predictors. These GPT-3 and T5, have demonstrated impressive performance
boosting algorithms regularly succeed in data science on a wide range of natural language processing tasks. Fine-
competitions, such as Kaggle, AV Hackathon, and tuning these models with task-specific data has become a
CrowdAnalytix. These are currently the most popular common practice.
machine learning algorithms. Use them in conjunction with
Python and R codes to get accurate results.  Domain Adaptation:
Techniques for transferring knowledge across different
 Popular Machine Learning (ML) Algorithms New domains have improved, allowing models to generalize well
Advancements: from the source domain to the target domain with limited
As per my pragmatic understanding the Machine labeled data.
learning algorithms have been continuously evolving, and
several advancements have taken place in recent years. Here  Bayesian Deep Learning:
are some popular machine learning algorithms and their Bayesian methods in deep learning have been explored
advancements: to capture uncertainty and improve model robustness.
Advancements include:
 Deep Learning:
Deep learning, particularly deep neural networks, has  Variational Inference:
made significant advancements in various areas, such as Variational inference techniques have been applied to
computer vision, natural language processing, and speech deep learning models, enabling Bayesian inference with
recognition. Some notable advancements include: deep neural networks. Variational Autoencoders (VAEs)
and Bayesian Neural Networks (BNNs) are examples of this
 Transformer Models: advancement.
Transformer models, such as the Transformer
architecture and its variants (e.g., BERT, GPT, T5), have  Bayesian Optimization:
achieved state-of-the-art performance in language-related Bayesian optimization algorithms have been used to
tasks by leveraging self-attention mechanisms. tune hyperparameters of deep learning models efficiently,
reducing the need for manualhyperparameter tuning.
 Generative Adversarial Networks (GANs):
GANs have become more powerful and versatile, XII. SUMMARY
enabling the generation of highly realistic and high-
resolution images and videos. Progressive GANs and The article discusses various machine learning
StyleGAN are examples of advancements in this area. algorithms, including supervised learning, unsupervised
learning, regression, deep learning, and support vector
machines. These algorithms are widely used in the field of
machine learning for different purposes.

IJISRT23MAY2009 www.ijisrt.com 2507


Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 Supervised learning involves training a model using [2]. Bharadiya, J. . (2023). A Comprehensive Survey of
labeled data, where the algorithm learns to make Deep Learning Techniques Natural Language
predictions based on input-output pairs. Regression, a Processing. European Journal of Technology, 7(1), 58
type of supervised learning, is specifically used for - 66. https://doi.org/10.47672/ejt.1473
predicting continuous numeric values. [3]. Bharadiya, J. . (2023). Convolutional Neural
 Unsupervised learning, on the other hand, deals with Networks for Image Classification. International
unlabeled data and focuses on discovering patterns and Journal of Innovative Science and Research
structures within the data. Clustering is a common Technology, 8(5), 673 - 677. https://doi.org/10.5281/
unsupervised learning technique that groups similar zenodo.7952031
data points together based on their inherent [4]. J. M. Keller, M. R. Gray, J. A. Givens Jr., “A Fuzzy
characteristics. KNearest Neighbor Algorithm”, IEEE Transactions
 Deep learning refers to the use of deep neural networks, on Systems, Man and Cybernetics, Vol. SMC-15, No.
which are artificial neural networks with multiple 4, August 1985
layers. Deep learning has gained popularity due to its [5]. M. Bkassiny, Y. Li, and S. K. Jayaweera, “A survey
ability to automatically learn hierarchical on machine learning techniques in cognitive radios,”
representations from complex data, such as images, IEEE Communications Surveys & Tutorials, vol.
text, and audio. 15, no. 3, pp. 1136–1159, Oct. 2012.
 Support Vector Machines (SVMs) are a type of [6]. Nallamothu, P. T., & Bharadiya, J. P. (2023).
supervised learning algorithm that is effective for both Artificial Intelligence in Orthopedics: A Concise
classification and regression tasks. SVMs find the best Review. Asian Journal of Orthopaedic Research,
hyperplane that separates different classes in the data, 6(1), 17–27. Retrieved from https://journalajorr.com/
maximizing the margin between them. index.php/AJORR/article/view/164
 These algorithms play a crucial role in various [7]. P. Harrington, “Machine Learning in action”,
applications of machine learning, such as image Manning Publications Co., Shelter Island, New York,
recognition, natural language processing, fraud 2012
detection, and recommendation systems. Each [8]. S. Marsland, Machine learning: an algorithmic
algorithm has its strengths and weaknesses, and the perspective. CRC press, 2015
choice of algorithm depends on the specific problem [9]. W. Richert, L. P. Coelho, “Building Machine
and the available data. Learning Systems with Python”, Packt Publishing
Ltd., ISBN 978-1-78216-140-0
XIII. CONCLUSION

Machine learning is both supervised and unsupervised.


If you have fewer data points with clearly identified training
data, choose supervised learning. Unsupervised.

For large data sets, learning would generally result in


better performance and better results. Consider using deep
learning techniques if you have a large collection of data
that is easily accessible. Also, you found out what is deep
reinforcement learning and reinforcement learning. Given
his greater understanding of neural networks, their uses and
their drawbacks. Several different machine learning
algorithms are examined in this paper. Today, machine
learning is used by everyone, whether they know it or not.
Update your profile photo on social networking sites to
receive product recommendations when you shop online.
This article provides introductions to the vast majority of
known machine learning methods.

REFERENCES

[1]. Bharadiya , J. P., Tzenios, N. T., & Reddy , M.


(2023). Forecasting of Crop Yield using Remote
Sensing Data, Agrarian Factors and Machine
Learning Approaches. Journal of Engineering
Research and Reports, 24(12), 29–44. https://doi.org/
10.9734/jerr/2023/v24i12858

IJISRT23MAY2009 www.ijisrt.com 2508

You might also like