A Study On Regression Algorithm in Machine Learning
A Study On Regression Algorithm in Machine Learning
ISSN No:-2456-2165
Abstract:- In this fast-moving world, millions of data numbers rather than class. Hence, it is useful in predicting
and information exist and accessible to all. But from number--based problems like stock market prices, student
those collections, gathering exactly required data leads test performance, temperature for a given day.
to predict accurate results. ML plays a vital role in
converting the data into knowledge. Obliviously people There are various regression algorithms, that performs
are interacting with ML every day. From each and the task efficiently and produce accurate results. They are:
every interaction it constantly learns and improves the Linear Regression, Logistic Regression, K nearest
interaction. Regression is an important factor in ML. It neighbors and Decision Trees, Support Vector machine,
determines the relationship among variables. This Random Forest and Naive Bayes.
paper provides a study about regression algorithms
such as Linear regression, Support Vector Machine, Each regression algorithm has its own features. Based
Random Forest along with their strengths and on the data preparation regression algorithm will be
weaknesses. selected and trained and then machine learning model will
be generated.
Keywords:- AI, ML, Regression, Support Vector Machine,
Linear regression, Random Forest. II. LINEAR REGRESSION
B. Regression
Regression is one of the basic capabilities of machine
learning. It is a form of supervised learning. It discovers Fig 1:- Linear Regression
relationship among variables and provide output in terms of
Random Forest is a collection of models. It consists of Machine learning techniques are used to find the
multiple decision trees that are combined (as shown in Fig. underlying patterns within complex data automatically, else
4) and form a strong model to carryout classification and it is hard to discover. Future event prediction and complex
regression. The newly derived model will be more robust, decision making can be done with the hidden patterns and
accurate and handles overfitting better than basic models. knowledge about a problem. Machine learning converts the
Based on the majority voting, it calculates the output for data into knowledge. Classification and Regression are most
classification. For regression, mean is calculated. important part of ML. Regression discovers models based
on the given dataset. In this paper, some regression
Random forest model is good in handling tabular techniques with their strengths and weaknesses are
data or categorical features. It captures non-linear described. Each type of models has different special
interaction between the features and the target. Tree- features. Optimal result depends on the selection of
based models are not designed to work with very sparse regression algorithm that most suits to the data. In future,
features. While dealing with sparse input data, sparse with the collaboration of various regression techniques, we
features can be pre-processed to generate numerical can derive new regression technique that works well in
statistics, or switch to a linear model. This suits better for short time with high accuracy even for large datasets.
such scenarios.
REFERENCES
Strengths:
Robust, Accurate and powerful model.
Reduce overfitting, variance.
Works well with categorical and continuous variables.
Supports implicit feature selection and derives feature
importance.
Weaknesses:
High computational cost when forest becomes large.
Slow in prediction.