0% found this document useful (0 votes)

107 views8 pages

Ola Data Analysis For Dynamic Price Prediction Usi

This document compares the performance of multiple linear regression and random forest regression algorithms for predicting taxi fares using machine learning. It analyzes sample cab ride data to predict fares based on attributes like location, time, passenger count. Both algorithms are tested on 75 samples each and evaluated based on metrics like r-squared, mean squared error, root mean squared error. Statistical analysis shows multiple linear regression has a slightly higher r-squared value and lower mean squared error, indicating it provides a more accurate prediction model than random forest regression for this problem.

Uploaded by

Aditya Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views8 pages

Ola Data Analysis For Dynamic Price Prediction Usi

Uploaded by

Aditya Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

500 Advances in Parallel Computing Algorithms, Tools and Paradigms

D.J. Hemanth et al. (Eds.)

© 2022 The authors and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/APC220071

Ola Data Analysis for Dynamic Price

Prediction Using Multiple Linear
Regression and Random Forest Regression
G. Venkat Sai Taruna, and P. Sriramyab, 1
a
Research Scholar, Dept. of CSE, Saveetha School of Engineering,
b
Professor, Dept. of AI&DS, Saveetha School of Engineering,
SIMATS, Chennai,Tamilnadu, India

Abstract. This research aims to create the most efficient and accurate cab fare
prediction system using two machine learning algorithms, the Multiple linear
Regression algorithm and the random forest algorithm, and compare parameters r-
square, Mean Square Error (MSE), Root MSE, and RMSLE values to evaluate the
efficiency of two machine learning algorithm. Considering Multiple linear
Regression as group 1 and random forest algorithms as implemented, the 2 group
process was to predict prices and get the best accuracy to compare algorithms. The
algorithm should be efficient enough to produce the exact fare amount of the trip
before the trip starts. The sample size for implementing this work was N=10 for
each group considered. The sample size calculation was done with clincle. The
pretest analysis was kept at 80%. The sample size is estimated using G-power.
Based on the statistical analysis significance value for calculating r-squared, MSE
was 0.945 and 0.266(p>0.05), respectively. The Multiple linear algorithms give a
slightly better accuracy rate with a mean r-squared percentage of 71.69%, and the
Random forest algorithm has a mean r-square of 71.29%. Through this, prediction
is made for online booking of cabs or taxis, and the Multiple linear algorithms give
a slightly better r-squared value than the Random forest algorithm.

Keywords. Multiple Linear regression, Random Forest regression, Fare prediction,

Novel exploratory data analysis, Machine Learning.

1. Introduction

This research aims to predict fare amount for online cab services with a machine
learning algorithm using a Multiple linear regression algorithm to evaluate the r
squared value with a Random forest algorithm [1]. The primary importance of this
study is predicting the prices of online cab services. The trip cost which will start can
be shown before the trip begins. This process is shown as price prediction. The price
prediction offers the trip's fare by calculating the given values of the attributes.
The attributes are the central values to be calculated to show prediction—the
attributes like location, date-time, passenger count, and existing fare amount [2]. The
existing fare amount should be changed or updated through the program, and the fare
amount is updated through weather conditions, day or night, etc. These conditions

1
P.Sriramya, Dept. of AI&DS, Saveetha School of Engineering, Saveetha Institute of Medical And
Technical Sciences, Chennai, India. E-mail: [email protected].
G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction 501

affect the fare amount and update it. The research applications show the uses of the
proposed algorithm, Multiple linear regression, and where it is used [3]. It mainly
correlates two or more independent variables and obtains statistically significant
relations. Under certain conditions, multiple linear regression allows us to derive
projected values for specific variables. [4].
According to the literature survey, more than 20,000 related articles were
published in Google Scholar, and Scopus indexed journals of Machine Learning. Many
existing reports show predictions of cab fare with different approaches. The multiple
linear regression algorithm is one of the best algorithms for prediction analysis. This
study focuses on the mobile application for the stock prediction that uses improved
multiple linear regression to provide accurate stock price predictions. IMLR is a hybrid
technique that combines the average moving approach with multiple linear regression
in multiple linear regression. It analyzes the daily stock history and predicts stock
prices [5].
In this research work, multiple linear regression algorithms are used to forecast and
explain the creation of Spanish day-ahead electricity prices. The accuracy of multiple
linear regression models is improved at the expense of complexity. [6] show a quantile
regression model based on gradient boosted regression trees. The price trend of maize
is forecasted using several linear regression analysis models under big data, although
the estimated price will diverge from the actual model. The forecast value is derived
from the study of independent factors. Then it was utilized to predict corn [2] better. In
the investigation of Multiple linear regression predictor analysis for electricity price
forecasting, error rates are considerably reduced using multiple linear regression.

2. Materials and Methods

The research was performed in the Data Analytics Lab in the Department of Computer
Science and Engineering, Saveetha School of Engineering, Saveetha Institute of
Medical And Technical Sciences. The sample size taken for experimenting was 10.
Two groups are considered classifiers algorithms to classify prediction of fare amount,
and machine learning classification algorithms are used. Group 1 is the Random forest
algorithm, and the Multiple linear regression algorithm is group 2, and they are
compared for more r-square, MSE, RMSE, and RMSLE values for choosing the best
algorithm. The current work is random forest regression, and the proposed work is
multiple linear regression for price prediction. The total number of samples evaluated
on the proposed methodology is 75 in each of the two groups. Attributes are a crucial
part of showing the prediction price of the fare. The g power calculation calculates the
required samples for this assay. [7]. The minimum analytical power is set at 0.8, while
the maximum allowed error is set to 0.5. Figure 1 depicts the work's intended
architecture
502 G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction

Figure 1. Multiple linear regression architecture diagram.

2.1. Random Forest Algorithm

Random forest regression is the existing algorithm in this work. A Novel exploratory
data analysis is applied to analyze input data and summarize their main characteristics.
The training dataset goes through novel exploratory data analysis to extract the main
feature for data extraction. The random forest algorithm is a tree-based technique that
employs numerous decision trees' quality features. Random forest is a supervised
method that takes as input a training dataset and finds predicted values. The decision
tree algorithms have disadvantages like low execution accuracy and inaccurate
predictions. These disadvantages can be solved using the Random forest algorithm.
Here in this algorithm, the data is divided into three sets, and the program is executed in
different ways where accuracy is found [8].

2.2. Multiple Linear Regression Algorithm

The Multiple linear regression algorithm is the proposed algorithm in this article.
The testing procedure includes training the dataset before continuing and after testing
it, training and evaluating these algorithms. The testing procedure includes training
the dataset before continuing the testing process and, after trying it, training and
evaluating
G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction 503

Table 1. Sample Input Dataset

Fare_a Pickup_dateti Pickup_lon Pickup_lati Dropoff_lon Dropoff_latitu Passeng

mount me gitude tude gitude de er_count

4.5 2009-06-15 -73.844311 40.721319 -73.84161 40.712278 1

17:26:21 UTC

16.9 2010-01-05 -74.016048 40.711303 -73.979268 40.782004 1

16:52:16 UTC

5.7 2011-08-18 -73.982738 40.76127 -73.991242 40.750562 2

00:35:00 UTC

7.7 2012-04-21 -73.98713 40.733143 -73.991567 40.758092 1

04:30:42 UTC

5.3 2010-03-09 -73.968095 40.768008 -73.956655 40.783762 1

07:51:00 UTC

2.3. Statistical Analysis

For statistical comparisons of metrics like r squared and MSE, SPSS version 21 was
employed. The dependent attributes are fare amount, and fare amount, Pickup
DateTime, pickup longitude, pickup latitude, dropoff longitude and latitude, and
passenger count are independent attributes that will appear in both data sets. The
r-squared and MSE were calculated. The r-squared value and Mean Square Error were
computed using an independent sample T-test. Algorithm (p>0.05, Independent sample
t-test). When these algorithms are compared, Multiple linear regression has a higher r-
square value of 71.69% when compared to random forest regression of 71.29%. And
mean square error of multiple linear regression (54.54%) is lesser than Random Forest
(57.40%). As there is a marginal difference in accuracy, Multiple linear regression is
statistically better when compared to random forest regression.

Figure 2. Comparative analysis of Test and training data for the performance evaluation parameters r square,
MSE, RMSE and RMSLE
504 G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction

Table 2. Comparison of the performance evaluation metrics for training data values achieved.

Multiple linear Random Forest

MSE 4.017 5.724

RMSE 2.004 2.392

RMSLE 0.191 0.227

R square 0.796 0.712

3. Results

In this study, we observed that multiple linear regression algorithms have a slightly
better r-squared value than Random Forest observed that r-squared and MSE values are
almost the same as in both algorithms in case of training data. According to the results
achieved in training data there is better improvement in MSE value of Multiple Linear
Regression (4.017) when compared to Random Forest (5.724). Figure 2 gives
comparative analysis of Training data for performance evaluation parameters r square,
MSE, RMSE and RMSLE. From Table 2.

Figure 3. Comparative analysis of Training data for the performance evaluation parameters r square, MSE,
RMSE and RMSLE.

Figure 3 gives comparative analysis of Test data for performance evaluation

parameters r square, MSE, RMSE and RMSLE. From Table 3, it is observed that there
G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction 505

is a slight significant increase in r-square values in both algorithms in case of testing

data. Accordingto the results achieved in testing data there is slight improvement in
MSE value of Multiple Linear Regression (5.601) when compared to Random Forest
(5.651). Since testing data is considered for results, we can prove that Multiple linear
regression is able to predict price in an accurate manner.

Table 3. Comparison of the performance evaluation metrics for testing data values achieved.

Multiple linear Random Forest

MSE 5.601 5.651

RMSE 2.366 2.377

RMSLE 0.220 0.225

R square 0.734 0.708

A brief descriptive statistical analysis was performed to obtain Mean, Std.

Deviation and Std. Error Mean for r-squared and MSE values of Multiple linear
regression algorithm and Random Forest Algorithm which is presented in Table 4. An
independent sample t-test was performed with a fixed confidence level to obtain t-test
Equality of Means which is presented in Table 5.

Table 4. Group Statistics: Comparison of Random Forest and Multiple linear algorithm by varying rsquare
parameters. Multiple linear has a mean value of 71.69 for and the Random Forest results in a mean value of
71.29 for r-square.

Algorithm N Mean Std.Deviation Std.Error Mean

r-square Multiple 10 71.69 .486 .153

linear

RF 10 71.29 .467 .147

MSE Multiple 10 54.54 1.172 .370

linear

RF 10 57.40 .843 .266

506 G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction

Table 5. Independent Sample T test for the two groups Multiple linear and Random Forest algorithms.
[significance is 0.945 (r-square) and 0.266 (MSE), p>0.05]

r-square
F Sig. T df Sig. Mean Std.Error
(2-tailed) Difference Difference

Equal .005 .945 1.865 18 .039 0. .398 .280

variances
assumed

Equal 1.865 17.973 .039 0 .398 .280

variances
not
assumed

MSE Equal 1.317 .266 -6.248 18 .001 -2.85 .934

variances
assumed

Equal -6.248 16.348 .001 -2.85 .934

variances
not
assumed

4. Discussion

Multiple linear regression is a machine learning approach that lets us transfer numeric
inputs to numeric outputs by fitting a line through the data points. It's the process of
identifying a line that best fits the data points on a plot so that we can use it to forecast
output values for inputs that aren't present in the data set we have, with the hope that
those outputs will fall on the line. Multiple linear regression is slightly better than
random forest algorithm in terms of determining model variation and relative
contribution of each independent variable in total variance. This property shows that
multiple linear regression is slightly better than random forest algorithm, with a
significance value of less than 0.945. The average r square value of multiple linear
regression is 71.69%.
In the article Multiple linear regression model for predicting bidding price, it
shows accuracy estimated from statistics of validation data r square, MAPE is used as
estimators of model [9]. Multiple linear regression is one of the best machine learning
algorithms for finding predictions of values and shows why it is the best algorithm to
use [2]. The similar findings are that multiple linear regression provides more accurate
values or predictions and is capable of capturing a decent amount of variations[10].
There are no opposite findings observed for this work.
The limitation of this work is that, when there is an overfitted model it performs
worse on the testing dataset. Despite the fact that the study's outcomes are marginally
superior in both experimental and statistical analysis, the work has limitations. Multiple
G. Venkat Sai Tarun and P. Sriramya / Ola Data Analysis for Dynamic Price Prediction 507

linear regression cannot obtain accurate values when the points on the graph do not fit
in line. It considers the effect of more than one explanatory variable on some outcomes.
As a result, future work could include improving the algorithm so that it can compute
dynamic ride-sharing during peak traffic hours. To deal with this complexity, deep
neural networks can be used.

5. Conclusion

The research work, proposed a method for cab fare prediction using machine-learning
techniques, these results showed a slightly better accuracy standard for producing a
near accurate estimation result. Based on the significance value (0.945) achieved
through SPSS. Multiple linear regression mean accuracy is 71.69% and random forest
mean accuracy is 71.29%. The mean square error of multiple linear regression is lower
when compared to random forest algorithms. Thus, multiple linear regression has
slightly better accuracy when compared to random forest algorithms.

References

[1] Politis DN. Model-Based Prediction in Regression. Model-Free Prediction and Regression 2015; 33–
56.
[2] Ulgen T, Poyrazoglu G. Predictor Analysis for Electricity Price Forecasting by Multiple Linear
Regression. 2020 International Symposium on Power Electronics, Electrical Drives, Automation and
Motion (SPEEDAM). Epub ahead of print 2020. DOI: 10.1109/speedam48782.2020.9161866.
[3] Smith PF, Ganesh S, Liu P. A comparison of random forest regression and multiple linear regression
for prediction in neuroscience. Journal of Neuroscience Methods 2013; 220: 85–91.
[4] Roback P, Legler J. Review of Multiple Linear Regression. Beyond Multiple Linear Regression 2021;
1–38.
[5] Izzah A, Sari YA, Widyastuti R, et al. Mobile app for stock prediction using Improved Multiple Linear
Regression. 2017 International Conference on Sustainable Information Engineering and Technology
(SIET). Epub ahead of print 2017. DOI: 10.1109/siet.2017.8304126.
[6] Sharma N. XGBoost. The Extreme Gradient Boosting for Mining Applications. GRIN Verlag, 2018.
[7] Yuen KC, Zhu L, Zhang D. Lifetime Data Analysis. 2002; 8: 401–412.
[8] Xu R. Improvements to random forest methodology. DOI: 10.31274/etd-180810-3436.
[9] Noi P, Degener J, Kappas M. Comparison of Multiple Linear Regression, Cubist Regression, and
Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of
MODIS LST Data. Remote Sensing 2017; 9: 398.
[10] Kaushal A, Shankar A. House Price Prediction Using Multiple Linear Regression. SSRN Electronic
Journal. DOI: 10.2139/ssrn.3833734.

Extracted Text
No ratings yet
Extracted Text
391 pages
A Comparative Analysis On Linear Regression and Support Vector Regression
No ratings yet
A Comparative Analysis On Linear Regression and Support Vector Regression
5 pages
Empirical Asset Pricing Via Machine Learning Appendix
No ratings yet
Empirical Asset Pricing Via Machine Learning Appendix
31 pages
Deoda DDP
No ratings yet
Deoda DDP
39 pages
Unit III Da Notes
No ratings yet
Unit III Da Notes
43 pages
Exp 2 (Multiple Linear Regression)
No ratings yet
Exp 2 (Multiple Linear Regression)
6 pages
Taxi Fare Prediction Using Random Forests
No ratings yet
Taxi Fare Prediction Using Random Forests
10 pages
House Price Prediction Using Regression Techniques: A Comparative Study
No ratings yet
House Price Prediction Using Regression Techniques: A Comparative Study
5 pages
Mathematical Programming For Piecewise Linear Regression Analysis
No ratings yet
Mathematical Programming For Piecewise Linear Regression Analysis
43 pages
From Equations To Predictions Understanding The Mathematics and Machine Learning of Multiple Linear Regression
No ratings yet
From Equations To Predictions Understanding The Mathematics and Machine Learning of Multiple Linear Regression
9 pages
Optimizing The Hyperparameters 1693296270
No ratings yet
Optimizing The Hyperparameters 1693296270
11 pages
Application and Comparison of Several Machine Learning Algorithms and Their Integration Models in Regression Problems
No ratings yet
Application and Comparison of Several Machine Learning Algorithms and Their Integration Models in Regression Problems
9 pages
EX 1.1 Insurance Cost Prediction
No ratings yet
EX 1.1 Insurance Cost Prediction
4 pages
DAV 2201079 Exp 2 2-1
No ratings yet
DAV 2201079 Exp 2 2-1
35 pages
4TH Year Cat 1
No ratings yet
4TH Year Cat 1
12 pages
(Slide) Non Linear Regression
No ratings yet
(Slide) Non Linear Regression
39 pages
LR LogReg
No ratings yet
LR LogReg
53 pages
Stock Price Prediction Website Using Linear Regres
No ratings yet
Stock Price Prediction Website Using Linear Regres
10 pages
Algorithmic Trading Strategy For Uk Energy Market Using Linear Regression and Naive Bayes As Benchmarks
No ratings yet
Algorithmic Trading Strategy For Uk Energy Market Using Linear Regression and Naive Bayes As Benchmarks
7 pages
Prediction of House Rent Using Multiple Linear Regression
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
20 pages
Predictive Maintenance of Machinery in A Manufacturing Plant Using Polynomial Regression
No ratings yet
Predictive Maintenance of Machinery in A Manufacturing Plant Using Polynomial Regression
9 pages
Prediction of Dynamic Price of Ride-On-Demand Services Using Linear Regression
No ratings yet
Prediction of Dynamic Price of Ride-On-Demand Services Using Linear Regression
1 page
Lecture 3
No ratings yet
Lecture 3
51 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
1 PB
No ratings yet
1 PB
8 pages
Analyzing Ola Data For Predicting Price
No ratings yet
Analyzing Ola Data For Predicting Price
7 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
15 pages
Flight Price Predection 2
No ratings yet
Flight Price Predection 2
6 pages
A Machine Learning Approach To Predict Price of Airlines Tickets
No ratings yet
A Machine Learning Approach To Predict Price of Airlines Tickets
7 pages
Airfare Synopsis
No ratings yet
Airfare Synopsis
6 pages
Paper 90
No ratings yet
Paper 90
7 pages
A Comparison of Regression Models For Prediction of Graduate Admissions
No ratings yet
A Comparison of Regression Models For Prediction of Graduate Admissions
5 pages
Ijirt161160 Paper
No ratings yet
Ijirt161160 Paper
5 pages
47.epra Journals 14763
No ratings yet
47.epra Journals 14763
6 pages
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
No ratings yet
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
5 pages
Research High School
No ratings yet
Research High School
10 pages
A17 Journal (1) .Docxnew
No ratings yet
A17 Journal (1) .Docxnew
9 pages
IJRPR22505
No ratings yet
IJRPR22505
3 pages
MLRS Assignment 1 24070146008 Sreemanth Mannem
No ratings yet
MLRS Assignment 1 24070146008 Sreemanth Mannem
12 pages
Stock Price Trend Forecasting Using Supervised Learning Methods
No ratings yet
Stock Price Trend Forecasting Using Supervised Learning Methods
2 pages
Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar
0% (1)
Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar
7 pages
Report
No ratings yet
Report
2 pages
(Ebook PDF) Business Statistics A First Course, 6th Edition Instant Download
100% (5)
(Ebook PDF) Business Statistics A First Course, 6th Edition Instant Download
40 pages
Bike Rental (Project)
No ratings yet
Bike Rental (Project)
16 pages
Quantitative Techniques Final As at 4 May 2005
0% (1)
Quantitative Techniques Final As at 4 May 2005
217 pages
EE5253 2023 Paper Group35
No ratings yet
EE5253 2023 Paper Group35
5 pages
Effect of Online Reviews On Consumer Purchase Behavior
No ratings yet
Effect of Online Reviews On Consumer Purchase Behavior
6 pages
Telecom Customer Churn
0% (1)
Telecom Customer Churn
39 pages
Impact of Stress Among Nurses in Private Hospitals - An Empirical Study
No ratings yet
Impact of Stress Among Nurses in Private Hospitals - An Empirical Study
11 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
11 pages
Emily Oster - Witchcraft Weather and Economic Growth in Renaissance Europe
No ratings yet
Emily Oster - Witchcraft Weather and Economic Growth in Renaissance Europe
14 pages
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
Operation Mangement Chapter 3
100% (1)
Operation Mangement Chapter 3
105 pages
Homework 02 Answers
0% (1)
Homework 02 Answers
12 pages
(IJCST-V13I1P4) :DR - Snehal K Joshi
No ratings yet
(IJCST-V13I1P4) :DR - Snehal K Joshi
7 pages
Assignment Usol 2016-2017 PGDST
No ratings yet
Assignment Usol 2016-2017 PGDST
20 pages
All The Previous Questions
No ratings yet
All The Previous Questions
37 pages
UGC NET Code 17 Management 12th March 2023 Question Paper
No ratings yet
UGC NET Code 17 Management 12th March 2023 Question Paper
29 pages
Contoh Artikel Jurnal Learning 6
No ratings yet
Contoh Artikel Jurnal Learning 6
15 pages
GSCA Pro 1.1 User Manual
No ratings yet
GSCA Pro 1.1 User Manual
47 pages
Determinants of Academic Performance Among Senior H Igh School Shs Students in The Ashanti Mampong Municipality of Ghana
No ratings yet
Determinants of Academic Performance Among Senior H Igh School Shs Students in The Ashanti Mampong Municipality of Ghana
16 pages
A Study On The Junior High School Students' Spatial Ability and Mathematics Problem Solving in Mojokerto
No ratings yet
A Study On The Junior High School Students' Spatial Ability and Mathematics Problem Solving in Mojokerto
9 pages
Tijbm, BM2104-049
No ratings yet
Tijbm, BM2104-049
6 pages
Motivating Peak Performance
No ratings yet
Motivating Peak Performance
20 pages
Untitled Document
No ratings yet
Untitled Document
21 pages
Computers and Electronics in Agriculture: S. Sunoj, C. Igathinathane, R. Visvanathan
No ratings yet
Computers and Electronics in Agriculture: S. Sunoj, C. Igathinathane, R. Visvanathan
9 pages
Chat GPT
No ratings yet
Chat GPT
6 pages
1 s2.0 S221501612300153X Main
No ratings yet
1 s2.0 S221501612300153X Main
22 pages
DS Unit 2 Essay Answers
No ratings yet
DS Unit 2 Essay Answers
17 pages
Finance Research Letters: Xianhang Qian, Shanyun Qiu, Guangli Zhang
No ratings yet
Finance Research Letters: Xianhang Qian, Shanyun Qiu, Guangli Zhang
7 pages
Construction and Building Materials: David J.A. Trujillo, Dominika Malkowska
No ratings yet
Construction and Building Materials: David J.A. Trujillo, Dominika Malkowska
12 pages
2012.khansa Zaman - Customer Loyalty in FMCG Sector of Pakistan - Dominikus Diazpora Darmawan
No ratings yet
2012.khansa Zaman - Customer Loyalty in FMCG Sector of Pakistan - Dominikus Diazpora Darmawan
9 pages
Long-Term Electricity Demand Forecast Using Multivariate Regression and End-Use Method A Study Case of Maluku-Papua Electricity System
No ratings yet
Long-Term Electricity Demand Forecast Using Multivariate Regression and End-Use Method A Study Case of Maluku-Papua Electricity System
6 pages
Development of Asphalt Pavement Temperature Model For Tropical Climate Conditions in West Bali Region
No ratings yet
Development of Asphalt Pavement Temperature Model For Tropical Climate Conditions in West Bali Region
7 pages
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Intelligent Technologies for Automated Electronic Systems
From Everand
Intelligent Technologies for Automated Electronic Systems
S. Kannadhasan
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Automatic Target Recognition: Fundamentals and Applications
From Everand
Automatic Target Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
From Everand
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
Fouad Sabry
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
View Synthesis: Exploring Perspectives in Computer Vision
From Everand
View Synthesis: Exploring Perspectives in Computer Vision
Fouad Sabry
No ratings yet
Active Appearance Model: Unlocking the Power of Active Appearance Models in Computer Vision
From Everand
Active Appearance Model: Unlocking the Power of Active Appearance Models in Computer Vision
Fouad Sabry
No ratings yet

Ola Data Analysis For Dynamic Price Prediction Usi

Uploaded by

Ola Data Analysis For Dynamic Price Prediction Usi

Uploaded by

500 Advances in Parallel Computing Algorithms, Tools and Paradigms

D.J. Hemanth et al. (Eds.)

Ola Data Analysis for Dynamic Price

Keywords. Multiple Linear regression, Random Forest regression, Fare prediction,

2. Materials and Methods

Figure 1. Multiple linear regression architecture diagram.

2.1. Random Forest Algorithm

2.2. Multiple Linear Regression Algorithm

Table 1. Sample Input Dataset

Fare_a Pickup_dateti Pickup_lon Pickup_lati Dropoff_lon Dropoff_latitu Passeng

4.5 2009-06-15 -73.844311 40.721319 -73.84161 40.712278 1

16.9 2010-01-05 -74.016048 40.711303 -73.979268 40.782004 1

5.7 2011-08-18 -73.982738 40.76127 -73.991242 40.750562 2

7.7 2012-04-21 -73.98713 40.733143 -73.991567 40.758092 1

5.3 2010-03-09 -73.968095 40.768008 -73.956655 40.783762 1

2.3. Statistical Analysis

Multiple linear Random Forest

MSE 4.017 5.724

RMSE 2.004 2.392

RMSLE 0.191 0.227

R square 0.796 0.712

Figure 3 gives comparative analysis of Test data for performance evaluation

is a slight significant increase in r-square values in both algorithms in case of testing

Multiple linear Random Forest

MSE 5.601 5.651

RMSE 2.366 2.377

RMSLE 0.220 0.225

R square 0.734 0.708

A brief descriptive statistical analysis was performed to obtain Mean, Std.

Algorithm N Mean Std.Deviation Std.Error Mean

r-square Multiple 10 71.69 .486 .153

RF 10 71.29 .467 .147

MSE Multiple 10 54.54 1.172 .370

RF 10 57.40 .843 .266

Equal .005 .945 1.865 18 .039 0. .398 .280

Equal 1.865 17.973 .039 0 .398 .280

MSE Equal 1.317 .266 -6.248 18 .001 -2.85 .934

Equal -6.248 16.348 .001 -2.85 .934

You might also like