Taxi Demand Prediction Using Ensemble Model Based On Rnns and Xgboost
Taxi Demand Prediction Using Ensemble Model Based On Rnns and Xgboost
Abstract — Taxis play an important role in urban Therefore, there are many existing proposed systems which use
transportation. Understanding the taxi demand in the future gives taxi GPS trajectories as their transportation sensors. Anomaly
an opportunity to organize the taxi fleet better. It also reduces the detection is one of the most popular topics using taxi GPS data.
waiting time of passengers and cruising time of taxi drivers. Even, Zhang, D. et al [1], applied isolation forest method which uses
there are some works proposed to predict the demand of taxi but “Few and Different” concept to develop anomaly detection
there are few studies that consider the function of areas such as system in taxi trajectories. Yisheng Lv. et al [2], used a GPS
hospital area, department store area, residential area, and tourist trace to build a deep learning model to predict traffic flow of a
attraction. One predictive model may not fit with all types of area. city.
We use a point of interest (POI) to match taxi demand with a place
to study the taxi demand in the area with a different function. In For taxi demand prediction studies, Moreira-Matias, L., et al
this paper, we investigate the best predictive models that can [3], presented a model for predicting the number of services that
forecast demand of taxi hourly with 7 types of area function. The will happen at taxi stands by applying the time-varying Poisson
models that were selected for the experiment are long short term model and the auto regression integrated moving average
memory (LSTM), gated recurrent unit (GRU) and extreme (ARIMA). Moreover, they used sliding-window ensemble
gradient boosting (XGBOOST). Then, we proposed the ensemble framework to originate a prediction by combining the prediction
model that can forecast the taxi demand well with all types of area of each model. The dataset was generated from 441 vehicles
function using the information from those machine learning with 63 taxi stands in the city of Porto. N. Davis et al [4],
models. We build the models based on a real-world dataset
proposed a taxi demand forecasting model using multi-level
generated by over 5,000 taxis in Bangkok, Thailand for 4 months.
The result shows that the proposed ensemble model can
clustering approach and data from a leading taxi booking
outperform other models in overall. application in the city of Bengaluru, India. They applied many
linear forecasting models such as Holt-Winters (HW) model,
Keywords— Taxi Demand Prediction; Time Series; Neuron Seasonal Naive, STL decomposition, ARIMA, TBATS. The
Networks; Spatio-temporal Data result showed that STL had the best performance. There are also
studies [5], [6], [7] using artificial neural networks to forecast
I. INTRODUCTION demand of taxi. Mukai and Yoden [5], also applied artificial
Where to find a passenger is one of the most important neural networks to predict the taxi demand in 25 regions of
questions for all taxi drivers. The more time taxi driver spends Tokyo. They divided the hours of a day into 6 intervals (4 hours).
on cruising a new passenger, the more fuel consumption and the They reported that the area that has highest error rate is the area
fewer number of passengers can be picked-up. For that has many types of transportation such as railway, subway
inexperienced taxi drivers, they usually don’t know where to because the taxi demand is so small and non-periodical. Zhao et
pick-up a new passenger since they have no experience about al [6], compared Markov predictor with Neural Network
the demand of taxi over time and space. The information about predictor using the dataset from yellow taxicab and Uber taxi in
taxi demand in the future can be used to guide both New York. The result showed that Markov predictor can achieve
inexperienced and experienced drivers to catch up with the taxi high accuracy in the area with high theoretical maximum
demand in the city faster. Hence, it helps to match the demand predictability while neural network model performed better in
with supply in taxi services. the area that has lower theoretical maximum predictability. Jun
Xu et al [7], divided the entire New York City into around 6,500
With Rich information from taxi GPS sensors, many areas. Then, they applied a special kind of recurrent neural
researchers have used this digital footprint to discover some network which is long short term memory recurrent neural
knowledge that can help us improving the transportation system.
Models
Areas
lstm gru xgboost ensemble
Hospital 24.01% 24.84% 26.92% 24.52% Fig. 10. One week prediction in the subway area
Subway 27.01% 27.44% 25.75% 25.92% Figure 9 shows the comparison of one day prediction of each
model with the actual demand in the airport while figure 10
Tourist Attraction 25.39% 25.44% 24.10% 23.85%
shows the comparison of one week prediction in the subway
All places 24.32% 24.82% 25.55% 24.02%
area. Those figures 9, 10 illustrate that the model which provides
a better prediction lately is likely to produce more accurate
prediction in the next time step. The ensemble model follows
The average sMAPE of all models in each area types were this trend. Therefore, it can provide a better prediction in overall.
shown in table II. The XGBOOST approach can outperform
both LSTM and GRU model in the airport, department store and V. CONCLUSTION
tourist attraction areas which are high taxi demand areas. The We proposed an ensemble model based on long short
prediction of GRU is really closed to LSTM’s but the result term memory network (LSTM), gated recurrent unit network
shows that LSTM still provides better prediction than GRU for (GRU) and eXtreme gradient boosting (XGBOOST) models to
all types of area. Both LSTM and GRU work very well in low predict taxi demand in 7 types of area function in the city of
taxi demand areas such as residential area, education area and Bangkok, Thailand. We use Point of Interests (POIs) to match
hospital area. With sMAPE of 24.32%, LSTM is the best the taxi demand with studied areas. The model that provides a
independent predictive model for taxi demand prediction in this better prediction lately is likely to produce more accurate
experiment. For the ensemble model, even though, it can prediction in the next time step. The ensemble model predicts
provide the lowest error rate in only tourist attraction and taxi demand by following this trend. The results show that the
department store areas but it produced the results that are closed ensemble model outperforms other standalone models with
to the result of the best model in each type of area function. The sMAPE of 24.02% in all areas. For the independent models,
average sMAPE of the ensemble model in all places is 24.02% LSTM provides the most accurate prediction in low taxi
which is the lower than all standalone models. demand areas such as residential area, hospital and education
area. LSTM achieves better results than GRU in all types of
2) Comparing the predictions over a specific place area. The XGBOOST model gives better results than the others
in high taxi demand areas such as department store, subway and
airport areas. This study shows that a single predictive model
cannot provide the best prediction in all areas. Combining the
predictions from various models can improve the performance
in overall.
ACKNOWLEDGMENT
This research is financially supported by Thailand Advanced
Institute of Science and Technology (TAIST), National Science
and Technology Development Agency (NSTDA), Tokyo
Institute of Technology, Sirindhorn International Institute of
Technology, Thammasat University under the TAIST Tokyo
Tech Program and partially support by Center of Excellence in
Intelligent Informatics, Speech and Language Technology and
Service Innovation (CILS). The real-world taxi GPS dataset is
supported by Toyota Tsusho Electronics (Thailand) Co, Ltd
Fig. 9. One day prediction in the airport
REFERENCES
[1] ZHANG, D., LI, N., ZHOU, Z.-H., CHEN, C., SUN, L., AND LI, S.,
"IBAT: detecting anomalous taxi trajectories from GPS traces.," In 13th
conference on ubiquitous computing, UbiComp, 2011.
[2] Y. Lv, Y. Duan, "Traffic flow prediction with big data: A deep learning
approach," IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 865-873,
2015.
[3] Moreira-Matias, L., et al., "Predicting Taxi–Passenger Demand Using
Streaming Data," IEEE Transactions on Intelligent Transportation
Systems, vol. 14, no. 3, pp. 1393-1402, 2013.
[4] N. Davis, G. Raina, and K. Jagannathan, "A multi-level clustering
approach for forecasting taxi travel demand," in Proc. IEEE ITSC, Dec,
pp. 223-228, 2016.
[5] Naoto Mukai and Naoto Yoden, "Taxi Demand Forecasting Based on
Taxi Probe Data by Neural Network," Smart Innovation, Systems and
Technologies, 2012.
[6] Kai Zhao, Denis Khryashche, Juliana Freire, Cl´audio Silva, and Huy,
"Predicting Taxi Demand at High Spatial Resolution: Approaching the
Limit of Predictability," in IEEE International Conference on Big Data,
2016.
[7] Jun Xu et al, "Jun Xu et al, “Real-Time Prediction of Taxi Demand Using
Recurrent Neural Networks," IEEE Trans. Intell. Transp. Syst., vol. pp,
no. 99, pp. 1-10, 2017.
[8] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
[9] K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, D.
Bahdanau, Y. Bengio, "Learning phrase representations using RNN
encoder-decoder for statistical machine translation", arXiv preprint arXiv:
1406.1078, 2014.
[10] Tianqi Chen, Carlos Guestrin “XGBoost: A Scalable Tree Boosting
System”, arXiv: 1603.02754v3 [cs.LG] 10 Jun 2016.
[11] Friedman, Jerome H. "Greedy function approximation: a gradient
boosting machine." Annals of statistics (2001): 1189-1232.
[12] Chollet, Fran\c{c}ois, Keras, (2015), GitHub repository,
https://github.com/keras-team/keras