Big Data Analytics For Logistics and Transportation: Conference Paper
Big Data Analytics For Logistics and Transportation: Conference Paper
net/publication/280934975
CITATIONS READS
38 11,658
3 authors:
Adel M. Alimi
Ecole Nationale d'Ingénieurs de Sfax
585 PUBLICATIONS 5,091 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Abdelkarim Ben Ayed on 13 August 2015.
A bstract-Nowadays, there are many challenges for the collecting data,Variety the high diversity of data types and
logistics industry mainly with the integration of E-commerce formats that require to be stored and analyzed together and
and new sources of data such as smartphones, sensors, GPS Veracity the uncertainty of data (see Fig. 2).
and other devices. Those new data sources generate daily a
huge quantity of unstructured data, to deal with such complex VKleoleeds
High
Keywords- logistic; transportation; big data analytics. retransactional daa All enterprise data Co�lemented by external data
Variety
I. INTRODUCTION Fig. 2. Large variety of travel and transportation companies data [2], p. 3.
Social networks, smart phones, tablets, GPS devices,
sensors, log files, and many other devices and sources are Big data analytics is used in many areas such as machine
generating every second a large quantity of unstructured learning, computer vision, web statistics, medical
data. Besides, the quantity of data created every year is much applications, DNA analysis, data classification and
bigger than that created ever before, that is why our age is clustering [3], and in public and private industry including
called the age of information (see Fig. I). the logistics and transportation industry.
To deal with big data, companies including the logistics
World Wide I Data Created and Ror,l;ro,torl Each Year and transportation industry need to use dedicated tools called
40 big data analytics. These tools allow efficiently and easily
iii 35 managing and analyzing the huge data coming from roads
� 30 and vehicles sensors, GPS devices, customer's applications
.(g
� 25 and websites, etc.
.� 20 Today, many big data analytics solutions are available,
� 15 but the most used is the open source Apache Hadoop
framework [4]. Hadoop uses a distributed storage and
� 10
c: parallel computation model over a cluster of many
:::J
0 5
E
commodity machines to easily handle big data.
<t 0 The remaining sections of this paper are organized as
2010 2012 2014 2016 2018 2020
Source: International Data Corporation follows. In section 2, we give a brief presentation of Apache
Hadoop. In section 3, we present examples of projects using
Fig. I. Exponontial growth in digital data during actual decade [I], p. 3. big data analytics for logistics and transportation industry. In
section 4, we present our proposed system for containers
All these data contains treasures of valuable information code recognition using big data analytics. Finally, we present
that could be very useful for governments as well as private a conclusion and future perspectives in sectionS.
companies for making deep analysis, monitoring, taking
decisions, improving their quality of services, etc. However,
II. APACHE HADOOP: BIG DATA ANALYTICS
most of the traditional data analysis tools, such as relational
databases, are unable to store and manage such very complex To deal with the increased demand on storage and
data, called big data. computation requirements, old systems are based either on
Mainly three aspects of complexity called the 4Vs scale up solutions or scale out solutions. Scale up solutions
(Volume, Velocity, Variety and Veracity) characterize big use a classic non-parallel architecture with, however,
data. Volume represents to the huge size of data that starts improved resources, but it is very expensive
from one terabyte or more, Velocity the high speed of (cost/performance) and limited by a technical barrier. Scale
out solutions use parallel architectures to improve
computation resources with a lower cost, but with a much only of communication with user application and managing
higher engineering effort. In the other side, big data analytics the other workers nodes. The tasks are sent to data (not the
solutions such Apache Hadoop are based on a framework data sent to worker machine) which improve the system
that abstract most of the engineering effort caused by parallel performance and mainly the bandwidth (see Fig. 5).
architectures.
A. Presentation ofHadoop
DFS
"'''
�
"'''
..."
MapReduce
.. 1PIit4
(cluster resource management
"'''
& data processing)
HDFS
(redundant, relllI>l._1
312
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)
Fig. 6. Locations of trucks with risk factor [8], p. 2368. The data is collected from many sources (see Table I)
such as bus GPS devices that send data from 1,000 buses
313
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)
every 20 seconds, bus timetables, cameras and traffic After that, a big data analytics system stores and
sensors in roads, tramways, and bus lanes. Collecting such analysis all the structured and unstructured data to identify
enormous amounts of mUltiple and fast data was not customers' preferences, characteristics, problems, and to
possible with traditional data sets. That is why the data is provide them high quality services.
collected in clustered servers running IBM big data
E. City of Stockholm Real-Time Intelligent
analytics to build a real-time digital map of Dublin city
Transportation Services
transportation network.
Advanced analytics on the collected data helped to Another project based on IBM InfoSphere big data
identify traffic problems and answer questions such as the analytics that aims to improve the quality of the
optimal time to start bus lanes, the best place to add more transportation network in the city of Stockholm [14].
bus lanes, etc. 120,000 vehicles of taxis and trucks equipped with GPS
devices were used to collect and send a large collection of
The project helped the city to better monitor and manage
data every second combined with a map containing over
the traffic in real time, which accelerate decision making
600,000 links (see Fig. 8).
and improved traffic flow and mobility in the city. The used big data analytics system combined the
As future improvement for the project, a predictive collected data with past traffic data and weather forecasts to
analytics solution will combine meteorological data with generate more accurate predictions about future traffic
other data to assure good traffic flow in extreme weather conditions such as shortest-time routes in real-time. The
conditions. results served for publics, police officers, firemen, urban
C. City of Da Nang, Vietnam, Traffic
planners, etc.
Management System -
, --------.
This project is made by IBM smart city technologies in : GPS: Real Time Real Time
Real Time
Speed &
Real Time
: Data I Transformation Geo
Heading
Aggregates
order to reduce traffic congestion and pollution in city of Da : Streams :
, ,
Logic Mapping
Estimation
& Statistics
1 ______ --_ .
Nang in Vietnam [9], [12].
The objective of the project is to build water and
transportation traffic management system able to deal with
the city fast-growing population of a city of more than a
million population.
Sensors on Da Nang's buses, roads and highways collect
data for the management system that use the collected data
to optimize traffic lights synchronization and reduce traffic
congestion.
Besides, data coming from ships sensors gives
information about water status such as water's turbidity,
salinity, pH, chlorine and conductivity levels, which allow
customers to receive appropriate alerts if necessary.
All these different information are combined together in
big data analytics system that provides a real-time summary Fig. 8. Stockholm real-time monitoring of transportation network flow
of traffic status, events and incidents through maps and [14], p. 4.
alerts for better management of the city transportation
network. F. Cloud-Enhanced System Architecture for
The project reduced traffic congestion and energy Logistics Tracking Services
consumption, created an efficient control and management The goal of this project is to build a system that combine
system that support the high growth of the city population internet of things, SaaS cloud architecture and big data
and resulted in a better and safer mobility for commuters. analytics technologies for setting-up an efficient real-time
monitoring of customers cargoes [16].
D. British Airways' Know Me Program
The data is collected from mobile phones: 2D QR codes,
A big data analytics project that started in 2012 from GPS locations and RFID electronic codes. Then, data is sent;
Opera Solutions company that aims to improve the quality using RSA encryption algorithm to protect customers'
of British Airways services for its customers. The objective privacy; through wireless networks. A big data analytics
is to understand clients' needs better than any competitive system using HBase as databse is used for storing all these
airline company [9], [13] . unstructured data (see Fig. 9 and Fig. lO).
The project collected different types of information Using a traditional computing system to handle such
about 20 million customers via websites, smart phones and project is not possible because of the large requirements of
tablets applications, blog sties' rating, likes on social storage, calculations and bandwidth for such lot of
medias, conversations with call centers, etc. unstructured logistics information.
314
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)
315
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)
References
c Start [I] J.M. Tien, "Big Data: Unleashing information", Joumal Syst Sci Syst
Eng (Jun 2013) 22(2), pp. 127-151.
t [2] K. Wedgwood and R. Howard, "Big data and analytics in travel and
transportation", IBM Big Data and Analytics White Paper, November
Container code acquisition 2014.
.1- [3] A Ben Ayed, M. Ben Halima, Adel M. A1imi, "Survey on clustering
methods : Towards fuzzy clustering for big data", 6th International
Pre-processing the image Conference of Soft Computing and Pattern Recognition (SoCPaR),
Tunis, Tunisia, 2014, pp. 331-336 .
• [4] http://hadoop. apache.orgl
RGB to gray level [5] http://en.wikipedia.orglwiki/Apache_Hadoop
Transformation [6] S. Ghemawat, H. Gobioff and S. T. Leung, "The Google file system",
SOSP '03 Proceedings of the nineteenth ACM symposium on
• Operating systems principles, pp. 29-43, 2003.
316