0% found this document useful (0 votes)
20 views7 pages

Big Data Analytics For Logistics and Transportation: Conference Paper

Uploaded by

gabace7968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views7 pages

Big Data Analytics For Logistics and Transportation: Conference Paper

Uploaded by

gabace7968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/280934975

Big Data Analytics for Logistics and Transportation

Conference Paper · May 2015


DOI: 10.1109/ICAdLT.2015.7136630

CITATIONS READS
38 11,658

3 authors:

Abdelkarim Ben Ayed Mohamed Ben Halima


University of Sfax Higher Institute of Computer Science and Multimedia, University of Sfax
9 PUBLICATIONS 119 CITATIONS 39 PUBLICATIONS 351 CITATIONS

SEE PROFILE SEE PROFILE

Adel M. Alimi
Ecole Nationale d'Ingénieurs de Sfax
585 PUBLICATIONS 5,091 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Emergent collaborative systems View project

Road Surveillance in Smart Cities View project

All content following this page was uploaded by Abdelkarim Ben Ayed on 13 August 2015.

The user has requested enhancement of the downloaded file.


20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

Big Data Analytics for Logistics and Transportation


Abdelkarim Ben Ayed, Mohamed Ben Halima, Adel M. Alimi
REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, ENIS, BP 1173, Sfax, 3038, Tunisia
{abdelkarim.benayed.tn, mohamed.benhlima, adel.alimi}@ieee.org

A bstract-Nowadays, there are many challenges for the collecting data,Variety the high diversity of data types and
logistics industry mainly with the integration of E-commerce formats that require to be stored and analyzed together and
and new sources of data such as smartphones, sensors, GPS Veracity the uncertainty of data (see Fig. 2).
and other devices. Those new data sources generate daily a
huge quantity of unstructured data, to deal with such complex VKleoleeds

data, the use of big data analytic tools becomes an obligation.


In this context, many works have been done recently in the
integration of big data analytics in the logistics industry. In this
paper, we propose to give a review of the latest applications of Volume Originalioncncl
.nd
big data analytics in the field of logistics and transportation destination (OND)
velocity Ma1«lowns
industry and to propose a novel approach to detect and Loyarty<ttributes
C''''''''<Y
recognize containers code based on a Hadoop big data
analytics system.

High

Keywords- logistic; transportation; big data analytics. retransactional daa All enterprise data Co�lemented by external data

Variety

I. INTRODUCTION Fig. 2. Large variety of travel and transportation companies data [2], p. 3.
Social networks, smart phones, tablets, GPS devices,
sensors, log files, and many other devices and sources are Big data analytics is used in many areas such as machine
generating every second a large quantity of unstructured learning, computer vision, web statistics, medical
data. Besides, the quantity of data created every year is much applications, DNA analysis, data classification and
bigger than that created ever before, that is why our age is clustering [3], and in public and private industry including
called the age of information (see Fig. I). the logistics and transportation industry.
To deal with big data, companies including the logistics
World Wide I Data Created and Ror,l;ro,torl Each Year and transportation industry need to use dedicated tools called
40 big data analytics. These tools allow efficiently and easily
iii 35 managing and analyzing the huge data coming from roads
� 30 and vehicles sensors, GPS devices, customer's applications
.(g
� 25 and websites, etc.
.� 20 Today, many big data analytics solutions are available,
� 15 but the most used is the open source Apache Hadoop
framework [4]. Hadoop uses a distributed storage and
� 10
c: parallel computation model over a cluster of many
:::J
0 5
E
commodity machines to easily handle big data.
<t 0 The remaining sections of this paper are organized as
2010 2012 2014 2016 2018 2020
Source: International Data Corporation follows. In section 2, we give a brief presentation of Apache
Hadoop. In section 3, we present examples of projects using
Fig. I. Exponontial growth in digital data during actual decade [I], p. 3. big data analytics for logistics and transportation industry. In
section 4, we present our proposed system for containers
All these data contains treasures of valuable information code recognition using big data analytics. Finally, we present
that could be very useful for governments as well as private a conclusion and future perspectives in sectionS.
companies for making deep analysis, monitoring, taking
decisions, improving their quality of services, etc. However,
II. APACHE HADOOP: BIG DATA ANALYTICS
most of the traditional data analysis tools, such as relational
databases, are unable to store and manage such very complex To deal with the increased demand on storage and
data, called big data. computation requirements, old systems are based either on
Mainly three aspects of complexity called the 4Vs scale up solutions or scale out solutions. Scale up solutions
(Volume, Velocity, Variety and Veracity) characterize big use a classic non-parallel architecture with, however,
data. Volume represents to the huge size of data that starts improved resources, but it is very expensive
from one terabyte or more, Velocity the high speed of (cost/performance) and limited by a technical barrier. Scale
out solutions use parallel architectures to improve

978-1-4799-8400-8/15/$31.00 ©2015 IEEE 311


20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

computation resources with a lower cost, but with a much only of communication with user application and managing
higher engineering effort. In the other side, big data analytics the other workers nodes. The tasks are sent to data (not the
solutions such Apache Hadoop are based on a framework data sent to worker machine) which improve the system
that abstract most of the engineering effort caused by parallel performance and mainly the bandwidth (see Fig. 5).
architectures.
A. Presentation ofHadoop

Apache Hadoop is an open source framework written in


Java. It is designed to deal with very large data sets using
computer clusters of commodity hardware. It has two main
parts, a distributed storage part: the Hadoop Distributed File
System (HDFS) and a processing part: the MapReduce
programming model. Doug Cutting developed Hadoop in
2005[5] based on Google File System (GFS) and Google
MapReduce published papers [6], [7].
B. Architecture ofHadoop Framework

Hadoop is composed of two main parts, a storage part


managed by HDFS and a processing part managed by Fig. 4. Distributed file system architecture (e.g. Google File System) [ 17]
MapReduce programming model or higher programming
languages (see Fig. 3). (�) LI __ u b:::,
S:::
---= m t:.::=-
J
o
i::.:. b __--,>
Ml '---�
- �----�
�'NI'-.Ktcrs
Coor��""",and.edute�
HADOOP 1.0 I'ruorideJobplq!n!'5j�

DFS
"'''


"'''
..."
MapReduce
.. 1PIit4
(cluster resource management
"'''
& data processing)

HDFS
(redundant, relllI>l._1

Fig. 3. Hadoop Framework main components HDFS and MapReduce [15]

A typical Hadoop system is composed by a master Map Phase Reduce Phase


server (with one or two backup mirros) and a many low cost
machines/slaves (thousands) running linux. Master has Fig. 5. MapReduce architecture [ 18]
"Name Node" and "Job tracker" components that manage
respectively "Data node" (storage task) and "Task tracker"
E. OtherHadoop elements
(processing task) in other machines.
Besides programming in MapReduce, users can develop
C. Hadoop Distributed File System H
( DFS)
their codes in other easier high-level programming
The Hadoop framework uses a distributed redundant languages that will be translated automatically to Map and
storage system called HDFS that stores files in blocks Reduce tasks such as:
replicated in multiple machines. A main server (master • Hive: data warehouse language using SQL-92
node) manage data splitting and replication in the other querries.
chunk servers (workers nodes) used for both data storage • Pig: data flows oriented language using Pig Latin
and processing (see Fig. 4). programming language.
• Hbase: A sparse database for storing large quantities
D. MapReduce programming model
of data.
Hadoop uses mainly MapReduce as a programming
model to process large datasets. MapReduce is composed of F. Advantages of big data for logistics industry
two functions: "Map" divide problems to smaller ones and Big data analytics (e.g. Hadoop) have many advantages
"Reduce" combine results. The Map and Reduce functions for logistics and transportation industry such as:
are to be written by the user. MapReduce take care of all the • More efficient for real-time big logistics
details of distributed computation. The main server (master applications collecting data from a very large
node) is not overloaded by computation, it is responsible network of sensors and GPS devices

312
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

• Storage and process of very big files


• Improved exploitation for both structured and
unstructured logistics data
• Development of powerful logistics projects with
smarter strategies based on collected and analyzed
--
data
\ ....... ....
• Development of efficient real-time traffic """'""' .
monitoring applications
• Development of more accurate logistics predictions
that improve service quality for customers and
improve companies' revenues.

III. BIG DATA ANAL YTICS FOR LOGISTICS AND frlpunrttul -­


.
TRANSPORTATION INDUSTRY Kochi

Many big data analytics research projects are realized for Fig. 7. Graph showing locations of idling trucks [8], p. 2368.
logistics and transportation industry to deal with the huge
data coming from roads and vehicles sensors, GPS devices, The analysis of data allows also monitoring driving
customer's applications and websites, etc. behavior and answering to some questions like: which
In the following, we present an overview and compare vehicles are wasting fuels? which drivers have the highest
some of these projects.
risks? The results of analysis are displayed by charts and
A. Real-time vehicles monitoring in India graph (see Fig. 6 and Fig. 7).
The main objective of this project is to deploy modern
technologies like Big Data analytics and Hadoop in order to B. City of Dublin, Public Transit System
improve operational efficiency for logistics and This smart city project was a collaboration between IBM
transportation firms. In fact, these technologies helps firms' and Dublin City Council, it took place in Ireland from 2010
managers to make better business decisions. to 2013. IBM big data analytics helped the city of Dublin to
In this project, the authors collect data from vehicles improve its public bus transportation network and reduce
about fuel, speed, acceleration, GPS location coordinates the increasing traffic congestion problem [9], [10].
using vehicles sensors and GPS devices with other data such The main idea was to reduce congestion in the city of
date, time, driver's id, etc. and then send these information Dublin in its public bus transport network without making
by packets over wireless communication (GPRS) to major modification for the city historic infrastructure.
clustered servers running Hadoop.
All this unstructured data comes from hundreds of TABLE I. SOURCES OF DATA FOR THE PROJECT: CITY OF
vehicles sending packets every 2 seconds to a HDFS system DUBLIN [II], p. 7.
to store such big data. An analysis is made then weekly or
monthly over these terabytes of data using Hadoop analytics Municipal Data - City of Dublin
system in order to improve the transportation company Data Source Description Approx. Scale
productivity and help reduce the costs.
GPS Data
1000 Buses
Speed Data
Bus System Location reported
Stop Data
every 20 seconds
Fare Data
Traffic Light Data
Traffic Flow Data 24 Intersections
XML
Real-time video
CCTY monitoring
Stream across the city
Road Weather
CSY 54 Stations
Conditions
Road Works and
CSY
Maintenance
Small Attendee Events
Dublin Event Data
Large Attendee Events

Fig. 6. Locations of trucks with risk factor [8], p. 2368. The data is collected from many sources (see Table I)
such as bus GPS devices that send data from 1,000 buses

313
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

every 20 seconds, bus timetables, cameras and traffic After that, a big data analytics system stores and
sensors in roads, tramways, and bus lanes. Collecting such analysis all the structured and unstructured data to identify
enormous amounts of mUltiple and fast data was not customers' preferences, characteristics, problems, and to
possible with traditional data sets. That is why the data is provide them high quality services.
collected in clustered servers running IBM big data
E. City of Stockholm Real-Time Intelligent
analytics to build a real-time digital map of Dublin city
Transportation Services
transportation network.
Advanced analytics on the collected data helped to Another project based on IBM InfoSphere big data
identify traffic problems and answer questions such as the analytics that aims to improve the quality of the
optimal time to start bus lanes, the best place to add more transportation network in the city of Stockholm [14].
bus lanes, etc. 120,000 vehicles of taxis and trucks equipped with GPS
devices were used to collect and send a large collection of
The project helped the city to better monitor and manage
data every second combined with a map containing over
the traffic in real time, which accelerate decision making
600,000 links (see Fig. 8).
and improved traffic flow and mobility in the city. The used big data analytics system combined the
As future improvement for the project, a predictive collected data with past traffic data and weather forecasts to
analytics solution will combine meteorological data with generate more accurate predictions about future traffic
other data to assure good traffic flow in extreme weather conditions such as shortest-time routes in real-time. The
conditions. results served for publics, police officers, firemen, urban
C. City of Da Nang, Vietnam, Traffic
planners, etc.
Management System -
, --------.

This project is made by IBM smart city technologies in : GPS: Real Time Real Time
Real Time
Speed &
Real Time
: Data I Transformation Geo
Heading
Aggregates
order to reduce traffic congestion and pollution in city of Da : Streams :
, ,
Logic Mapping
Estimation
& Statistics
1 ______ --_ .
Nang in Vietnam [9], [12].
The objective of the project is to build water and
transportation traffic management system able to deal with
the city fast-growing population of a city of more than a
million population.
Sensors on Da Nang's buses, roads and highways collect
data for the management system that use the collected data
to optimize traffic lights synchronization and reduce traffic
congestion.
Besides, data coming from ships sensors gives
information about water status such as water's turbidity,
salinity, pH, chlorine and conductivity levels, which allow
customers to receive appropriate alerts if necessary.
All these different information are combined together in
big data analytics system that provides a real-time summary Fig. 8. Stockholm real-time monitoring of transportation network flow
of traffic status, events and incidents through maps and [14], p. 4.
alerts for better management of the city transportation
network. F. Cloud-Enhanced System Architecture for
The project reduced traffic congestion and energy Logistics Tracking Services
consumption, created an efficient control and management The goal of this project is to build a system that combine
system that support the high growth of the city population internet of things, SaaS cloud architecture and big data
and resulted in a better and safer mobility for commuters. analytics technologies for setting-up an efficient real-time
monitoring of customers cargoes [16].
D. British Airways' Know Me Program
The data is collected from mobile phones: 2D QR codes,
A big data analytics project that started in 2012 from GPS locations and RFID electronic codes. Then, data is sent;
Opera Solutions company that aims to improve the quality using RSA encryption algorithm to protect customers'
of British Airways services for its customers. The objective privacy; through wireless networks. A big data analytics
is to understand clients' needs better than any competitive system using HBase as databse is used for storing all these
airline company [9], [13] . unstructured data (see Fig. 9 and Fig. lO).
The project collected different types of information Using a traditional computing system to handle such
about 20 million customers via websites, smart phones and project is not possible because of the large requirements of
tablets applications, blog sties' rating, likes on social storage, calculations and bandwidth for such lot of
medias, conversations with call centers, etc. unstructured logistics information.

314
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

Cloud computing platform


Besides, unlike vehicle license plate recognition systems,
container-code recognition systems has more challenges
because of the low code/container contrast and the large
varieties of sizes, colors, positions, inter-spaces and
alignment of these codes (see Fig. 11).
According to [19], automatic container code recognition
require three steps: text detection, characters extraction and
finally text recognition. Because text detection step is very
important for the other steps, we chose a robust method
insensitive for code contrast and other text variable
characteristics such as texture-based text detection method
using Haar wavelet transform for text features extraction
and Support Vector Machine (SVM) to classify these
Resource Providers Resource Users features into text and non-text regions [20].
To overcome the high computation time of the employed
Fig. 9. Architecture overview of cloud-enhanced system architecture for
logistics tracking services [16], p. 546.
text detection method [20] and text recognition method [21],
which is a big problem for real time and industry
applications including ship transportation industry, we
decided to use Hadoop MapReduce to have a parallel
execution model as a solution to reduce the computation time
for the proposed system.
First, container code is captured using monitoring
cameras or mobile devices and stored on Hadoop distributed
system file (HDFS). After that a pre-processing and graying
color image steps are applied on the captured image. Next,
Fig. 10. The cloud computing structure of cloud-enhanced system
architecture for logistics tracking services [16], p. 547. we decompose the gray image in 20x20 pixel blocs. These
blocs are analyzed and classified separately on different
machines using MapReduce programming model to extract
We can classify these transportation projects in two the text regions. Next, another step is used to separate code
categories, projects to improve operational efficiency and characters from the extracted text regions. Then, Optical
projects that improve customer experience Character Recognition (OCR) is applied on individual
characters using MapReduce programming model. Finally,
container code is recognized by merging these characters
IV. PROPOSED BIG DATA SYSTEM FOR CONTATNERS CODE
(see Fig. 12).
RECOGNITION

Ship transportation industry is nowadays a very active


industry with very large number of containers to be V. CONCLUSION
transported every day. To supervise the delivery of all these In this paper, we presented the big interest of using big
containers, unique identifiers codes are written on the data analytics for logistics companies with real project
container, but manual reading of these codes include lot of examples.
problems such as slow speed and high error rate. To improve their performance and assure a competitive
attitude, logistics and transportation companies must shift to
the big data analytics to deal quickly and efficiently with
today large variety of data. Big data analytics have also lot of
benefits like traffic congestion reduction, vehicles and
drivers control, improvement of customer services, etc.
Besides using powerful big data analytics software
enhance routing for planes, trains and trucks, which reduce
energy consumption in developed countries and decrease the
impact of transportation industry on the environment.
In addition, we proposed a real time system for
container-code recognition based on Hadoop big data
analytics solution.
As future work, we propose to apply a similar big data
analytics project to our city, Sfax, in Tunisia, which suffer of
Fig. II. Examples of containers codes [19], p. 2843.
a large and increasing population of more than one million

315
20154th IEEE International Conference on Advanced logistics and Transport (ICAlT)

habitants and dramatic pollution because of its industrial Acknowledgment


aspect and its numerous transportation vehicles. The
proposed project will integrate intelligent techniques with the The authors would like to acknowledge the financial
power of big data analytics to make new plans and strategies support of this work by grants from General Direction of
in the goal to reduce the impact of pollution and traffic Scientific Research (DGRST), Tunisia, under the ARUB
congestion in the city of Sfax. program.

References
c Start [I] J.M. Tien, "Big Data: Unleashing information", Joumal Syst Sci Syst
Eng (Jun 2013) 22(2), pp. 127-151.
t [2] K. Wedgwood and R. Howard, "Big data and analytics in travel and
transportation", IBM Big Data and Analytics White Paper, November
Container code acquisition 2014.
.1- [3] A Ben Ayed, M. Ben Halima, Adel M. A1imi, "Survey on clustering
methods : Towards fuzzy clustering for big data", 6th International
Pre-processing the image Conference of Soft Computing and Pattern Recognition (SoCPaR),
Tunis, Tunisia, 2014, pp. 331-336 .
• [4] http://hadoop. apache.orgl
RGB to gray level [5] http://en.wikipedia.orglwiki/Apache_Hadoop
Transformation [6] S. Ghemawat, H. Gobioff and S. T. Leung, "The Google file system",
SOSP '03 Proceedings of the nineteenth ACM symposium on
• Operating systems principles, pp. 29-43, 2003.

Image decomposition to [7] 1. Dean, S. Ghemawat, "MapReduce: simplified data processing on


large clusters", OSDl'04: Sixth Symposium on Operating System
small blocks Design and Implementation, San Francisco, CA, December, 2004.
,/ .............................. \ [8] AP. Sivan, 1. Johns and J. Venugopal, "Big Data Intelligence in
Logistics Based On Hadoop And Map Reduce", International
Conference on Innovations in Engineering and Technology
Analysis of Analysis of (ICIET' 14), 21-22 March, India, 2014.
block (1) block (N) [9] 1. Viegas, "Big data and transport", International Transport Forum,
.. .. Map October 2013 .
[10] S.D. Galligan and J. 0' Keeffe, "Big Data Helps City of Dublin
Classification Classification Improve its Public Bus Transportation Network and Reduce
of block (1) of block (N) Congestion", IBM press, May 2013.
[II] P. Yip, "Transform Industries, Institutions, and Societies with
Watson", Smarter Business Summit, Halifax, Canada, 2014.
\. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .,/ [12] N. Y. Annonk, "Da Nang, Vietnam Turns to IBM to Transform City
Extraction of text Systems", IBM press, August 2013.
Reduce
coordinates [13] H. Shilling, "Big Data Takes the Travel Industry in New Direction",
Opera Solutions blog, june 2013.
[14] A Biem, E. Bouillet, H. Feng, A. Ranganathan, A Riabov, O.
Verscheure, H. Koutsopoulos, C. Moran, "IBM InfoSphere Streams
Characters extraction for Scalable, Real-Time, Intelligent Transportation Services",
SIGMOD' 10, June 6-11, 2010, Indianapolis, Indiana, USA
,/ .............................. \ [15] http://dataconomy.com/hadoop-components-need-know/
[16] X. Lin, X. Zheng, "A Cloud-Enhanced System Architecture for
Recognition Recognition Map Logistics Tracking Services", International Conference on Computer,
of character of character Networks and Communication Engineering (ICCNCE), May 2013,
pp. 545-548.
(1) (n)
[17] http://en.wikipedia.orglwiki/Google_File_System
[18] http://architects. dzone. com/artic1es/how-hadoop-MapReduce-works
\. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .,/ [19] Wu W. , Liu Z. , Chen M., Yang X. and He x., "An automated vision
Container code recognition system for container-code recognition", Expert Systems with
Reduce Applications, 39(3), 2012, pp. 2842-2855.
[20] Sayahi S. and Ben Halima M., "An intelligent and robust multi­
oriented image scene text detection", The 6'10 International Conference
End of Soft Computing and Pattern Recognition (IEEE SoCPaR'2014),
pp. 418-422.
[21] Halima M. B. , Karray H. , and Alimi A M. (2010), "A comprehensive
Fig. 12. Big data system for containers code recognition method for Arabic video text detection, localization, extraction and
recognition", Advances in Multimedia Information Processing-PCM
20I0, Springer Berlin Heidelberg, pp. 648-659.

316

View publication stats

You might also like