Orchestrated Platform For Cyber-Physical Systems
Orchestrated Platform For Cyber-Physical Systems
Orchestrated Platform For Cyber-Physical Systems
Systems
Róbert Lovas, Attila Farkas, Attila Csaba Marosi, Sándor Ács, József Kovács,
Ádám Szalóki, and Botond Kádár
Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA
SZTAKI), P.O.Box 63, H-1518, Budapest, Hungary
{robert.lovas, attila.farkas, attila.marosi, sandor.acs, jozsef.kovacs,
adam.szaloki, botond.kadar}@sztaki.mta.hu
1 Introduction
tracking is done using times at which the various events occur and are sequen-
tially ordered according to their occurrence time. In the modeling phase, the
task of a modeler is to determine the state variables that capture the behavior
of the system, the events that can change the values of those variables and the
logic associated with each event. Executing the logic associated with each event
in a time-ordered sequence produces a simulation of the system. As each event
occurs and expires, it is removed from the sequence called an event list, and the
next event is activated. This continues until all the events have occurred or an
earlier defined time window limit is achieved. Statistics are gathered throughout
the simulation and reported with performance measures. Later in the paper, we
provide the simulation scenarios, which apply the EasySim kernel (see Section
4.1) but the main focus will be on the scenario generation and the evaluation
of the simulation runs, which contrary to the initial desktop environment, will
run in parallel on the orchestrated back-end platform. With the parallelization
support of this back-end platform, we are able to significantly speed up the
evaluation of different scenarios which earlier run only sequentially in desktop
environments. Additionally, it is important to mention that other simulation en-
gines, even third-party, off-the-shelf simulation software, would had been suitable
to model the scenarios presented in Section 3.2, EasySim was selected because
of performance reasons.
2 Related work
all management software and acts as a frontend for the users. Infinity is the per-
manent storage cluster (based on HDFS). Computing clusters have a lifecycle:
they are created, used for computation and finally, they are removed. All data
must be uploaded to Infinity beforehand. Data can be uploaded to and retrieved
from Infinity via WebHDFS [15] or Cosmos CLI (a command line interface to
WebHDFS). The Big Data GE specifies the use of SQL-like analytics tools like
Hive, Impala [16] or Shark. Although the GE is based on Hadoop [17], it pro-
poses several alternatives: (i) Cassandra [18] File System can be used instead of
HDFS; (ii) a distributed NoSQL database like HBase can be installed on top of
HDFS; (iii) use, e.g., Cascading [19] as an extension or replacement.
DICE [20] is an EU Horizon 2020 research project that aims to provide a
methodology and framework for developing data intensive applications. It offers
a framework consisting of an Eclipse-based IDE and tools, and supports Apache
Spark [21], Storm [22], Hadoop (MapReduce), Cassandra and MongoDB [23].
By using its methodology, it allows the architecture enhancement, agile delivery
and testing for batch and stream processing applications.
Building on application containers and orchestration (e.g., via Docker [24]
or Kubernetes [25]), serverless computing is an execution model for cloud com-
puting where the cloud provider manages dynamically the underlaying machine
resources. The pricing model is based on the actual resources consumed dur-
ing execution (e.g., CPU execution time, memory, network). All major public
cloud providers support this model, e.g., AWS Lambda, Google Cloud Func-
tions or Azure Function. There are several open source implementations like
OpenLambda [26], OpenFaaS [27] (Open Function as a Service), Kubeless [28],
Funktion, Iron Functions, Fission, etc.
Terraform [29] is an open source tool for building, managing and versioning
virtual infrastructures in public or private cloud environments. Terraform allows
defining whole virtual infrastructures via a configuration template. This can con-
tain low-level information like machine types, storage or network configuration,
but also high-level components like SaaS services or DNS entries. Based on the
configuration, Terraform creates an execution plan and a resource graph to build
the defined infrastructure. Topology and Orchestration Specification for Cloud
Applications (TOSCA) [30, 31] is a standard language by OASIS [32] for describ-
ing collections or topologies of services, their relationships, components, etc. It is
similar to Amazon CloudFormation, OpenStack Heat (and Terraform). It aims
to be an open standard that provides a superset of features (and grammar).
Regarding the representation and sharing industrial data in distributed sys-
tems, several initiatives exists. The National Institute of Standard and Tech-
nology (NIST) initiated the Smart Manufacturing Systems (SMS) Test Bed [33]
in which data is collected from the Manufacturing Lab using the MTConnect
(link is external) standard. That data is aggregated and published internally
and externally of NIST via web services. Other initiative from General Electric
is PREDIX [34] a huge platform enabling the collection and analysis of product’s
and assets’ related data in order to improve and optimize operations. The SMS
Test Bed is a source from where data can be retrieved and analyzed, but the Test
Orchestrated Platform for CPS 5
Bed itself does not include solvers or simulators. PREDIX with its own multiple
layers is designed for collection and analytics and includes tools for analysis but
building models which are later applied in decision support here is also always
necessary and at the same time almost the most difficult part. How to build
these models is still not clear from the available sources of PREDIX. In our so-
lution, to be presented in the simulation scenario the model in question is built
as discrete event simulation with a tool developed earlier in an earlier project at
MTA SZTAKI. This model is built automatically based on the Core Manufactur-
ing Simulation Data (CMSD) standard, specifically designed for manufacturing
simulation studies.
3 Docker@SZTAKI project
The main motivation behind the Docker@SZTAKI project was to elaborate and
demonstrate a Docker software container-based platform that can be formed
on demand in a highly portable way, i.e., according to the complexity and the
actual needs of the CPS application in various IT environments. The supported
environments (see Fig 1) include even the user’s laptop, or the on-premise servers
of the institute, and also a wide range of private/public/community clouds, e.g.,
the Hungarian academic, federated, community cloud, the MTA Cloud (based
on OpenStack middleware) or the public Amazon Web Services (AWS) cloud.
tools, such as the CQueue manager (see Section 3.2) or the Occopus [35] cloud
orchestrator. CQueue plays a crucial role when the push model of Docker swarm
clustering mechanism cannot be applied, and the pull model is more suitable
for the application (e.g.in case of EasySim). The Occopus cloud orchestrator is
responsible for creating and managing the required VMs in the selected clouds
when the Docker@SZTAKI user needs extra or 24/7 available IT capacities for
their applications.
The platform has been used for the demonstrating two major pillars of the
CPS components: sensor data back-end and DES simulation.
Fig. 3. Architecture of a web-based data collector application for sensor image data
Tier. The forwarding decision is made in two steps. First, based on a round-robin
algorithm a high-availability proxy and load-balancer (based on HAProxy [36])
is selected. The proxy, in turn, will select an application server with the low-
est load and forward the request to that one. A Data Collector instance in
the Aggregation Tier (shown in Fig. 2) will decode the received data and store
them in the Database Tier (shown in Fig. 2). Besides the Data Collector, other
functionalities are also available and work similarly. Database services are pro-
vided by a Cassandra or MongoDB [23] database cluster, besides an RDBMS
like MySQL. Cassandra is a decentralized structured storage system that is well
suited for storing time-series data like sensor data. As the volume of incoming
data changes Cassandra allows dynamically adding or removing new nodes to
the database.
Data submission is initiated by the Client Tier resolving the DNS endpoint
of a given service. The DNS endpoint may contain one or more load-balancer ad-
dresses, in turn they distribute the load between the available Receiver instances.
Using round-robin DNS techniques, it is possible to scale the number of load-
balancer nodes. It is a well-known simple method for load sharing, fault tolerance
and load distribution for making multiple redundant service hosts available. Next
HAProxy servers are responsible for balancing the load across multiple applica-
tion servers (e.g., Data Collectors) after through the round-robin DNS the client
contacts one. HAProxy also continuously monitors the health and performance
of the application servers connected.
A data receiver application and connected components are depicted in Fig. 3.
It consists of the following: Chef is used as a deployment orchestrator for boot-
8 Lovas, Farkas, Marosi et al.
strapping new nodes for the different tiers. The Data Processing component
and Cassandra Connector are implemented using the Flask Web Framework
and Python. The Sensor Metadata Decoder is responsible for interpreting the
incoming data and passing it to the Cassandra Connector. The Cassandra Con-
nector is used to store the decoded metadata in the database cluster. uWSGI [37]
is used as a WSGI [38] application server, and finally, NGINX [39] is connected
to the wire-protocol of uWSGI to achieve a high performance WSGI-based web
frontend.
Container-based architecture
The original data collector framework is based on virtual machines, and the
components are run on separate nodes. This architecture is ideal to scale out
or scale in the components based on the application utilization. On the other
hand, this separation might have a negative effect on the resource utilization. To
achieve better resource utilization we have created the Docker version of the data
collector infrastructure with smaller granularity. With the Docker container tech-
nology [24] the components of the architecture can be separated into containers
therefore we can run more than one collector component on one particular vir-
tual machine. The Docker version of the collector provides more efficient resource
utilization than virtual machine based solution.
container start. This configuration is performed by the docker entry point script
at the start (this is the main configuration method in the Docker ecosystem). For
the Cassandra Docker version the official Cassandra image was selected from the
Docker Hub but we applied some modifications; the official entrypoint script was
extended to support the automatic Cassandra cluster creation at the start time
on a Docker Swarm cluster. With these images we created a Docker compose file
to provide a simple container orchestration method. With the compose file the
main part of the collector infrastructure can be deployed by the service operator
on one machine or on a Swarm cluster as a Docker stack, and the containers can
be easily configured through the compose file with various environment variables.
The service operator can deploy the data collector framework as Docker stack
from the described Docker compose file on a cluster managed by Docker Swarm.
Another important feature of Docker Swarm is the provided overlay network
between the Swarm nodes for the containers. In this network the containers can
access each other like they are on one virtual machine. Furthermore, Swarm
provides an ingress routing mesh on this network. With the routing mesh the
Swarm services can expose their ports on the virtual machines so they can be
reached on every Swarm node from outside of the cluster. With that feature
Swarm provides an external load balancer between the application containers
within a Docker service. Therefore, we decided to replace the HAproxy in the
data collector infrastructure with the above-described routing mesh facility of
Swarm. The resulting architecture is demonstrated in Fig. 4. Prometheus [40] is
used for monitoring the infrastructure with agents deployed on the nodes.
The infrastructure is deployed and managed by the Occopus [35] cloud or-
chestrating tool. Occopus is an open source software providing features to orches-
trate, configure, and manage virtual infrastructures. It allows describing virtual
infrastructures in a cloud-agnostic way. We created the necessary description
files to build and maintain the collector framework. As an additional benefit,
the number of Swarm workers in the framework can be automatically scaled
based on their CPU load.
Extended architecture
In the next iteration of the data collector, we improved the data storing layer
and separated the functions of the data collector layer to improve the disad-
vantages of the framework. In the first version, all metadata about the sensors
and the measured data are stored in the Cassandra database. This is not an
optimal database schema to store related data in a NoSQL database, there-
fore, we separated the stored data into two databases. The information and
the corresponding metadata of the sensors are stored in an SQL database, and
measurement data will be stored in a NoSQL database or distributed file sys-
tem. Originally data collectors served multiple purposes: receive, process and
store the data in a database. These functions have been separated into distinct
components. These streaming components push data to a streaming component,
and dedicated processors store the date for further analytics or process them
10 Lovas, Farkas, Marosi et al.
in-stream. This greatly reduces the stress on the data collector and furthers the
architecture. The extended collector architecture is demonstrated in Fig. 5.
Since Docker does not provide pull model for tasks execution (swarm uses
push execution model) the new CQueue framework provides a lightweight queue
service for processing tasks via application containers. The framework consists
of four main components (see Fig. 6): (i) one or more CQueue server(s), which
act(s) as frontend(s) and receive(s) the container-based task requests; (ii) a queue
server schedules the tasks requests for workers; (iii) CQueue workers that pull
tasks from the queue server; and (iv) a key-value store stores the state and
the output of the finished tasks. Currently queuing is handled by RabbitMQ,
and Redis is used as the key-value store. The frontend server and the worker
components are written in golang, and they have a shared code-base. All of the
components are running inside Docker containers and can be scaled based on
their utilization. The design goals of the framework are to use standard interfaces
and components to create a generic job processing middleware.
The framework is built for executing Docker container-based tasks with their
specific inputs. Also, environment variables and other input parameters can be
specified for each container. CQueue uses a unique ID to identify the pushed
tasks, and the user has to specify it. The ID, the application container and
the inputs of the task must be specified in standard JSON (JavaScript Object
Notation) format. The CQueue server receives the tasks via a REST-Like API.
After this step, the server transforms the JSON formatted tasks to standard
AMQP (Advanced Message Queuing Protocol) messages and pushes them to
the queue server. The workers pull the registered tasks from the queue server
via the same AMQP protocol and execute them. One worker process one task at
a time. After the task is completed, the workers send a notification to the queue
Orchestrated Platform for CPS 11
server, and this task will be removed from the queue. The worker continuously
updates the status (registered, running, finished or failed ) of the task with the
task’s ID in the key-value store. When the task is finished or failed the worker
stores the stdout and stderr of the task in the key-value store as well. The
status of a task and the result can be queried from the key-value store through
the CQueue server. The output of the task is not processed, it is stored in the
key-value store in its original format.
Standardized Database
Concerning the implementation of the persistence layer MySQL has been selected
to store all the necessary back-end information.
Regarding the standardization of manufacturing and logistics systems there
are different standards approved and offered by different organizations, the most
known one is ISA 95, provided by the International Society of Automation [41].
Having a comparison on the base of applicability, finally, we selected the Stan-
dard for Core Manufacturing Simulation Data (CMSD) [42] in order to have
a standardized system with reusable components. In this way, we applied stan-
dard data formats for representing certain structures of the system related to the
simulation module, namely the SISO-STD-008-2010, Standard for Core Manu-
facturing Simulation Data (SISO CMSD) provided by SISO1 is applied in the
research.
This standard addresses interoperability between simulation systems and
other manufacturing applications. As such, it inherently includes the most rele-
vant and simulation related data for simulation of manufacturing systems. The
CMSD model is a standard representation for core manufacturing simulation
data, providing neutral structures for the efficient exchange of manufacturing
data in a simulation environment. These neutral data structures are applied to
support the integration of simulation software with other manufacturing appli-
cations.
The CMSD standard has several packages, but not all of them are necessary
in this application. Just as an example, the layout package was not used, as in
our scenario the focus of the experiment the layout is not relevant. The stan-
dard itself is described as a UML model furthermore there are XML, as well
and representations in different programming languages. Within the context of
1
http://www.sisostds.org, visited 2017-11-01
Orchestrated Platform for CPS 13
the research, the back-end database was designed and implemented with an im-
plementation of the CMSD standard in a relational database format, based on
the initial UML version, forming the main data storage of different simulation
forecasting scenarios. All the data about resources, the entities or workpieces
traveling in the manufacturing system, the routings, the sequence of operations,
the control logic and the manufacturing orders to be completed are all stored
in the database according to CMSD specification. One of the non-functional re-
quirements for selecting this solution, namely the direct implementation of SQL
database tables and relations was the speed of building and updating simulation
models instantly.
According to the nature of the data stored in the MySQL database two types
of tables can be distinguished. On one hand, the implementation of the CMSD
standard provides the information related to simulation. On the other hand,
there are tables, which stores specific information necessary for the application
itself in this new environment.
As mentioned in the introductory part, the reason for using EasySim instead
of any other existing DES tools is the difference in performance. EasySim is a
simulation kernel providing only the core functionality of a DES tool. No graph-
ical user interface had been developed for it, and we believe that one the most
promising data structure had been selected to represent the event list because
its properties (size, the speed of accessing its content) can highly influence the
speed of the simulation. Furthermore, EasySim had been developed for building
a DES model in the most direct way with programming. The overall intention to
developed EasySim in such way was to achieve fast simulation runs and because
EasySim is an own implementation we could be sure that the integration with
the other tools in this paper would be in the most convenient way. Again, it is
14 Lovas, Farkas, Marosi et al.
true that the simulation model presented below would have been implemented
in any other simulation software, EasySim was selected because the flexibility to
integrate it in the back-end platform.
Regarding the simulation model, it implements a production line which con-
tains eight workstations connected to each other in a linear way, called a flow-
shop line.The modeled production line is part of a real manufacturing factory
and operation times were given for each workstation provided by the factory.
Additionally, the model implements some kind of stochastic behavior such as
failure of workstations which can be optionally used in the simulation. This
capability of stochastic behavior has been realized by integrating a mathemati-
cal software package during the development of EasySim which ensures proper
random number handling and different mathematical functions to approximate
reality as much as it is possible. The operations on each workstation are different
and may require the presence of one or more human workers who perform the
manufacturing of assembly task at the given station. As in a real production
system, the workers are also different, for each specific task at a workstation a
worker needs to have a specific skill. Moreover, the operators are assigned to
specific shifts meaning that shift-by-shift we can have different teams, grouping
different workers with different skills. As Fig 8 illustrates it is a linear, acyclic
production line which contains eight workstations (WS1, WS2, etc.). Below each
workstation, there is a required skill name which indicates that a worker can op-
erate on the workstation only if the worker has the specific skill. A worker can
have multiple different skills meaning that he can operate on different worksta-
tions. An evident solution is to have one worker for each workstation with the
required skills of course but in real factories have less workers available to allo-
cate them so the task is to find an optimal worker set which is able to carry the
order out with a minimal number of workers.
The task of the planner is to find the right configuration of workers for
each specific shift. Naturally, the problem can be formulated as a formalized
mathematical problem but as the nature of the operation times are stochastic,
i.e. each operation vary and follows a distribution, additionally, failures may
occur unexpectedly at each workstation, the usage of a discrete event simulation
tool is more adequate to model the line in question. To provide input, control the
run of the simulation model and visualize the results of the simulation scenarios
a web application was developed and integrated with the orchestrated back-end
platform described earlier. This is presented in the upper part of Fig 7 and
includes the Experiment Manager and the GUI for visualization. The visualized
Orchestrated Platform for CPS 15
In the example presented in Table 1 Worker1 can work on WS1, WS4 and
WS6 workstations but cannot work on WS2, WS3, WS5, WS7 and WS8 work-
stations. A worker can have multiple different skills, so considering the example
before a worker can operate both on WS2 and on WS3 only if he has the Skill2
AND Skill3 skills.
Fig 9 shows how the parametrization of the worker skills can be completed
with the help of the high-level GUI. As you can see, there are ten different workers
provided as columns in the matrix while in each configuration - which will run
in parallel on the orchestrated back-end platform - separate skill patterns can
be defined for each worker. These are denoted by the names of the workstations
i.i SV01, SV02, etc.
the completion of the simulation run, and when all the running configurations
were completed the GUI can visualize the simulation results. Fig 10 provides
statistics about the utilization of the workers in configuration number 3. The
blue part in the top region of the figure illustrates the percentage the operator
was idle while the green indicates the time where the operator was working. With
the orders completed in this configuration, we can see that applying seven oper-
ators we will have a very under-utilized environment. Fig 11 gives an overview
of how the 3 distinct orders behaved in the system, mainly meaning that there
were no significant differences between the five different configurations. As the
main focus of the paper is the orchestrated back-end, we additionally included
some explanatory charts, but many additional Key Performance Indicators can
be visualized within the GUI. Some of them visualizes aggregated while others
specific resource, buffer or worker related measures.
The developed sensor data back-end has been successfully migrated under the
MiCADO [43] (Microservices-based Cloud Application-level Dynamic Orches-
trator) framework that attempts to unify and also to extend the previously
described tools including Occopus, CQueue, etc. in long-term. It allowed us to
evaluate the sensor data back-end in a more fine-grained way using multi-level
scaling, i.e. not only VM but container level as well. This approach utilized the
two control loops of MiCADO that led to the presented results.
MiCADO is the automatic scaling on two levels. The microservice level auto-
scaling deals with keeping the optimal number of container instances for a partic-
ular microservice. The cloud level auto-scaling deals with allocating the optimal
amount of cloud resources for the entire microservice-based infrastructure.
MiCADO is developed by integrating various tools into a common frame-
work. For executing microservice infrastructure, a dynamically extendable and
resizable Swarm [45] cluster is used. Monitoring is performed by Prometheus [40]
with agents on the nodes of the cluster. The communication with the cloud API
and the orchestration of the Swarm cluster is performed by Occopus [35] men-
tioned in previous sections. Each of the components is integrated taking into
account the replaceability in the future in case a better tool appears in its area.
The scaling and optimization logic is built by the COLA project as well as the
submission interface. For describing the microservice infrastructure, the project
has chosen the TOSCA [31] specification language where the components, re-
quirements, relations, etc. can be easily defined in a portable way. The way of
describing scaling/optimization policies are developed by the COLA project as
an extension of the TOSCA specification.
The conceptual overview of the two control loops implemented by the afore-
mentioned components and tools are shown in Fig 12. In both control loops Pol-
icy Keeper performs controlling and decision making on scaling while Prometheus
acts as a sensor to monitor the measured targets. In the microservice control loop,
the targets are the microservice containers realizing the infrastructure to be con-
trolled. Containers are modified (number, location, etc.) by Swarm playing as
an actuator in the loop. A similar control loop is realized for the cloud resources
represented by virtual machines in our case. Here Occopus acts as an actuator
to scale up/down the virtual machines (targets). The microservice control loop
controls the consumers while the cloud level control loop controls the resources.
As a consequence, microservice loop affects the cloud loop since more consumers
require more resources.
The goal of MiCADO control loops is to provide an automatic scaling func-
tionality for an infrastructure built by microservices. For automatic scaling, there
are several different scenarios in which scaling can focus on optimizing the run-
ning infrastructure for various goals. The execution of the microservice infras-
tructure has different requirements and different measurable characteristics. For
example, processing, memory, network bandwidth, disk i/o, etc. are all resources
MiCADO may reserve for the infrastructure while cpu load, memory usage, re-
sponse time or disk usage are measurable characteristics. Beyond optimizing for
some of the characteristics, MiCADO is also being developed towards optimizing
for costs generated by the usage of (commercial) cloud resources.
Beyond optimizing for easily measurable external characteristics, MiCADO is
prepared to monitor some internal parameters of the microservice infrastructure.
For example, monitoring the length of a queue enables MiCADO to perform opti-
mization in different scenarios like keeping the number of items on a certain level,
keeping a predefined processing rate of items or make the items being consumed
by a predefined deadline. The different scenarios and optimization strategies are
Orchestrated Platform for CPS 19
Fig. 12. Control loops applied for multi-level auto-scaling of virtual machines and
containers in MiCADO
continuously developed and added to the latest version of MiCADO. The current
version of MiCADO (v3) supports the performance-based policy for containers
and virtual machines.
(a) Total number of data receiver containers (b) Total number of all containers (nodes)
Results
The multi-level scaling of the back-end is handled by MiCADO. With MiCADO’s
dual control loops we can scale the container-based data collectors and the host
virtual machines as well. The whole data collector infrastructure are deployed
in MiCADO.
20 Lovas, Farkas, Marosi et al.
This section summarizes further targeted use cases: (i) Connected Cars (see
Section 5.1) and (ii) Precision Agriculture (see Section 5.2); and contains con-
clusions for the paper. In both (i-ii) CPS areas a subset of the presented back-end
framework has been already applied and integrated successfully with other sys-
tem components particularly for research and evaluation purposes, and also for
forming the baseline of new production-level services.
As additional future work, we have started to study and elaborate the adap-
tation of different job-based policies, including deadline and throughput, in Mi-
CADO over the CQueue microservice infrastructure. The integration will lead
to an auto-scalable CQueue job execution framework with different strategies on
scaling. Furthermore, the adaptation of this sensor data ingestion architecture is
22 Lovas, Farkas, Marosi et al.
already in progress in two further sectors, namely Connected Cars and Precision
Farming, with some positive preliminary results based on the outlined sectoral
demands.
5.3 Conclusions
Acknowledgment
This work was partially funded by the European ”COLA - Cloud Orchestra-
tion at the Level of Application” project, Grant Agreement No. 731574 (H2020-
ICT-2016-1), by the National Research, Development and Innovation Fund of
Hungary under grant No. VKSZ 12-1-2013-0024 (Agrodat.hu), and by the In-
ternational Science & Technology Cooperation Program of China under grant
No. 2015DFE12860. On behalf of the Project Occopus we thank for the usage
of MTA Cloud [53] that significantly helped us achieving the results published
in this paper. The research conducted with the scope of the discrete event sim-
ulation was supported by the European Commission through the H2020 project
EPIC (http://www.centre-epic.eu) under grant No. 739592.
24 Lovas, Farkas, Marosi et al.
References
1. L. Monostori, B. Kádár, T. Bauernhansl, S. Kondoh, S. Kumara, G. Reinhart,
O. Sauer, G. Schuh, W. Sihn, and K. Ueda, “Cyber-physical systems in manu-
facturing,” CIRP Annals-Manufacturing Technology, vol. 65, no. 2, pp. 621–641,
2016.
2. C. Kardos, G. Popovics, B. Kádár, and L. Monostori, “Methodology and data-
structure for a uniform system’s specification in simulation projects,” Procedia
CIRP, vol. 7, pp. 455–460, 2013.
3. A. Gupta, M. Kumar, S. Hansel, and A. K. Saini, “Future of all technologies-the
cloud and cyber physical systems,” Future, vol. 2, no. 2, 2013.
4. I. Mezgár and U. Rauschecker, “The challenge of networked enterprises for cloud
computing interoperability,” Computers in Industry, vol. 65, no. 4, pp. 657–674,
2014.
5. R. Gao, L. Wang, R. Teti, D. Dornfeld, S. Kumara, M. Mori, and M. Helu, “Cloud-
enabled prognosis for manufacturing,” CIRP Annals-Manufacturing Technology,
vol. 64, no. 2, pp. 749–772, 2015.
6. J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of things (iot): A
vision, architectural elements, and future directions,” Future generation computer
systems, vol. 29, no. 7, pp. 1645–1660, 2013.
7. AWS Internet of Things. Accessed: 2017-10-30. [Online]. Available: http:
//aws.amazon.com/iot
8. Azure IoT Suite - IoT Cloud Solution. Accessed: 2017-10-30. [Online]. Available:
http://www.microsoft.com/en-us/internet-of-things/azure-iot-suite
9. Google IoT Core. Accessed: 2017-10-30. [Online]. Available: http://cloud.google.
com/iot-core
10. FIWARE Architecture Description: Big Data. Accessed: 2017-10-30. [On-
line]. Available: https://forge.fiware.org/plugins/mediawiki/wiki/fiware/index.
php/FIWARE.ArchitectureDescription.Data.BigData
11. N. Marz and J. Warren, Big Data: Principles and best practices of scalable realtime
data systems. Manning Publications Co., 2015.
12. HIGHLY SCALABLE BLOG. In-stream Big Data Processing. Accessed:
2017-10-30. [Online]. Available: https://highlyscalable.wordpress.com/2013/08/
20/in-stream-big-data-processing/
13. FIWARE Architecture Description: Big Data. Accessed: 2017-10-30. [On-
line]. Available: https://forge.fiware.org/plugins/mediawiki/wiki/fiware/index.
php/FIWARE.ArchitectureDescription.Data.BigData
14. FIWARE Glossary. Accessed: 2017-10-30. [Online]. Available: https://forge.fiware.
org/plugins/mediawiki/wiki/fiware/index.php/FIWARE.Glossary.Global
15. Hadoop: WebHDFS REST API. Accessed: 2017-10-30. [Online]. Avail-
able: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/
WebHDFS.html
16. M. Kornacker, A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erick-
son, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Kuff, D. Kumar, A. Leblang,
N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis,
S. Wanderman-Milne, and M. Yoder, “Impala: A modern, open-source sql engine
for hadoop,” in Proceedings of the 7th Biennial Conference on Innovative Data
Systems Research, 2015.
17. K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The hadoop distributed file
system,” in Mass storage systems and technologies (MSST), 2010 IEEE 26th sym-
posium on. IEEE, 2010, pp. 1–10.
Orchestrated Platform for CPS 25
39. W. Reese, “Nginx: The high-performance web server and reverse proxy,” Linux
Journal, vol. 2008, no. 173, p. 2, 2008.
40. The Prometheus monitoring tool. Accessed: 2017-11-22. [Online]. Available:
https://prometheus.io/
41. International Society of Automation . ISA95, Enterprise-Control System
Integration. Accessed: 2017-11-20. [Online]. Available: https://www.isa.org/isa95/
42. R. Bloomfield, E. Mazhari, J. Hawkins, and Y.-J. Son, “Interoperability of manu-
facturing applications using the core manufacturing simulation data (cmsd) stan-
dard information model,” Computers & Industrial Engineering, vol. 62, no. 4, pp.
1065–1079, 2012.
43. T. Kiss, P. Kacsuk, J. Kovacs, B. Rakoczi, A. Hajnal, A. Farkas, G. Gesmier,
and G. Terstyanszky, “Micadomicroservice-based cloud application-level dynamic
orchestrator,” Future Generation Computer Systems, 2017. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0167739X17310506
44. COLA: Cloud Orchestration at the Level of Application. Accessed: 2017-11-22.
[Online]. Available: http://www.project-cola.eu
45. Swarm mode of Docker. Accessed: 2017-11-22. [Online]. Available: https:
//docs.docker.com/engine/swarm/
46. W. He, G. Yan, and L. Da Xu, “Developing vehicular data cloud services in the
iot environment,” IEEE Transactions on Industrial Informatics, vol. 10, no. 2, pp.
1587–1595, 2014.
47. C. Marosi, Attila, R. Lovas, A. Kisari, and S. Erno, “A novel iot platform for the
era of connected cars,” in Proceedings of the IEEE International Conference on
Future IoT Technologies (Future IoT 2018), Eger, Hungary [in press].
48. Agrodat.hu project website. Accessed: 2017-10-30. [Online]. Available: http:
//www.agrodat.hu
49. G. Paller, P. Szármes, and G. Élö, “Power consumption considerations of gsm-
connected sensors in the agrodat.hu sensor network,” Sensors & Transducers, vol.
189, no. 6, pp. 52–60, 2015.
50. X. Wen, G. Gu, Q. Li, Y. Gao, and X. Zhang, “Comparison of open-source cloud
management platforms: Openstack and opennebula,” 2012, pp. 2457–2461.
51. C. Marosi, Attila, A. Farkas, and R. Lovas, “An adaptive cloud-based iot back-end
architecture and its applications,” in Proceedings of The 26th Euromicro Inter-
national Conference on Parallel, Distributed and Network-Based Processing (PDP
2018), Cambridge, UK [in press].
52. Cloudifacturing Project at EU Cordis portal. Accessed: 2017-10-30. [Online].
Available: http://cordis.europa.eu/project/rcn/211582\ en.html
53. MTA Cloud. Accessed: 2017-10-30. [Online]. Available: https://cloud.mta.hu