Lecture 13 IoT Cloud Computing
Lecture 13 IoT Cloud Computing
• The IoT revolution comes with trillions of connected devices; however, the real value
of IoT is in the advanced processing of the collected data.
• By nature, IoT data is more dynamic, heterogeneous and unstructured than typical
business data.
• The exploitation in the Cloud of data obtained in real time from sensors is therefore
very much a necessity.
• This data processing leads to advanced proactive and intelligent applications and
services.
• The Cloud paradigm can be delivered using essentially three different service models
namely, infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software
as a Service (SaaS).
• A Cloud-based IoT platform is then a dynamic and flexible resource sharing platform
delivering IoT services and offers scalable resources and services management.
• The exploitation of IoT data depends on massive resources, which should be available
when needed and scaled back when not needed.
IaaS, PaaS and SaaS Paradigms
A Cloud based IoT platform needs usually to select one from the three different service
models: IaaS, PaaS or SaaS.
• The initial success of the Cloud is due to the possibility to embed practically any
legacy applications within Virtual Machines (VMs), which are managed by an external
stakeholder.
• This permits to relieve the application owner from managing physical infrastructures.
• PaaS, on the other hand, provides a platform allowing customers to develop, run, and
manage applications.
• Finally, SaaS is a software licensing and delivery model in which software is licensed
on a subscription basis and is centrally hosted.
• SaaS is typically accessed by users using a thin client via a web browser.
• Cloud-based IoT platforms are usually based on the SaaS paradigm and provide IoT-
related services using a web interface on a pay-per-use basis.
• For example, a service such as Xively1 provides a web service with a database able to
store sensors data points. This data is then processed and displayed in various
graphics.
Requirements of IoT BigData Analytics Platform
Distributed
Scalable
• The platform needs to be scalable in order to address the needs of a
variable number of the devices, services and users.
Real-Time
• The platform needs to be able to process data in real-time, i.e.
providing a fast analysis and responses for situations of urgency.
• Areal-time data analysis platform needs to be able to prioritize urgent
traffic and processing from nonurgent ones.
Programmable
• The platform shall support programmable capabilities of IoT business
and service logics, data warehouse scheme, template of data and
service model.
Interoperable
• The platform provides interoperability between the different IoT
services and infrastructure.
• The APIs need to follow the existing standards. The components are
published and maintained as Open Source software.
• The target is to deliver a common data model able to exploit both
structured and unstructured data.
• In order to create multimodal and cross-domain smart applications, it
is necessary to move from raw data to linked data and adopt
unambiguous description of relevant information.
Secure
• The topmost block represents the Cloud platform, the middle one is
the network connectivity while the bottom one is the local
deployment, including gateway and sensors.
The amount of IoT data coming from real-world smart objects with sensing, actuating,
computing and communication capabilities is exploding. The sensors and devices are more
and more deployed, within more applications and across industries.
• The volume and quality of the data generated by IoT devices is very different from the
traditional transaction-oriented business data.
• Coming from millions of sensors and sensor-enabled devices, IoT data is more
dynamic, heterogeneous, imperfect, unprocessed, unstructured and real-time than
typical business data.
• Figure 2.2, the BigData is defined by 4 “Vs”, which are Volume, Velocity, Variety and
Veracity.
• The first V is for a large volume of data, not gigabytes but rather thousands of terabytes.
The second V is referencing data streams and real-time processing.
• The third V is referencing the heterogeneity of the data: structure and unstructured,
diverse data models, query language, and data sources.
• A cloud-based IoT analytics platform provides IoT-specific analytics that reduce the
time, cost and required expertise to develop analytics-rich, vertical IoT applications.
• The applications of the IoT BigData Platform can be classified into four main categories
i. Deep understanding and insight knowledge
ii. Real time actionable insight
iii. Performance optimization
iv. Proactive and predictive applications.
Batch Processing
• The most widely used tool for the case is Hadoop MapReduce.
• MapReduce is a programming model and Hadoop an implementation, allowing
processing large data sets with a parallel, distributed algorithm on a cluster.
• It has its own language, PigLatin, similar to SQL. Pig Engine parses, optimizes and
automatically executes PigLatin scripts as a series of MapReduce jobs on a Hadoop
cluster.
• It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that
supports general execution graphs.
• It can be up to a hundred times faster than MapReduce with its capacity to work in-
memory, allowing keeping large working datasets in memory between jobs, reducing
considerably the latency.
• It has almost no latency as the data are streamed in real-time (row by row).
• It runs on YARN and works with its own extended version of MapReduce.
Machine Learning
• Machine learning is especially useful in the context of IoT when some properties of the
data collected need to be discovered automatically.
• Apache Spark comes with its own machine learning library, called MLib. It consists of
common learning algorithms and utilities, including classification, regression,
clustering, collaborative filtering, dimensionality reduction.
• Algorithms can be grouped in 3 domains of actions: Classification, association and
clustering.
• KNIME is an analytic platform that allows the user to process the data in a user-friendly
graphical interface.
• If the workflow is already deployed on Hadoop, Mahout, a machine learning library can
be used.
• Freeboard offers simple dashboards, which are readily useable sets of widgets able to
display data.
• Tableau Public is a free service that lets anyone publish interactive data to the web.
• Once on the web, anyone can interact with the data, download it, or create their own
visualizations of it.
• No programming skills are required. Tableau allows the upload of analysed data
from .csv format, for instance.
• The visualization tool is very powerful and allows a deep exploration the data.
• Kibana is an open source analytics and visualization platform designed to work with
Elasticsearch.
• Kibana allows searching, viewing, and interacting with data stored in Elasticsearch
indices.
• It can perform advanced data analysis and visualize data in a variety of charts, tables,
and maps.
• It allows to store, search, and analyze big volumes of data quickly and in near real
time.
• It is really designed for real-time analytics, most commonly used with Flink or Spark
streaming.
Data Collection Using Low-power, Long-range Radios
• Regarding the deployment of IoT devices in a large scale, it is still held back by technical
challenges such as short communication distances.
• During the last decade, low-power but short-range radio such as IEEE 802.15.4 radio
have been considered by the WSN community with multi-hop routing to overcome the
limited transmission range.
• Future 5G/LTE standards do have the IoT orientation, but these technologies and
standards are not ready yet while the demand is already high.
• Recent so-called Low-Power Wide Area Networks (LPWAN) such as those based on
SigfoxTM or Semtech’s LoRaTM [1] technology provide a better connectivity answer for
IoT as several kilometers can be achieved without relay nodes to reach a central
gateway or base station.
• Most of long range technologies can achieve 20 km or higher range in LOS condition
and about 2 km in urban NLOS [2].
• With cost and network availability constraints, LoRa technology, which can be privately
deployed in a given area without any operator, has a clear advantage over Sigfox which
coverage is entirely operator-managed.
• These low-power, long-range radio technologies will definitely allow a huge number of
sensors to be installed in remote area, thus augmenting the amount of data to be
treated in the IoT Cloud platform.
• In this typical architecture data captured by end-devices are sent to a gateway which will
push data to well identified network servers.
• Then application servers managed by end-users could retrieve data from the network
server.
• If encryption is used for confidentiality, the application server can be the place where data could
be decrypted and presented to end-users.
• The LoRa Alliance has issued the LoRaWAN specification [3] in a tentative for standardization of
public, large-scale LoRa LPWAN infrastructures featuring multi-gateways and full
network/application servers’ architecture.
• This specification also defines the set of common channels for communications, the packet format,
Medium Access Control (MAC) commands.
• This architecture can however be greatly simplified for small, ad-hoc deployment scenarios where
the gateway can directly push data to some servers or IoT-specific cloud platforms if properly
configured.
• The implementation of the full LoRaWAN specification requires gateways to be able to listen on
several channels and LoRa settings simultaneously.
• Cloud is a remote server that host applications, services and platforms to provide
services and solutions within an IOT environment.
• The IOT devices producing this data may be located anywhere around the globe. So,
a cloud must be able to connect, communicate and direct data and information
on an “Any Time, Any Place & Any Thing " basis.
• The storage, processing and analysis of data on a cloud to derive useful insights are
collectively known as 'Cloud Computing’.
• The major activities in cloud Computing involve delivering data, applications,
photos, videos and much more things over the Internet to the cloud.
• Cloud computing acts as a paradigm for big data storage and analytics. The real
evolution comes from combining IoT with cloud computing. For example, sensory
data can be uploaded and stored with cloud computing, later to be used
intelligently for smart monitoring and actuation with other smart devices.
• Cloud computing in IoT features rapid elasticity which means that users can
readily scale the services to their needs. They can easily and quickly edit the
software setup, add or remove users, increase storage space, etc. These
characteristics further empower IoT by providing elastic computing power,
storage, and networking.
IoT cloud Challenges
Any cloud (server) must have the following
capabilities -
1. It must be capable to access, store, analyze, process and
monitor big data.
IOT services may be accessed by billions of devices which could
have lot of data to share and exchange over the network and
across the cloud. There can be many other real-life examples of
cloud-based services and applications. For example, even Gmail is
a cloud service that is utilized by computers, laptops and mobile
devices. A Gmail user can register an account, access his/her
account, send or receive emails from anywhere and any device.
There are billions of Gmail accounts and each account could have
hundreds and thousands of sent and received mails.
2. Data must remain secure and protected on a cloud.
The data stored on a cloud could be private or even confidential. It
must be securely stored over the cloud so that only an authentic
user or device could access it. Also, while communicating over the
network, it must remain secured so that it could not be hacked by
an attacker.
3. The cloud must be capable of providing inter-service and
inter-device communication.
The big data and big data analytics require abundant IT resources
and efficient softwares. A cloud must be able to optimize the
requirements of IT infrastructure, so that the purpose of an IOT
system is fulfilled in the most cost-effective manner.
Architecture of IOT cloud
The architecture underlines all the background processes that
usually remain hidden and abstracted. A typical cloud architecture
can be represented by the following block diagram
Data Ingestion: The first thing that a cloud needs to do is to collect data from
the network. The process of importing data from IOT devices to the cloud is
called Data Ingestion Process.
It aggregates the data from IOT devices and distributes the data for further
processing. It also works in reverse direction by distributing data over to the
IoT devices. Clouds use certain protocols when receiving the data from the
network as well as when transmitting the data to the devices.
Pipelining:
After the data has been collected, it is the responsibility of IOT cloud to
manage the data in a proper way. The process of pipelining includes several
tasks as follow
Device Registry:
To manage the data onto the cloud, there is a section for registration
of devices with their unique IDs so that both users and cloud can
uniquely identify the devices and access those devices.
Transforming Data:
When the data is arrived to cloud, it may need to be converted to other
formats. For example, the signal voltages acquired from a temperature and
humidity sensor may need to be converted to a calibrated unit of temperature
and humidity.
Data storage:
The Data from the physical world comes in various forms. Cloud Platforms
provide enough storage to store the data in a structured manner. The Cloud
platforms provide reliable, high-performance and secured storage space. The
bulk amount of data can be stored on a cloud and can be accessed remotely
at anytime from anywhere in the world.
Real Time Data Analytics:
After the data storage, cloud platforms can perform some kind of analytics on that
data and convert it into feasible actions for human use. Cloud Platforms analyze
information and process the information with some kind of mathematics or logic i.e.
converting the data into information in order to perform automated tasks.
The cloud after processing the data, can send the data to other IOT devices to
perform some tasks based on that or can provide the data to other computing
engines.
IOT Clouds as Service Providers
The clouds are the service providers in an IOT environment. There are many as-
a-Service roles of cloud computing. All these as-a-Service roles are cloud
services rendered by the cloud platforms.
• IaaS comes at the bottom of the cloud computing services stack in which IaaS
provider provides fundamental building blocks of computing resources to the
clients over the Internet such as networking resources, hardware, software,
storage and many other IT infrastructure components.
• IaaS provider grants access to all these resources to the clients and is responsible
for housing, operating and maintaining all these equipments.
• IaaS users pay on a per-use basis, typically by the hour, week or month. Some ISP
providers also charge users according to the storage space used by the users.
• Vendor/Provider is responsible for configuring and maintaining the
infrastructure. So, users or enterprises do not need to worry about infrastructure
upgrades and they can focus on the task or application instead.
• Some of the well-known IaaS providers are Amazon Web services, Windows Azure
and Google Compute Engine.
Platform as a service (PaaS):
• PaaS provides a platform to the users to develop, run and manage applications
without the complexity of building and dealing with the infrastructure of the
platform
• PaaS is typically associated with developing and launching an app in a constrained
environment.
• Some examples of PaaS solutions are the “Google App Engine” system, “Heroku”
which operates on top of the Amazon Web Services and “Force.com” built as part of
the SalesForce.com Software as a Service offering.
Software as a Service (SaaS):
• The top layer of the cloud computing is SaaS in which the end user
operates the software applications only like email, word
processing etc. Instead of installing and maintaining software, the
users simply access these softwares via Internet.
• They do not need to worry about complex software
development and hardware management.
Google IOT cloud is a set of fully managed and integrated services that allow to easily
and securely connect, manage and ingest IOT data from globally dispersed devices,
process and analyze the data in real-time, implement the operational changes and take
actions as needed.
a) End-to-End Security -
It enables end-to-end security with certificate based authentication and TLS
1.2 transportation security.
b) Integrated Services –
It seamlessly move IoT data across Google Cloud services.
• Amazon AWS is a managed cloud platform that provides access to the IoT devices to
securely connect and interact with the cloud applications and as well as other
devices.
• AWS IOT can handle up to billions of devices and trillions of messages securely with
low latency and low over head.
• The AWS IoT supports communication protocol and standards like HTTP, MQTT and
websockets.
Note: Some of the other PaaS Cloud platforms like Microsoft azure IoT suite, IBM
Watson IoT, Salesforce IoT cloud, Carriots are also worth mentioning.
Thingspeak (saas)-
• It provides API to store and retrieve data from IoT devices over HTTP protocol.
• It allows users to upload and visualize the real time data from IoT devices. For
example, a developer can publish the surrounding temperature and humidity data
as well as time logging on thingspeak.