IOT Data Management and Analytics
IOT Data Management and Analytics
analytics.
Traditional vs. IOT data management.
Data processing simply the conversion of raw data to the meaningful information.
Life cycle of data within an IOT system proceed from data production to aggression,
transfer, optional filtering and preprocessing and finally storage and archiving.
Querying and analyzing are the endpoints that initiate and consume data production. But
production, collection, aggression, filtering and some basic querying and preliminary
processing functionalities are consider online and communication intensive operations.
Traditional Internet vs. Internet of Things
IOT data lifecycle management.
Storage operations aim at making data available on the long term for constant access and
updates.
Archival is consider with the read only data.
IOT system may generate, process and store data in network for real time and localized
services.
Element of IOT data life cycle.
Querying.
Production.
Collection.
Aggression/Fusion.
Delivery
Preprocessing.
Storage, Update and archiving.
Processing/ Analysis.
Output and interpretation.
Querying.
Sensing and transfer of data by the edge devices within the IOT framework.
Reporting these data to the interested party periodically.
Data is usually time stamped and possibly geo stamped and form of simple key value pairs.
Contain rich with audio/ video/ image content with varying degrees of complexity in
between.
Collection.
Sensors and smart objects within the IOT store the data for a certain time interval or report
it to govern components.
Data may be collected at concentration points or gateway within the network.
Further filtered and process into compact forms for efficient transmission.
A collection is the first stage of the cycle and crucial since the quality of data collected will
heavily impact the output.
Collection process need to ensure that the data gathered are define and accurate. So that
subsequent decisions made on findings are valid.
Aggression/fusion.
Transmitting the raw data out of the network in real time is prohibitively expensive,
increasing data rates and limited bandwidth.
Deploy summarization and merging operation in real time to compress the volume of data
to be stored and transmitted.
Delivery
Data is filtered, aggregate and possibly process at the concentration points or at the
autonomous virtual unit within the IOT.
These processes may need to be send further up the system for storage and depth analysis.
Wired or wireless broadband communication used there to transfer data to permanent
stores.
Preprocessing.
IOT data will come from different sources with varying formats and structures.
Data need to be preprocess to handle missing data, remove redundancies and integrate data
from different sources into a unified schema before being committed to the storage.
Manipulation of data into a form suitable for further analysis and preprocessing.
Preparation is about constructing a data set from one or more data sources to be used for
further exploration and processing.
Analyzing data that not been carefully screen for problems highly mislead results and
heavily dependent on the quality of data prepared.
Preprocessing is a procedure used in data mining call data cleaning.
Storage/Update and archiving.
Handles the efficient storage and organization of data. Continuous update of data with new
information as it becomes available. Archiving refers to the offline long term storage of
data that that is not immediately needed for the systems ongoing operations.
Allow quick access and retrieval of the processed information. Allow it to be pass next
stage directly when needed.
Rather than having a relational data bases for big data analytics people move to the no
SQL data bases to deal with.
Storage can also be decentralized for autonomous IOT systems.
Due to the limited capabilities of such objects, storage capacity remains limited in
comparison to the centralized storage model.
Processing/Analysis.
This phase involves the ongoing retrieval and analysis operations performed and stored
and archived in order to gain insights into the historical data and predict future trends or to
detect abnormities in the data that may be trigger further investigation or action.
Required to filter and clean data before meaningful operations can take place.
Out put and interpretation.
Design primitives of the design determine the logical and physical structure of data
management solution.
Design primitives are organized into the three main dimensions.
Data collection.
Target the discovery and identification of “Things”, sub system static or mobile data whose data is
to be fed to the IOT data store.
Data management system design.
Data management design system elements address the architecture of the database management
system and data stored and archive.
Processing.
Element deal with actual aces to the store data.
Data collection element.
Mobility support.
Mobile devises are move. So that need to access data stores to a transparent way.
Use session based synchronization system to data exchange and store. Publisher/subscriber
base systems used to notification based data delivery from mobile devises.
Such systems are place for vehicular and WSN IOT sub space.
Database system design element.
Federated architecture.
Distributed and federated database systems can be useful for the IOT data management.
Distributed database systems -: distribute database over the multiple locations.
Federated architecture -: manage independent and heterogeneous data stores at multiple sites.
With federate architecture there is a trend that build the IOT system around service oriented
architecture.
Need to join real time spatio temporal data residing in lower level of IOT layers with historical
data residing in the database server.
For ea. E health.
Database system design element.
Schema support.
Database schema formally defines the structure of a database system. In the relational
model, schema is defined beforehand as tables and relationships linking those tables, and all
data insertions/updates must adhere to that schema.
Query optimization is challenging in non-schema systems, because of the lack of knowledge
about indices, table partitions and cardinalities, and statistics about data values
Processing Elements.
Access Model.
In order to access data, querying languages have been used for relational systems, and later
adapted to sensor networks. Structured Query Language (SQL) has been the de facto standard
for data access, with standard selection/projection/join/aggregation operations that can be
nested for complex queries.
As SQL has become too complex due to the continuous extensions as new capabilities are
added, developers for the various applications envisioned for IoT will find it hard to learn all
of SQL’s dialects and tricks while they may need only a subset of them. Therefore, it has been
suggested that a more flexible form be used, in which an SQL dialect or predefined
configuration is chosen according to the specific requirements of the scenario at hand.
Q &A
Thank you.