IoT & Its Applications Unit-IV
IoT & Its Applications Unit-IV
Structured Data:
Structured data defines data which are organized properly .All Relational
databases come under the structured data. Structured data is categorized as quantitative
data. Data that fits neatly in fixed fields and columns. Example: spreadsheets. Examples of
structured data include names, dates, addresses, credit card numbers, stock information,
Geo location, and more. In relational databases we can give input, search, and manipulate
structured data fast.
Unstructured Data:
Data acquired by the all the business is the unstructured data. Non-relational,
or No SQL databases, are used for managing unstructured data..More than 80 percent of
all data generated by the business process today is considered to be an unstructured data.
Advanced analytics is considered for manipulating the unstructured data. For example data
mining techniques, machine learning techniques and Natural language Processing are
used for the analysis of unstructured data of text, video and image.
For example, data from sensors attached to industrial machinery can alert
manufacturers of strange activity ahead of time. With this information, a repair can be made
before the machine suffers a costly breakdown.
Semi structured data:
This data is a hybrid data which shares the attributes of structured data and
unstructured data. It contains certain schema and consistency.Email, JSON is an example
of the unstructured data.
Data in IoT is operated as Data in transit (motion) or data at rest. The data
acquired from the IoT sensor objects is the data in motion. The data in motion is utilized by
the fog and edge computing. Data is sent to data center from the fog and edge computing.
Data in motion:
Data is actively moving from one location to another in the data in motion .for
example data is transferred between two networks.
Data at rest:
Data at rest is data that is not actively moving from device to device or
network to network such as data stored on a hard drive, laptop, and flash drive.eg: USB.
Protecting sensitive data both in transit and at rest is much needed for
modern systems as intruders find more complicated ways to steal data. Spark, storm and
Flink are the tools used for analysing the stored data. Myriad tools are used for processing
the structured data.Hadoop helps data processing and data storage.
IoT Data from the smart devices are realized and analysed in many ways.
Most of the IoT systems deploy the descriptive analysis and diagnostic analysis.
Prescriptive analysis and predictive analysis are complex to implement but modern
business are trending towards it.
Descriptive analysis
Diagnostic analysis
Predictive analysis
Prescriptive analysis
Data from Iot sensors undergo a challenge with the relational databases. The
challenges include,Scaling problems, Volatility of data.
Machine learning:
The data is generated from the IoT sensors are processed by a set of
algorithms and tools to come out with the relationship between the data. This data
processing is carried out by machine learning. Data obtained from the sensors should be
analyzed to take proper decision.
Machine learning is important tool for the IoT and data analytics .Machine
learning, Deep learning, Neural Networks and Convolutional networks are the various terms
related to the field of IoT.Self driving vehicles are embedded with self-learning capacity to
make intelligent decisions during driving is due to advancements in the machine learning
concepts.
Predictions.
Foreseeing.
Both Amazon and Netflix make use of machine learning figuring out how to
absorb our dispositions and deliver a superior ordeal to the client.The below figure depicts
the Roles and Responsibilities of ML in IOT and DATA ANALYTICS related to various
industries.
Figure 4.5: Various Fields Integrated With Deep Intelligence.
Supervised learning:
Unsupervised Learning:
Figure 4.6: Unsupervised Learning
The given data is unlabeled and we are able to find different categories of the
input it is said to unsupervised learning. This algorithm finds the different set of groups in
the given unlabeled set of data. This grouping is performed by the K Means clustering. The
mean of the particular input is calculated and all the data with similar kind are grouped
together. The following figure depicts the three different clusters formed from given set of
unlabeled data.
Neural networks are the extensions of machine learning approach the system
are able to recognize or differentiate and mimic human brain. Network is formed with
different set of layers namely input layer, first layer, higher layer, top layer and the output
layer. The following figure explains how a system is trained to find a dog from a given set of
labeled images of animals, through proper learning to classify them.
In Input layer unlabeled image is sent to the pretrained network. The first
layer finds the different shapes and in the higher layer complex structures are identified
(different features like face, arm) and top layer would identify the different high complex
structures (differentiate different animal categories).the final output layer predicts the animal
based on the training .the output unit gives the final output with high accuracy.
A neural network has much research focus. A neural network has been used
with various image processing application. There are different kinds of neural network
namely artificial neural network, convolutional neural network and recurrent neural network.
Deep learning concept was further developed which consists of more number of layers. The
result of one layer is fed into the next layer and the processing is done fast at the
intermediate layers. Numerous applications nowadays rely on deep learning concept and
neural network approaches.
Figure 4.7: NETCONF – YANG
Local learning: if the data is processed in the sensor node or fog node
ML for IoT in major domains: Weather sensor can provide the details of
pollution level at the city. Light embedded on street can change the luminosity based on the
local light conditions of the environment.ML integrated with IoT is deployed on various
applications. The following actions are performed on the sensors embedded on various
places.
Monitoring: The sensors are used for monitoring the environment for
example the temperature sensor.ML integrated with this sensor can find the failure
condition.
Self healing, Self-optimizing: The system which identifies the fault by itself
and it can find a corrective action for the fault being identified.
Predictive Analytics: This kind of analytics is done to predict the issue
which is going to arise due to some fault in the system. Predictive analysis is done to
improve the safety and maintenance of the system .sensors which are embedded in
machines can predict the faults which is going to occur through the help of big data
analytics
The data management is done by the big data and hadoop. Hadoop is the
backbone of various big data application. The data is being collected; stored, manipulated
and analyzed .The big data has three Vs
Variety: deals with different kinds of data like structured, unstructured and
semi structured data stored in the hadoop.Data from sensors is the example of structured
data, data from the social media is the unstructured data
Volume: deals with the huge volume of data ranging from giga bytes to exa
bytes. Clusters of servers are used for big deployments.
Machine data: Data generated from the sensors embedded in IoT systems
Transaction data: Data obtained from transactions
Social data: Data obtained from the social media like face book, twitter
(huge amount of data generated from the social media)
Industrial automation and control systems feed their data into relational
databases and historians. Examples of relational databases include oracle and Microsoft
SQL.Historian databases include the time series data recorded from the sensors.
There are new technologies for handling the data management. They are
NoSQL Databases
Hadoop
Massively Parallel Processing Databases:
The data from the enterprises are structured data and it is being stored in
relation databases. These group of relational databases together constitute the data
warehouses.MPP is a concept which is built on the top of the relational data warehouses
for faster access and reducing the query time. These systems can process the data in
parallel so it results in faster query process time.MPP is also termed as the analytic
databases. Refer the following figure for the MPP nothing sharing architecture. It possess
the master node to which all nodes are connected .each node has the processor, memory
and storage within itself. The whole process is optimized with the help of SQL. Fast
processing is an important aspect of MPP.
NoSQL (“non SQL” or “not only SQL”) databases store data in a format
other than relational tables. The semi structured and unstructured data are processed by
NO SQL. NoSQL database has been characterized in many types which include document
stores, key-value stores, wide-column stores, and graph stores.
Key value stores: It stores in the form of associative arrays. Key is paired
with value. Wide column stores: stores key value pairs but formatting takes place row by
row Graph stores: it describes the relationship between elements. Well suited for natural
Language processing and social media.
The data structures used by NoSQL databases are different from those used
by default in relational databases which makes some operations faster in NoSQL.
Schema-free
They don’t require any kind of definition of the schema of the data Offers
heterogeneous structures of data in the same domain
Figure 4.9: Difference Between RDBMS And NoSQL DB
Simple API
Distributed
Column-oriented Graph
Graphs based
Document-oriented
Key-value pair storage databases store data as a hash table where each key
is unique, and the value can be a JSON, BLOB (Binary Large Objects), string, etc.
For example, a key-value pair may contain a key like "Website" associated
with a value like "Guru99".
Column-based
Document-Oriented:
In this diagram on your left you can see we have rows and columns, and in
the right, we have a document database which has a similar structure to JSON.
The document type is mostly used for CMS systems, blogging platforms,
real-time analytics & e-commerce applications..
A graph type database stores entities as well the relations amongst those
entities. The entity is stored as a node with the relationship as edges. An edge gives a
relationship between nodes. Every node and edge has a unique identifier.
Compared to a relational database where tables are loosely connected, a
Graph database is a multi-relational in nature..
Graph base database mostly used for social networks, logistics, and spatial
data.
Some specific cases when NoSQL databases are a better choice than
RDBMS include the following:
When there is a large need for storing large amounts of unstructured data
with changing schemas.
Hadoop:
Hadoop Distributed File System: system for storing data from different
nodes.Map reduce: Processing engine which divides a big task into small one and it runs in
parallel for faster approach.
Hadoop runs applications using the Map Reduce algorithm, where the data is
processed in parallel on different CPU nodes.
The above figure depicts the hadoop cluster; it includes the name nodes and
the data nodes.
Name Nodes: This Node is important for data ads, deletes reads on the
HDFS system. Namenode takes the request from clients and it gives the requested block to
the available nodes. It gives instruction to the data nodes when to perform the replication.
Data nodes: This node is to store the data .The various blocks are
distributed in the data nodes .The same block is shared to one or more nodes as per their
replication policy. This is done to ensure the data redundancy.
Create a DFS
Create a Namenode
Hadoop Architecture
Hadoop Common: These are Java libraries and utilities required by other
Hadoop modules.
Hadoop YARN: This is a framework for job scheduling and cluster resource
management.
Apache projects:
Mapreduce::
Hadoop divides the job into two important tasks. There are two types of
tasks:
Multiple Task Trackers: Acts like slaves, each of them performing the job
From the scheduling and monitoring of jobs running on the cluster. YARN has
replaced the work done by the Job Tracker and TaskTracker daemons .YARN is the basic
requirement for Enterprise Hadoop, which provides resource management .It delivers a
consistent operations, security, and data governance for the Hadoop. YARN also extends
the power of Hadoop to include more new technologies found within the data center. Yarn
has an advantage of cost effective, linear-scale storage and processing. It provides ISVs
and developers a consistent framework for writing data access applications that run IN
Hadoop.
Figure4.17 : Yarn
n Features:N FEATURES:
APACHE KAFKA:
Kafka can work with Flume/Flafka, Spark Streaming, Storm, HBase, Flink,
and Spark for real-time ingesting, analysis and processing of streaming data.
Many companies who handle a lot of data use Kafka. LinkedIn, Twitter uses it
as part of Storm to provide a stream processing infrastructure. It's also used by other
companies like Spotify, Uber, Tumbler, Goldman Sachs, PayPal, Box, Cisco, CloudFlare,
and Netflix.
Spark uses Hadoop in two ways – one is storage and second is processing.
Since Spark has its own cluster management computation, it uses Hadoop for storage
purpose only.
Figure 4.21 SPARK
COMPONENTS OF SPARK
Figure 4.22: Apache spark core
Spark Core is the underlying general execution engine for spark .It provides
In-Memory computing and referencing datasets in external storage systems.
Spark SQL
Spark Streaming
GraphX
Advanced Analytics − Spark not only supports ‘Map’ and ‘reduce’. It also
supports SQL queries, Streaming data, Machine learning (ML), and Graph algorithms.
Apache storm and apache flink is built for the distributed stream
processing and it is mainly deployed for the IoT systems. Storm takes the data from the
Kafka and it processes it for the data streaming.
Big data refers to the unstructured data collected and stored in the cloud.
Big data analytics can be performed on the data centre data in the cloud .it performs
batch job analytics .This edge streaming analytics allows you to analyse and monitor the
streaming of data at the edges to make the prediction decision wisely. In edges
analytics the data is not been analysed in single edge. It is analysed in
distributed edge nodes, each node has to communicate with one another. Streaming
analytics is being done on the traffic data which gives information to the driver in taking
important decisions due to analytics on the traffic data. Big data analytics is performed
on the data at rest, streaming analytics is performed on the data in motion.
Time sensitivity
Filter: It filters out the irrelevant data and takes only important data
needed for processing that is the work of filter in APU.
Time: As the data flow through real time basis, timing should be framed. If
there is a fluctuation of data at different times .The average value is calculated from the
various time fluctuated data. Average value between the certain time intervals is
calculated.
Match patterns: Matching patterns aims at the alerting the system if there
is a kind of emergency. For example the matching pattern may alert a nurse by
notification of an alarm. Machine learning technique is adopted to find the matching
patterns of the system
Advantages
Scalability of analytics.
The amount of bandwidth needed to transmit all the data collected by thousands
of these edge devices will also grow exponentially with the increasing number of
these devices
Network traffic monitoring and profiling: This feature lets you analyze
the network by monitoring the traffic and it rectifies the problem.
Application traffic monitoring and profiling: This kind of monitoring is
done by the protocols MQTT, CoAP, and DNP3
Accounting: For this kind of accounting process the software like cisco
Jasper is used for monitoring the flow of data
Data warehousing and data mining: Data stored in the warehouse will
be analyzed for multiservice application.
Flexible Net flow Architecture: FNF is used for networks and it can be
deployed in the IoT Infrastructure. This has the advantage of flexibility, scalability and it
can check the progress of network packet .It also monitors the network behavior.
FNF Flow monitors (Net flow cache): It is a record with key fields (flow
record) and non key fields (flow of attributes). It monitors the information stored in the
cache. It is the flow exporter it sends the information.
FNF Exporter: defines the net flow where the data has to be sent
(destination address).The information from the reporter is being sent to the Net flow
reporting collector.
Flow export timers: timers indicate how many times flow should be
exported to the server
Net flow export format:
Netflow server for collection and reporting: Problems in network the final
destination of the net flow is analyzed by the server.
Xively Python Libraries are used to embed python code as per the Xively
APIs. A Xively Web interface is available for creating the front end part. Xively can work
with different programming language platform .HTTP protocols, APIs, MQTT are the
protocols used in Xively. All the devices are connected to Xively Cloud for real-time
processing and archiving to cloud. IOT application developers can write the frontend for
IoT applications as per their requirements. Management of apps is very flexible with
Xively cloud and other APIs. Xively is very popular with companies which deal with IoT
based device manufacturing and development. Companies using Xively has the secure
connectivity of devices and good data management capability.
Xively is an IoT cloud platform that is “an enterprise platform for building,
managing, and deriving business value from connected products”. It is a cloud-based
API with an SDK which simplifies and reduces the time of the development process.
Android
Arduino
Arm mbed
C
Step 4: IoT devices are assigned using the available APIs. The
permissions are given to perform the Create, Update, Delete and Read operation.
4.11 DJANGO:
Tools like Django Rest Framework helpful for developing mobile apps.
This service Brings Alexa Voice to any connected device. AVS for AWS IoT
reduces the cost and complexity of integrating Alexa.
AVS for AWS IoT enables Alexa built-in functionality on MCUs, such as the
ARM Cortex M class with less than 1 MB embedded RAM. To do so, AVS
offloads memory and compute tasks to a virtual Alexa Built-in device in the
cloud.
A JSON document used to store and retrieve current state information for
a device.
Group registry
Jobs service
Remote operations are set to the devices connected to AWS IoT. For
example, you can define a job that instructs a set of devices to download and install
application reboot, perform remote troubleshooting process.
Message broker
The MQTT protocol is used for the secure transmission over WebSocket
to publish and subscribe. HTTP REST interface is used to publish.
Registry
Rules engine
AWS IoT customers are building industrial IoT applications for predictive
quality and maintenance and to remotely monitor operations.
AWS IoT customers are building connected home applications for home
automation, home security and monitoring, and home networking.
Figure 4.36: Commercial
Businesses have used SNMP for a long time, but it was being used more
for reading device states than for configuring devices. NETCONF and YANG address
the disadvantages of SNMP and it has added the various functionality in network
management, such as:
Configuration transactions
Service provider and enterprise network teams are changing their trends
towards a service oriented approach for managing their networks. IETF’s Network
Configuration Protocol (NETCONF) and YANG, a data modelling language, to help
remove the time, cost and manual steps involved in network element configuration.
4.14 YANG
YANG can model both configuration data and state data using the 'config'
statement
The toaster YANG module begins with the header information followed by
identity declarations which define various bread types.
Each leaf node definition has a type and optionally a description and
default value.
PART-B
1.Analyze the purpose of the Six-Pillar Approach for Cisco IoT System
2.Examine the Features of IBM on IoT platform, and brief on the services
Provided in it.
suitable illustration.
5.Summarize about IoT strategy for smart city and design the
layered architecture for
PART-A
1.Analyze the purpose of the Six-Pillar Approach for Cisco IoT System
2.Examine the Features of IBM on IoT platform, and brief on the services
Provided in it.
suitable illustration.
5.Summarize about IoT strategy for smart city and design the
layered architecture for