0% found this document useful (0 votes)
12 views

15CS81 M4 Introduction

engineering maths 4

Uploaded by

Dheeraj Katoch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

15CS81 M4 Introduction

engineering maths 4

Uploaded by

Dheeraj Katoch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

INTERNET OF

THINGS
TECHNOLOGY
15CS81
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
BANGALORE INSTITUTE OF TECHNOLOGY
K. R. ROAD, V. V. PURA, BENGALURU-560004.
MODULE 4

Data and Analytics for IoT


Introduction
➢ In the world of IoT, the creation of massive amounts of data from sensors is common and
one of the biggest challenges - not only from a transport perspective but also from a data
management standpoint.

➢ Example: Modern Jet Engines, may be equipped with around 5000 sensors.

➢ Therefore, a twin engine commercial aircraft with these engines operating on average
8hours a day will generate over 500TB of data daily, and this is just the data from the
engines.

➢ Aircraft today have thousands of other sensors connected to the airframe and other
systems.
Figure: Commercial Jet Engine
➢ The amount of IoT data coming just from the commercial airline business is
overwhelming.
➢ This example is but one of many that highlight the big data problem that is
being exacerbated by IoT.
➢ Example: Utility Industry, Even moderately sized smart meter networks can
provide over 1 billion data points each day.
➢ Analyzing this amount of data in the most efficient manner possible falls
under the umbrella of data analytics.
Structured Versus Unstructured Data

Figure: Comparison between Structured & Unstructured Data


➢ Structured data means that the data follows a model or schema that defines how the
data is represented or organized, meaning it fits well with a traditional relational
database management system (RDBMS).

➢ In many cases we can find structured data in a simple tabular form.

➢ Example: A spreadsheet where data occupies a specific cell and can be explicitly
defined and referenced.

➢ Structured data can be found in most computing systems and includes everything
from banking transaction and invoices to computer log files and router
configurations.
➢ IoT sensor data often uses structured values, such as temperature, pressure, humidity,
and so on, which are all sent in a known format.

➢ Structured data is easily formatted, stored, queried, and processed; for these reasons, it
has been the core type of data used for making business decisions.

➢ Because of the highly organizational format of structured data, a wide array of data
analytics tools are readily available for processing this type of data.

➢ Unstructured data lacks a logical schema for understanding and decoding the data
through traditional programming means.
➢ Examples: Text, Speech, Images, and Video.

➢ As a general rule, any data that does not fit neatly into a predefined data model is

classified as unstructured data.

➢ According to some estimates, around 80% of a business’s data is unstructured.

➢ Because of this fact, data analytics methods that can be applied to unstructured data,

such as cognitive computing and machine learning, are deservedly garnering a lot of

attention.
➢ With machine learning applications, such as natural language processing (NLP), we
decode speech.

➢ With image/facial recognition applications, we can extract critical information from


still images and video.

➢ Smart objects in IoT networks generate both structured and unstructured data.

➢ Structured data is more easily managed and processed due to its well-defined
organization.

➢ On the other hand, unstructured data can be harder to deal with and typically
requires very different analytics tools for processing the data.
Semi-Structured Data
➢ Semi-structured data, is a hybrid of structured and unstructured data and shares
characteristics of both.

➢ While not relational, semi-structured data contains a certain schema and consistency.

➢ Example: Email, the fields are well defined but the content contained in the body
field and attachments are unstructured.

➢ Example: JavaScript Object Notation (JSON) and Extensible Markup Language


(XML), which are common data interchange formats used on the web and in some
IoT data exchanges.
Data in Motion Versus Data at Rest
➢ Data in IoT networks is either in transit (“data in motion”) or being held or stored
(“data at rest”).

➢ Examples:

o Data in motion - Traditional client/server exchanges, such as web browsing


and file transfers, and email.

o Data at rest - Data saved to a hard drive, storage array, or USB drive

➢ From an IoT perspective, the data from smart objects is considered data in motion as
it passes through the network en route to its final destination. This is often processed
at the edge, using fog computing.
➢ When data is processed at the edge, it may be filtered and deleted or forwarded on for
further processing and possible storage at a fog node or in the data centre.

➢ Data does not come to rest at the edge.

➢ When data arrives at the data centre, it is possible to process it in real-time, just like at
the edge, while it is still in motion.

➢ Data at rest in IoT networks can be typically found in IoT brokers or in some sort of
storage array at the data centre.

➢ The best known of these tools is Hadoop. Hadoop not only helps with data processing
but also data storage.
IoT Data Analytics Overview

➢ The true importance of IoT data from smart objects is realized only when the
analysis of the data leads to actionable business intelligence and insights.
➢ Data analysis is typically broken down by the types of results that are
produced.
➢ There are four types of data analysis results
▪ Descriptive
▪ Diagnostic
▪ Predictive
▪ Prescriptive
Figure: Types of Data Analysis Results
• Descriptive: Descriptive data analysis gives the information related to what is
happening, either now or in the past.
• Example: A thermometer in a truck engine reports temperature values every
second.
• From a descriptive analysis perspective, we can pull this data at any moment
to gain insight into the current operating condition of the truck engine.
• If the temperature value is too high, then there may be a cooling problem or
the engine may be experiencing too much load.
• Diagnostic: When we are interested in the “why”, diagnostic data analysis can
provide the answer.

• The example of the temperature sensor in the truck engine, we might wonder why
the truck engine failed.

• Diagnostic analysis might show that the temperature of the engine was too high,
and the engine overheated.

• Applying diagnostic analysis across the data generated by a wide range of smart
objects can provide a clear picture of why a problem or an event occurred.
• Predictive:Predictive analysis aims to foretell problems or issues before
they occur.

• Example: With historical values of temperatures for the truck engine, predictive
analysis could provide an estimate on the remaining life of certain components in
the engine.

• These components could then be proactively replaced before failure occurs.

• Or perhaps if temperature values of the truck engine start to rise slowly over
time, this could indicate the need for an oil change or some other sort of engine
cooling maintenance.
• Prescriptive: Prescriptive analysis goes a step beyond predictive and recommends
solutions for upcoming problems.

• A prescriptive analysis of the temperature data from a truck engine might calculate
various alternatives to cost-effectively maintain the truck.

• These calculations could range from the cost necessary for more frequent oil changes
and cooling maintenance to installing new cooling equipment on the engine or
upgrading to a lease on a model with a more powerful engine.

• Prescriptive analysis looks at a variety of factors and makes the appropriate


recommendation.
➢ Both predictive and prescriptive analyses are more resource intensive and increase
complexity, but the value they provide is much greater than the value from descriptive
and diagnostic analysis.
➢ That descriptive analysis is the least complex and at the same time offers the least
value.
➢ On the other end, prescriptive analysis provides the most value but is the most
complex to implement.
➢ Most data analysis in the IoT space relies on descriptive and diagnostic analysis, but a
shift toward predictive and prescriptive analysis is understandably occurring for most
businesses and organizations.
Figure: Application of Value & Complexity Factors to the Types of Data Analysis
IoT Data Analytics Challenges
➢ As IoT has grown and evolved, it has become clear that traditional data analytics
solutions were not always adequate.
➢ Example: Traditional data analytics typically employs a standard RDBMS and
corresponding tools, but the world of IoT is much more demanding.
➢ While relational databases are still used for certain data types and applications,
they often struggle with the nature of IoT data.
➢ IoT data places two specific challenges on a relational database:
▪ Scaling problems
▪ Volatility of data
• Scaling problems
o Due to the large number of smart objects in most IoT
networks that continually send data, relational databases
can grow incredibly large very quickly.
o This can result in performance issues that can be costly to
resolve, often requiring more hardware and architecture
changes.
• Volatility of data
o With relational databases, it is critical that the schema be designed correctly
from the beginning.
o Changing it later can slow or stop the database from operating.
o Due to the lack of flexibility, revisions to the schema must be kept at a
minimum.
o IoT data, however, is volatile in the sense that the data model is likely to
change and evolve over time.
o A dynamic schema is often required so that data model changes can be made
daily or even hourly.
➢ To deal with challenges like scaling and data volatility, a different type of database,
known as NoSQL, is being used.
➢ Structured Query Language (SQL) is the computer language used to communicate
withan RDBMS.
➢ As the name implies, a NoSQL database is a database that does not use SQL.
➢ It is not set up in the traditional tabular form of a relational database.
➢ NoSQL databases do not enforce a strict schema, and they support a complex,
evolving data model.
➢ These databases are also inherently much more scalable.
➢ IoT also brings challenges with the live streaming nature of its data and with
managing data at the network level.
➢ Streaming data, which is generated as smart objects transmit data, is
challenging because it is usually of a very high volume, and it is valuable
only if it is possible to analyse and respond to it in real-time.
➢ Real-time analysis of streaming data allows us to detect patterns or
anomalies that could indicate a problem or a situation that needs some kind
of immediate response.
➢ Another challenge that IoT brings to analytics is in the area of network data,
which is referred to as network analytics.
➢ With the large numbers of smart objects in IoT networks that are
communicating and streaming data, it can be challenging to ensure that these
data flows are effectively managed, monitored, and secure.
➢ Network analytics tools such as Flexible NetFlow and IPFIX provide the
capability to detect irregular patterns or other problems in the flow of IoT data
through a network.

You might also like