0% found this document useful (0 votes)
22 views54 pages

IERG4230 BigData Analytics

Uploaded by

deathnoterus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views54 pages

IERG4230 BigData Analytics

Uploaded by

deathnoterus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

IERG4230 Introduction to IoT

Big Data Analytics


for IoT

IERG4230: Big Data Analytics for IoT P.1


Big Data

 Lots of data is being collected


and warehoused
 Web data, e-commerce

 purchases at department/

grocery stores
 Bank/Credit Card

transactions
 Social Network

IERG4230: Big Data Analytics for IoT P.2


Big Data

IERG4230: Big Data Analytics for IoT P.3


Big Data
 “Big Data” is data whose scale, diversity, and complexity
require new architecture, techniques, algorithms, and
analytics to manage it and extract value and hidden
knowledge from it…
 Big data is high volume, high velocity, and/or high variety
information assets that require new forms of processing to
enable enhanced decision making, insight discovery and
process optimization.
 The challenges include capture, curation, storage, search,
sharing, transfer, analysis, and visualization.

IERG4230: Big Data Analytics for IoT P.4


Big Data

IERG4230: Big Data Analytics for IoT P.5


Big Data: 3Vs

IERG4230: Big Data Analytics for IoT P.6


Big Data: 3Vs

IERG4230: Big Data Analytics for IoT P.7


Big Data: Volume
 Data Volume
 44x increase from 2009 2020
 From 0.8 zettabytes to 35zb
 Data volume is increasing exponentially

Exponential increase
in collected/generated
data
IERG4230: Big Data Analytics for IoT P.8
Big Data: Volume 4.6 billion
30 billion RFID camera
tags today phones
(1.3B in 2005) world wide
12+ TBs
of tweet data
every day

100s of
millions of
GPS
data every day

enabled
? TBs of

devices
sold
annually

2+ billion
25+ TBs of people on
log data the Web
every day by end
76 million smart meters 2011
in 2009…
200M by 2014

IERG4230: Big Data Analytics for IoT P.9


Big Data: Variety
 Relational Data (Tables/Transaction/Legacy
Data)
 Text Data (Web)
 Semi-structured Data (XML)
 Graph Data
 Social Network, Semantic Web (RDF), …
 Streaming Data
 You can only scan the data once
 A single application can be generating/
collecting many types of data
 Big Public Data (online, weather, finance, etc)

IERG4230: Big Data Analytics for IoT P.10


Big Data: Types of Data

 Relational Data (Tables/Transaction/Legacy Data)


 Text Data (Web)
 Semi-structured Data (XML)
 Graph Data
 Social Network, Semantic Web (RDF), …

 Streaming Data
 You can only scan the data once

IERG4230: Big Data Analytics for IoT P.11


Big Data: Types of Data
• Structured data
– Typically stored in databases or spreadsheets, required to be managed in
accordance with a standardised storage format and ontology e.g. names, place
names,
– e.g. SATAC applications, load, enrolments, FLO usage data
• Unstructured data
– text, audio, imagery, video
– e.g. student email, chat rooms, questionnaire responses, lecture videos (audio &
video)
• Different data types lend themselves to different analytical techniques. Unstructured
data often requires pre- processing prior to enable structured data analysis
• Unstructured data analysis
– Text : document clustering , topic detection, entity extraction (people, places,
locations, dates, times etc., sentiment analysis (+,-)
– Audio : speaker identification, language identification, speech to text, keyword
spotting
– Video analysis : face recognition, object recognition, target tracking

IERG4230: Big Data Analytics for IoT P.12


Big Data: Data Types

IERG4230: Big Data Analytics for IoT P.13


Big Data: Velocity

• Data is generated fast and need to be processed fast

• Online Data Analytics

• Late decisions  missing opportunities

• Examples
• E-Promotions: Based on your current location, your purchase
history, what you like  send promotions right now for store next to
you

• Healthcare monitoring: sensors monitoring your activities and body


 any abnormal measurements require immediate reaction

IERG4230: Big Data Analytics for IoT P.14


Big Data: Velocity

IERG4230: Big Data Analytics for IoT P.15


Big Data: Source of Data

Mobile devices
(tracking all objects all the time)

Social media and networks Scientific instruments


(all of us are generating data) (collecting all sorts of data)

Sensor technology and


networks
(measuring all kinds of data)
 The progress and innovation is no longer hindered by the ability to collect
data
 But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable
fashion 16
IERG4230: Big Data Analytics for IoT P.16
Big Data: Data Generation

• The Model of Generating/Consuming Data has Changed

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming
data

IERG4230: Big Data Analytics for IoT P.17


Big Data: Sources

IERG4230: Big Data Analytics for IoT P.18


Big Data: 4Vs?

IERG4230: Big Data Analytics for IoT P.19


Big Data: More Vs?

IERG4230: Big Data Analytics for IoT P.20


Big Data: Drivers

- Optimizations and predictive analytics


- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time

- Ad-hoc querying and reporting


- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets

IERG4230: Big Data Analytics for IoT P.21


Harnessing Big Data

 OLTP: Online Transaction Processing (DBMSs)


 OLAP: Online Analytical Processing (Data Warehousing)
 RTAP: Real-Time Analytics Processing (Big Data Architecture & technology)
IERG4230: Big Data Analytics for IoT P.22
Challenges in Handling Big Data

 The Bottleneck is in technology


 New architecture, algorithms, techniques are needed
 Also in technical skills
 Experts in using the new technology and dealing with big
data

23
IERG4230: Big Data Analytics for IoT
IERG4230: Big Data Analytics for IoT P.24
Big Data: Use Cases

IERG4230: Big Data Analytics for IoT P.25


Big Data: Market

IERG4230: Big Data Analytics for IoT P.26


Big Data Technology

IERG4230: Big Data Analytics for IoT P.27


Big Data: Enabling Technology

IERG4230: Big Data Analytics for IoT P.28


Cloud Computing
 IT resources provided as a service
 Compute, storage, databases, queues
 Clouds leverage economies of scale of commodity hardware
 Cheap storage, high bandwidth networks & multicore
processors
 Geographically distributed data centers
 “Out-sourced” resource management, reduced Time to
deployment
 Scaling: On demand provisioning, co-locate data and compute
 Reliability: Massive, redundant, shared resources
 Sustainability: Hardware not owned

IERG4230: Big Data Analytics for IoT


IoT and Cloud

PaaS IaaS SaaS


Public resource management,
Public
cloud
Public QoS management, Service
cloud
Cloud management invocation, Admission control
server domain

Network
domain Location management,
Network
control Service exposure, Billing,
system Identity management, Service
Local cloud Support functions
Home domain
cloud Mobile
cloud Local resource management,
Object
domain
Public cloud interaction

NFC/ Resource exposure,


Resource Request
Bluetooth/
ZIgBee/ WiFi

indoor objects outdoor objects(wireless)

IERG4230: Big Data Analytics for IoT P.30


Big Data :Computation Architecture

IERG4230: Big Data Analytics for IoT


Big Data : Distributed Algorithms
on Hadoop

IERG4230: Big Data Analytics for IoT


Big Data – Storage Architecture

IERG4230: Big Data Analytics for IoT


Big Data – Storage Architecture

IERG4230: Big Data Analytics for IoT


Big Data – Special-Purpose
Database

IERG4230: Big Data Analytics for IoT


Big Data – Special-Purpose
Database

IERG4230: Big Data Analytics for IoT


Big Data – Special-Purpose
Database

IERG4230: Big Data Analytics for IoT


Big Data – Special-Purpose
Database

IERG4230: Big Data Analytics for IoT


Big Data – Special-Purpose
Database

IERG4230: Big Data Analytics for IoT


Big Data – Platform Stack
Examples

IERG4230: Big Data Analytics for IoT


Big Data Components

IERG4230: Big Data Analytics for IoT


Value of Big Data Analytics

 Big data is more real-time in


nature than traditional DW
applications
 Traditional DW architectures (e.g.
Exadata, Teradata) are not well-
suited for big data apps
 Shared nothing, massively parallel
processing, scale out architectures
are well-suited for big data apps

42
IERG4230: Big Data Analytics for IoT
Big Data: Analytics

 Aggregation and Statistics


 Data warehouse and OLAP
 Indexing, Searching, and Querying
 Keyword based search
 Pattern matching (XML/RDF)
 Knowledge discovery
 Data Mining
 Statistical Modeling

IERG4230: Big Data Analytics for IoT P.43


Big Data: Analytics
• Learning analytics draws upon techniques from a number of established fields:
– Statistics
– Artificial Intelligence
– Machine Learning
– Data mining
– Social Network Analysis
– Text Mining and Web Analytics
– Operational Research
– Information Visualization

• Application domains such as business intelligence, national security intelligence


and learning analytics all have an interest in analysing large volumes of data
from disparate data sources and are providing the business cases for the rapid
growth in ‘big data’ & data analytics.

• Learning analytics encompasses support to both the business and teaching


functions of the learning institution.
IERG4230: Big Data Analytics for IoT P.44
Big Data: Analytic Tools

 Data mining
 Statistical analysis
 Predictive analysis
 Correlation
 Regression
 Forecasting
 Process Modeling
 Optimization
 Simulation

IERG4230: Big Data Analytics for IoT


Business Intelligence: BI

IERG4230: Big Data Analytics for IoT P.46


Big Data: Analytics

IERG4230: Big Data Analytics for IoT P.47


Big Data Analytics

IERG4230: Big Data Analytics for IoT P.48


Big Data Analytics

IERG4230: Big Data Analytics for IoT P.49


Big Data: Structural Data Analysis
Descriptive statistics – sums, means, std devs, basic plotting (graphs,
charts, histograms)

Data visualisation –
tools that enable the human to see meaningful patterns in data

Machine learning -
tools that enable computers to find patterns in data to perform either
classification, clustering or prediction
e.g. decision trees, neural networks, support vector machines, linear
regression, self organising maps, k-means

Predictive analytics –
Algorithmic approaches (generally machine learning) for predicting
key target variables of interest.

IERG4230: Big Data Analytics for IoT P.50


Big Data: Visualization
Structured Data Unstructured Data

IERG4230: Big Data Analytics for IoT P.51


Big Data: Visualization
Combining Structured & Unstructured Data Sources

IERG4230: Big Data Analytics for IoT P.52


Dangers in Analytics

 Privacy
 Security
 Drawing decisions on incomplete data
 Drawing decisions on inaccurate data
 Using only data that supports our gut decisions
 Drawing the wrong conclusion from the data
 Stock prices example

IERG4230: Big Data Analytics for IoT


Big Data, IoT, Analytics

IoT will enable Big Data


Big Data needs Analytics
Analytics will improve processes
for more IoT devices

IERG4230: Big Data Analytics for IoT P.54

You might also like