Iot M4

The document discusses key concepts related to IoT data infrastructure including challenges with relational databases for IoT data, machine learning concepts, big data characteristics, NoSQL and Hadoop ecosystems, streaming data frameworks like Apache Kafka, Apache Spark, Apache Storm and Apache Flink, and the Lambda architecture for combining batch and stream processing.

Uploaded by

Siddesh Av Siddesh Av

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

Iot M4

Uploaded by

Siddesh Av Siddesh Av

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

IOT 18CS81 MOD 4 SHORT NOTES

VTU short notes by YouTuber Afnan Marquee. The notes are in simple format for quick
learning. For video explanation, check out my YouTube Channel!
While relational databases are still used for certain data types and applications, they often
struggle with the nature of IoT data. IoT data places two specific challenges on a relational
database:

ML is concerned with any process where the computer needs to receive a set of data that is
processed to help perform a task with more efficiency.
Supervised Learning
In supervised learning, the machine is trained with input for which there is a known correct
answer. For example, suppose that you are training a system to recognize when there is a
human in a mine tunnel.

Unsupervised learning,, uses machine learning algorithms to analyze and cluster unlabeled
datasets. These algorithms discover hidden patterns or data groupings without the need for
human intervention.
The three V’s of Big Data are: Volume, Velocity, and Variety.

Massively parallel processing (MPP) databases were built on the concept of the relational data
warehouses but are designed to be much faster, to be efficient, and to support reduced query
times.

NoSQL (“not only SQL”) is a class of databases that support semi-structured and unstructured.

It supports the following database types: Document store, Object store, key-value store,
row-column store, graph store.

Hadoop is the most recent entrant into the data management market, but it is arguably the most
popular choice as a data repository and processing engine.

It has name nodes and data nodes.

YARN was developed to take over the resource negotiation and job/task tracking, allowing
MapReduce to be responsible only for data processing.

Apache Kafka is a distributed publisher-subscriber messaging system that is built to be scalable

and fast. It is composed of topics, or message brokers, where producers write data and
consumers read data from these topics.

Apache Storm and Apache Flink are other Hadoop ecosystem projects designed Apache Spark
Apache Spark is an in-memory distributed data analytics platform designed to accelerate
processes for distributed stream processing and are commonly deployed for IoT use cases.
Storm can pull data from Kafka and process it in a near-real-time fashion, and so can Apache
Flink. This space is rapidly evolving, and projects will continue to gain and lose popularity as
they evolve.

Lambda Architecture
Ultimately the key elements of a data infrastructure to support many IoT use cases involves the
collection, processing, and storage of data using multiple technologies. Querying both data in
motion (streaming) and data at rest (batch processing) requires a combination of the Hadoop
ecosystem projects discussed.

One architecture that is currently being leveraged for this functionality is the Lambda
Architecture. Lambda is a data management system that consists of two layers for ingesting
data (Batch and Stream) and one layer for providing the combined data (Serving). These layers
allow for the packages discussed previously, like Spark and MapReduce, to operate on the data
independently, focusing on the key attributes for which they are designed and optimized.
Stream layer: This layer is responsible for near-real-time processing of events.
Batch layer: The Batch layer consists of a batch-processing engine and data store.
Serving layer: The Serving layer is a data store and mediator that decides which of the ingest
layers to query based on the expected result or view into the data.

Big Data Architecture Basics
No ratings yet
Big Data Architecture Basics
24 pages
Glossary
No ratings yet
Glossary
11 pages
Lambda Architecture
No ratings yet
Lambda Architecture
20 pages
Index: Mlbase Component, 100
No ratings yet
Index: Mlbase Component, 100
8 pages
Iot Analytics
No ratings yet
Iot Analytics
14 pages
Article
No ratings yet
Article
7 pages
Data Handling & Analytics: Unit 5
No ratings yet
Data Handling & Analytics: Unit 5
18 pages
Compute Engine
No ratings yet
Compute Engine
49 pages
Apache Iotdb: Time-Series Database For Internet of Things
No ratings yet
Apache Iotdb: Time-Series Database For Internet of Things
4 pages
Big Data Analytics Notes
67% (3)
Big Data Analytics Notes
16 pages
Cours BI 23 24 Session 4 2
No ratings yet
Cours BI 23 24 Session 4 2
46 pages
IOT 4 Module
No ratings yet
IOT 4 Module
48 pages
InfoQ Modern Data Architectures Pipelines Streams
No ratings yet
InfoQ Modern Data Architectures Pipelines Streams
42 pages
INTERNET OF THINGS Unit IV
No ratings yet
INTERNET OF THINGS Unit IV
9 pages
Big Data Analysis Apache Storm Perspecti
No ratings yet
Big Data Analysis Apache Storm Perspecti
6 pages
Assignment Questions BDA Lec 6
No ratings yet
Assignment Questions BDA Lec 6
51 pages
Data Camp Lexicon
No ratings yet
Data Camp Lexicon
2 pages
UNIT IV - Iot - 1
No ratings yet
UNIT IV - Iot - 1
27 pages
IoT - Module 4 - 8th Sem
No ratings yet
IoT - Module 4 - 8th Sem
17 pages
IOT Key 2019
No ratings yet
IOT Key 2019
14 pages
Big Data Tools and Techniques
No ratings yet
Big Data Tools and Techniques
12 pages
Latency 5
No ratings yet
Latency 5
8 pages
226 Unit-7
No ratings yet
226 Unit-7
26 pages
RESERCH
No ratings yet
RESERCH
15 pages
Big Data Architecture
No ratings yet
Big Data Architecture
41 pages
Lez.a-03 Architectures BigData NewStyle
No ratings yet
Lez.a-03 Architectures BigData NewStyle
23 pages
Hortonworks Data Platform (HDP)
100% (1)
Hortonworks Data Platform (HDP)
56 pages
Basic Terms of DATA ENGINEERING
No ratings yet
Basic Terms of DATA ENGINEERING
9 pages
Replication-Based Query Management For Resource Allocation Using Hadoop and MapReduce Over Big Data
No ratings yet
Replication-Based Query Management For Resource Allocation Using Hadoop and MapReduce Over Big Data
13 pages
Open Source Technologies
No ratings yet
Open Source Technologies
19 pages
Assignment 6
No ratings yet
Assignment 6
12 pages
Big Data Architecture
No ratings yet
Big Data Architecture
9 pages
Tools in Data Analytics
No ratings yet
Tools in Data Analytics
17 pages
ReductStore - White Paper - Review
No ratings yet
ReductStore - White Paper - Review
7 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
24 pages
Large Scale and MultiStructured Databases
No ratings yet
Large Scale and MultiStructured Databases
223 pages
Dibs Final Paper 2015
No ratings yet
Dibs Final Paper 2015
9 pages
Q. What Is Big Data?
No ratings yet
Q. What Is Big Data?
8 pages
Course1 Summary
No ratings yet
Course1 Summary
4 pages
Module 1 Glossary What Is Big Data
No ratings yet
Module 1 Glossary What Is Big Data
2 pages
Big Data Processing With Apache Spark - Infoqdotcom
No ratings yet
Big Data Processing With Apache Spark - Infoqdotcom
16 pages
1 - Big Data Analytics & IoT
No ratings yet
1 - Big Data Analytics & IoT
13 pages
Data-Intensive Computing
No ratings yet
Data-Intensive Computing
88 pages
Objectives of Information Retrieval
No ratings yet
Objectives of Information Retrieval
5 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
Big Data Deals With Large Data Sets
No ratings yet
Big Data Deals With Large Data Sets
4 pages
Ingestion Layer PDF
No ratings yet
Ingestion Layer PDF
11 pages
Data Analytics and Hadoop
No ratings yet
Data Analytics and Hadoop
21 pages
Lamda Architecture
No ratings yet
Lamda Architecture
10 pages
Data Engineering Quick Reference
No ratings yet
Data Engineering Quick Reference
9 pages
Data Factory
100% (2)
Data Factory
26 pages
C++ Assignment
No ratings yet
C++ Assignment
8 pages
Bca 240 CD
100% (2)
Bca 240 CD
67 pages
Evolution Firewall - 1
100% (1)
Evolution Firewall - 1
31 pages
Power System Load Flow Analysis Using Microsoft Excel
100% (2)
Power System Load Flow Analysis Using Microsoft Excel
21 pages
CLF-C02 Exam Guide Slides
No ratings yet
CLF-C02 Exam Guide Slides
30 pages
A Survey of Generative AI Applications
No ratings yet
A Survey of Generative AI Applications
36 pages
Drive 19 For Emertxe New
No ratings yet
Drive 19 For Emertxe New
6 pages
CC Module 4
No ratings yet
CC Module 4
35 pages
CGV Mini Project Report PDF
No ratings yet
CGV Mini Project Report PDF
42 pages
Ireless Networks Ireless Etwork Omponents Ireless Ocal REA Etwork Opologies
No ratings yet
Ireless Networks Ireless Etwork Omponents Ireless Ocal REA Etwork Opologies
22 pages
21-22 Internship CSE
No ratings yet
21-22 Internship CSE
87 pages
Iam Ug
No ratings yet
Iam Ug
364 pages
Dissertation Computer Science Example
100% (2)
Dissertation Computer Science Example
4 pages
Mod 1
No ratings yet
Mod 1
65 pages
Other Tools Manual
No ratings yet
Other Tools Manual
362 pages
(Joseph Linaschke) Getting The Most From Instagram
No ratings yet
(Joseph Linaschke) Getting The Most From Instagram
34 pages
Micron MT29F4G16ABBDAH4 IT D Datasheet
No ratings yet
Micron MT29F4G16ABBDAH4 IT D Datasheet
132 pages
Goaero 2k23
No ratings yet
Goaero 2k23
16 pages
Design Thinking Uber Presentation With Name
No ratings yet
Design Thinking Uber Presentation With Name
10 pages
Log
No ratings yet
Log
2 pages
Unit 5 (A)
No ratings yet
Unit 5 (A)
25 pages
Mad Final Hi This Is Mad Project Report
No ratings yet
Mad Final Hi This Is Mad Project Report
34 pages
CG Miniproject Report PDF
No ratings yet
CG Miniproject Report PDF
35 pages
Idp Electrical
No ratings yet
Idp Electrical
6 pages
VTU Result 2023
No ratings yet
VTU Result 2023
1 page
AI Project
No ratings yet
AI Project
9 pages
Assignment1 1
No ratings yet
Assignment1 1
1 page
Wepik Unleashing The Potential of Artificial Intelligence A Comprehensive Overview 20230516114431XrRI
No ratings yet
Wepik Unleashing The Potential of Artificial Intelligence A Comprehensive Overview 20230516114431XrRI
8 pages
1641 StudentID Assignment2
No ratings yet
1641 StudentID Assignment2
10 pages
Ecommerce in Developing Countries-The Case of Liberia
No ratings yet
Ecommerce in Developing Countries-The Case of Liberia
22 pages
Infoman Report
No ratings yet
Infoman Report
17 pages
Perle de Rosée-1
No ratings yet
Perle de Rosée-1
7 pages
Erp Performance As Intervening Variable To Financial Performance For Erp Implementation, Adherence To Coso, and GCG Implementation
No ratings yet
Erp Performance As Intervening Variable To Financial Performance For Erp Implementation, Adherence To Coso, and GCG Implementation
20 pages
645c7ded2a346 TC For Internship Offer TC 2 1
No ratings yet
645c7ded2a346 TC For Internship Offer TC 2 1
3 pages
SQRRL Reservior RSAC 2016 1
No ratings yet
SQRRL Reservior RSAC 2016 1
13 pages
How To Remove Windows TCP IP Limits Connections
No ratings yet
How To Remove Windows TCP IP Limits Connections
8 pages
Real-Time Driver Drowsiness Detection For Android Application Using Deep Neural Networks Techniques
No ratings yet
Real-Time Driver Drowsiness Detection For Android Application Using Deep Neural Networks Techniques
9 pages
Supported Devices: List of Special Relay
No ratings yet
Supported Devices: List of Special Relay
9 pages
Assignment 2
No ratings yet
Assignment 2
1 page
National Technology Day
No ratings yet
National Technology Day
1 page
Wa0008.
No ratings yet
Wa0008.
1 page
Tech Challenge - RCB - FY23
No ratings yet
Tech Challenge - RCB - FY23
1 page
Multimedia Final Exam
No ratings yet
Multimedia Final Exam
2 pages
IBM WebSphere eXtreme Scale 6
From Everand
IBM WebSphere eXtreme Scale 6
Anthony Chaves
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Data Lakes & Pipelines: A Modern Azure Guide
From Everand
Data Lakes & Pipelines: A Modern Azure Guide
Kameron Hussain
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet

Iot M4

Uploaded by

Iot M4

Uploaded by

IOT 18CS81 MOD 4 SHORT NOTES

It has name nodes and data nodes.

Apache Kafka is a distributed publisher-subscriber messaging system that is built to be scalable

You might also like