0% found this document useful (0 votes)

10 views8 pages

009 Ex 7

The document outlines the implementation of a real-time IoT data analytics system using Azure services. It details the steps to collect IoT device data via Azure IoT Hub, process it with PySpark in Azure Synapse, and store the cleaned data in Azure Data Lake Storage Gen2. The successful completion of the process enables efficient downstream analytics of the IoT data.

Uploaded by

rennierra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views8 pages

009 Ex 7

Uploaded by

rennierra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

EX NO:

DATE:

AIM:

To implement real-time IoT data analytics by collecting IoT device data through Azure

IoT Hub, streaming it using Azure Stream Analytics, transforming the data using

PySpark scripts in Azure Synapse Spark Pools, and storing the cleaned data into Azure

Data Lake Storage Gen2 for downstream analytics.

PROCEDURE:

● First, create an IoT Hub named iothubnum1 and register a device called

device1.

● Then, simulate IoT data using a Python script in VS Code by sending random

temperature and humidity values to the IoT Hub.

● Create an Azure Data Lake Storage Gen2 account named iotdatalakenum, and

enable the hierarchical namespace option while creating it.

● Inside the storage account, create two containers named raw-data and

processed-data to separately store incoming raw and processed data.

● Next, set up a Stream Analytics Job named iotStreamJob where the input will

be the IoT Hub (iotInput) and the output will be the raw-data container of the

storage account with the path pattern raw/{date}/{time}.

● Create an Azure Synapse Analytics Workspace using basic settings, allow

public network access, and complete the workspace creation.

● After that, create an Apache Spark Pool inside Synapse named iotSparkPool,

choose node size as Small, and set auto-pause to 5 minutes to save credits.

● Then, in Synapse Studio, create a new Notebook, attach it to the iotSparkPool,

and write a PySpark script to read data from the raw-data container.

● Apply transformations to clean and convert the data types (temperature to float

and humidity to int) using PySpark commands.

● Write the transformed data into the processed-data container in Parquet file

format for efficient storage and future analytics.

● Finally, the entire process of ingesting IoT data, transforming it using PySpark,

and storing the cleaned data into ADLS Gen2 is completed successfully.

PYSPARK CODE:

PySpark ETL Job for IoT Data Processing in Azure Synapse:

from pyspark.sql.functions import col

# Read from ADLS raw container

df = spark.read.option("header",

"true").csv("abfss://[email protected]/raw/")

# Transformations

df_clean = df.withColumn("temperature", col("temperature").cast("float")) \

.withColumn("humidity", col("humidity").cast("int"))

# Write to ADLS processed container in Parquet format

df_clean.write.mode("overwrite").parquet("abfss://processed-data@iotdatalake

num.dfs.core.windows.net/cleaned/")

OUTPUT:

AZURE DASHBOARD:
PYTHON SCRIPT FOR SENDING SIMULATED IOT TELEMETRY DATA TO AZURE IOT
HUB:

IOT SPARK POOL

BANDWIDTH UTILIZATION REPORT - EGRESS AND INGRESS TRAFFIC

SERVICE MONITORING: END-TO-END LATENCY AND REQUEST SUCCESS RATES

STORAGE – CONTAINER CREATED :

STREAM ANALYTICS TEST QUERY FOR IOT DATA PROCESSING

AZURE SYNAPSE SPARK POOL RESOURCE ALLOCATION:

SYNAPSE ANALYTICS WORKSPACE:

DATA LAKE STORAGE STRUCTURE IN AZURE SYNAPSE:

AZURE SYNAPSE SPARK NOTEBOOK: IOT DATA TRANSFORMATION PIPELINE

AZURE SYNAPSE SPARK JOB MONITORING DASHBOARD

SPARK LOGS:

SPARKPOOL ACTIVE SESSIONS DASHBOARD:

RESULT:
The real-time IoT data analytics system was successfully implemented by streaming

IoT data from the IoT Hub to Azure Synapse Analytics, transforming the data using

PySpark in Synapse Notebooks, and storing the processed data into Azure Data Lake

Storage Gen2 for further analysis.

OOPS Notes For 3rd Sem ALL Chapters
83% (6)
OOPS Notes For 3rd Sem ALL Chapters
62 pages
11th Computer Science Public Exam 2019 To 2024 Questions With Answer Key English Medium PDF Download
100% (1)
11th Computer Science Public Exam 2019 To 2024 Questions With Answer Key English Medium PDF Download
53 pages
Iot Analytics
100% (1)
Iot Analytics
294 pages
The Functional View of IoT
No ratings yet
The Functional View of IoT
4 pages
An Introduction To Databases With Web Applications - Martyn Prigmore
No ratings yet
An Introduction To Databases With Web Applications - Martyn Prigmore
713 pages
Class: Xi Session: 2022-23 Computer Science (083) Practice Paper-1 (Theory) Maximummarks:70 Time Allowed: 3 Hrs General Instructions
100% (1)
Class: Xi Session: 2022-23 Computer Science (083) Practice Paper-1 (Theory) Maximummarks:70 Time Allowed: 3 Hrs General Instructions
48 pages
Unit 1 Data Science in Iot
No ratings yet
Unit 1 Data Science in Iot
23 pages
09 - Azure Data Engineering Cheatsheet
No ratings yet
09 - Azure Data Engineering Cheatsheet
37 pages
Student Management System
100% (1)
Student Management System
18 pages
Python Classes Objects Special Methods Inheritance, Polymorphism, Encapsulation
No ratings yet
Python Classes Objects Special Methods Inheritance, Polymorphism, Encapsulation
138 pages
Advanced Project For Data Engineering in Azure
100% (1)
Advanced Project For Data Engineering in Azure
5 pages
Data Science in IOT (BCIT601)
No ratings yet
Data Science in IOT (BCIT601)
41 pages
009 EXno 7
No ratings yet
009 EXno 7
7 pages
Introduction To Flexnet Manager Suite: © Flexera / Company Confidential
No ratings yet
Introduction To Flexnet Manager Suite: © Flexera / Company Confidential
42 pages
Analyzing IoT Data in Python Chapter1
100% (1)
Analyzing IoT Data in Python Chapter1
27 pages
Modern Industrial Iot Analytics On Azure Ebook
No ratings yet
Modern Industrial Iot Analytics On Azure Ebook
26 pages
MIE1628 Big Data Analytics Lecture10
No ratings yet
MIE1628 Big Data Analytics Lecture10
41 pages
DAV Chapter3
No ratings yet
DAV Chapter3
44 pages
IoT With Cloud Computing - Unit 3
No ratings yet
IoT With Cloud Computing - Unit 3
27 pages
Lec 16 Metrics Cohesion
100% (1)
Lec 16 Metrics Cohesion
22 pages
Azure DataEngineer Course Outline
No ratings yet
Azure DataEngineer Course Outline
4 pages
Iot CP and A CH 3
No ratings yet
Iot CP and A CH 3
19 pages
Modul 2 - IoT Application - Update
No ratings yet
Modul 2 - IoT Application - Update
10 pages
Azure Data Superstore Pipeline - End-to-End Data Engineering and Visualization Report
No ratings yet
Azure Data Superstore Pipeline - End-to-End Data Engineering and Visualization Report
23 pages
Unit II Notes
No ratings yet
Unit II Notes
54 pages
Unit 5
No ratings yet
Unit 5
22 pages
DAY11-Introduction To Microsoft Azure
No ratings yet
DAY11-Introduction To Microsoft Azure
38 pages
Wenming Ye 西雅图数据平台组 Sr. Program Manager: IOT B-302
No ratings yet
Wenming Ye 西雅图数据平台组 Sr. Program Manager: IOT B-302
51 pages
Iot CP and A CH 4
No ratings yet
Iot CP and A CH 4
18 pages
Azure de QSN and Ans
No ratings yet
Azure de QSN and Ans
16 pages
Process Automation and Data Acquisitions
No ratings yet
Process Automation and Data Acquisitions
6 pages
Unit II Notes
No ratings yet
Unit II Notes
53 pages
Iot 3
No ratings yet
Iot 3
27 pages
IoT Project - A.Fazius
No ratings yet
IoT Project - A.Fazius
12 pages
MIS
No ratings yet
MIS
4 pages
Open Source Platform For Data Analysis
No ratings yet
Open Source Platform For Data Analysis
8 pages
Compare Performance, Load, Stress Testing
No ratings yet
Compare Performance, Load, Stress Testing
6 pages
ADE Project Amit
No ratings yet
ADE Project Amit
17 pages
Unit 4
No ratings yet
Unit 4
5 pages
Growlink Challenge
No ratings yet
Growlink Challenge
7 pages
IOT - Phase 2
No ratings yet
IOT - Phase 2
5 pages
Sample Template File For Project
No ratings yet
Sample Template File For Project
8 pages
Jubilee Azure Optimization Updated Project Timeline 1
No ratings yet
Jubilee Azure Optimization Updated Project Timeline 1
3 pages
Data Collection and Ingestion
No ratings yet
Data Collection and Ingestion
3 pages
What Is Device Integration? Explain Its Implementation
No ratings yet
What Is Device Integration? Explain Its Implementation
12 pages
Developing AI Algorithms For IoT Data Optimization and Anomaly Detection
No ratings yet
Developing AI Algorithms For IoT Data Optimization and Anomaly Detection
3 pages
Unit 6 Iot
No ratings yet
Unit 6 Iot
12 pages
DP 203t00a Enu Powerpoint 03
No ratings yet
DP 203t00a Enu Powerpoint 03
25 pages
Definition of IoT Data Analytics
No ratings yet
Definition of IoT Data Analytics
18 pages
Self Driving Perception
No ratings yet
Self Driving Perception
1 page
Unit 3,4,5
No ratings yet
Unit 3,4,5
24 pages
IOT Journey
No ratings yet
IOT Journey
8 pages
Iot Platform Design Methodology
No ratings yet
Iot Platform Design Methodology
7 pages
Top 10 Production-Grade Reusable PySpark Scripts For Data Engineers - by Mayurkumar Surani - May, 2025 - Medium
No ratings yet
Top 10 Production-Grade Reusable PySpark Scripts For Data Engineers - by Mayurkumar Surani - May, 2025 - Medium
14 pages
Pseudocode Syntax Cheatsheet
100% (1)
Pseudocode Syntax Cheatsheet
1 page
Azure IoT Architecture BRK1552 Miller
No ratings yet
Azure IoT Architecture BRK1552 Miller
28 pages
Analyzing IoT Data in Python Chapter3
No ratings yet
Analyzing IoT Data in Python Chapter3
30 pages
Iot Save
No ratings yet
Iot Save
2 pages
Ficoblazeadvisordecisionruleshowitworks4202wp 160302225024
No ratings yet
Ficoblazeadvisordecisionruleshowitworks4202wp 160302225024
40 pages
Data Management in The Internet of Things (Iot) Refers To The Process of
No ratings yet
Data Management in The Internet of Things (Iot) Refers To The Process of
9 pages
1
No ratings yet
1
3 pages
CAD Final Asked in Exam - Updated (1) - 1
No ratings yet
CAD Final Asked in Exam - Updated (1) - 1
16 pages
Rdo Engine
No ratings yet
Rdo Engine
5 pages
Final Year Report Submitted
No ratings yet
Final Year Report Submitted
61 pages
Date: Experiment No. 1 Aim: To Study J2EE.: Theory
No ratings yet
Date: Experiment No. 1 Aim: To Study J2EE.: Theory
71 pages
Introduction To C Programming
No ratings yet
Introduction To C Programming
100 pages
C2JS
No ratings yet
C2JS
23 pages
Info
No ratings yet
Info
62 pages
Practice Exercises 15-23
No ratings yet
Practice Exercises 15-23
13 pages
DP Audits
No ratings yet
DP Audits
4 pages
PRESENTATION
No ratings yet
PRESENTATION
11 pages
Crash Report
No ratings yet
Crash Report
7 pages
Template CV
No ratings yet
Template CV
3 pages
John Tan Resume 2018
No ratings yet
John Tan Resume 2018
1 page
Comprog
No ratings yet
Comprog
12 pages
SRS Ecom
No ratings yet
SRS Ecom
13 pages
Challenging Task 5
No ratings yet
Challenging Task 5
5 pages
List of Topics For Research Paper in Software Engineering
No ratings yet
List of Topics For Research Paper in Software Engineering
8 pages
Ec-405 Chapter-3
No ratings yet
Ec-405 Chapter-3
6 pages
Capgemini
No ratings yet
Capgemini
1 page
IGNOU BCA Object-Oriented Technologies and Java Programming Previous Year Solved Papers MCS 024
From Everand
IGNOU BCA Object-Oriented Technologies and Java Programming Previous Year Solved Papers MCS 024
Manish Soni
No ratings yet
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
Build your own Blockchain: Make your own blockchain and trading bot on your pc
From Everand
Build your own Blockchain: Make your own blockchain and trading bot on your pc
Magelan Cybersecurity
No ratings yet
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
From Everand
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
Anand Vemula
No ratings yet
Azure For Starters
From Everand
Azure For Starters
Chinmoy Mukherjee
No ratings yet
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
NgRx SignalStore: An effortless solution for state management
From Everand
NgRx SignalStore: An effortless solution for state management
Abdelfattah Ragab
No ratings yet
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Projects with IOTA
From Everand
Projects with IOTA
Guillermo Perez Guillen
No ratings yet

009 Ex 7

Uploaded by

009 Ex 7

Uploaded by

EX NO:

Data Lake Storage Gen2 for downstream analytics.

temperature and humidity values to the IoT Hub.

enable the hierarchical namespace option while creating it.

processed-data to separately store incoming raw and processed data.

storage account with the path pattern raw/{date}/{time}.

● Create an Azure Synapse Analytics Workspace using basic settings, allow

public network access, and complete the workspace creation.

● Then, in Synapse Studio, create a new Notebook, attach it to the iotSparkPool,

and humidity to int) using PySpark commands.

format for efficient storage and future analytics.

PySpark ETL Job for IoT Data Processing in Azure Synapse:

from pyspark.sql.functions import col

# Read from ADLS raw container

df_clean = df.withColumn("temperature", col("temperature").cast("float")) \

# Write to ADLS processed container in Parquet format

IOT SPARK POOL

SERVICE MONITORING: END-TO-END LATENCY AND REQUEST SUCCESS RATES

STORAGE – CONTAINER CREATED :

AZURE SYNAPSE SPARK POOL RESOURCE ALLOCATION:

SYNAPSE ANALYTICS WORKSPACE:

AZURE SYNAPSE SPARK NOTEBOOK: IOT DATA TRANSFORMATION PIPELINE

SPARKPOOL ACTIVE SESSIONS DASHBOARD:

Storage Gen2 for further analysis.

You might also like