Spark Streaming Assignment
Spark Streaming Assignment
Aggregation
Background:
You have been provided with a Kafka topic named ads_data that contains
advertisement data in the following format:
ills
{
"ad_id": "12345",
"timestamp": "2023-08-23T12:01:05Z",
Sk
"clicks": 5,
"views": 10,
"cost": 50.75
}
a
at
The goal is to process this real-time data, compute business insights using
window-based aggregation, and write the aggregated results into a Cassandra
table. The aggregation key is ad_id, and aggregated values should update
D
Tasks:
w
ills
○ If an entry exists, update the values:
■ Add new clicks/views to the existing counts.
■ Update the average cost per view.
Sk
● If an entry doesn't exist, create a new row with the aggregated
values.
Submission:
a
Submit your Spark Streaming application code, along with a brief report
detailing the results and any challenges faced during the assignment.
at
D
w
ro
G