Big Data Analytics Application
Big Data Analytics Application
Research area 02
Designing architectures to support streaming analytics requires handling high-velocity, real-time data
efficiently. Novel architectures focus on processing data in motion, reducing latency, scaling
dynamically, and providing timely insights. Here are some cutting-edge architectures to support
streaming analytics:
1. Lambda Architecture 2.0 (Augmented Lambda)
Overview: Extends the traditional Lambda architecture by adding more real-time processing
capabilities. It separates data into:
Batch layer: Stores historical data for deep analysis.
Speed layer: Processes real-time data streams.
Serving layer: Combines results from both layers for quick query responses.
Novelty:
• Uses stream-first processing, where real-time data is prioritized over batch jobs.
• Advances in tools like Apache Beam support unified APIs for both batch and streaming data.
Use Cases: Real-time fraud detection, and live analytics for social media feeds.
2. Kappa Architecture
Overview: A simplification of the Lambda architecture, focusing only on real-time processing. Instead
of separating batch and speed layers, all data is treated as a stream.
Core Technology:
• Tools like Kafka Streams, Flink, and Apache Pulsar make Kappa feasible.
Novelty:
• Data is processed continuously, even for historical datasets, by replaying event logs.
• Avoids the complexity of maintaining two separate processing paths (batch and real-time).
Use Cases: Real-time recommendations, IoT data analytics.
3. Microservices-based Streaming Architecture
Overview: Leverages microservices to create loosely coupled, independently deployable services for
streaming data.
• Data Pipelines are broken into smaller components, each service processing and forwarding
data.
• Services communicate via event-driven platforms like Apache Kafka, RabbitMQ, or AWS
Kinesis.
Novelty:
• Serverless functions are invoked when new data arrives, ensuring real-time responses.
Novelty:
Novelty:
• Preserve privacy by processing data locally and sharing only necessary outcomes.
• Reduces network congestion by minimizing data transfer.
Use Cases: Healthcare analytics, smart cities, and distributed IoT networks.
• Edge computing components process data streams locally and sync with the central model.
Novelty:
• Combines edge analytics with real-time model updates for situational awareness.
• Supports predictive analytics through live simulation of scenarios.
Use Cases: Predictive maintenance in manufacturing, autonomous vehicles, and energy grid
management.
These architectures aim to balance scalability, latency, fault tolerance, and ease of integration. Each is
tailored to specific use cases—whether edge computing, privacy-preserving analytics, or high-
performance real-time systems—making them critical for the next generation of streaming analytics.
General Instructions:
❖ You have to submit a soft copy
1. A .docx file that will contain the assignment.
❖ Submit into the BLC’s assignment section.