0% found this document useful (0 votes)

66 views8 pages

STREAM PROCESSING 2 Marks Question and Answers

The document covers various aspects of stream processing, data integration, data mining, and data models, emphasizing real-time data analysis and the transition from batch to stream processing. It discusses architectures like Kappa and Lambda, the concept of big data, and different database models including NoSQL and relational databases. Additionally, it introduces Apache Kafka as a platform for event processing and outlines its core APIs for managing streaming applications.

Uploaded by

prakashpoint2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views8 pages

STREAM PROCESSING 2 Marks Question and Answers

Uploaded by

prakashpoint2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

STREAM PROCESSING-CCS368

TWO MARKS QUESTIONS AND ANSWERS

1,Stream processing is the need for real-time data analysis and decision-
making. In scenarios where immediate data processing is crucial, such as
fraud detection in banking or real-time monitoring in manufacturing, stream
processing allows for the analysis of data as it arrives, enabling
instantaneous responses.

The shift from batch processing to stream processing in many domains is

driven by the increasing demand for real-time insights and the growing
volume and velocity of data.

Batch processing vs. stream processing are two different approaches to

handling data. Batch processing involves processing large volumes of data at
once, at scheduled intervals. In contrast, stream processing involves
continuously processing data in real time as it arrives

2 Data Integration
Data migration is the process of selecting, preparing, extracting, and transforming data and
permanently transferring it from one computer storage system to another. Data migration is a
common IT activity.

3, DATA MINING
Data mining is the process of sorting through large data sets to identify patterns and relationships
that can help solve business problems through data analysis. Data mining techniques and tools
help enterprises to predict future trends and make more informed business decisions.

4 Six stages of data preprocessing

 Step 1: Collection. The collection of raw data is the first step of the data processing
cycle. ...
 Step 2: Preparation. ...
 Step 3: Input. ...
 Step 4: Data Processing. ...
 Step 5: Output. ...
 Step 6: Storage.

5,Data as a service
Data as a service (DaaS) is a business model where data is made available on demand and
regardless of the consumer's location or infrastructure.service

Unit 2

1, Kappa Architecture is a software architecture used for processing streaming data.

The main premise behind the Kappa Architecture is that you can perform both real-time
and batch processing, especially for analytics, with a single technology stack. It is
based on a streaming architecture in which an incoming series of data is first stored in a
messaging engine like Apache Kafka. From there, a stream processing engine will read
the data and transform it into an analyzable format, and then store it into an analytics
database for end users to query.

2, Lambda architecture
L ambda architecture is a data deployment model for processing that consists of a traditional batch data
pipeline and a fast streaming data pipeline for handling real-time data. In addition to the batch layer and
speed layers, Lambda architecture also includes a data serving layer for responding to user queries.

3,BIG DATA

Big data refers to extremely large and diverse collections of structured, unstructured, and semi-
datatructured data that continues to grow exponentially over time.

Real-time analytics
Real-time analytics is the discipline that applies logic and mathematics to data to
provide insights for making better decisions quickly. For some use cases, real time
simply means the analytics is completed within a few seconds or minutes after the
arri

Stream processing
Stream processing allows applications to respond to new data events at the moment
they occur. Rather than grouping data and collecting it at some predetermined
interval, batch processing and stream processing applications collect and process d

What is a message broker?

A message broker is software that enables applications, systems and services to communicate
with each other and exchange information. The message broker does this by translating messages

What’s the Difference Between Real-time and Batch ETL?

Real-time and batch ETL are two data extraction approaches. Real-time ETL
(extract, transform, load) extracts, transforms and loads data as soon as it
becomes available. Batch ETL processes data in batches according to a
predetermined schedule or set of conditions.

Both approaches have their pros and cons. However, which one is best suited for your
business depends on many factors. These factors are speed requirements, the volume
of incoming data, security needs, etc.

By comparing the two approaches, we can understand which one suits our particular
needs. This article will cover each approach individually before looking at them
together. By doing so we should gain further insight into when to opt for either real-time
or batch ETL. Let’s dive right in!
etween formal messaging protocols. immediately as they are generated.
UNIT 3
DATA MODELS AND QUERY LANGUAGES

1, The relational model (RM) is an approach to managing data using a structure and
language consistent with first-order predicate logic, first described in 1969 by English computer
scientist Edgar F. Codd,[1][2] where all data is represented in terms of tuples, grouped
into relations. A database organized in terms of the relational model is a relational database.

The purpose of the relational model is to provide a declarative method for specifying data and
queries: users directly state what information the database contains and what information they
want from it, and let the database management system software take care of describing data
structures for storing the data and retrieval procedures for answering queries

2, Document Object Model (DOM)

The Document Object Model (DOM) connects web pages to scripts or

programming languages by representing the structure of a document—such
as the HTML representing a web page—in memory. Usually it refers to
JavaScript, even though modeling HTML, SVG, or XML documents as objects
are not part of the core JavaScript language.

3, KEY VALUE PAIR

A key-value pair consists of two related data elements: A key, which is a constant that defines
the data set (e.g., gender, color, price), and a value, which is a variable that belongs to the
set (e.g., male/female, green, 100). Fully formed, a key-value pair could look like these: gender =
male. color = green.
4, What is a NoSQL database?
NoSQL, also referred to as “not only SQL” or “non-SQL”, is an approach to database design that
enables the storage and querying of data outside the traditional structures found in relational
databases
5, Object–relational mismatch
Object–relational mismatch is a set of difficulties going between data in relational data
stores and data in domain-driven object models. Relational Database Management
Systems (RDBMS) is the standard method for storing data in a dedicated database, while object-
oriented (OO) programming is the default method for business-centric design in programming
languages. The problem lies in neither relational databases nor OO programming, but in the
conceptual difficulty mapping between the two logic models. Both logical models are differently
implementable using database servers, programming languages, design patterns, or other
technologies. Issues range from application to enterprise scale, whenever stored relational data is
used in domain-driven object models, and vice versa. Object-oriented data stores can trade this
prob

6.What is one-to-many and many-to-many relationship?

A many-to-many relationship is really two one-to-many relationships with a third table. A many-
to-many relationship means that for each record in one table there can be many records in
another table and for each record in the second table there can be many in the first.
lem for other implementation difficulties.

7, What are network data models?

In computing, the network model is a database model conceived as a flexible way of representing
objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in
which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy
or lattice.

What is flexible schema?

what "Schema Flexibility" means? Well...it means that you can define a table with some columns
and then dynamically add more columns at run time without the need of redefine the table
structure
Structured query language
Structured query language (SQL) is a programming language for storing and processing information in a
relational database. A relational database stores information in tabular form, with rows and columns
representing different data attributes and the various relationships between the data values.

What Is Data Locality?

The data locality pattern allows us to move computation to data. The data can live in the database
or the file system. The situation is simple as long as our data fits into the disk or memory of our
m
What is Declarative query language
,We just specify the what data we want, but how to achieve that we delegate it to the language
engine. Select * from students where department = "computer science". For example in SQL, the
above program can be written something like. achines. Processing can be local and fast. In big
data applications,

Graph data model

The graph data model is often referred to as being whiteboard-friendly. Typically, when
designing a data model, people draw example data on the whiteboard and connect it to other data
drawn to show how different items connect. The whiteboard model is then re-formatted and
structured to fit normalized tables for a relational model.

CYPHER QUERY LANGUAGE

Cypher is a declarative graph query language that allows for expressive and efficient data
querying in a property graph. Cypher was largely an invention of Andrés Taylor while working

What is a graph query?

Graph queries, for the most part, attempt to identify an explicit pattern within the graph database.
Graph queries have an expressive power to return something at the level of an analytic in a
normal data processing system.
The Semantic Web

The Semantic Web is a vision about an extension of the existing World Wide Web, which
provides software programs with machine-interpretable metadata of the published information
and data. In other words, we add further data descriptors to otherwise existing content and data
on the Web

CODASYL,

The Conference/Committee on Data Systems Languages, was a consortium formed in 1959 to

guide the development of a standard programming language that could be used on
many computers. This effort led to the development of the programming language COBOL, the
CODASYL Data Model, and other technical standards.

CODASYL's members were individuals from industry and government involved in data
processing activity. Its larger goal was to promote more effective data systems analysis, design,
and implementation. The organization published specifications for various languages over the
years, handing these over to official standards bodies (ISO, ANSI, or their predecessors) for
formal standardization.

SPARQL,
pronounced 'sparkle', is the standard query language and protocol for Linked Open
Data on the web or for RDF triplestores. SPARQL, short for “SPARQL Protocol
and RDF Query Language”, enables users to query information from databases or
any data source that can be mapped to RDF.

UNIT 4
EVENT PROCESSING WITH APACHE KAFKA

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source

system developed by the Apache Software Foundation written in Java and Scala. The project aims
to provide a unified, high-throughput, low-latency platform for handling real-time data feeds
Kafka APIs
Apache Kafka® provides five core Java APIs to enable cluster and client management.
This article explores the different Kafka APIs and how they can be used to build and
manage powerful streaming applications.

 Producer API
 Consumer API
 Admin Client API
 Connect API
 Kafka Streams API

 What is admin API?

The Admin API provides programmatic access to several of the App Engine administrative
operations that are found in the Google Cloud console

What is producer API?

Producers publish (write) a stream of events to one or more Kafka topics. The Producer API
enables developers to create their own producers that write data to Kafka topics. The API
provides several options for configuring the behavior of the producer, such as setting the number
of acknowledgments required before considering a message as sent, or setting compression
options to reduce the size of messages.

BDS Session 1.1
No ratings yet
BDS Session 1.1
69 pages
Ebook Fast Data Architectures For Streaming Applications 2
No ratings yet
Ebook Fast Data Architectures For Streaming Applications 2
58 pages
17CS81 IOT Notes Module4
No ratings yet
17CS81 IOT Notes Module4
17 pages
What Is Data and Why Data Is A Very Important Asset
No ratings yet
What Is Data and Why Data Is A Very Important Asset
35 pages
Chapter Two Data Science: by Abdulaziz Oumer
No ratings yet
Chapter Two Data Science: by Abdulaziz Oumer
29 pages
Implementasi Big Data Di Fintech - Compressed
100% (1)
Implementasi Big Data Di Fintech - Compressed
45 pages
Intro 2 DB
No ratings yet
Intro 2 DB
126 pages
BUAN6320 - Chapter 2 & 9
No ratings yet
BUAN6320 - Chapter 2 & 9
55 pages
Im Unit 3 DBMS Updated
No ratings yet
Im Unit 3 DBMS Updated
93 pages
Super Important Questions For BDA
100% (1)
Super Important Questions For BDA
26 pages
Tableau Building Blocks of A Modern Analytics Platform
No ratings yet
Tableau Building Blocks of A Modern Analytics Platform
20 pages
MEIE3284 Lecture 03
No ratings yet
MEIE3284 Lecture 03
49 pages
Deep Architecture
No ratings yet
Deep Architecture
65 pages
Duda
No ratings yet
Duda
13 pages
Emerging Chapter 2
No ratings yet
Emerging Chapter 2
22 pages
Database Management Systems
No ratings yet
Database Management Systems
5 pages
Chapter 6 - Big Data Architecture Part 1
No ratings yet
Chapter 6 - Big Data Architecture Part 1
41 pages
Master Ahmed Hussnain 2014 PDF
No ratings yet
Master Ahmed Hussnain 2014 PDF
85 pages
Dp203 Notes
No ratings yet
Dp203 Notes
87 pages
How Mongodb Enables Real-Time Data With Event-Driven Architecture
No ratings yet
How Mongodb Enables Real-Time Data With Event-Driven Architecture
23 pages
Microsoft: Exam Questions DP-200
No ratings yet
Microsoft: Exam Questions DP-200
10 pages
Kafka and Strom Event Processing in Realtime
No ratings yet
Kafka and Strom Event Processing in Realtime
46 pages
MODULE-1 Data at Rest Vs Data in Motion
No ratings yet
MODULE-1 Data at Rest Vs Data in Motion
17 pages
6testing DB Imp m06
No ratings yet
6testing DB Imp m06
14 pages
Microsoft - Certshared.dp 203.free - pdf.2023 Sep 25.by - Osborn.177q.vce
No ratings yet
Microsoft - Certshared.dp 203.free - pdf.2023 Sep 25.by - Osborn.177q.vce
24 pages
The Next Generation Cognitive Security O PDF
No ratings yet
The Next Generation Cognitive Security O PDF
22 pages
Real Time Analytics With Spark and Kafka
No ratings yet
Real Time Analytics With Spark and Kafka
53 pages
BIG DATA 1 Unit
100% (1)
BIG DATA 1 Unit
17 pages
Internet of Things 18Cs81: Module - 4 Data and Analytics For Iot
No ratings yet
Internet of Things 18Cs81: Module - 4 Data and Analytics For Iot
32 pages
Database Note - 1
No ratings yet
Database Note - 1
29 pages
Twitrends: A Real Time Trending Topics Detection System For Twitter Social Network
No ratings yet
Twitrends: A Real Time Trending Topics Detection System For Twitter Social Network
10 pages
Institutional Training Final Report
No ratings yet
Institutional Training Final Report
27 pages
Articol Disteibuted Data Processing
No ratings yet
Articol Disteibuted Data Processing
9 pages
Automotive Big Data
No ratings yet
Automotive Big Data
10 pages
DBMS Capsule
No ratings yet
DBMS Capsule
4 pages
Making The Shift From Relational To NoSQL
No ratings yet
Making The Shift From Relational To NoSQL
6 pages
Managing Database Systems
No ratings yet
Managing Database Systems
14 pages
Unit 2 - BD - Big Data Technology Foundations
No ratings yet
Unit 2 - BD - Big Data Technology Foundations
44 pages
Unit I
No ratings yet
Unit I
11 pages
Glossary
No ratings yet
Glossary
11 pages
Adbms Finals Reviewer
No ratings yet
Adbms Finals Reviewer
3 pages
MIS Mod2
No ratings yet
MIS Mod2
36 pages
Ds Notes
No ratings yet
Ds Notes
88 pages
Dbms Essays U1-4
No ratings yet
Dbms Essays U1-4
78 pages
Database Notes
No ratings yet
Database Notes
47 pages
DSBDA - Unit - 1
No ratings yet
DSBDA - Unit - 1
41 pages
DBMS 1 1
No ratings yet
DBMS 1 1
19 pages
Data Dictionary
No ratings yet
Data Dictionary
24 pages
Summ of Dbms
No ratings yet
Summ of Dbms
12 pages
Swe2011 Bda - III
No ratings yet
Swe2011 Bda - III
53 pages
Big Data Architecture
No ratings yet
Big Data Architecture
41 pages
IoT - Module 4 - 8th Sem
No ratings yet
IoT - Module 4 - 8th Sem
17 pages
Lecture 16
No ratings yet
Lecture 16
31 pages
DBMS Intro Presentation
No ratings yet
DBMS Intro Presentation
45 pages
Adbms - Super 25
No ratings yet
Adbms - Super 25
7 pages
Emergency Chapter Two
No ratings yet
Emergency Chapter Two
41 pages
Introduction To Data Models 677e35511a823
No ratings yet
Introduction To Data Models 677e35511a823
45 pages
Lambda Architecture
No ratings yet
Lambda Architecture
20 pages
Chapeter1 2 3 Summary
No ratings yet
Chapeter1 2 3 Summary
9 pages
Database
No ratings yet
Database
72 pages
CCD Chapter 3 Notes
No ratings yet
CCD Chapter 3 Notes
11 pages
Presentation1 DP900
No ratings yet
Presentation1 DP900
34 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
24 pages
Introduction To Database Systems
No ratings yet
Introduction To Database Systems
4 pages
Dsbda Unit1
No ratings yet
Dsbda Unit1
221 pages
Comp102 - Database Intro
No ratings yet
Comp102 - Database Intro
6 pages
Unit Ii
No ratings yet
Unit Ii
20 pages
Cs498 Week 12 Slide
No ratings yet
Cs498 Week 12 Slide
100 pages
INFOMAN Prelim Notes
No ratings yet
INFOMAN Prelim Notes
9 pages
Big Data 3rd Assignment Answers
No ratings yet
Big Data 3rd Assignment Answers
8 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
Pervasive and Mobile Computing: A.R. Al-Ali, Ragini Gupta, Imran Zualkernan, Sajal K. Das
No ratings yet
Pervasive and Mobile Computing: A.R. Al-Ali, Ragini Gupta, Imran Zualkernan, Sajal K. Das
29 pages
2 Emerging
No ratings yet
2 Emerging
10 pages
Cloud
No ratings yet
Cloud
6 pages
Big Data Analytics - Chapter 4
No ratings yet
Big Data Analytics - Chapter 4
22 pages
Cloud Computing Applications Part 2 Final
No ratings yet
Cloud Computing Applications Part 2 Final
79 pages
1 Introduction
No ratings yet
1 Introduction
39 pages
5-Introduction To Streams Concepts, Stream Data Model and Architecture-03!02!2025
No ratings yet
5-Introduction To Streams Concepts, Stream Data Model and Architecture-03!02!2025
17 pages
Data Models
No ratings yet
Data Models
5 pages
DDM Assignment
No ratings yet
DDM Assignment
27 pages
Optimizing Flink For High-Throughput Machine Learning: Streaming Feature Engineering in Banking
No ratings yet
Optimizing Flink For High-Throughput Machine Learning: Streaming Feature Engineering in Banking
10 pages
SQL Material
No ratings yet
SQL Material
56 pages
Building Event-Driven Microservices 2E ER - Adam Bellemar
No ratings yet
Building Event-Driven Microservices 2E ER - Adam Bellemar
184 pages
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet

STREAM PROCESSING 2 Marks Question and Answers

Uploaded by

STREAM PROCESSING 2 Marks Question and Answers

Uploaded by

STREAM PROCESSING-CCS368

TWO MARKS QUESTIONS AND ANSWERS

The shift from batch processing to stream processing in many domains is

Batch processing vs. stream processing are two different approaches to

4 Six stages of data preprocessing

1, Kappa Architecture is a software architecture used for processing streaming data.

What is a message broker?

What’s the Difference Between Real-time and Batch ETL?

2, Document Object Model (DOM)

The Document Object Model (DOM) connects web pages to scripts or

3, KEY VALUE PAIR

6.What is one-to-many and many-to-many relationship?

7, What are network data models?

What is flexible schema?

What Is Data Locality?

Graph data model

CYPHER QUERY LANGUAGE

What is a graph query?

The Conference/Committee on Data Systems Languages, was a consortium formed in 1959 to

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source

 What is admin API?

What is producer API?

You might also like