0% found this document useful (0 votes)

51 views3 pages

ETCW03

Systems like distributed message queue and stream processing platform are being used for scaling huge number of partitions of data streams and on the commodity hardware, this data streams are having high velocity. API used for programming by these systems is low level, so requires more coding which increases the maintenance and learning time of the programmer. These systems don’t have the sufficient capability of querying in SQL like Hive, Impala or Presto big data systems. Here we are defining the minimal extension set to standard SQL for manipulation and querying of data streams. Streaming SQL have the prototype of above extensions. A tool for streaming SQL that compiles streaming SQL into physical plans performed on Samza which is an open-source distributed stream processing framework. Here we are comparing the performance of streaming SQL queries with similar Samza applications and discussing the improvements in usability.

Uploaded by

Editor IJAERD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views3 pages

ETCW03

Uploaded by

Editor IJAERD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

e-ISSN (O): 2348-4470

Scientific Journal of Impact Factor (SJIF): 4.72

p-ISSN (P): 2348-6406

International Journal of Advance Engineering and Research

Development
Emerging Technologies in the Computer World, January -2017

Improved Distributed Query Processing

Prof. Shilpa Pimpalkar, Department of Computer Engineering, AISSMSs IOIT

Prachi Bhonde1, Saylee Dalu2, Viraj Deshmukh3, Tushar Shedge4

Computer Engineering, AISSMSs IOIT

Abstract -- Systems like distributed message queue and stream processing platform are being used for scaling huge
number of partitions of data streams and on the commodity hardware, this data streams are having high velocity. API
used for programming by these systems is low level, so requires more coding which increases the maintenance
and learning time of the programmer. These systems dont have the sufficient capability of querying in SQL like Hive,
Impala or Presto big data systems. Here we are defining the minimal extension set to standard SQL for manipulation and
querying of data streams. Streaming SQL have the prototype of above extensions. A tool for streaming SQL that compiles
streaming SQL into physical plans performed on Samza which is an open-source distributed stream processing
framework. Here we are comparing the performance of streaming SQL queries with similar Samza applications and
discussing the improvements in usability.

Keywords-Samza Sql, Streaming data, Apache Kafka, Apache Zookeeper, Apache Calcite, Yarn

I.INTRODUCTION

The architectures like Lambda and Kappa were time consuming. To overcome this problem proposed system has
capability to execute query in less time with distributed query processing. It will make work of hours in minutes. This
query processing model is a scalable and fault-tolerant SQL based streaming query engine implemented on top of
Apache Samza with support for interaction with non-streaming data sources. This project is used to explore a unified
framework which enables Kappa architecture style data processing pipelines based on well-known standard SQL. In this
proposed architecture apache Samza and Apache calcite provide core functionality. Samza provides developers with a
Java API similar to Map/Reduce for implementing streaming tasks, and a message serialization and deserialization API
called Serde API to support different message formats (tektite, Avro, JSON or Thrift). While regular Samza jobs read and
write to/from Kafka, Samza provides a separate Java API to plug in different input and output systems. A high-level view
of Samza architecture. The features of Samza discussed here are available for proposed model to utilize it for executing
streaming SQL queries. Samza comes with a built-in YARN client for submitting Samza jobs to a YARN cluster, and
apache Samza job has an application master to perform input partition assignments and task scheduling. The Samza
application master also takes care of fault tolerance. A query is a Samza job with proposed systems specific stream task
implementation that performs the computation described in the query.

II.PROBLEM STATEMENT

The distributed SQL query engines have been available for Big Data, we still lack support for SQL-based stream
querying capabilities in distributed stream processing systems. In this system, we identify a set of requirements and
propose a standard SQL based streaming query model for fast processing of data.

III.LITRATURE REVIEW

Milinda Pathirage [1] Technologies such as distributed message queues and streaming processing platforms that can scale
to thousands of data stream partitions on commodity hardware are a response. Programming API provided by the
previous systems are often low-level. That will increase programmers overhead to maintain code and learn new coding
standards/syntaxes. Also there is lack of SQL querying capabilities which are popular on Big Data Systems. Here they
are defining minimal set of extensions to standard SQL. It will support streaming queries.

Dmitry Namiot [2] This papers goal is to provide a quick introduction and survey of the technical solutions for big data
streams processing. In this survey, Machine to Machine communications, sensors data in Internet of Things as well as
time series data processing. They have discussed the basic elements behind data streams processing. Existing technical
solutions for implementation of data stream processing are also discussed.

Oscar Boykin [3] Summing bird is an open-source domain-specific language. It is implemented in Scala and designed to
integrate online and batch MapReduce computations in a single framework. Hadoop can operate efficiently for batch
processing and Storm for online processing. Summingbird can operate in a hybrid processing mode i.e. it can combine

@IJAERD-2017, All rights Reserved 7

International Journal of Advance Engineering and Research Development (IJAERD)
E.T.C.W, January -2017, e-ISSN: 2348 - 4470, print-ISSN: 2348-6406.

batch as well as online processing results. It imposes constraints on type of aggregations that can be performed, but these
constraints are not found restrictive for broad range of analytics tasks at Twitter.

Supun Kamburugamuve [4] There is huge number of applications through which large amount of data generated in
external environments is stored on servers for real time processing. These applications includes stock trading, sensor
based monitoring, web processing, network monitoring and so on. This data generated from various sources can be seen
as stream of events or tuples. In stream based applications data is handled as sequence of event tuple.

Qian Lin [5] In this paper they have proposed a noval model for stream joins, called join-biclique. It uses large cluster as
bipartite graph. Join-biclique has several advantages over state of the art techniques such as memory efficiency, elasticity
and scalability. These are essential features for building scalable and efficient streaming systems. Depending on join-
biclique they have developed scalable distributed stream join system called BiStream. It supports full history joins,
window join and also online data aggregation.

Mariam Kiran [6] Paper combines ideas from database management, cost models, query management and cloud
computing to present a general architecture that could be applied in any given scenario where affordable online data
processing of Big Datasets is needed. The results showcase a reduction in cost and argue benefits for performing online
analysis and anomaly detection for sensor data.

Leonardo Neumeyer [7] S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that
allows programmers to easily develop applications for processing continuous unbounded streams of data. We show that
the S4 design is surprisingly flexible and lends itself to run in large clusters built with commodity hardware.

Matei Zaharia [8] This paper says that big data applications should act on real time data. Also these applications should
be fault tolerant and scalable as we are using them on larger scales. They should handle stragglers automatically. But
distributed systems at that time were not fault tolerant and also recovery was too expensive or time consuming. Also they
were not handling stragglers so new model was proposed called D-stream i. e. discretized stream that overcome these
challenges.

Daniel J. Abadi [9] This paper tells about Auroras basic processing model and architecture. It is new system for
monitoring applications to manage data streams. There are various types of monitoring applications and variety of data is
being produced from them. For management of this data Aurora architecture and stream oriented set of operators are
discussed.

Arvind Arasu [10] In this paper CQL (Continuous Query Language) is discussed which is supported by STREAM
(Stanford Data Stream Management System). CQL is SQL based query language that support querying on data streams
as well as updatable relations. They begun by presenting an abstract semantics that relies only on black box mappings
among streams and relations. From these mappings we define a precise and general interpretation for continuous queries.

IV.PROPOSED SYSTEM

In our system we provide the sql query as a input for system, system first interact with SamzaSQL using Sqline library
the query planner convert the query into the tuple format and send to the input for sibling window operation logic. Each
tuple is having some timestamp, by using that timestamp system identify time interval the omits the tuple and send the
result to output window.

Fig. Streaming SQL Architecture

Figure illustrates out proposed system i.e. Streaming SQL architecture. Users interact with Streaming SQL through a
special SQL shell build using SqlLine 3 library and a custom Streaming SQL specific JDBC driver implementation.
SamzaSQL shell is a command line application that runs on users desktop. SamzaSQL JDBC driver wraps the query
planner that converts a streaming SQL query to a Samza job containing one or more SamzaSQL tasks. Query planner
uses Samza YARN Client to submit streaming jobs to a YARN cluster. Query planner uses Calcite model files described
@IJAERD-2017, All rights Reserved 8
International Journal of Advance Engineering and Research Development (IJAERD)
E.T.C.W, January -2017, e-ISSN: 2348 - 4470, print-ISSN: 2348-6406.

in JSON format and Kafka Schema Registry to retrieve necessary metadata for query planning and uses Zookeeper to
share metadata and configuration information between query planner and SamzaSQL streaming tasks. Information shared
via Zookeeper
includes streaming SQL query to use during task query planning, Schema Registry location, message schema details, etc.
We are also planning to utilize Zookeeper to store SamzaSQL shells session information including running queries, and
including input/output streams and the status of the query.

V.CONLUSION

Streaming SQL is SQL implementation on top of Apache Samza which utilize standard SQL as much as possible by
moving streaming related details away from the language layer to the physical execution layer. Streaming SQL
demonstrates the advantages of having access to check pointed local storage and concepts such as bootstrap streams in a
stream processing framework.

REFERENCES

[1] Milinda Pathirage, Julian Hyde, Yi Pan and Beth Plale SamzaSQL: Scalable Fast Data Management with
Streaming SQL, 2016 IEEE

[2] Dmitry Namiot, On Big Data Stream Processing, vol. 3, no. 8, 2015

[3] O. Boykin, S. Ritchie, I. OConnell, and J. Lin, Summingbird: A framework for integrating batch and online
Mapreduce computations, Proceedings of the VLDB Endowment, vol. 7, no. 13, 2014.

[4] Supun Kamburugamuve, Survey of Distributed Stream Processing for Large Stream Sources, For the PhD
Qualifying Exam 2013

[5] Qian Lin, Beng Chin Ooi, Zhengkui Wang, Cui Yu, Scalable Distributed Stream Join Processing, 2015

[6] Mariam Kiran, Peter Murphy, Inder Monga, Jon Dugan, Sartaj Singh Baveja Lambda Architecture for Cost-
effective Batch and Speed Big Data processing, 2015 IEEE International Conference on Big Data

[7] L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, S4: Distributed stream computing platform, in Data Mining
Workshops (ICDMW), pp. 170177, International Conference on. IEEE, 2010.

[8] Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, Ion Stoica, Discretized Streams:
Fault-Tolerant Streaming Computation at Scale, Nov. 36, 2013

[9] D. J. Abadi, D. Carney, U. etintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S.
Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal-The
International Journal on Very Large Data Bases, vol. 12, no. 2, pp. 120139, 2003.

[10] A. Arasu, S. Babu, and J. Widom, The cql continuous query language: Semantic foundations and query
execution, The VLDB Journal, vol. 15, no. 2, pp. 121142, Jun. 2006

BDA UNIT-2 (Final)
No ratings yet
BDA UNIT-2 (Final)
27 pages
Basic Operational Concepts
No ratings yet
Basic Operational Concepts
29 pages
2 66kV Isolator R3 Jan13
No ratings yet
2 66kV Isolator R3 Jan13
33 pages
Whitford Engineering Design Guide
100% (1)
Whitford Engineering Design Guide
44 pages
Conection and Vector Group of Transformer
No ratings yet
Conection and Vector Group of Transformer
59 pages
Latency 5
No ratings yet
Latency 5
8 pages
A Study On Speed Breakers-IJAERDV04I0363450
No ratings yet
A Study On Speed Breakers-IJAERDV04I0363450
5 pages
Tutorial - Manual Microwind 1.d
No ratings yet
Tutorial - Manual Microwind 1.d
125 pages
ECD Lab NEC 752
No ratings yet
ECD Lab NEC 752
17 pages
Yamaha HS80M HS50M Service Manual
No ratings yet
Yamaha HS80M HS50M Service Manual
27 pages
Instruction-Level Parallelism and Superscalar Processors
No ratings yet
Instruction-Level Parallelism and Superscalar Processors
22 pages
ViVIX-S Portable, Wireless - New
No ratings yet
ViVIX-S Portable, Wireless - New
2 pages
Heeds User Guide
No ratings yet
Heeds User Guide
195 pages
DORMA Architectural Hardware 2014 Price List #24
No ratings yet
DORMA Architectural Hardware 2014 Price List #24
274 pages
InstallationManual B44066S October2010 V2
No ratings yet
InstallationManual B44066S October2010 V2
8 pages
ETCW31
No ratings yet
ETCW31
5 pages
HDM4 Version 205 Installation Instructions
No ratings yet
HDM4 Version 205 Installation Instructions
36 pages
Analysis of A Structure With Magneto-Rheological Fluid Damper-IJAERDV04I0917914
No ratings yet
Analysis of A Structure With Magneto-Rheological Fluid Damper-IJAERDV04I0917914
7 pages
A Micro-Video Recommendation System Using Improved Slope One Algorithm Based On Big Data
No ratings yet
A Micro-Video Recommendation System Using Improved Slope One Algorithm Based On Big Data
5 pages
7.3.7 Lab - View The Switch MAC Address Table
No ratings yet
7.3.7 Lab - View The Switch MAC Address Table
7 pages
An Enhance Expert System For Diagnosis of Diabetes Using Fuzzy Rules Over PIMA Dataset-IJAERDV04I0996134
No ratings yet
An Enhance Expert System For Diagnosis of Diabetes Using Fuzzy Rules Over PIMA Dataset-IJAERDV04I0996134
6 pages
SAP Basis Daily Monitoring Tcodes: ABAP Stack Checks
No ratings yet
SAP Basis Daily Monitoring Tcodes: ABAP Stack Checks
7 pages
ATM PacerRF 20kw40kw Power Supply
No ratings yet
ATM PacerRF 20kw40kw Power Supply
2 pages
A Study On Improving Energy Metering System and Energy Monitoring-IJAERDV04I0346872
No ratings yet
A Study On Improving Energy Metering System and Energy Monitoring-IJAERDV04I0346872
4 pages
Fifo
0% (1)
Fifo
4 pages
Honeyword:Encryption Mechanism in Network Security
No ratings yet
Honeyword:Encryption Mechanism in Network Security
4 pages
Picminikit
No ratings yet
Picminikit
6 pages
7180 Rudder Angle Indicator: Owner's Operation, Installation & Maintenance Manual
No ratings yet
7180 Rudder Angle Indicator: Owner's Operation, Installation & Maintenance Manual
24 pages
A Review On Optimization of Bus Driver Scheduling-IJAERDV04I0373755
No ratings yet
A Review On Optimization of Bus Driver Scheduling-IJAERDV04I0373755
5 pages
DCC2010 SDRcube N2APB OH2NLT
No ratings yet
DCC2010 SDRcube N2APB OH2NLT
27 pages
Combustion Research Unit CRU Tool
No ratings yet
Combustion Research Unit CRU Tool
5 pages
6ES7analog 1HF00 Module
No ratings yet
6ES7analog 1HF00 Module
5 pages
Abdelmalek 2014 A
No ratings yet
Abdelmalek 2014 A
4 pages
PYNQ-Z2 Setup Guide - Python Productivity For Zynq (Pynq) v1.0
No ratings yet
PYNQ-Z2 Setup Guide - Python Productivity For Zynq (Pynq) v1.0
3 pages
Altitude Guide
No ratings yet
Altitude Guide
17 pages
TF Tio Setup Guide en
No ratings yet
TF Tio Setup Guide en
10 pages
Back To School
No ratings yet
Back To School
5 pages
Tech Overview Magnum Io 1790750 r5 Web
No ratings yet
Tech Overview Magnum Io 1790750 r5 Web
4 pages
Read Me
No ratings yet
Read Me
2 pages
Invoice: Pt. Ecart Services Indonesia
No ratings yet
Invoice: Pt. Ecart Services Indonesia
1 page
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
From Everand
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
Anand Vemula
No ratings yet
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
From Everand
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
From Everand
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
William Smith
No ratings yet
Unified Data Workflows with Fugue: The Complete Guide for Developers and Engineers
From Everand
Unified Data Workflows with Fugue: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Splunk for Data Insights: Definitive Reference for Developers and Engineers
From Everand
Splunk for Data Insights: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Java Streams Explained: A Practical Guide with Examples
From Everand
Java Streams Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
CrateDB for IoT and Machine Data: The Complete Guide for Developers and Engineers
From Everand
CrateDB for IoT and Machine Data: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
From Everand
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
Anand Vemula
No ratings yet
Study Guide Automating and Programming Cisco Data Center Solutions 300-635 DCAUTO Exam
From Everand
Study Guide Automating and Programming Cisco Data Center Solutions 300-635 DCAUTO Exam
Anand Vemula
No ratings yet
Study Guide 300-435 ENAUTO: Automating and Programming Cisco Enterprise Solutions Certification Exam
From Everand
Study Guide 300-435 ENAUTO: Automating and Programming Cisco Enterprise Solutions Certification Exam
Anand Vemula
No ratings yet
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
From Everand
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
Anand Vemula
No ratings yet
DataFusion: Query Execution with Rust and Arrow: The Complete Guide for Developers and Engineers
From Everand
DataFusion: Query Execution with Rust and Arrow: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
From Everand
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
RisingWave for Real-Time Data Processing: The Complete Guide for Developers and Engineers
From Everand
RisingWave for Real-Time Data Processing: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
Apache Arrow Dataset in Practice: The Complete Guide for Developers and Engineers
From Everand
Apache Arrow Dataset in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Advanced Apache Tez Techniques: Definitive Reference for Developers and Engineers
From Everand
Advanced Apache Tez Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Querying Clouds and APIs with SQL via Steampipe: The Complete Guide for Developers and Engineers
From Everand
Querying Clouds and APIs with SQL via Steampipe: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Apache Samza: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Apache Samza: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Kestra Pipeline Orchestration Essentials: The Complete Guide for Developers and Engineers
From Everand
Kestra Pipeline Orchestration Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Synapse Administration and Deployment: The Complete Guide for Developers and Engineers
From Everand
Synapse Administration and Deployment: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Awk Programming in Practice: Definitive Reference for Developers and Engineers
From Everand
Awk Programming in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
From Everand
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
StarPU: Parallel Computing and Task Scheduling Techniques
From Everand
StarPU: Parallel Computing and Task Scheduling Techniques
Richard Johnson
No ratings yet
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
From Everand
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
Practical Dataflow Engineering: Definitive Reference for Developers and Engineers
From Everand
Practical Dataflow Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Presto in Practice: Definitive Reference for Developers and Engineers
From Everand
Presto in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Couchbase Essentials: Definitive Reference for Developers and Engineers
From Everand
Couchbase Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Principles of Real-Time Data Streaming: Definitive Reference for Developers and Engineers
From Everand
Principles of Real-Time Data Streaming: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
From Everand
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
From Everand
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SAP HANA SYSTEM REPLICATION SCENARIOS
From Everand
SAP HANA SYSTEM REPLICATION SCENARIOS
Giridhar Kankanala
No ratings yet
Data Lakes & Pipelines: A Modern Azure Guide
From Everand
Data Lakes & Pipelines: A Modern Azure Guide
Kameron Hussain
No ratings yet
KNIME Workflow Design and Automation: Definitive Reference for Developers and Engineers
From Everand
KNIME Workflow Design and Automation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
From Everand
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dataiku Platform Foundations: Definitive Reference for Developers and Engineers
From Everand
Dataiku Platform Foundations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Essential Apache Beam: Definitive Reference for Developers and Engineers
From Everand
Essential Apache Beam: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
From Everand
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Azure Data Demystified: From SQL to Synapse
From Everand
Azure Data Demystified: From SQL to Synapse
Kameron Hussain
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet

ETCW03

Uploaded by

ETCW03

Uploaded by

e-ISSN (O): 2348-4470

Scientific Journal of Impact Factor (SJIF): 4.72

International Journal of Advance Engineering and Research

Improved Distributed Query Processing

Prachi Bhonde1, Saylee Dalu2, Viraj Deshmukh3, Tushar Shedge4

@IJAERD-2017, All rights Reserved 7

Fig. Streaming SQL Architecture

@IJAERD-2017, All rights Reserved 9

You might also like