Moving Queries To The Data, Not Data To The Queries

1. When querying NoSQL databases, it is more efficient to distribute the queries to the nodes holding the data, rather than transferring large datasets to a central processor. This keeps queries fast by minimizing data movement over the network. 2. Hash rings are commonly used to evenly distribute data loads across servers in big data solutions. They determine which node a document should be assigned to based on the leading bits of the document's hash value. 3. Databases use replication to make backup copies of data in real time, allowing horizontal scaling of reads. Replication can cause inconsistencies if a client reads from a replica before an update is replicated. For consistency, reads should go to the same node that handled the write.

Uploaded by

Air

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views2 pages

Moving Queries To The Data, Not Data To The Queries

Uploaded by

Air

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

1.

Moving queries to the data, not data to the queries

With the exception of large graph databases, most NoSQL systems use commodity
processors that each hold a subset of the data on their local shared-nothing drives.
When a client wants to send a general query to all nodes that hold data, it’s more
efficient to send the query to each node than it is to transfer large datasets to a
central processor. This simple rule helps you understand how NoSQL databases can
have dramatic performance advantages over systems that weren’t designed to
distribute queries to the data nodes. Keeping all the data within each data node in
the form of logical documents means that only the query itself and the final result
need to be moved over a network. This keeps your big data queries fast.
2. Using hash rings to evenly distribute data on a cluster:
One of the most challenging problems with distributed databases is figuring out a
consistent way of assigning a document to a processing node. Using a hash ring
technique to evenly distribute big data loads over many servers with a randomly
generated 40-character key is a good way to evenly distribute a network load.
Hash rings are common in big data solutions because they consistently determine
how to assign a piece of data to a specific processor. Hash rings take the leading
bits of a document’s hash value and use this to determine which node the document
should be assigned. This allows any node in a cluster to know what node the data
lives on and how to adapt to new assignment methods as your data grows.
Partitioning keys into ranges and assigning different key ranges to specific nodes is
known as keyspace management.
3. Using replication to scale reads:
Databases use replication to make backup copies of data in real time. Using
replication allows you to horizontally scale read requests. Figure shows this structure

This replication strategy works well in most cases. There are only a few times when
you must be concerned about the lag time between a write to the read/write node
and a client reading that same record from a replica. One of the most common
operations after a write is a read of that same record. If a client does a write and then
an immediate read from that same node, there’s no problem. The problem occurs if a
read occurs from a replica node before the update happens. This is an example of
an inconsistent read. The best way to avoid this type of problem is to only allow
reads to the same write node after a write has been done. This logic can be added to
a session or state management system at the application layer. Almost all distributed
databases relax database consistency rules when a large number of nodes permit
writes. If your application needs fast read/write consistency, you must deal with it at
the application layer.
4. Letting the database distribute queries evenly to data nodes:
In order to get high performance from queries that span multiple nodes, it’s important
to separate the concerns of query evaluation from query execution. Figure shows
this structure:

Figure NoSQL systems move the query to a data node, but don’t move data to a
query node. In this example, all incoming queries arrive at query analyzer nodes.
These nodes then forward the queries to each data node. If they have matches, the
documents are returned to the query node. The query won’t return until all data
nodes (or a response from a replica) have responded to the original query request. If
the data node is down, a query can be redirected to a replica of the data node.
This approach is somewhat similar to the concept of federated search. Federated
search takes a single query and distributes it to distinct servers and then combines
the results together to give the user the impression they’re searching a single
system.

Physics Project: Water Level Indicator
100% (1)
Physics Project: Water Level Indicator
21 pages
Nosql Notes
No ratings yet
Nosql Notes
110 pages
FlashSystem Fundamental Concepts Quiz - Attempt Review
No ratings yet
FlashSystem Fundamental Concepts Quiz - Attempt Review
22 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Etabs V18 Course Content
No ratings yet
Etabs V18 Course Content
9 pages
FS For Medium Voltage Motor PDF
No ratings yet
FS For Medium Voltage Motor PDF
8 pages
The NOSQL CheatSheet
No ratings yet
The NOSQL CheatSheet
7 pages
Speakout Grammar Extra Intermediate Plus Unit 4
No ratings yet
Speakout Grammar Extra Intermediate Plus Unit 4
2 pages
Nosql Databases
No ratings yet
Nosql Databases
379 pages
Big Data - No SQL Databases and Related Concepts
100% (1)
Big Data - No SQL Databases and Related Concepts
101 pages
NoSQL Module 2
No ratings yet
NoSQL Module 2
76 pages
No SQL
No ratings yet
No SQL
109 pages
Module 2 Nosql
No ratings yet
Module 2 Nosql
31 pages
Bigdata Unit 4
No ratings yet
Bigdata Unit 4
97 pages
NoSQL Databases and Big Data Storage Systems
No ratings yet
NoSQL Databases and Big Data Storage Systems
4 pages
ECS781P-9-Cloud Data Management
No ratings yet
ECS781P-9-Cloud Data Management
79 pages
4.NoSQL 1
No ratings yet
4.NoSQL 1
69 pages
Capstone Project Report Format PDF
No ratings yet
Capstone Project Report Format PDF
13 pages
Nosql M2-P1-P2
No ratings yet
Nosql M2-P1-P2
75 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
Module 2
No ratings yet
Module 2
40 pages
Lecture 3 - Principles of NoSQL Databases
No ratings yet
Lecture 3 - Principles of NoSQL Databases
49 pages
Module 2
No ratings yet
Module 2
36 pages
Big Data Slides
No ratings yet
Big Data Slides
26 pages
Big Data Analysis
No ratings yet
Big Data Analysis
9 pages
2.1.SummerSOC2015 Tutorial NoSQL
No ratings yet
2.1.SummerSOC2015 Tutorial NoSQL
62 pages
Nosql Final
No ratings yet
Nosql Final
50 pages
Unit 1
No ratings yet
Unit 1
23 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
22 pages
Module 2 Notes
No ratings yet
Module 2 Notes
19 pages
Nosql Mod2
No ratings yet
Nosql Mod2
25 pages
Mod5 CH2
No ratings yet
Mod5 CH2
36 pages
R23 IDS Unit3
No ratings yet
R23 IDS Unit3
36 pages
NoSQL Unit 1 & 2 QnA
No ratings yet
NoSQL Unit 1 & 2 QnA
18 pages
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
No ratings yet
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
27 pages
Chapter24 Nosql Dbs
No ratings yet
Chapter24 Nosql Dbs
35 pages
No SQL
No ratings yet
No SQL
39 pages
41 NoSQL Introduction
No ratings yet
41 NoSQL Introduction
18 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
13 pages
DBMS 11
No ratings yet
DBMS 11
13 pages
G-06-Autonomous Database - Serverless and Dedicated-Transcript
No ratings yet
G-06-Autonomous Database - Serverless and Dedicated-Transcript
7 pages
A Thorough Introduction To Distributed Systems
No ratings yet
A Thorough Introduction To Distributed Systems
31 pages
NOSQL
No ratings yet
NOSQL
16 pages
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
No ratings yet
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
28 pages
NoSQL - Unit2
No ratings yet
NoSQL - Unit2
8 pages
Case Study About Database Tools
No ratings yet
Case Study About Database Tools
13 pages
Introduction To Big Data and NoSQL
No ratings yet
Introduction To Big Data and NoSQL
52 pages
Lec21Notes Merged
No ratings yet
Lec21Notes Merged
20 pages
Bda Ia2 Bda
No ratings yet
Bda Ia2 Bda
7 pages
Module 3
No ratings yet
Module 3
14 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
No SQL
No ratings yet
No SQL
14 pages
NoSQL Lecture Notes Compilation
No ratings yet
NoSQL Lecture Notes Compilation
5 pages
No SQL Unit 1 To 5 Topics - 241220 - 115648
No ratings yet
No SQL Unit 1 To 5 Topics - 241220 - 115648
19 pages
Set Reference Final PDF
No ratings yet
Set Reference Final PDF
4 pages
Notes For Question Bank
No ratings yet
Notes For Question Bank
17 pages
2383 - 1019 - DOC - NoSQL Databases
No ratings yet
2383 - 1019 - DOC - NoSQL Databases
6 pages
BDA Answers
No ratings yet
BDA Answers
6 pages
BDA Module-3
No ratings yet
BDA Module-3
7 pages
NoSQL Databases UNIT-2
No ratings yet
NoSQL Databases UNIT-2
29 pages
Chapter - 2: Database Model Key-Value Data Store Document Databases Column Databases Graph Databases
No ratings yet
Chapter - 2: Database Model Key-Value Data Store Document Databases Column Databases Graph Databases
61 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
SAP Manual
No ratings yet
SAP Manual
24 pages
BDT Assignment
No ratings yet
BDT Assignment
4 pages
CS3492-DBMS Unit-5
No ratings yet
CS3492-DBMS Unit-5
9 pages
Nosql
No ratings yet
Nosql
5 pages
Digital Report
No ratings yet
Digital Report
2 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
8 pages
Array List, Queue and Tree
No ratings yet
Array List, Queue and Tree
67 pages
SAP Class Notes
No ratings yet
SAP Class Notes
15 pages
Lecture PPT - Post Midsem
No ratings yet
Lecture PPT - Post Midsem
150 pages
Non Relational Database Management Systems
No ratings yet
Non Relational Database Management Systems
15 pages
Random Forest
No ratings yet
Random Forest
10 pages
CSE5001 - Algorithms: Design and Implementation Registration Number:-18MCS0089 Exercise Name:-Min Cost Max Flow Algorithm
No ratings yet
CSE5001 - Algorithms: Design and Implementation Registration Number:-18MCS0089 Exercise Name:-Min Cost Max Flow Algorithm
14 pages
Businessnews Simplified Supply Chains Intermediate Teachersnotes
No ratings yet
Businessnews Simplified Supply Chains Intermediate Teachersnotes
2 pages
DBM 8200 Enh VehicleActions 2013
No ratings yet
DBM 8200 Enh VehicleActions 2013
40 pages
DQS251 - Piling - Spun Piles - Tutorial-Drwgs-Dec 2019 - pg11
No ratings yet
DQS251 - Piling - Spun Piles - Tutorial-Drwgs-Dec 2019 - pg11
8 pages
Adafruit Cap1188 Breakout
No ratings yet
Adafruit Cap1188 Breakout
20 pages
External Recruitment No 1/2024
No ratings yet
External Recruitment No 1/2024
1 page
Salud Estructural
No ratings yet
Salud Estructural
45 pages
Ram Sharma
No ratings yet
Ram Sharma
2 pages
Matrix Multiplication Parallel
No ratings yet
Matrix Multiplication Parallel
5 pages
Sentiment Analysis: Literature Survey
No ratings yet
Sentiment Analysis: Literature Survey
3 pages
Pre - Screening Form - Trell
No ratings yet
Pre - Screening Form - Trell
3 pages
160 Proficiency Syllabus
No ratings yet
160 Proficiency Syllabus
129 pages
Use of Robot Kits in Manufacturing Industry-CIM
No ratings yet
Use of Robot Kits in Manufacturing Industry-CIM
11 pages
Sentiment Analysis Twitter
No ratings yet
Sentiment Analysis Twitter
3 pages
Sentiment Analysis Twitter
No ratings yet
Sentiment Analysis Twitter
3 pages
CSE5001 - Algorithms: Design and Implementation Registration Number:-18MCS0089 Exercise Name:-Set Cover Problem
No ratings yet
CSE5001 - Algorithms: Design and Implementation Registration Number:-18MCS0089 Exercise Name:-Set Cover Problem
6 pages
Closest Pair Problem
No ratings yet
Closest Pair Problem
6 pages
Mongodb Class Exercise: Difference Update and Update One
No ratings yet
Mongodb Class Exercise: Difference Update and Update One
4 pages
18MCS0089 VL2018191005961 Ast03 PDF
No ratings yet
18MCS0089 VL2018191005961 Ast03 PDF
3 pages
Name: Muhammad Fawad ID NAME: 12699 Course: PF Lab: Using Using Using Using Namespace Class Static Void String
No ratings yet
Name: Muhammad Fawad ID NAME: 12699 Course: PF Lab: Using Using Using Using Namespace Class Static Void String
4 pages
Immersive Experiences Using Tech The Future of Events
No ratings yet
Immersive Experiences Using Tech The Future of Events
8 pages
Setty Rajeshwar Rao:19MCS0054: Name
No ratings yet
Setty Rajeshwar Rao:19MCS0054: Name
7 pages
Assignment No. 1 (Summer 2022) Subject: Adv. I. C. Engines Course Coordinator: Dr. J. G. Suryawanshi Last Date of Submission: 04/02/2022
No ratings yet
Assignment No. 1 (Summer 2022) Subject: Adv. I. C. Engines Course Coordinator: Dr. J. G. Suryawanshi Last Date of Submission: 04/02/2022
1 page
PDF Eng
No ratings yet
PDF Eng
8 pages
A Path Finding Visualization Using A Star Algorithm and Dijkstra's Algorithm
100% (1)
A Path Finding Visualization Using A Star Algorithm and Dijkstra's Algorithm
2 pages
Anjali Jain
No ratings yet
Anjali Jain
4 pages
HP Designjet T2300 eMFP Product Series - The Scanner Diagnostic Plot HP® Customer Support
No ratings yet
HP Designjet T2300 eMFP Product Series - The Scanner Diagnostic Plot HP® Customer Support
8 pages
Individual Assignment
No ratings yet
Individual Assignment
3 pages
Final Checklist Thesis - Ab 1.03
No ratings yet
Final Checklist Thesis - Ab 1.03
4 pages
IP ADDRESS ACTIVITY - Quiz
No ratings yet
IP ADDRESS ACTIVITY - Quiz
2 pages
Muayad CV
No ratings yet
Muayad CV
1 page
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet

Moving Queries To The Data, Not Data To The Queries

Uploaded by

Moving Queries To The Data, Not Data To The Queries

Uploaded by

1.

Moving queries to the data, not data to the queries

You might also like