Moving Queries To The Data, Not Data To The Queries
Moving Queries To The Data, Not Data To The Queries
This replication strategy works well in most cases. There are only a few times when
you must be concerned about the lag time between a write to the read/write node
and a client reading that same record from a replica. One of the most common
operations after a write is a read of that same record. If a client does a write and then
an immediate read from that same node, there’s no problem. The problem occurs if a
read occurs from a replica node before the update happens. This is an example of
an inconsistent read. The best way to avoid this type of problem is to only allow
reads to the same write node after a write has been done. This logic can be added to
a session or state management system at the application layer. Almost all distributed
databases relax database consistency rules when a large number of nodes permit
writes. If your application needs fast read/write consistency, you must deal with it at
the application layer.
4. Letting the database distribute queries evenly to data nodes:
In order to get high performance from queries that span multiple nodes, it’s important
to separate the concerns of query evaluation from query execution. Figure shows
this structure:
Figure NoSQL systems move the query to a data node, but don’t move data to a
query node. In this example, all incoming queries arrive at query analyzer nodes.
These nodes then forward the queries to each data node. If they have matches, the
documents are returned to the query node. The query won’t return until all data
nodes (or a response from a replica) have responded to the original query request. If
the data node is down, a query can be redirected to a replica of the data node.
This approach is somewhat similar to the concept of federated search. Federated
search takes a single query and distributes it to distinct servers and then combines
the results together to give the user the impression they’re searching a single
system.