Datanswers Architecture 12-23-2021
Datanswers Architecture 12-23-2021
Publishing Information
Authorized users with the DatAnswers User role can use the DatAnswers web
application to:
Search for documents using queries and view the result set
DATANSWERS ARCHITECTURE 1
DatAnswers Components
2
DATANSWERS COMPONENTS
SyncUnit
DSP Server
DATANSWERS ARCHITECTURE 2
Chapter 2 DATANSWERS ARCHITECTURE
The search cluster comprises several machines (nodes) that hold the indexed data.
The index is split between all nodes on the cluster. Each part of the partitioned index
is called a shard. Each shard could have zero to many replicas. A replica is used to
prevent failures and down time and to increase the amount of concurrent users that
can simultaneously use DatAnswers. When a search request is retrieved from the web
server, it is distributed between all shards on the cluster. Each shard retrieves the
relevant documents and the aggregated results are then returned to the web server.
The returned data consists of documents that the user has permissions to read/view.
Documents are evenly distributed between all shards. As the amount of documents
grow, additional shards should be added. As the amount of queries increased,
additional replicas should be added.
SyncUnit
The SyncUnit is used as a temporary storage for files that are pending index, security
data and configuration settings.
Each Data Classification Engine service that crawls documents, sends the documents
to be indexed to the SyncUnit. Each cluster node than takes the files to be indexed,
index them and delete the file from the SyncUnit.
DSP Server
The DSP framework is used for sending general and user configuration.
DATANSWERS ARCHITECTURE 3
Chapter 2 DATANSWERS ARCHITECTURE
It is also used to calculate the security level for each file on DW scope. The security
calculations are stored on the SyncUnit and are laded by each shard. The security
data is then used to filter files that the user doing the search doesn’t have read per
missions on.
DATANSWERS ARCHITECTURE 4