A Review of Elastic Search Performance M
A Review of Elastic Search Performance M
Abstract: The most important aspect of a search engine is the search. Elastic search is a highly scalable search engine that stores data in a
structure, optimized for language based searches. When it comes to using Elastic search, there are lots of metrics engendered. By using Elastic
search to index millions of code repositories as well as indexing critical event data, you can satisfy the search needs of millions of users while
instantaneously providing strategic operational visions that help you iteratively improve customer service. In this paper we are going to study
about Elastic searchperformance metrics to watch, important Elastic search challenges, and how to deal with them. This should be helpful to
anyone new to Elastic search, and also to experienced users who want a quick start into performance monitoring of Elastic search.
Keywords: Elastic search, Query latency, Index flush, Garbage collection, JVM metrics, Cache metrics.
__________________________________________________*****_________________________________________________
node is also able to function as a data node. In order to
1. INTRODUCTION: improve reliability in larger clusters, users may launch
dedicated master-eligible nodes that do not store any data.
Elastic search is a highly scalable, distributed, open source
RESTful search and analytics engine. It is multitenant-capable a. Data nodes
with an HTTP web interface and schema-free JSON Every node that stores data in the form of index and performs
documents. Based on Apache Lucene, Elastic search is one of actions related to indexing, searching, and aggregating data is
the most popular enterprise search engines today and is a data node. In larger clusters, you may choose to create
capable of solving a growing number of use cases like log dedicated data nodes by adding node.master: false to the
analytics, real-time application monitoring, and click stream config file, ensuring that these nodes have enough resources to
analytics. Developed by Shay Banon and released in 2010, it handle data-related requests without the additional workload
relies heavily on Apache Lucene, a full-text search engine of cluster-related administrative tasks.
written in Java.Elastic search represents data in the form of
structured JSON documents, and makes full-text search b. Client nodes
accessible via RESTful API and web clients for languages like Client nodeis designed to act as a load balancer that helps
PHP, Python, and Ruby. It’s also elastic in the sense that it’s route indexing and search requests. Client nodes help to bear
easy to scale horizontally—simply add more nodes to some of the search workload so that data and master-eligible
distribute the load. Today, many companies, including nodes can focus on their core tasks.
Wikipedia, eBay, GitHub, and Datadog, use it to store, search,
and analyze large amounts of data on the fly.
2. ELASTICSEARCH-THEBASIC ELEMENTS
In Elastic search, a cluster is made up of one or more
nodes.Each node is a single running instance of Elastic search,
and its elasticsearch.yml configuration file designates which
cluster it belongs to (cluster.name) and what type of node it
can be. Any property, including cluster name set in the
configuration file can also be specified via command line
argument. The three most common types of nodes in Elastic
search are:
Search performance metrics
Query load: Monitoring the number of queries currently in
progress can give you a rough idea of how many requests your
cluster is dealing with at any particular moment in time.
Consider alerting on unusual spikes or dips that may point to
Fig:2 Elastic search data rganization underlying problems. You may also want to monitor the size
4. of the search thread pool queue.
4. ELASTIC SEARCH PERFORMANCE
METMETRICS : Query latency: Though Elasticsearch does not explicitly
Elasticsearch provides plenty of metrics to detect problems provide this metric, monitoring tools can help you use the
like unreliable nodes, out-of-memory errors, and long garbage available metrics to calculate the average query latency by
collection times. All these metrics are accessible via sampling the total number of queries and the total elapsed time
Elasticsearch’s API as well as single-purpose monitoring tools at regular intervals. Set an alert if latency exceeds a threshold,
like Elastic’s Marvel and universal monitoring services like and if it fires, look for potential resource bottlenecks, or
Datadog. investigate whether you need to optimize your queries.
4.1Search and indexing performance Fetch latency: The fetch phase, should typically take much
In Elasticsearch we have two types of requests, the search less time than the query phase. If this metric isconstantly
requests and index requests which aresimilar to read and write increasing, this could indicate a problem with slow
requests in a traditional database system. disks, enriching of documents (highlighting relevant text in
search results, etc.), or requesting too many results.
Each shard loads the documents and returns them to Index refresh
shards.5.
Cluster status:
Memory usage 4.4 Cluster health and node availability
Elasticsearch makes excellent use of any RAM that has not
been allocated to JVM heap. Elasticsearch was designed to If the cluster status is yellow, at least one replica shard is
rely on the operating system’s file system cache to serve unallocated or missing. Search results will still be
requests quickly and reliably.A number of variables determine complete, but if more shards disappear, you may lose
whether or not Elasticsearch successfully reads from the file data.If the cluster status is red, at least one primary shard
system cache. If the segment file was recently written to disk is missing, and you are missing data, which means that
by Elasticsearch, it is already in the cache. However, if a node searches will return partial results. You will also be
has been shut off and rebooted, the first time a segment is blocked from indexing into that shard. Consider setting up
queried, the information will most likely have to be read from an alert to trigger if status has been yellow for more than 5
disk. This is one reason why it’s important to make sure your
Initializing and unassigned shards
min or if the status has been red for the past minute.
cluster remains stable and that nodes do not crash.Generally,
it’s very important to monitor memory usage on your nodes, When you first create an index, or when a node is
and give Elasticsearch as much RAM as possible, so it can rebooted, its shards will briefly be in an ―initializing‖ state
leverage the speed of the file system cache without running before transitioning to a status of ―started‖ or
out of space. ―unassigned‖, as the master node attempts to assign
shards to nodes in the cluster. If shards remain in an
225
IJRITCC | November 2017, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 5 Issue: 11 222 – 229
_______________________________________________________________________________________________
initializing or unassigned state too long, it could be a time. If caches hog too much of the heap, they may slow
warning sign that the cluster is unstable. things down instead of speeding them up.
In Elastic search, each field in a document can be stored in one
4.5 Resource saturation and errors of two forms: as an exact value or as full text. An exact value,
Elasticsearch nodes use thread pools to manage how threads such as a timestamp or a year, is stored exactly the way it was
consume memory and CPU. Since thread pool settings are indexed because you do not expect to receive to query 1/1/16
automatically configured based on the number of processors, it as ―January 1st, 2016.‖ If a field is stored as full text, that
usually doesn’t make sense to tweak them. If the nodes are not means it is analyzed—basically, it is broken down into tokens,
able to keep up we can add more nodes to handle all of the and, depending on the type of analyzer, punctuation and stop
concurrent requests. Fielddata and filter cache usage is another words like ―is‖ or ―the‖ may be removed. The analyzer
area to monitor, as evictions may point to inefficient queries or converts the field into a normalized format that enables it to
signs of memory pressure. match a wider range of queries.
Elastic search uses two main types of caches to serve search
Thread pool queue and rejections requests more quickly: fielddata and filter.
Each node maintains many types of thread pools; The most
important nodes to monitor are search, index, merge, and Fielddata cache
bulk.The size of each thread pool’s queue represents how The fielddata cache is used when sorting or aggregating on a
many requests are waiting to be served while the node is field, a process that basically has to uninvent the inverted
currently at capacity. The queue allows the node to track and index to create an array of every field value per field, in
eventually serve these requests instead of discarding them. document order.
Thread pool rejections arise once the thread pool’s maximum
queue size is reached. Filter cache
Filter caches also use JVM heap. Elastic search automatically
Metrics to watch cached filtered queries with a max value of 10 percent of the
Thread pool queues : Large queues are not ideal because they heap, and evicted the least recently used data. Elastic search
use up resources and also increase the risk of losing requests if automatically began optimizing its filter cache, based on
a node goes down. If you see the number of queued and frequency and segment size (caching only occurs on segments
rejected threads increasing steadily, you may want to try that have fewer than 10,000 documents or less than 3 percent
slowing down the rate of requests (if possible), increasing the of total documents in the index).
number of processors on your nodes, or increasing the number
of nodes in the cluster. As shown in the screenshot below, Cache metrics to watch
query load spikes correlate with spikes in search thread pool Fielddata cache evictions: Ideally, you want to limit the
queue size, as the node attempts to keep up with rate of query number of fielddata evictions because they are I/O intensive. If
requests. you’re seeing a lot of evictions and you cannot increase your
memory at the moment, Elastic search recommends a
temporary fix of limiting fielddata cache to 20 percent of heap;
Elastic search also recommends using doc values whenever
possible because they serve the same purpose as fielddata.
However, because they are stored on disk, they do not rely on
JVM heap. Although doc values cannot be used for analyzed
string fields, they do save field data usage when aggregating or
sorting on other types of fields.
227
IJRITCC | November 2017, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169
Volume: 5 Issue: 11 222 – 229
_______________________________________________________________________________________________
take much time, but merging 10,000 segments all the way settings. With this in place, the index will only
down to one segment can take hours. The more merging commit writes to disk upon every sync_interval,
that must occur, the more resources you take away from rather than after each request, leaving more of its
fulfilling search requests, which may defeat the purpose of resources free to serve indexing requests.
calling a force merge in the first place. It is a good idea to
schedule a force merging during non-peak hours, such as 5.5Bulk thread pool rejections
overnight, when you don’t expect many search or Thread pool rejections are typically a sign that you are sending
indexing requests. too many requests to your nodes, too quickly. If this is a
temporary situation you can try to slow down the rate of your
5.4Index-heavy workload. requests. However, if you want your cluster to be able to
Elastic search comes pre-configured with many settings to sustain the current rate of requests, you will probably need to
retain enough resources for searching and indexing data. scale out your cluster by adding more data nodes. In order to
However, if the usage of Elastic search is heavily skewed utilize the processing power of the increased number of nodes,
towards writes, it makes sense to tweak certain settings to you should also make sure that your indices contain enough
boost indexing performance, even if it means losing some shards to be able to spread the load evenly across all of your
search performance or data replication. nodes.
229
IJRITCC | November 2017, Available @ http://www.ijritcc.org
_______________________________________________________________________________________