ElasticSearch Interview Questions and Answers For Freshers
ElasticSearch Interview Questions and Answers For Freshers
1) What is Elasticsearch?
Elasticsearch is a NoSQL database. It is based on the Lucene search engine, and it is built with RESTful
APIS. It offers simple deployment, maximum reliability, and easy management. It also provides
advanced queries to perform detailed analysis and stores all the data centrally. It helps execute a
quick search of the documents.
3) What is a Cluster?
A cluster is a collection of nodes which together holds data and provides joined indexing and search
capabilities.
4) Explain Index
A node is an elastic search Instance. It is created when an elasticsearch instance begins.
13) What Are The Main Operations You Can Perform On A Document?
Here, are important operation performed on documents:
Indexing a document
Fetching documents
Updating documents
Deleting documents
26) What are the various commands available in Elasticsearch cat API?
Command using with cat API are:
Cat aliases, cat allocation, cat count, cat field data
Cat health, cat indices, cat master, pending tasks, cat plugins, cat recovery
cat repositories, cat snapshots, cat templates
overview Of ElasticSearch
Elasticsearch is an open-source, RESTful, scalable, built on Apache Lucene library, document-based
search engine. It stores retrieve and manage textual, numerical, geospatial, structured and
unstructured data in the form of JSON documents using CRUD REST API or ingestion tools such as
Logstash.
You can use Kibana, an open-source visualization tool, with Elasticsearch to visualize your data and
build interactive dashboards for Analysis.
Elasticsearch, Apache Lucene search engine is a JSON document, which is indexed for faster searching.
Due to indexing, user can search text from JSON documents within 10 seconds.
[image source]
Q #10) Can you please define Mapping in an Elasticsearch?
Answer: Mapping is the outline of the documents stored in an index. The mapping defines how a
document is indexed, how its fields are indexed and stored by Lucene.
Q #11) What is a Document with respect to Elasticsearch?
Answer: A document is a JSON document that is stored in Elasticsearch. It is equivalent to a row in a
relational database table.
Q #12) Can you explain SHARDS with regards to Elasticsearch?
Answer: When the number of documents increases, hard disk capacity, and processing power will not
be sufficient, responding to client requests will be delayed. In such a case, the process of dividing
indexed data into small chunks is called Shards, which improves the fetching of results during data
search.
Q #13) Can you define REPLICA and what is the advantage of creating a replica?
Answer: A replica is an exact copy of the Shard, used to increase query throughput or achieve high
availability during extreme load conditions. These replicas help to efficiently manage requests.
Q #14) Please explain the procedure to add or create an index in Elasticsearch Cluster?
Answer: To add a new index, create an index API option should be used. The parameters required to
create the index is Configuration setting of an index, Fields mapping in the index as well as Index
aliases
Q #15) What is the syntax or code to delete an index in Elasticsearch?
Answer: You can delete an existing index using the following syntax:
DELETE /<index_name>
_all or * can be used to remove/delete all the indices
Q #16) What is the syntax or code to list all indexes of a Cluster in Elasticsearch?
Answer: You can get the list of indices present in the cluster using the following syntax:
GET /_<index_name>
Q #17) Can you tell me the syntax or code to add a Mapping in an Index?
Answer: You can add a mapping in an index using the following syntax:
POST /_<index_name>/_type/_id
Similarly, when we search for a document (a record) from Elasticsearch, you are interested in getting
the relevant information that you are looking for. Based on the relevance, the probability of getting the
relevant information is calculated by the Lucene scoring algorithm.
The Lucene technology helps to search a particular record i.e. document which is indexed based on the
frequency of the term in search appearing in the document, how often its appearance across an index
and query which is designed using various parameters.
Q #20) What are the various possible ways in which we can perform a search in
Elasticsearch?
Answer:
Mentioned below are the various possible ways in which we can perform a search in
Elasticsearch:
Applying search API across multiple types and multiple indexes: Search API, we can
search an entity across multiple types and indices.
Search request using a Uniform Resource Identifier: We can search requests using
parameters along with URI i.e. Uniform Resource Identifier.
Search using Query DSL i.e. (Domain Specific Language) within the body: DSL i.e.
Domain Specific Language is utilized for JSON request body.
Q #21) What are the various types of queries that Elasticsearch supports?
Answer: Queries are mainly divided into two types: Full Text or Match Queries and Term based
Queries.
Text Queries such as basic match, match phrase, multi-match, match phrase prefix, common terms,
query-string, simple query string.
Term Queries such as term exists, type, term set, range, prefix, ids, wildcard, regexp and, fuzzy.
Q #22) Can you compare between Term-based queries and Full-text queries?
Answer: Domain Specific Language (DSL) Elasticsearch query which is known as Full-text
queries utilizes the HTTP request body, offers the advantage of clear and detailed in their intent, over
time it is simpler to tune these queries.
Term based queries utilize the inverted index, a hash map-like data structure that helps to locate
text or string from the body of email, keyword or numbers or dates, etc. used in analysis purposes.
Q #23) Please explain the working of aggregation in Elasticsearch?
Answer: Aggregations help in the collection of data from the query used in the search. Different types
of aggregations are Metrics, Average, Minimum, Maximum, Sum and stats, based on different
purposes.
Q #24) Can you tell me data storage functionality in Elasticsearch?
Answer: Elasticsearch is a search engine used as storage and searching complex data structures
indexed and serialized as a JSON document.
Q #25) What is an Elasticsearch Analyzer?
Answer: Analyzers are used for Text analysis, it can be either built-in analyzer or custom analyzer.
The analyzer consists of zero or more Character filters, at least one Tokenizer and zero or more Token
filters.
Character filters break down the stream of string or numerical into characters by stripping out
HTML tags, searching the string for key and replacing them with the related value defined in
mapping char filter as well as replace the characters based on a specific pattern.
Tokenizer breaks the stream of string into characters, For example, whitespace tokenizer
breaks the stream of string while encountering whitespace between characters.
Token filters convert these tokens into lower case, remove from string stop words like ‘a’, ‘an’,
‘the’. or replace characters into equivalent synonyms defined by the filter.
Q #26) Can you list various types of analyzers in Elasticsearch?
Answer: Types of Elasticsearch Analyzer are Built-in and Custom.
Built-in analyzers are further classified as below:
Standard Analyzer: This type of analyzer is designed with standard tokenizer which breaks
the stream of string into tokens based on maximum token length configured, lower case token
filter which converts the token into lower case and stops token filter, which removes stop
words such as ‘a’, ‘an’, ‘the’.
Simple Analyzer: This type of analyzer breaks a stream of string into a token of text
whenever it comes across numbers or special characters. A simple analyzer converts all the
text tokens into lower case characters.
Whitespace Analyzer: This type of analyzer breaks the stream of string into a token of text
when it comes across white space between these string or statements. It retains the case of
tokens as it was in the input stream.
Stop Analyzer: This type of analyzer is similar to that of the simple analyzer, but in addition
to it removes stop words from the stream of string such as ‘a’, ‘an’, ‘the’. The complete list of
stop words in English can be found from the link.
Keyword Analyzer: This type of analyzer returns the entire stream of string as a single token
as it was. This type of analyzer can be converted into a custom analyzer by adding filters to it.
Pattern Analyzer: This type of analyzer breaks the stream of string into tokens based on the
regular expression defined. This regular expression acts on the stream of string and not on the
tokens.
Language Analyzer: This type of analyzer is used for specific language texts analysis. There
are plug-ins to support language analyzers. These plug-ins are Stempel, Ukrainian Analysis,
Kuromoji for Japanese, Nori for Korean and Phonetic plugins. There are additional plug-ins for
Indian as well as non-Indian languages such as Asian languages ( Example, Japanese,
Vietnamese, Tibetan) analyzers.
[image source]
Fingerprint Analyzer: The fingerprint analyzer converts the stream of string into lower case,
removes extended characters, sorts and concatenates into a single token.
Q #27) How can Elasticsearch Tokenizer be used?
Answer: Tokenizers accept a stream of string, break them into individual tokens and display output as
collection/array of these tokens. Tokenizers are mainly grouped into word-oriented, partial word, and
structured text tokenizers.
Q #28) How do Filters work in an Elasticsearch?
Answer: Token filters receive text tokens from tokenizer and can manipulate them to compare the
tokens for search conditions. These filters compare tokens with the searched stream, resulting in
Boolean value, like true or false.
The comparison can be whether the value for searched condition matches with filtered token texts, OR
does not match, OR matches with one of the filtered token text returned OR does not match any of the
specified tokens, OR value of the token text is within given range OR is not within a given range, OR
the token texts exist in search condition or does not exist in the search condition.
Q #31) What are functionalities of attributes such as enabled, index and store in
Elasticsearch?
Answer:
Enabled attribute of Elasticsearch is applied in the case where we need to retain and store a
particular field from indexing. This is done by using “enabled”: false syntax into the top-level
mapping as well as to object fields.
Index attribute of Elasticsearch will decide three ways in which a stream of string can be indexed.
‘analyzed’ in which string will be analyzed before it is subjected to indexing as a full-text field.
‘not_analyzed’ index the stream of string to make it searchable, without analyzing it.
‘no’ – where the string will not be indexed at all, and will not be searchable as well.
Irrespective of setting the attribute ‘store’ to false, Elasticsearch stores the original document on the
disk, which searches as quickly as possible.
Q #32) How does a character filter in Elasticsearch Analyzer utilized?
Answer: Character filter in Elasticsearch analyzer is not mandatory. These filters manipulate the input
stream of the string by replacing the token of text with corresponding value mapped to the key.
We can use mapping character filters that use parameters as mappings and mappings_path. The
mappings are the files that contain an array of key and corresponding values listed, whereas
mappings_path is the path that is registered in the config directory that shows the mappings file
present.
REST API is platform and language independent except that the language used for data exchange will
be XML or JSON.
Q #35) While installing Elasticsearch, please explain different packages and their
importance?
Answer: Elasticsearch installation includes the following packages:
Linux and macOS platform needs tar.gz archives to be installed.
Windows operating system requires .zip archives to be installed.
Debian, Ubuntu-based systems deb pack needs to be installed.
Red Hat, Centos, OpenSuSE, SLES needs rpm package to be installed.
Windows 64 bits system requires the MSI package to be installed.
Docker images for running Elasticsearch as Docker containers can be downloaded from Elastic
Docker Registry.
X-Pack API packages are installed along with Elasticsearch that helps to get information on the
license, security, migration, and machine learning activities that are involved in Elasticsearch.
Q #36) What are configuration management tools that are supported by Elasticsearch?
Answer: Ansible, Chef, Puppet and Salt Stack are configuration tools supported by Elasticsearch used
by the DevOps team.
Q #37) Can you please explain the functionality and importance of the installation of X-
Pack for Elasticsearch?
Answer: X-Pack is an extension that gets installed along with Elasticsearch. Various functionalities of
X-Pack are security (Role-based access, Privileges/Permissions, Roles and User security), monitoring,
reporting, alerting and many more.
Q #38) Can you list X-Pack API types?
Answer: X-Pack API types are listed as below:
(i) Info API: It provides general information on features of X-Pack installed, such as Build info, License
info, features info.
Info API – xPack API:
(ii) Graph Explore API: Explore API helps to retrieve and summarize documents information versus
terms of Elasticsearch indices.
(iii) Licensing APIs: This APIs helps to manage licenses such as to get trial Status, Starting Trial, get
basic status, start basic, start the trial, update license and delete license.
GET license
(iv) Machine learning APIs: These APIs perform tasks related to calendar such as create a calendar,
add and delete the job, add and delete scheduled events to the calendar, get the calendar, get
scheduled events, delete calendar, filter tasks such as create, update, get and delete the filter, data
feeds tasks like create, update, start, stop, preview and delete data feed, get data feed info/statistics.
Jobs tasks like create, update, open, close, delete the job, add or delete job to calendar, get job
info/statistics, various other tasks related to model snapshots, results, file structure as well as expired
data also are included in machine learning API.
(v) Security APIs: These API are utilized to perform X-Pack security activities, such as Authenticate,
clear cache, Privilege and SSL Certificate related security activities.
(vi) Watcher APIs: These API helps to watch or observe new documents added into Elasticsearch.
(vii) Rollup APIs: These API has been introduced for verifying the functionalities in the experimental
stage, which may be removed in the future from Elasticsearch.
(viii) Migration APIs: These API upgrades X-Pack index from the previous version to the latest
version.
Q #39) Can you list X-Pack commands?
Answer: X-Pack commands are listed below:
Certgen
Migrate
setup-passwords
syskeygen
users
Q #40) What is the functionality of cat API in Elasticsearch?
Answer: cat API commands give an analysis, overview, and health of Elasticsearch cluster which
include information related to aliases, allocation, indices, node attributes to name a few. These cat
commands use query string as its parameter which returns headers and their corresponding
information from the JSON document.
Q #41) What are the cat commands from cat API used in Elasticsearch?
Answer:
Enlisted below are the cat commands listed from cat API:
(i) Aliases – GET _cat/aliases?v –This command display mapping of alias with indices, routing as well
as filtering information.
(ii) Allocation – GET _cat/allocation?v –This command display disk space allocated for indices as
well as shards count on each node.
(iii) Count – GET _cat/count?v – This command shows how many documents are present in the
Elasticsearch cluster.
(iv) Fielddata – GET _cat/fielddata?v – This displays the amount of memory utilized by each of the
fields per node.
(v) Health – GET _cat/health?v – It displays cluster status like since how long it is up and running,
node counts it has, etc. to analyze cluster health.
(vi) Indices – GET _cat/indices?v – cat indices API gives us information on several shards, document,
deleted document, store sizes of all the shards including their replicas.
(vii) Master – GET _cat/master?v – It displays information that shows the master node that has been
elected.
Every possible area of ElasticSearch, as well as ELK stack, questions related to various analyzers,
filters, token filters and APIs used in ElasticSearch, has been asked as interview questions with most
technical answers to each of the questions.
We hope you have found the answers to the most frequently asked interview questions. Practice, refer
and revise these Elasticsearch Interview questions and answers to perform confidently in the technical
interview