0% found this document useful (0 votes)

255 views6 pages

Elasticsearch: Getting Started With Elasticsearch

Elasticsearch is an open-source, highly scalable full-text search and analytics engine that allows users to crawl through large volumes of data rapidly. It is document-based, schema-less, and uses JSON documents. Elasticsearch is distributed by nature and uses shards and replicas to improve performance and avoid failures. Some major advantages are its ability to handle structured and unstructured data, scale easily, and support features like fuzzy search and faceted search. It is commonly used for product search, log analysis, and alerting applications. Compared to relational databases, Elasticsearch is more suitable for semi-structured data, uses eventual consistency over ACID transactions, and has no predefined schema.

Uploaded by

Yathindra sheshappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

255 views6 pages

Elasticsearch: Getting Started With Elasticsearch

Uploaded by

Yathindra sheshappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Elasticsearch

Getting Started with Elasticsearch

Elasticsearch is an open-source, highly scalable full-text search and analytics engine. You can crawl
through big volume of data rapidly with the help of Elasticsearch. Generally, it is used in applications
where complex search is required. It is developed in Java and licensed under Apache license version 2.0.
Nowadays, many big companies around the world are using it for their growth.

In this article, we will cover below topics:

 What is Elasticsearch?
 Elasticsearch Features
 Elasticsearch Architecture
 Advantages of Elasticsearch
 Elasticsearch Use-cases
 Elasticsearch Vs. RDBMS
 Elasticsearch Vs. MongoDB
 Elasticsearch Vs. Solr
 Current Demand and Future of Elasticsearch

What is Elasticsearch?
First, let us understand why Elasticsearch was invented. Consider one example where customers are
looking for some product information from huge product volume. But the system is taking too much
time for information retrieval due to large volume of data. This in turn leads to poor user experience and
there may be the chances to lose the potential customer due to same. RDBMS works slow when it
comes to large amount of data. To overcome this problem, Elasticsearch was invented.

Elasticsearch is a document-based system which stores, manages and retrieves document oriented or
semi-structured data. Data is stored in JSON document format in Elasticsearch. It is also schema-less. It
is a NoSQL database which uses Lucene search engine

Elasticsearch uses Query Domain Specific Language to interact with data. Here queries are written in
JSON format. With the help of Query DSL, we can accommodate all the complex logic in a single query.
Query DSL is designed to handle all real-world complex logics in a single query.

Let us explore Elasticsearch features to understand what it offers.

Elasticsearch Features
Below are features offered by Elasticsearch:

 Elasticsearch is best suitable for structured and unstructured data.

 Elasticsearch is an alternative document store for MongoDB and RavenDB.
 Elasticsearch has implemented denormalization to improve the performance of search.
 Many big organizations like Wikipedia, Github, StackOverflow uses Elasticsearch for their search
engine.
 It is an open source technology.
 It is easy to use and developer friendly environment.
 Elasticsearch community is very active and always try to ensure that Elasticsearch is compatible with
everything.

Elasticsearch Architecture
Elasticsearch is not a data store primarily. But technically yes, we can make it a data store. Elasticsearch
stores documents and its versions. If two process simultaneously starts writing to a document, latest
version will be kept. It doesn’t support ACID properties like database.

Let us understand its architecture by exploring below concepts.

 Nodes and Clusters

Node is defined as a single instance of Elasticsearch. Usually, it runs one instance for each machine.
Clusters are termed as a collection of nodes which communicate with each other to read/write to an
index. Cluster required a unique name to avoid unnecessary nodes to join the cluster. There is a
master node which manages the whole cluster. Master node is responsible for any changes to
clusters like adding a node, removing a node, creating or deleting indices etc. Each clusters and
nodes have unique names

Each node in a cluster contributes to the searching and indexing capabilities of cluster. For example,
if we have run some search query, each node will execute that to search through the data it stores.
Each node supports searching, indexing, manipulating of existing data.
 Documents and Indices
Whatever data item we store in cluster is nothing but the document. Document is a JSON object
here and we can relate it to rows in database terminology. For example, if you want to store a
student then you will add one object having name and standard as its properties. As we are aware
that data will be spread across all the nodes, but do we know how to organize it? These documents
are stored under indices. Index is defined on the collection of documents having similar properties
or we can say logically related. For instance, an index for orders’ data, products’ data and
customers’ data.

Documents have their unique ID, which can be assigned by Elasticsearch or by users while adding
them to index. Any document is uniquely identified by its ID and index. There is no limit to number
of documents being added to index.
Indices are also identified by its name. Their names can be used to search for any document.
 Shards and replicas
Elasticsearch uses Lucene technology for faster retrieval of data. It uses the power of Lucene index
in distributed system to retrieve data extremely fast. Shards are termed as individual instances of
Lucene index. As data volume increases, index performance also slows down. To overcome this,
Elasticsearch uses shards to divide indexes and multiple pieces. Shards are important due to below
two reasons.
1. Shards enable us to divide the content horizontally
2. Shards allow parallel operations across multiple nodes which in turn increases performance.
Replicas are invented to avoid any unexpected network failure. Replica shards as its name implies
are replicas of index’s shards. Replicas are important in Elasticsearch architecture for below 2
reasons.
1. In case of shard or node failure, it will act as a life savior for Elasticsearch. Replica shard is
never associated to that node on which primary shard is defined
2. Due to replica shards, we can increase the throughput and performance as parallel search
can happen on replica shards as well.
While creating index, we can choose number of shards and its replicas. Although, we can change
number of replicas dynamically anytime.

Elasticsearch Advantages
Below are few advantages of Elasticsearch:

 Elasticsearch is built on Lucene – a full-featured information retrieval library. So, it gives the
most efficient and powerful full-text search capabilities of an open source product. It will be
great as it is widely known by developers.
 Elasticsearch has implemented a lot of features like Facetted search, customized stemming,
customized splitting text into words, etc.
 Elasticsearch supports fuzzy search. As you can find even though there are spelling mistakes in
search text.
 Elasticsearch supports intelisense feature which autocompletes your search text by predicting
your search based on your search history or completing your text with existing tags. For
example, Google search.
 As Elasticsearch is API driven, any action can be performed using a RESTful API.
 Elasticsearch stores any changes in data in transaction loss which reduces the risk of data loss.
 As Elasticsearch is distributed in nature, it is very easy to scale and integrate Elasticsearch in any
organization.
 Elasticsearch supports faceted search which is like having multiple filters on data along with
classification system over them. This search is more robust in nature than normal text-search.
 Elasticsearch implements multi-tenancy in a better way as a large Elasticsearch index.
 Using Elasticsearch’s query DSL, it is very easy to prepare complex queries and tune them
precisely. Moreover, query DSL provides a way to rank and group the results.
 As Elasticsearch uses JSON objects, it is very easy to communicate with other various
programming languages.

Elasticsearch Use-cases
Below are few use-cases for Elasticsearch:

 An online store which allows its customers to explore all the products they sell. In this case, you
can use Elasticsearch to store the whole product inventory and catalog. It also allows user to
search and use autocomplete option.
 Consider a scenario where you need to store log or transactions which you can use to analyze
trends, summarizations, anomalies or statistics. In this case, you can use Logstash, a part of ELK
Stack (Elasticsearch/Logstash/Kibana), to store and parse your data. Logstash helps you to feed
data into Elasticsearch.
 Have you seen the button “Notify me if item in stock” or “Notify me if price of this item falls
down” on e-commerce sites? This feature can be achieved with the help of Elasticsearch. Using
Elasticsearch, you can reverse-search and have a watch on price movements or stock
movements and send the alerts to customers once conditions are satisfied.
 Consider the requirement where you need to quickly analyze the data and visualize it. In this
case, Kibana can be best used with Elasticsearch. Elasticsearch is used to store data and Kibana
visualize that data in various custom dashboards. Kibana is a part of ELK Stack (Elasticsearch,
Logstash, Kibana).

Elasticsearch Vs. RDBMS

Let us compare how Elasticsearch is different than RDBMS.

Elasticsearch RDBMS
Semi-structured or unorganized data Structured and organized data
Eventual Consistency Tight Consistency
BASE transactions ACID transactions
No Pre-defined Schema Data and relationships stored in tables.
Index Database
Shard Partition
Type Table
Document Row
Field Column
Mapping Schema
Everything is indexed Index
Query DSL SQL

Elasticsearch Vs. MongoDB

Following table depicts the comparison between Elasticsearch and MongoDB.

Feature Elasticsearch MongoDB

Flexibility Schema-precise Schema-flexible
Speed Speed remains constant Speed can be increased by
irrespective of volume of data adding more shards. But speed
will drop if volume of data
increases
Security Paid plug in is required to User management by roles
manage access rights
Scalability Simplified scalability Horizontal scalability better than
RDBMS
Concurrency Yes Yes
Consistency Eventual Consistency Eventual Consistency
Replication Methods Master-slave replication Yes
Partitioning Methods Sharding Sharding
Transaction Concepts No No

Elasticsearch Vs. Solr

Below is the comparison between Elasticsearch and Solr.

Feature Elasticsearch Solr

License Open Source Open Source
Implementation Language Java Java

Data Schema Schema Free Yes

OS All OS with JVM All OS with JVM and servlet
container
Secondary Indices Yes Yes
Partitioning Methods Sharding Sharding
MapReduce With Hadoop Integration No
Consistency Eventual Consistency Eventual Consistency
Transaction Concepts No Optimistic Locking
Concurrency Yes Yes
APIs Java, RESTful, HTTP/JSON API Java, RESTful, HTTP API

Supported Programing .NET, Java, JavaScript, Perl, Scala, .NET, Java, JavaScript, Perl, Scala,
Languages PHP, Python, Ruby, Erlang PHP, Python, Ruby, Erlang, XML

Indexing/Searching Better performance of analytical Text-oriented

queries

Documentation Lack in documentation Very well documented

Installation and Configuration More intuitive Detailed documentation

Current Demand and Future of Elasticsearch

Elasticsearch is the most popular, open source, distributed, cross-platform and scalable search engine.
Elasticsearch is growing exponentially since 2010 and making a remarkable impression all over the IT
industry. Due to its exponential growth, there is a very high demand of talents having Elasticsearch skills.
IT professionals having knowledge of Elasticsearch are hired with a great salary and valued more. It is
trending in IT industry as it has a very bright future due it its capabilities to handle large amount of data
and faster search.

Conclusion
Elasticsearch stands out from all its competitors as it is highly scalable and widely distributed in nature.
If you have large volume of data and you want a faster search, then there is no way you can find
anything which is as good as Elasticsearch.

Kafka Interview Questions
No ratings yet
Kafka Interview Questions
60 pages
Elasticsearch
No ratings yet
Elasticsearch
15 pages
WsCube Tech Online MERN Stack Course
No ratings yet
WsCube Tech Online MERN Stack Course
24 pages
3 Lecture 3-ETL
100% (1)
3 Lecture 3-ETL
42 pages
Slide 5-6 Kafka
No ratings yet
Slide 5-6 Kafka
111 pages
Tomcat
100% (1)
Tomcat
36 pages
Angular 7 201 300
No ratings yet
Angular 7 201 300
100 pages
DB Campus Drive Preparation Materials Geeks4Geeks
No ratings yet
DB Campus Drive Preparation Materials Geeks4Geeks
14 pages
Spring Boot Questions
No ratings yet
Spring Boot Questions
9 pages
AIA 6600 - Module 4 - Using Quest With US Census Data
100% (1)
AIA 6600 - Module 4 - Using Quest With US Census Data
6 pages
Lecture 07 - Key-Value Databases
No ratings yet
Lecture 07 - Key-Value Databases
75 pages
Basics of Mongodb-Connectivity
No ratings yet
Basics of Mongodb-Connectivity
26 pages
Nursing Informatics Framework
100% (1)
Nursing Informatics Framework
3 pages
Spring Cloud Dataflow Server Cloudfoundry Reference
No ratings yet
Spring Cloud Dataflow Server Cloudfoundry Reference
133 pages
SQL Scenario-Based Interview Questions & Answers: Nitya Cloudtech PVT LTD
No ratings yet
SQL Scenario-Based Interview Questions & Answers: Nitya Cloudtech PVT LTD
8 pages
Spring Core&Boot
No ratings yet
Spring Core&Boot
40 pages
Spark A To Z
No ratings yet
Spark A To Z
63 pages
Azure Cloud Intro
No ratings yet
Azure Cloud Intro
34 pages
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
No ratings yet
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
40 pages
Zookeeper: Coordinating Your Cluster
No ratings yet
Zookeeper: Coordinating Your Cluster
13 pages
Cassandra DBA
No ratings yet
Cassandra DBA
5 pages
Hadoop Distributed File System (HDFS) : Suresh Pathipati
No ratings yet
Hadoop Distributed File System (HDFS) : Suresh Pathipati
43 pages
Hadoop Ecosystem PDF
No ratings yet
Hadoop Ecosystem PDF
55 pages
Course 6: Entity Relationship Diagrams: 1. Basic Elements and Rules
No ratings yet
Course 6: Entity Relationship Diagrams: 1. Basic Elements and Rules
46 pages
Kafka and Mongodb
No ratings yet
Kafka and Mongodb
15 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
37 pages
Connect Limits-Service To Spring Cloud
No ratings yet
Connect Limits-Service To Spring Cloud
18 pages
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
No ratings yet
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
11 pages
NetBackup102 WebUIGuide MySQLAdmin
No ratings yet
NetBackup102 WebUIGuide MySQLAdmin
38 pages
Mongodb Cookbook: Chapter No.1 "Installing and Starting The Mongodb Server"
100% (1)
Mongodb Cookbook: Chapter No.1 "Installing and Starting The Mongodb Server"
40 pages
Performance and Tuning: Oracle Initialization Parameters Used in The Compilation of PLSQL Units
No ratings yet
Performance and Tuning: Oracle Initialization Parameters Used in The Compilation of PLSQL Units
19 pages
Angular Version Differences
No ratings yet
Angular Version Differences
10 pages
SS1123 - D2T - Apache Cassandra Overview PDF
100% (1)
SS1123 - D2T - Apache Cassandra Overview PDF
45 pages
Hbase PDF
No ratings yet
Hbase PDF
8 pages
Hands On Database 2nd Edition Conger Test Bank
100% (48)
Hands On Database 2nd Edition Conger Test Bank
17 pages
2 Hadoop (Uploaded)
No ratings yet
2 Hadoop (Uploaded)
82 pages
1 Apache Zookeeper
No ratings yet
1 Apache Zookeeper
7 pages
Cloudurable Kafka Tutorial v1 PDF
No ratings yet
Cloudurable Kafka Tutorial v1 PDF
79 pages
Architecture Best Practices
No ratings yet
Architecture Best Practices
27 pages
SB NG Docker K8S Project Setup
No ratings yet
SB NG Docker K8S Project Setup
7 pages
Teradata Interview Questions and Answers
No ratings yet
Teradata Interview Questions and Answers
21 pages
Talend Open Studio For Data Integration: User Guide
No ratings yet
Talend Open Studio For Data Integration: User Guide
452 pages
Hadoop Overview Training Material
No ratings yet
Hadoop Overview Training Material
44 pages
JVM (Java Virtual Machine)
No ratings yet
JVM (Java Virtual Machine)
34 pages
Salesforce Lightning Components Tutorial
100% (1)
Salesforce Lightning Components Tutorial
24 pages
Database Management Final Exam PDF Free
No ratings yet
Database Management Final Exam PDF Free
31 pages
GemFire Architecture
No ratings yet
GemFire Architecture
72 pages
Data Quality Training
No ratings yet
Data Quality Training
9 pages
MongoDB Pagination
No ratings yet
MongoDB Pagination
6 pages
Spark Training in Bangalore
No ratings yet
Spark Training in Bangalore
36 pages
Parallel Programming With Spark: Matei Zaharia
No ratings yet
Parallel Programming With Spark: Matei Zaharia
40 pages
Basic MongoDB Commands
No ratings yet
Basic MongoDB Commands
2 pages
Apache Spark Theory by Arsh
No ratings yet
Apache Spark Theory by Arsh
4 pages
HBase Interview Questions
No ratings yet
HBase Interview Questions
12 pages
Apache Cassandra
No ratings yet
Apache Cassandra
7 pages
Docker - Part1
No ratings yet
Docker - Part1
3 pages
JPA
No ratings yet
JPA
12 pages
Scala PDF
No ratings yet
Scala PDF
29 pages
What Is RDBMS - Javatpoint
No ratings yet
What Is RDBMS - Javatpoint
3 pages
Salesforce Lightning Components Tutorial
No ratings yet
Salesforce Lightning Components Tutorial
11 pages
SHIVA KUMARA - JavaArchitect
No ratings yet
SHIVA KUMARA - JavaArchitect
9 pages
Why Mongodb Is So Popular?
No ratings yet
Why Mongodb Is So Popular?
3 pages
What Are The Different Type of SQL's Statements
No ratings yet
What Are The Different Type of SQL's Statements
10 pages
24 Hadoop Interview Questions & Answers For MapReduce Developers - FromDev
No ratings yet
24 Hadoop Interview Questions & Answers For MapReduce Developers - FromDev
7 pages
Cassandra Installation Review
No ratings yet
Cassandra Installation Review
6 pages
MongoDB Indexing PDF
No ratings yet
MongoDB Indexing PDF
3 pages
Turban Bi2e PP Ch02
No ratings yet
Turban Bi2e PP Ch02
48 pages
Opensource Column Store Databases - MariaDB ColumnStore vs. ClickHouse - FileId - 188040
No ratings yet
Opensource Column Store Databases - MariaDB ColumnStore vs. ClickHouse - FileId - 188040
50 pages
Oracle Ilearning
100% (1)
Oracle Ilearning
9 pages
Raheem Mam Amina
No ratings yet
Raheem Mam Amina
3 pages
Joins in SQL Server
No ratings yet
Joins in SQL Server
7 pages
Week1 DBMS PDF
No ratings yet
Week1 DBMS PDF
50 pages
99 Ejemplos Prácticos de Aplicaciones Neumáticas Stefan Hesse 1ra Edición PDF
No ratings yet
99 Ejemplos Prácticos de Aplicaciones Neumáticas Stefan Hesse 1ra Edición PDF
122 pages
Sgraup Resume
No ratings yet
Sgraup Resume
2 pages
File Management Lecture-Final PDF
No ratings yet
File Management Lecture-Final PDF
4 pages
Durant Photographyperformance 2010
No ratings yet
Durant Photographyperformance 2010
9 pages
ICT Notes+Chapter+6.10+Recognition+Systems
No ratings yet
ICT Notes+Chapter+6.10+Recognition+Systems
9 pages
Minor Project PPT Project Management System
No ratings yet
Minor Project PPT Project Management System
18 pages
Cambridge International AS & A Level: Information Technology 9626/12
No ratings yet
Cambridge International AS & A Level: Information Technology 9626/12
16 pages
Shahbaz Sharif
No ratings yet
Shahbaz Sharif
6 pages
COE 102 Project Phase 1 - Guided Template
No ratings yet
COE 102 Project Phase 1 - Guided Template
9 pages
HW-1 Answers DBMT Key
No ratings yet
HW-1 Answers DBMT Key
9 pages
Shoe Store Database Design
No ratings yet
Shoe Store Database Design
11 pages
Elyn Joy PPT in BME 2
No ratings yet
Elyn Joy PPT in BME 2
8 pages
2 - Database As A Service - Current Issues and Its Future - Zheng2018
No ratings yet
2 - Database As A Service - Current Issues and Its Future - Zheng2018
5 pages
1.2 DBMS
No ratings yet
1.2 DBMS
17 pages
340-Article Text-644-1-10-20210531
No ratings yet
340-Article Text-644-1-10-20210531
14 pages
Organize Like An Archivist Handout
No ratings yet
Organize Like An Archivist Handout
2 pages
DBMS Ut 1
No ratings yet
DBMS Ut 1
1 page
Purge and Archival Process in Pega
No ratings yet
Purge and Archival Process in Pega
3 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Mastering Apache Cassandra - Second Edition
From Everand
Mastering Apache Cassandra - Second Edition
Nishant Neeraj
No ratings yet
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
ColdFusion Interview Questions, Answers, and Explanations: ColdFusion Certification Review
From Everand
ColdFusion Interview Questions, Answers, and Explanations: ColdFusion Certification Review
equitypress
No ratings yet

Elasticsearch: Getting Started With Elasticsearch

Uploaded by

Elasticsearch: Getting Started With Elasticsearch

Uploaded by

Elasticsearch

Getting Started with Elasticsearch

In this article, we will cover below topics:

Let us explore Elasticsearch features to understand what it offers.

 Elasticsearch is best suitable for structured and unstructured data.

Let us understand its architecture by exploring below concepts.

 Nodes and Clusters

Elasticsearch Vs. RDBMS

Elasticsearch Vs. MongoDB

Feature Elasticsearch MongoDB

Elasticsearch Vs. Solr

Feature Elasticsearch Solr

Data Schema Schema Free Yes

Indexing/Searching Better performance of analytical Text-oriented

Documentation Lack in documentation Very well documented

Installation and Configuration More intuitive Detailed documentation

Current Demand and Future of Elasticsearch

You might also like