0% found this document useful (0 votes)
12 views22 pages

ElasticSearch, A Quick Intro

Uploaded by

a-lala a-lala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views22 pages

ElasticSearch, A Quick Intro

Uploaded by

a-lala a-lala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Elasticsearch, a quick

intro

Ahmed El Taweel
@iAhmedeltaweel
[email protected]
Problem, Search!!
Why search is hard?

● Volume
● Complexity
● Diversity
● Search queries made wrong :D
Database Search

● Full scan
○ Slow
○ Complex
○ Slow, Slow, slow

● Full Text search ????


○ Works, but!
■ Auto complete / correct
Inverted index
explained!
Theory ES uses on inverted index algorithm to do
lockups

● Term dictionary
● Postings list
● Term vector

Diagram reference: here


Tokenization 101
Text Analysis

Tokenization Normalization

breaking a text down into smaller chunks the quick brown fox jumps
mostly words. ● ‘Quick’ can be lowercase: ‘quick’.
“Hello world from Ahmed” => [hello, world, ● ‘foxes’ can be stemmed, or reduced
from, ahmed] to its root word: ‘fox’.
● ‘jump’ and ‘leap’ are synonyms and
can be indexed as a single word:
‘jump’.

Diagram reference: here


Elasticsearch,
Really!
What

● 13 Years old. Apache Lucene. Java based.


● It provides a distributed, multitenant-capable.
● HTTP web interface. JSON documents.
● Commonly used for:
○ log analytics.
○ Full-text search.
○ Operational intelligence use cases with Kibana.
Relational DB Elasticsearch

DB server ES node

Table Index

Table Schema Mapping

Row Document

Field Column

Diagram reference here


Take care

“There ain't no such thing as a free lunch”

● Complexity
● Resource-intensive
● Data loss risk
● Query optimization
● Security
● Version compatibility

Near real-time ~1sec


Document Journey
Indexing

Diagram reference: here and here


Searching

Diagram reference: here


API Convention
The Elasticsearch APIs uses JSON
over HTTP.
API Types

Document APIs Single & multi-document API

Search APIs Search across all indices in ES

Aggregation API Aggregation for searched data

Index APIs Operation at the index level.

Cluster APIs Operation at the cluster level.


API Convention
check the cluster health >>> GET -> /_cat/health?v

List all nodes in cluster >>> GET -> /_cat/nodes?v

List all indexes >>> GET -> /_cat/indices?v

Create Index >>> PUT -> /customer?pretty

Index a document with id >>> PUT -> /customer/1?pretty


{"name": "John Doe"}

Index document without id >>> POST -> /customer?pretty { ... }

Retrieve a document by id >>> GET -> /customer/1?pretty

Search documents >>> GET /my_index/_search { … }

Delete an index >>> DELETE -> /customer?pretty


Demo

Materials: https://github.com/ahmedeltaweel/elasticsearch-session
Testing
Testing
● Query
○ Accuracy
■ Edge cases
○ Performance
■ Metrics
● Data
○ Consistency
○ Mapping
Q&A

You might also like