ElasticSearch, A Quick Intro
ElasticSearch, A Quick Intro
intro
Ahmed El Taweel
@iAhmedeltaweel
[email protected]
Problem, Search!!
Why search is hard?
● Volume
● Complexity
● Diversity
● Search queries made wrong :D
Database Search
● Full scan
○ Slow
○ Complex
○ Slow, Slow, slow
● Term dictionary
● Postings list
● Term vector
Tokenization Normalization
breaking a text down into smaller chunks the quick brown fox jumps
mostly words. ● ‘Quick’ can be lowercase: ‘quick’.
“Hello world from Ahmed” => [hello, world, ● ‘foxes’ can be stemmed, or reduced
from, ahmed] to its root word: ‘fox’.
● ‘jump’ and ‘leap’ are synonyms and
can be indexed as a single word:
‘jump’.
DB server ES node
Table Index
Row Document
Field Column
● Complexity
● Resource-intensive
● Data loss risk
● Query optimization
● Security
● Version compatibility
Materials: https://github.com/ahmedeltaweel/elasticsearch-session
Testing
Testing
● Query
○ Accuracy
■ Edge cases
○ Performance
■ Metrics
● Data
○ Consistency
○ Mapping
Q&A