Elasticsearch: Ponel
Elasticsearch: Ponel
Ponel
Big Picture
- Fulltext search engine
- NoSql Database
- Apache Lucene Based (solr) - Java
- Inverted Index
- RESTFul Interface (http/json)
- Schemeless (Not really)
- Near Real Time (NRT)
- Search Multiple Index
Terms
RDBMS Elasticsearch
Database Index
Table Type
Row Document
- Cluster
- Shard/Replica
- limit is 2,147,483,519 (= Integer.MAX_VALUE - 128) documents
Create Index
PUT /blog
"acknowledged": true,
"shards_acknowledged": true,
"index": "blog"
}
CRUD
- Create / Update
POST /{index}/{type}/{id}
- Read
GET /{index}/{type}/{id}
- Delete
DELETE /{index}/{type}/{id}
Scheme
- Existing Field Cannot be Updated
- Exception
- new properties can be added to Object data type fields.
- new multi-fields can be added to existing fields.
- the ignore_above parameter can be updated.
PUT /{index}/_mapping/{type}
{
“properties”: {
“title”: {
“type”: “text”
}
}
}
Analyzed and No-Analyzed
- Text, by default is analyzed
- GET /_analyze
- built-in analyzer
- Standard
- Hey man, how are you doing? => hey|man|how|are|you|doing
- Whitespace
- Hey man, how are you doing? => Hey|man,|how|are|you|doing?
- Language
- Hey man, how are you doing? => Hai|man|how|are|you|do
- Custom
- Etc
Analyzed and No-Analyzed - Custom
PUT Blog
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
}
}
Filter and Query
Filter
- match or not?
- faster
- relevance doesn’t matter
- for non-analyzed data
Query