Elastic DB Engineer
Elastic DB Engineer
elastic.co/training
Le
ar
nHd
.Lo
.t CE
Hnt
IeN
rp-r
0is2
e-A
Spu
rp-2
p0o
An Elastic Training Course
1rt9
- -0N
1-I
ASp
Sr-O
2L0
U19
TI
-OC
Nu
sSt
Dom
N
WB
aHr
De
Elasticsearch Engineer II
Elasticsearch Engineer II
Course: Elasticsearch Engineer II
Version 6.6.0
© 2015-2019 Elasticsearch BV. All rights reserved. Decompiling, copying, publishing and/or distribution without written consent of Elasticsearch BV is
strictly prohibited.
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
• In case of problems, try the following steps in order:
sSt
Nu
-OC
TI
U19
2L0
Sr-O
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
‒ create an account
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
‒ you will need an access token, which the instructor will provide
ar
Le
Agenda and
IeN
rp-r
Introductions
0is2
e-A
Spu
rp-2
p0o
1rt9
- -0N
1-I
ASp
Sr-O
2L0
U19
TI
-OC
Nu
sSt
Dom
N
WB
aHr
De
About This Training
• Environment
• Introductions
• Agenda...
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
2 Field Modeling
3 Fixing Data
5 Cluster Management
De
aHr
WB
6 Capacity Planning
N
Dom
sSt
Nu
-OC
TI
U19
7 Document Modeling
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
8
p0o
9
r
IeN
De
aHr
WB
N
Dom
‒ and therefore we will use two datasets in the course…
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
{
"publish_date": "2017-11-10T07:00:00.000Z",
"seo_title": "Apply for an Elastic{ON} Opportunity Grant Today!",
"category": "News",
"locales": "de-de,fr-fr",
"title": "Apply for an Elastic{ON} Opportunity Grant Today!",
"content": " For the past few years, our developer relations team has been
running an informal scholarship program of sorts to help folks from
De
aHr
underrepresented groups in technology attend Elastic{ON}. ...",
WB
N
Dom
"author": "Anna Ossowski",
sSt
Nu
"url": "/blog/apply-for-an-elasticon-opportunity-grant-today" -OC
TI
U19
}
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
"status_code": 404,
WB
N
Dom
"runtime_ms": 142,
sSt
"geoip": {
Nu
"country_code3": "US", -OC
TI
U19
2L0
"location": {
Sr-O
ASp
"lon": -122.1206,
1-I
-0N
"lat": 47.6801
-
1rt9
},
p0o
rp-2
"region_name": "Washington",
Spu
e-A
"city_name": "Redmond",
0is2
p-r
"country_code2": "US",
r
IeN
Hnt
"continent_code": "NA"
nHd
ar
},
Le
"originalUrl": "/blog/2011/08/05/0.17.4-released.html",
"level": "info"
}
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
}, }
aHr
WB
N
Dom
"category": { }
sSt
"type": "text", },
Nu
-OC
TI
"fields": { "continent_code": {
U19
2L0
"type": "text",
Sr-O
"keyword": {
ASp
"keyword": {
-
"ignore_above": 256
1rt9
p0o
rp-2
} "type": "keyword",
Spu
e-A
} "ignore_above": 256
0is2
p-r
}, }
r
IeN
Hnt
"publish_date": { }
.t CE
.Lo
"type": "date" },
nHd
ar
}, ...
Le
...
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 1
De
Elasticsearch
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
Internals
TI
U19
De
• Caching
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC node1
TI
U19
2L0
Sr-O
hash("551") % 5 = 3
ASp
1-I
-0N
-
1rt9
P1 P2
p0o
rp-2
Spu
e-A
0is2
P3
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
P4 P0
ar
Le
shard
De
aHr
node1
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
P1 P2
Sr-O
What’s in a shard?
ASp
1-I
-0N
-
1rt9
P3
p0o
rp-2
Spu
e-A
0is2
P4 P0
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT blogs/_doc/551
analyzed document
buffer
PUT blogs/_doc/213
De
aHr
buffer
WB
N
Dom
sSt
Nu
PUT blogs/_doc/614
-OC
full buffer
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
buffer
rp-2
Spu
Lucene flush
e-A
0is2
segment1
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT blogs/_doc/5117
segment1
buffer
PUT blogs/_doc/31
segment1
refresh_interval limit
(default 1 second)
De
aHr
buffer
WB
N
Dom
Lucene flush
sSt
segment1
Nu
-OC segment2
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
indices.memory.index_buffer_size: 5% -OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
segment1 segment1
p0o
segment1
rp-2
segment2 segment2
Spu
segment
e-A
0is2
segment segment2
p-r
r
IeN
segment
Hnt
.t CE
.Lo
nHd
ar
Le
De
}
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
PUT my_index/_doc/102/?refresh=wait_for
U19
2L0
Sr-O
{
ASp
1-I
-0N
"firstname" : "James",
-
1rt9
p0o
"lastname" : "Brown",
rp-2
Spu
e-A
"city" : "Detroit"
r
IeN
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
• A search request is distributed to the shards and on each TI
U19
Nu
-OC
2L0
shard3
segment_0
segment_1
De
aHr
27
WB
N
Dom
13 segment_2
sSt
Nu
-OC
TI
U19
2L0
7
Sr-O
25 31
ASp
19
1-I
-0N
inverted indices of
rp-2
37 85
Spu
e-A
49 multiple documents
0is2
p-r
r
IeN
Hnt
.t CE
67
.Lo
nHd
ar
Le
PUT my_index/_doc/27
{
"author": "Uri",
"category": "Releases",
De
"title": "Elastic Cloud Enterprise Beta"
aHr
WB
N
Dom
}
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
PUT my_index/_doc/14
1-I
-0N
-
1rt9
{
p0o
rp-2
"author": "Rasmus",
Spu
e-A
0is2
"category": “Releases",
p-r
r
IeN
}
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
-OC
TI
U19
2L0
field names
Sr-O
ASp
the index
-
1rt9
author
p0o
rp-2
category
Spu
e-A
0is2
title
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
TI
U19
-OC the position that the term
occurs in each document
2L0
1 (14: 4,5)
-
1rt9
author
p0o
apm (14: 1)
rp-2
category
Spu
enterprise (27: 2)
.Lo
nHd
enters (14: 2)
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27 documents are deleted
N
Dom
sSt
Nu
-OC
TI
U19
2L0
1 (14: 4,5)
-
1rt9
author
p0o
apm (14: 1)
rp-2
category
Spu
enterprise (27: 2)
.Lo
nHd
enters (14: 2)
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
-OC
TI
_source is stored here
U19
2L0
1 (14: 4,5)
-
1rt9
author
p0o
apm (14: 1)
rp-2
category
Spu
enterprise (27: 2)
.Lo
nHd
enters (14: 2)
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
-OC
TI
single- and multi-
U19
2L0
dimensional numerics
ASp
1-I
-0N
1 (14: 4,5)
-
1rt9
author
p0o
apm (14: 1)
rp-2
category
Spu
enterprise (27: 2)
.Lo
nHd
enters (14: 2)
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
-OC
TI
U19
2L0
author
p0o
category
Spu
enters (14: 2)
ar
Le
segment_0
De
aHr
enters 1 14
WB
releases 2 14,27
N
Dom
sSt
Nu
TI
-OC doc_values
U19
2L0
1 (14: 4,5)
-
1rt9
author
p0o
apm (14: 1)
rp-2
category
Spu
enters (14: 2)
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
shard0
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
De
aHr
segment2 segment2 segment2
WB
N
Dom
segment segment segment
sSt
Nu
segment segment TI
-OC segment
U19
2L0
Sr-O
ASp
merge happens
1-I
-0N
-
1rt9
shard0
p0o
rp-2
Spu
e-A
0is2
segment1
p-r
segment
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
• If you use it, make sure to only use _forcemerge on indices
N
Dom
sSt
that will never have write operations executed in the future Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
node1 node2
my_index
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
shard
p0o
rp-2
Spu
e-A
segment segment
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
buffer
PUT blogs/_doc/801
segment1
De
aHr
WB
transaction log
segment2
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
buffer
p0o
rp-2
Spu
e-A
Lucene commit
0is2
segment1
p-r
Elasticsearch flush
r
IeN
Hnt
segment2
.t CE
.Lo
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
get added to the transaction log, how big they are, and when the
sSt
Nu
-OC
last flush happened
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
POST my_index/_flush/synced
De
aHr
WB
N
Dom
node1 node2 node3
sSt
Nu
-OC
TI
U19
2L0
Sr-O
P0 R0 R0
ASp
1-I
-0N
-
1rt9
GET blogs/_search
{
"query": {
"match": {
"content": "new releases"
}
},
De
aHr
WB
"sort": {
N
Dom
sSt
"author": {
Nu
-OC
TI
"order": "asc"
U19
2L0
Sr-O
}
ASp
1-I
}
-0N
-
}
p0o
rp-2
{
"error": {
"root_cause": [
{
De
aHr
"type": "illegal_argument_exception",
WB
N
Dom
"reason": "Fielddata is disabled on text fields by
sSt
Nu
default. Set fielddata=true on [author] in order to load -OC
TI
U19
}
Spu
e-A
],
0is2
p-r
...
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
author
aaron
Not an ideal format
alexander for sorting
baiera
The inverted index needs
De
aHr
banon
WB
to be uninverted
N
Dom
sSt
somehow if we want to
Nu
boness -OC
TI
U19
cam
1-I
-0N
-
1rt9
p0o
clint
rp-2
Spu
e-A
0is2
cohen
p-r
r
IeN
Hnt
.t CE
…and so on
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
... fielddata in memory by uninverting the inverted index.
2L0
Sr-O
}
p0o
rp-2
Spu
],
e-A
0is2
...
p-r
“text”?
Le
De
aHr
WB
N
Dom
• Doc values are fast and awesome
sSt
Nu
-OC
TI
U19
2L0
Sr-O
"author": {
"type": "text",
De
aHr
WB
"fields": {
N
Dom
sSt
"keyword": {
Nu
-OC
TI
"type": "keyword",
U19
2L0
Sr-O
"ignore_above": 256
ASp
1-I
}
-0N
-
1rt9
} In blogs, we indexed
p0o
rp-2
}
e-A
0is2
GET blogs/_search
{
"query": {
"match": {
"content": "new releases" "author": "",
}
De
aHr
}, "author": "A.J. Angus",
WB
N
Dom
"sort": {
sSt
Nu
"author.keyword": { -OC
"author": "Aaron Aldrich",
TI
U19
2L0
"order": "asc"
Sr-O
ASp
}
1-I
-0N
}
Spu
e-A
De
aHr
WB
N
Dom
sSt
Nu
-OC
• Segment level cache uses bit sets…
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
{ 1
aHr
WB
N
Dom
"query": {
sSt
Nu
"bool": { TI
U19
-OC
"filter": {
2
2L0
Sr-O
"range": {
ASp
1-I
-0N
"publish_date": {
-
1rt9
p0o
"gte": 2017,
rp-2
3
Spu
e-A
"lte": 2018
0is2
p-r
} } } } } }
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
better-query-execution-coming-elasticsearch-2-0
frame-of-reference-and-roaring-bitmaps
De
• Static setting that must be configured on every node:
aHr
WB
N
Dom
sSt
Nu
-OC
TI
default is 1% of the heap
U19
2L0
Sr-O
De
"query_string": {
aHr
WB
N
Dom
"query": "*_source*"
sSt
Nu
} TI
U19
-OC
}
The whole JSON body is
2L0
Sr-O
}
ASp
De
aHr
WB
N
Dom
fields (uses doc_values).
sSt
Nu
-OC
TI
U19
2L0
Sr-O
improve performance
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
5. What happens if you use ?refresh=wait_for in an index TI
U19
Nu
-OC
2L0
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 2
De
Field Modeling
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
TI
U19
De
• Controlling Dynamic Fields
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
"mappings": {
"_doc": {
"properties": {
"author": {
"type": "text", Most of the fields are the
"fields": {
"keyword": { default “text” and “keyword”,
"type": "keyword", which does not make sense for
"ignore_above": 256 some fields
}
}
},
"category": {
De
aHr
"type": "text",
WB
N
Dom
"fields": {
sSt
Nu
"keyword": { -OC
TI
U19
"type": "keyword",
2L0
Sr-O
"ignore_above": 256
ASp
1-I
-0N
}
-
1rt9
}
p0o
rp-2
},
Spu
e-A
0is2
"publish_date": {
p-r
r
IeN
"type": "date"
Hnt
.t CE
},
.Lo
nHd
...
ar
Le
De
aHr
}
WB
N
Dom
}
sSt
Nu
}, -OC
TI
U19
"continent_code": {
2L0
Sr-O
"type": "text",
ASp
"fields": {
-
1rt9
"type": "keyword",
Spu
e-A
0is2
"ignore_above": 256
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
},
ar
Le
...
"status_code": {
"status_code": 200 "type": "short"
}
De
aHr
WB
N
Dom
sSt
Nu
Clean-up some of the string fields: -OC
TI
U19
2L0
Sr-O
ASp
1-I
"code": {
-0N
"language": {
-
1rt9
} }
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
"locales": ["de-de","fr-fr"]
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
"user_agent": "Mozilla/5.0
Hnt
.t CE
?
ar
AppleWebKit/537.36 (KHTML,
Le
"version": "6.2.1"
De
aHr
WB
‒ Searching an exact version like 6.2.1 would be easy enough
N
Dom
sSt
Nu
-OC
TI
U19
‒ But how would you query the “version” field for “5.4” or “6.x”?
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
"major": {
N
Dom
"type": "byte"
sSt
Nu
}, -OC
TI
U19
"type": "byte" {
1-I
-0N
}, "version": {
-
1rt9
"bugfix": {
p0o
"display_name": "6.2.1",
rp-2
Spu
"type": "byte"
e-A
"major": 6,
0is2
}
p-r
"minor": 2,
r
IeN
}
Hnt
.t CE
} "bugfix": 1
.Lo
nHd
... }
ar
Le
De
"version.major": 5
aHr
WB
N
Dom
}
sSt
Nu
}, TI
U19
-OC
{
2L0
Sr-O
"match": {
ASp
1-I
-0N
"version.minor": 4
-
1rt9
p0o
}
rp-2
Spu
e-A
}
0is2
p-r
]
r
IeN
Hnt
}
.t CE
.Lo
nHd
}
ar
Le
De
aHr
WB
N
Dom
nice solution for ranges like this…
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ double_range
sSt
Nu
-OC
TI
U19
2L0
‒ date_range
Sr-O
ASp
1-I
-0N
-
1rt9
‒ ip_range
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT test_ranges
{
"mappings": {
"_doc": {
"properties": {
"publish_range": {
"type": "date_range"
De
aHr
WB
}
N
Dom
sSt
}
Nu
-OC
TI
}
U19
2L0
Sr-O
}
ASp
1-I
}
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT test_ranges/_doc/1
{
"publish_range": {
"gte": "2017-11-10",
"lt": "2018-11-10"
}
De
aHr
WB
}
N
Dom
sSt
Nu
TI
U19
-OC This document defines
both an upper and lower
2L0
Sr-O
ASp
bound for
1-I
-0N
-
1rt9
“publish_range”
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
GET test_ranges/_search
{
"query": {
"range": {
"publish_range": {
"gte": "2018-04-01"
}
} "hits": {
De
aHr
} "total": 1,
WB
"max_score": 1,
N
Dom
}
sSt
"hits": [
Nu
-OC {
TI
U19
"_index": "test_ranges",
2L0
Sr-O
"_type": "_doc",
ASp
1-I
"_id": "1",
-0N
-
1rt9
"_score": 1,
p0o
rp-2
"_source": {
Spu
"publish_range": {
e-A
0is2
"gte": "2017-11-10",
p-r
r
IeN
"lt": "2018-11-10"
Hnt
.t CE
}
.Lo
nHd
}
ar
Le
}
]
}
De
aHr
WB
N
Dom
within
sSt
contains
Nu
“Give me all docs where -OC
“Give me all docs where
TI
U19
2L0
query”
-
1rt9
query”
p0o
rp-2
Spu
e-A
0is2
12 28 16 27
p-r
r
IeN
D D
Hnt
.t CE
.Lo
nHd
ar
Le
Q 12 Q 28
14 26
Copyright Elasticsearch BV 2015-2019 Copying, publishing and/or !92
distributing without written permission is strictly prohibited
The relation Parameter (Example)
GET my_range_index/_search
{
"query": {
"range": {
"author_age_range": {
"gte": 23,
"lte": 43,
"relation": "intersects" intersects | within | contains
}
}
}
De
}
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
ange
-
1rt9
1 “Hello, Kibana” 15 to 24
p0o
rp-2
23 to 43 intersects 1,2,3
Spu
e-A
2 “Where is my log?” 25 to 34
0is2
25 to 45 within 2,3
p-r
r
IeN
3 “Ingestion problems?” 35 to 44
Hnt
.t CE
"mappings": {
"doc": {
"properties": {
"originalUrl": {
"type": "text",
"analyzer": "my_url_analyzer" An example of a
} mapping parameter
}
De
aHr
WB
...
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
‒ https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-params.html
Hnt
.t CE
.Lo
nHd
ar
Le
"mappings": {
De
aHr
WB
"doc": {
N
Dom
sSt
"properties": {
Nu
-OC
TI
"http_version": {
U19
2L0
Sr-O
"type": "keyword",
ASp
1-I
-0N
"index": false
-
1rt9
}
p0o
rp-2
Spu
...
e-A
0is2
p-r
r
IeN
“http_version” will
Hnt
.t CE
.Lo
not be indexed
nHd
ar
Le
"mappings": {
De
aHr
WB
"doc": {
N
Dom
sSt
"properties": {
Nu
-OC
TI
"http_version": {
U19
2L0
Sr-O
"type": "keyword",
ASp
1-I
-0N
"doc_values": false
-
1rt9
}
p0o
rp-2
Spu
...
e-A
0is2
p-r
r
IeN
“http_version” will
Hnt
.t CE
.Lo
De
aHr
WB
N
Dom
sSt
Nu
PUT my_logs/doc/_mapping
-OC
TI
U19
{
2L0
Sr-O
"url": {
indexed nor stored in
-
1rt9
"enabled": false
p0o
}
e-A
be stored in _source
0is2
}
p-r
r
IeN
Hnt
}
.t CE
.Lo
nHd
ar
Le
De
aHr
}
WB
N
Dom
...
sSt
Nu
-OC
TI
U19
{
2L0
Sr-O
ASp
"@timestamp": "2017-05-19T00:47:44.633Z",
1-I
-0N
"language": {
rp-2
"code": "en-us",
0is2
is disabled
p-r
"url": "/blog/category/releases"
r
IeN
Hnt
.t CE
},
.Lo
nHd
"runtime": "454ms"
ar
Le
De
aHr
WB
N
Dom
PUT blogs
sSt
Nu
{ -OC
TI
U19
2L0
"_all": { earlier
p0o
rp-2
Spu
"enabled": false
e-A
0is2
p-r
},
r
IeN
Hnt
.t CE
"properties": {
.Lo
nHd
...
ar
Le
"region_name": "Victoria",
"country_name": "Australia",
"city_name": "Surrey Hills"
De
aHr
fields:
WB
N
Dom
sSt
Nu
-OC
TI
U19
using copy_to…
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
"mappings": {
"_doc": {
"properties": {
"region_name": {
"type": "keyword",
"copy_to": "locations_combined"
},
"country_name": {
"type": "keyword", During indexing, the values
"copy_to": "locations_combined" will be copied to the
De
aHr
},
“locations_combined” field
WB
N
Dom
"city_name": {
sSt
Nu
"type": "keyword", TI
U19
-OC
"copy_to": "locations_combined"
2L0
Sr-O
},
ASp
1-I
-0N
"locations_combined": {
-
1rt9
"type": "text"
p0o
rp-2
}
Spu
e-A
0is2
...
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
}
N
Dom
sSt
}
Nu
-OC
TI
U19
}
2L0
Sr-O
"hits": [
ASp
{
1-I
-0N
-
"_index": "weblogs",
1rt9
p0o
rp-2
"_type": "_doc",
Spu
e-A
"_id": "1",
0is2
"_score": 0.5753642,
p-r
r
IeN
"_source": {
Hnt
.t CE
"region_name": "Victoria",
.Lo
nHd
"country_name": "Australia",
ar
Le
PUT ratings/_doc/2
{
"rating": 5.0
}
De
aHr
WB
N
Dom
sSt
Nu
GET ratings/_search?size=0 -OC
TI
U19
2L0
{
Sr-O
ASp
"aggs": {
1-I
-0N
"average_rating": {
-
1rt9
p0o
rp-2
"avg": {
Spu
e-A
"field": "rating"
0is2
p-r
r
IeN
}
Hnt
"aggregations": {
.t CE
}
.Lo
nHd
"average_rating": {
ar
}
Le
"value": 5
}
}
}
Copyright Elasticsearch BV 2015-2019 Copying, publishing and/or !104
distributing without written permission is strictly prohibited
Specifying a Default Value for nulls
• Use the null_value parameter to assign a value to a field if
it is null
‒ The _source is not altered, but the value of null_value is
indexed
PUT ratings
{
"mappings": {
"_doc": {
De
aHr
WB
"properties": {
N
Dom
sSt
"rating": {
Nu
"type": "float", -OC
TI
U19
2L0
Sr-O
"null_value": 1.0
ASp
}
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
}
p-r
}
Hnt
.t CE
that field
.Lo
nHd
ar
Le
PUT ratings/_doc/2
{
"rating": 5.0
}
De
aHr
WB
N
Dom
GET ratings/_search?size=0
sSt
Nu
{ -OC
TI
U19
"aggs": {
2L0
Sr-O
ASp
"average_rating": {
1-I
-0N
"avg": {
-
1rt9
p0o
rp-2
"field": "rating"
Spu
e-A
}
0is2
"aggregations": {
p-r
r
IeN
}
Hnt
"average_rating": {
.t CE
}
.Lo
"value": 3
nHd
ar
}
Le
}
}
PUT ratings/_doc/1
{
"rating": 4 All three PUT
} commands work fine
PUT ratings/_doc/2
De
aHr
WB
{
N
Dom
sSt
"rating": "3"
Nu
-OC
TI
U19
}
2L0
Sr-O
ASp
1-I
-0N
PUT ratings/_doc/3
-
1rt9
{
p0o
rp-2
Spu
"rating": 4.5
e-A
0is2
}
p-r
r
IeN
A “sum” aggregation on
Hnt
.t CE
.Lo
De
aHr
WB
PUT ratings/_doc/1
N
Dom
sSt
{
Nu
-OC
"rating": 4 Works fine
TI
U19
2L0
}
Sr-O
ASp
PUT ratings/_doc/2
1-I
-0N
-
1rt9
{
p0o
rp-2
}
p-r
r
IeN
PUT ratings/_doc/3
Hnt
.t CE
{
.Lo
nHd
ar
PUT blogs/_mapping/_doc
{
De
"_meta" : {
aHr
WB
N
Dom
"blog_mapping_version" : "2.1"
sSt
Nu
} -OC
TI
U19
2L0
}
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
JSON here…
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
mapping based on:
N
Dom
sSt
Nu
-OC
TI
U19
PUT test2
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"my_string_fields": {
De
aHr
WB
"match_mapping_type": "string",
N
Dom
sSt
"mapping": {
Nu
-OC
TI
"type": "keyword"
U19
2L0
Sr-O
}
If the field is a string, map
ASp
}
1-I
-0N
it as “keyword”
-
1rt9
}
p0o
rp-2
]
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
.Lo
}
nHd
ar
Le
POST test2/_doc
{
"blog_reaction": ":thumbsup:"
}
GET test2/_mapping
De
aHr
"properties": {
WB
N
Dom
"blog_reaction": {
sSt
Nu
-OC "type": "keyword"
TI
U19
2L0
}
Sr-O
ASp
}
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT test2/_doc/_mapping
{
"dynamic_templates": [
{
"my_float_fields": { If an unmapped field name
"match": "f_*", starts with “f_”, then it will be
De
aHr
WB
"mapping": { mapped as a float
N
Dom
sSt
"type": "float"
Nu
} -OC
TI
U19
2L0
}
Sr-O
ASp
}
1-I
-0N
-
1rt9
]
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
POST test2/_doc/
{
"f_avg_response_time": "34.8"
}
"properties": {
De
aHr
WB
"blog_reaction": {
N
Dom
"type": "keyword"
sSt
Nu
-OC },
TI
U19
2L0
"f_avg_response_time": {
Sr-O
ASp
"type": "float"
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
POST blogs/_doc/
WB
N
Dom
{
sSt
Nu
"some_new_field": "This is quite unexpected" -OC
TI
U19
2L0
}
Sr-O
ASp
mapping.
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
PUT blogs/_doc/_mapping
-0N
{
p0o
"dynamic": "strict"
already defined in the
0is2
p-r
}
r
IeN
mapping
Hnt
.t CE
.Lo
nHd
ar
Le
POST blogs/_doc/
{
"some_other_field": "This wont't work"
}
{
"error": {
De
"root_cause": [
aHr
WB
{
N
Dom
sSt
"type": "strict_dynamic_mapping_exception",
Nu
-OC
"reason": "mapping set to strict, dynamic
TI
U19
2L0
allowed"
1-I
-0N
}
-
1rt9
p0o
],
rp-2
Spu
"type": "strict_dynamic_mapping_exception",
e-A
0is2
},
.Lo
nHd
ar
"status": 400
Le
PUT blogs/_doc/_mapping
{
"properties": {
"some_other_field": { The POST command on the
De
"type": "text" previous slide will work now
aHr
WB
N
Dom
}
sSt
Nu
} TI
U19
-OC
}
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
“some_other_field” to the
0is2
p-r
mapping
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
• Use the “null_value” parameter to assign a value to a field if it is Nu
-OC
TI
U19
2L0
null
Sr-O
ASp
1-I
-0N
-
1rt9
• You can control the effect of new fields added to a mapping using
.Lo
nHd
ar
Le
3. How would you map a field that you never need to use for
De
aHr
WB
searches or aggregations?
N
Dom
sSt
Nu
-OC
TI
U19
2L0
range query?
.Lo
nHd
ar
Le
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 3
De
Fixing Data
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
TI
U19
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
‒ changing an existing mapping TI
U19
Nu
-OC
2L0
Sr-O
ASp
• First, let's look into tools that can help you with the task...
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
longer available for scripting as of Elasticsearch 6.0
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
‒ execute a script within an ingest pipeline TI
U19
Nu
-OC
2L0
Sr-O
ASp
• Reindex API
1-I
-0N
-
1rt9
p0o
rp-2
Spu
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
PUT my_index/_doc/1
2L0
Sr-O
ASp
{
1-I
-0N
"blog_id": "h81CKmIBCLh5xF6i7Y2f",
-
1rt9
p0o
rp-2
"num_of_views": 3
Spu
e-A
}
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
increment “num_of_views”
De
aHr
WB
"script": {
N
Dom
“source” is
sSt
"source": "ctx._source.num_of_views += params.new_views",
Nu
the code -OC
TI
"params": {
U19
2L0
Sr-O
"new_views": 2
ASp
1-I
-0N
}
-
1rt9
}
p0o
rp-2
Spu
}
e-A
0is2
p-r
r
IeN
Hnt
“params” is for
.t CE
.Lo
nHd
optional parameters
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
POST my_index/_doc/1/_update
U19
2L0
Sr-O
{
ASp
1-I
-0N
"script": {
-
1rt9
"id": "add_new_views",
p0o
rp-2
Spu
"params": {
e-A
0is2
"new_views": 2
p-r
r
IeN
}
.t CE
.Lo
} to the script
nHd
ar
Le
De
aHr
WB
N
Dom
Updates: use the _source field ctx._source.field_name
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
access a field
e-A
0is2
use doc['field_name']
.t CE
.Lo
nHd
ar
Le
https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-contexts.html
De
aHr
WB
N
Dom
‒ or set a timeout using script.cache.expire
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
‒ may be able to avoid this issue by using parameters… TI
U19
Nu
-OC
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
"script": {
"source": "ctx._source.num_of_views += 2"
}
Two different scripts, so
two compilations needed
"script": {
De
aHr
WB
"source": "ctx._source.num_of_views += 3"
N
Dom
sSt
}
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
"script": {
Hnt
.t CE
}
Le
De
aHr
_delete_by_query endpoint as well
WB
N
Dom
sSt
Nu
-OC
TI
U19
"category": {
"type": "text",
"fields": { “category” only has 10 distinct
"keyword": { values, making it a good
"type": "keyword", candidate for “keyword” only
"ignore_above": 256
De
}
aHr
WB
N
}
Dom
sSt
},
Nu
-OC
TI
"content": { “content” is a large amount of
U19
2L0
Sr-O
"type": "text",
text, so “keyword” seems
ASp
"fields": {
1-I
-0N
unnecessary
-
1rt9
"keyword": {
p0o
rp-2
"type": "keyword",
Spu
e-A
"ignore_above": 256
0is2
p-r
}
r
IeN
Hnt
.t CE
}
.Lo
nHd
},
ar
Le
...
De
aHr
WB
"type": "keyword",
N
Dom
"ignore_above": 256
sSt
Nu
-OC
} TI
U19
}
2L0
Sr-O
},
ASp
"category": {
1-I
-0N
"type": "keyword"
-
1rt9
},
rp-2
Spu
"type": "text"
and use case better
p-r
r
IeN
},
Hnt
.t CE
"locales": {
.Lo
nHd
"type": "keyword"
ar
Le
},
...
POST _reindex
De
aHr
{
WB
N
Dom
"source": {
sSt
Nu
"index": "my_source_index", TI
-OC Optional “query" to
U19
"query": {
Sr-O
ASp
... to reindex
1-I
-0N
-
}
1rt9
p0o
rp-2
},
Spu
e-A
0is2
"dest": {
p-r
r
IeN
"index": "my_destination_index"
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
POST _reindex
{
"source": {
"index": "blogs"
},
"dest": {
De
aHr
"index": "blogs_fixed"
WB
N
Dom
}
sSt
Nu
} -OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
} "buckets": [
aHr
WB
N
Dom
} {
sSt
"key": "Engineering",
Nu
} TI
U19
-OC
} "doc_count": 440
2L0
Sr-O
},
ASp
1-I
{
-0N
-
1rt9
"key": "",
p0o
rp-2
"doc_count": 333
Spu
e-A
0is2
},
p-r
r
IeN
{
Hnt
.t CE
"key": "Releases",
.Lo
nHd
ar
"doc_count": 238
Le
},
De
aHr
WB
N
Dom
• Let’s discuss a few tips for dealing with document versions
sSt
Nu
-OC
TI
U19
and reindexing…
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
overwritten
Nu
_id: 123 _id: 123 TI
U19
-OC
_version: 1 _version: 3 _version: 4
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
overwritten
rp-2
_version: 2 _version: 2
0is2
_version: 3
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
created
ar
_id: 789
Le
De
aHr
WB
N
Dom
sSt
exception
Nu
_id: 123 _id: 123 TI
U19
-OC
_version: 1 _version: 3 _version: 3
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
exception
rp-2
_version: 2 _version: 2
0is2
_version: 2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
created
ar
_id: 789
Le
De
"index": "blogs_fixed",
aHr
WB
"type": "doc",
N
Dom
sSt
"id": "Cc1CKmIBCLh5xF6i7Y2b",
Nu
-OC
"cause": {
TI
U19
2L0
"type": "version_conflict_engine_exception",
Sr-O
ASp
"reason": "[doc][Cc1CKmIBCLh5xF6i7Y2b]:
1-I
-0N
"index_uuid": "HUkf-mfCQfOghnqSIi5kBQ",
0is2
},
nHd
destination
ar
"status": 409
Le
},
De
aHr
"index": "blogs"
WB
"created": 0,
N
Dom
},
sSt
"deleted": 0,
Nu
"dest": { -OC "batches": 2,
TI
U19
"index": "blogs_fixed",
2L0
"version_conflicts": 1594,
Sr-O
ASp
}, "retries": {
-
1rt9
p0o
"bulk": 0,
rp-2
"conflicts": "proceed"
Spu
"search": 0
e-A
}
0is2
},
p-r
r
IeN
"throttled_millis": 0,
Hnt
.t CE
"requests_per_second": -1,
.Lo
nHd
ar
"throttled_until_millis": 0,
Le
"failures": []
}
POST blogs/_update_by_query
{
De
aHr
WB
"query": {
N
Dom
empty string
sSt
"match": {
Nu
-OC
TI
"category.keyword": ""
U19
2L0
Sr-O
}
ASp
1-I
},
-0N
-
1rt9
"script": {
p0o
rp-2
Spu
}
p-r
r
IeN
Hnt
}
.t CE
.Lo
nHd
ar
Le
POST _reindex
{
"source": {
"index": "blogs"
},
"dest": { 333 blogs were
"index": "blogs_fixed", “updated”
De
"version_type": "external" {
aHr
WB
"took": 130,
N
Dom
},
sSt
"timed_out": false,
Nu
"conflicts": "proceed" TI
U19
-OC "total": 1594,
}
2L0
"updated": 333,
Sr-O
ASp
"created": 0,
1-I
-0N
"deleted": 0,
-
1rt9
p0o
"batches": 2,
rp-2
Spu
e-A
"version_conflicts": 1261,
0is2
"noops": 0,
p-r
r
IeN
Hnt
"retries": {
.t CE
.Lo
"bulk": 0,
nHd
ar
"search": 0
Le
},
...
De
aHr
WB
N
Dom
sSt
exception
Nu
_id: 123 _id: 123 TI
U19
-OC
_version: 1 _version: 3 _version: 3
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
exception
rp-2
_version: 2 _version: 2
0is2
_version: 2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
created
ar
_id: 789
Le
POST blogs_fixed/_delete_by_query
{
"query": {
"range": {
"publish_date": {
De
aHr
WB
"lte": "2016"
N
Dom
sSt
}
Nu
-OC
TI
}
U19
2L0
Sr-O
}
ASp
1-I
}
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
De
aHr
"total": 1594,
WB
N
Dom
"updated": 0,
sSt
Nu
"created": 812,
-OC
TI
U19
"deleted": 0,
2L0
Sr-O
"batches": 2,
ASp
1-I
"version_conflicts": 782,
-0N
-
1rt9
"noops": 0,
p0o
"retries": {
Spu
e-A
"search": 0
r
IeN
},
in the target
.Lo
nHd
ar
Le
PUT blogs_fixed/_mapping/_doc
{
"properties": {
De
"content": {
aHr
WB
"type": "text",
N
Dom
sSt
Nu
"fields": { TI
U19
-OC
"english": {
Add a new multi-field
2L0
Sr-O
"type": "text",
ASp
"analyzer": "english"
-
1rt9
}
e-A
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
nHd
}
ar
Le
GET blogs_fixed/_search
{
"query": {
"match": {
"content.english": "performance tips"
}
}
De
aHr
WB
}
N
Dom
No hits, because no
sSt
Nu
-OC
documents have this field yet
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
POST blogs_fixed/_update_by_query
GET blogs_fixed/_search
{
"query": {
De
aHr
"match": {
WB
N
Dom
"content.english": "performance tips"
sSt
Nu
} -OC
TI
U19
2L0
}
Sr-O
426 hits
ASp
}
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
PUT blogs_fixed/_doc/_mapping
N
Dom
sSt
{
Nu
-OC
TI
"properties": {
U19
"reindexBatch": {
ASp
"type": "short"
-0N
-
1rt9
} reindexing
p0o
rp-2
Spu
}
e-A
0is2
}
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
}
WB
N
Dom
},
sSt
Nu
"script": { -OC
TI
U19
2L0
"source": """
Sr-O
ASp
if(ctx._source.containsKey("content")) {
1-I
-0N
-
ctx._source.content_length = ctx._source.content.length();
1rt9
p0o
rp-2
} else {
Spu
e-A
0is2
ctx._source.content_length = 0;
p-r
r
IeN
}
Hnt
ctx._source.reindexBatch=1;
.Lo
nHd
ar
"""
Le
}
}
De
"locales": "de-de,fr-fr" "locales": {
aHr
WB
N
Dom
"type": "keyword"
sSt
Nu
TI
U19
-OC
}
2L0
Sr-O
"locales": ["de-de","fr-fr"]
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
{
sSt
node1
Nu
“field1” : “value1”,
“field2” : “value2”, -OC
TI
U19
“field3” : “value3”
2L0
Sr-O
}
ASp
1-I
-0N
-
1rt9
my_cluster
ingest_node1 data_node1
De
aHr
data_node2
WB
N
Dom
ingest_node2
sSt
Nu
Client TI
U19
-OC data_node3
2L0
Sr-O
data_node4
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
data_node1
De
aHr
ingest_node1
WB
N
Dom
sSt
Nu
{
-OC
{
“a” : “value4”, P1
TI
“field1” : “value1”,
U19
“field2” : “value2”,
Sr-O
“c” : “value6”,
ASp
“field3” : “value3”
“d” : “value7”
1-I
-0N
}
}
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
PUT _ingest/pipeline/my-pipeline-id
{
"description" : "DESCRIPTION",
De
aHr
"processors" : [
WB
N
Dom
{
sSt
Nu
... array of processors TI
-OC
U19
}
2L0
Sr-O
ASp
],
1-I
-0N
"on_failure" : [
-
1rt9
p0o
{
rp-2
Spu
e-A
...
0is2
p-r
}
r
IeN
Hnt
.t CE
]
optional array of processors
.Lo
nHd
}
ar
if an error occurs
Le
PUT _ingest/pipeline/my_pipeline
{
"processors": [
{
"set": {
"field": "number_of_views",
De
aHr
"value": 0
WB
N
Dom
}
sSt
Nu
} -OC
TI
U19
]
2L0
Sr-O
Adds “number_of_views” if
ASp
}
1-I
-0N
0 if it already does
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
}
WB
N
Dom
sSt
Nu
-OC
TI
"_source": {
U19
2L0
"number_of_views": 0,
Sr-O
ASp
}
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT my_index/_doc/1?pipeline=my_pipeline
{
"author": "Monica Sarbu",
"category": "Brewing in Beats"
}
"_source": {
De
"number_of_views": 0,
aHr
WB
"author": "Monica Sarbu”,
N
Dom
sSt
"category": "Brewing in Beats"
Nu
-OC
TI
}
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
POST _bulk
p-r
r
IeN
De
aHr
The pipeline is applied on
WB
N
Dom
sSt
every document
Nu
-OC
TI
U19
2L0
PUT test_index
Sr-O
ASp
{
1-I
-0N
"settings": {
-
1rt9
p0o
rp-2
"default_pipeline": "my_pipeline"
Spu
e-A
}
0is2
p-r
r
IeN
}
Hnt
.t CE
.Lo
nHd
ar
Le
PUT _ingest/pipeline/blogs_pipeline
{
"processors": [
{
"split": {
"field": "locales",
De
Can be a regular
aHr
WB
"separator": ","
N
Dom
} expression as well
sSt
Nu
} -OC
TI
U19
2L0
]
Sr-O
ASp
}
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
once, at index time
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
},
} “ctx” is a map, and we
{ check to see if it contains
De
aHr
"script": { a “content” field
WB
N
Dom
"source": """
sSt
Nu
if(ctx.containsKey("content")) {
-OC
TI
ctx.content_length = ctx.content.length();
U19
2L0
} else {
Sr-O
ASp
ctx.content_length = 0;
1-I
-0N
}
-
1rt9
p0o
"""
rp-2
Spu
}
e-A
0is2
}
p-r
Assigning “ctx.content_length”
r
IeN
]
Hnt
.t CE
}
adds the field to the doc if it is not
.Lo
nHd
ar
already defined
Le
De
} "locales": [
aHr
WB
"de-de",
N
Dom
}
sSt
"fr-fr",
Nu
] "ja-jp",
TI
U19
-OC
}
2L0
"ko-kr"
Sr-O
ASp
],
1-I
-0N
"content_length": 15
rp-2
Spu
e-A
...
0is2
"locales": [
p-r
r
IeN
Hnt
"en-en"
.t CE
.Lo
],
nHd
ar
"content_length": 0
Le
...
POST blogs_fixed/_update_by_query?pipeline=blogs_pipeline
GET blogs_fixed/_search
{
"locales": [
"de-de",
De
"fr-fr",
aHr
WB
"ja-jp",
N
Dom
sSt
"ko-kr",
Nu
"zh-chs" -OC
TI
U19
],
2L0
Sr-O
"category": "News",
-
1rt9
"publish_date": "2017-05-04T06:00:00.000Z",
Spu
e-A
"seo_title": "",
0is2
p-r
"content": "...",
r
IeN
Hnt
"url": "/blog/introducing-machine-learning-for-the-elastic-stack",
.t CE
.Lo
"content_length": 5861
nHd
ar
}
Le
De
aHr
WB
N
Dom
• The Update By Query API allows you to reindex a collection of
sSt
Nu
-OC
TI
documents into the same index
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
N
Dom
POST messages/_update_by_query
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
"processors" : [
e-A
0is2
{
p-r
r
IeN
"script" : {
Hnt
.t CE
"source" : "ctx._index=ctx.clientip.country_iso_code.toLowerCase()"
.Lo
nHd
ar
}
Le
}
]
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 4
De
Advanced Search
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
& Aggregations
TI
U19
De
• Pipeline Aggregations
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
GET blogs/_search
sSt
I'm looking for all blog
Nu
{ TI
U19
-OC
"query": { posts about 5.x releases
2L0
Sr-O
"wildcard": {
ASp
1-I
-0N
"title.keyword": {
-
1rt9
p0o
}
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
nHd
}
ar
Le
De
aHr
WB
N
Dom
GET blogs/_search
sSt
Nu
{ I'm looking for all blog TI
-OC
U19
"regexp" : { releases
1-I
-0N
"title.keyword": ".*5\\.[0-2]\\.[0-9].*"
-
1rt9
p0o
rp-2
}
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
.t CE
De
aHr
written in multiple
WB
N
Dom
sSt
languages.”
Nu
-OC
TI
U19
2L0
Sr-O
GET blogs/_search
ASp
1-I
-0N
{
-
1rt9
p0o
"query": {
rp-2
Spu
e-A
"exists": {
0is2
p-r
"field": "locales"
r
IeN
Hnt
}
.t CE
.Lo
nHd
}
ar
Le
De
aHr
"query": {
WB
N
Dom
"bool": {
sSt
Nu
"must_not": { -OC
TI
U19
2L0
"exists": {
Sr-O
ASp
"field": "geoip.region_name"
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
"query": {
WB
N
Dom
"bool": {
sSt
Nu
"must_not": { -OC
TI
U19
2L0
"exists": {
Sr-O
ASp
"field": "geoip"
1-I
-0N
-
}
1rt9
p0o
rp-2
}
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
WB
"hits": {
}
N
Dom
"total": 54,
sSt
}
Nu
"max_score": 0,
-OC
TI
} "hits": [
U19
2L0
{
Sr-O
}
ASp
"_source": {
1-I
-0N
"locales": [
-
1rt9
"de-de",
p0o
"fr-fr",
Spu
e-A
"ja-jp",
compute a Boolean value
0is2
"ko-kr",
p-r
r
IeN
"zh-chs"
Hnt
.t CE
],
.Lo
nHd
"category": "News",
Le
De
}, Context in which the
aHr
WB
"context": "filter",
N
Dom
script is applied
sSt
"context_setup": {
Nu
-OC
TI
"index": "blogs_fixed",
U19
2L0
Sr-O
"document": {
Specify the
ASp
document on which
-
}
1rt9
p0o
rp-2
}
0is2
applied
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
{
ar
Le
"result": true
}
De
aHr
WB
"source": """
N
Dom
sSt
def d = new Date(doc['publish_date'].value.millis);
Nu
-OC
TI
return d.toString().substring(0,3);
U19
2L0
Sr-O
"""
ASp
1-I
}
-0N
-
1rt9
}
p0o
rp-2
Spu
}
e-A
0is2
}
p-r
r
IeN
Hnt
.t CE
.Lo
De
aHr
"Fri"
WB
N
Dom
]
sSt
Nu
} TI
U19
-OC
},
2L0
Sr-O
{
ASp
1-I
"_index": "blogs_fixed",
-0N
-
1rt9
"_type": "doc",
p0o
rp-2
"_id": "g81CKmIBCLh5xF6i7Y-v",
Spu
e-A
"_score": 1,
0is2
p-r
"fields": {
r
IeN
Hnt
.t CE
"day_of_week": [
.Lo
nHd
"Tue"
ar
Le
]
}
},
De
"script_fields": {
aHr
WB
"day_of_week": {
N
Dom
sSt
"script": {
Nu
-OC
TI
"source": """
U19
2L0
Sr-O
return d.toString().substring(0,3);
1-I
-0N
-
"""
1rt9
p0o
rp-2
}
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ or modeling your data in a way where scripting can be avoided
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
‒ avoid repeating code in multiple places
WB
N
Dom
sSt
Nu
-OC
‒ minimize mistakes
TI
U19
2L0
Sr-O
ASp
1-I
-0N
POST _scripts/my_search_template
Name of the
{ search template
"script": {
De
"lang": "mustache",
aHr
WB
N
Dom
"source": {
sSt
Nu
"query": { TI
U19
-OC
"match": {
2L0
Sr-O
"{{my_field}}": "{{my_value}}"
ASp
1-I
-0N
}
-
1rt9
p0o
}
rp-2
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
nHd
Parameters
ar
Le
“I am looking for
blogs that have shard in
the title.”
De
aHr
WB
N
Dom
GET blogs/_search/template
sSt
Nu
{ TI
U19
-OC
"id": "my_search_template",
2L0
Sr-O
"params": {
ASp
1-I
-0N
"my_field": "title",
-
1rt9
"my_value": "shard"
p0o
rp-2
Spu
}
e-A
0is2
p-r
}
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
},
N
Dom
"should": {
sSt
Nu
"multi_match": { -OC
TI
U19
"query": "{{blog_query}}",
2L0
Sr-O
ASp
"fields": ["title","title.*","content","content.*"],
1-I
-0N
"type": "phrase"
-
1rt9
}
p0o
rp-2
Spu
}
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
GET blogs_fixed/_search/template
ASp
1-I
-0N
{
-
1rt9
"id": "blogs_webform_search",
p0o
rp-2
Spu
"params": {
e-A
0is2
}
.t CE
.Lo
}
nHd
ar
Le
{{#param1}}
"This section is skipped if param1 is null or false"
{{/param1}}
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
"filter": {
N
Dom
sSt
"range": {
Nu
-OC
TI
"publish_date": {"gte": "{{search_date}}"}
U19
2L0
Sr-O
}
ASp
1-I
}
-0N
-
1rt9
{{/search_date}}
p0o
}
e-A
“search_date” parameter
0is2
}
p-r
r
IeN
is defined
Hnt
}
.t CE
.Lo
"""
nHd
ar
Le
}
}
GET blogs_fixed/_search/template
{
"id": "blogs_with_date_search", 47 hits
"params": {
"search_term": "shay banon"
}
}
De
aHr
WB
N
Dom
GET blogs_fixed/_search/template
sSt
Nu
{ -OC
TI
U19
2L0
"params": {
1-I
-0N
"search_date": "2017-07-01"
Spu
e-A
}
0is2
p-r
r
IeN
}
Hnt
.t CE
.Lo
nHd
ar
Le
GET logs_server*/_search
{
"size": 0,
"aggs": {
De
"runtime_percentiles": {
aHr
"runtime_percentiles": {
WB
N
Dom
"percentiles": { "values": {
sSt
Nu
"field": "runtime_ms" TI
U19
-OC "1.0": 0,
} "5.0": 88.00109639047503,
2L0
Sr-O
"25.0": 95.00000000000001,
ASp
}
1-I
-0N
} "50.0": 103.37306961911929,
-
1rt9
p0o
"75.0": 159.88916204500126,
rp-2
}
Spu
e-A
"95.0": 685.1015874756147,
0is2
p-r
"99.0": 4198.930939937213
r
IeN
Hnt
.t CE
}
.Lo
nHd
}
ar
GET logs_server*/_search
{
"size" : 0,
"aggs": {
"runtime_quintiles": { "runtime_quintiles": {
"percentiles": { "values": {
De
aHr
"field": "runtime_ms", "20.0": 93.83826098733662,
WB
N
Dom
"percents": [ "40.0": 99,
sSt
Nu
20, -OC "60.0": 111.35144881124744,
TI
U19
80, }
-
1rt9
p0o
100
rp-2
}
80% of the responses
Spu
e-A
]
0is2
}
r
IeN
Hnt
236 ms
.t CE
}
.Lo
nHd
}
ar
Le
GET logs_server*/_search
{
"size": 0,
"aggs": { "runtime_ranks": {
De
aHr
"values": {
WB
"runtime_ranks": {
N
Dom
"100.0": 43.74582092577736,
sSt
"percentile_ranks": {
Nu
-OC "500.0": 94.35230182775403
TI
"field": "runtime_ms",
U19
2L0
}
Sr-O
}
1-I
}
-0N
}
in less than 100 ms
p0o
rp-2
Spu
}
e-A
0is2
}
p-r
values
.Lo
nHd
ar
Le
De
aHr
"doc_count": 69
WB
},
N
Dom
},
sSt
"aggs": {
Nu
{
-OC
TI
"blogs_by_author": { "key": "Alexander Reelsen",
U19
2L0
Sr-O
"terms": { "doc_count": 67
ASp
},
1-I
"field": "author.keyword"
-0N
-
{
1rt9
}
p0o
rp-2
}
e-A
"doc_count": 31
0is2
}
p-r
},
r
IeN
Hnt
} {
.t CE
.Lo
nHd
"doc_count": 29
},
De
aHr
},
WB
N
Dom
"aggs": {
sSt
Nu
"logstash_top_hits": { -OC
TI
U19
"top_hits": {
2L0
Sr-O
ASp
"size": 5
1-I
-0N
}
-
}
rp-2
}
0is2
}
r
IeN
Hnt
.t CE
}
.Lo
nHd
}
ar
Le
"buckets": [
{
"key": "Suyog Rao",
"doc_count": 69,
"logstash_top_hits": {
"hits": {
"total": 69,
"max_score": 6.6510196,
"hits": [
De
aHr
WB
{
N
Dom
"_index": "blogs",
sSt
Nu
"_type": "doc", -OC
TI
U19
"_id": "TM1CKmIBCLh5xF6i7Y2b",
2L0
Sr-O
ASp
"_score": 6.6510196,
1-I
-0N
"_source": {
-
1rt9
"publish_date": "2016-06-27T06:00:00.000Z",
p0o
rp-2
Spu
"seo_title": "",
e-A
0is2
"locales": "",
Hnt
.t CE
De
},
aHr
WB
N
Dom
"missing_longitude": {
sSt
Nu
"missing": { TI
U19
-OC
"field": "geoip.location.lon"
2L0
Sr-O
} "aggregations": {
ASp
"missing_latitude": {
1-I
-0N
}
-
1rt9
"doc_count": 6397
p0o
}
rp-2
},
Spu
e-A
}
0is2
"missing_longitude": {
p-r
r
IeN
"doc_count": 6397
Hnt
.t CE
.Lo
}
nHd
ar
}
Le
GET blogs/_search
{
"size": 0,
"aggs": {
De
aHr
"author_terms": {
WB
A simple terms agg on a
N
Dom
"terms": {
sSt
field in the document
Nu
-OC "field": "author.keyword"
TI
U19
}
2L0
Sr-O
ASp
}
1-I
-0N
}
-
1rt9
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
"script": {
aHr
WB
N
Dom
"source": "doc['publish_date'].value.dayOfWeek"
sSt
Nu
} TI
U19
-OC
}
2L0
Sr-O
}
ASp
1-I
-0N
} "buckets": [
-
1rt9
{
p0o
}
rp-2
"key": "2",
Spu
e-A
"doc_count": 415
0is2
p-r
1 = Monday,
r
},
IeN
Hnt
.t CE
{
2 = Tuesday,
.Lo
nHd
"key": "1",
ar
etc.
Le
"doc_count": 385
},
De
aHr
"source": "doc['author.keyword'].value +
WB
N
Dom
'_' +doc['publish_date'].value.dayOfWeek"
sSt
"buckets": [
Nu
} { TI
U19
-OC
} "key": "Alexander Reelsen_3",
2L0
Sr-O
} "doc_count": 67
ASp
},
1-I
-0N
} {
-
1rt9
}
p0o
"doc_count": 52
Spu
e-A
},
0is2
p-r
{
r
IeN
Hnt
"doc_count": 42
.Lo
nHd
},
ar
Le
{
"key": "Clinton Gormley_2",
"doc_count": 38
},
De
aHr
Query
WB
N
Dom
sSt
Nu - Recommendation
FG -OC
TI
U19
2L0
Sr-O
ASp
- Fraud Detection
1-I
-0N
-
1rt9
p0o
BG
rp-2
Spu
e-A
- Defect Detection
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
And more…
nHd
ar
Le
https://www.elastic.co/blog/significant-terms-aggregation
De
aHr
"aggs": {
WB
N
Dom
"content_terms": {
sSt
Nu
"terms": { -OC
TI
U19
2L0
"field": "content",
Sr-O
ASp
"size": 10
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
De
"key": "the",
aHr
Monica likes to blog about “and”,
WB
"doc_count": 89
N
Dom
}, “the”, “to”, “in” and so on.
sSt
Nu
{ -OC
TI
U19
"key": "to",
2L0
Sr-O
"doc_count": 88
ASp
},
1-I
-0N
{
-
1rt9
p0o
"key": "in",
rp-2
Spu
"doc_count": 86
e-A
0is2
},
p-r
r
IeN
{
Hnt
.t CE
"key": "is",
.Lo
nHd
"doc_count": 86
ar
},
Le
{
"key": "this",
De
aHr
"aggs": {
WB
N
Dom
"content_significant_terms": {
sSt
Nu
"significant_terms": { -OC
TI
U19
2L0
"field": "content",
Sr-O
ASp
"size": 10
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
De
"bg_count": 133
aHr
WB
},
N
Dom
sSt
{
Nu
-OC
TI
"key": "beat",
U19
2L0
"doc_count": 56,
Sr-O
ASp
"score": 5.3810714135420605,
1-I
-0N
"bg_count": 105
-
1rt9
p0o
},
rp-2
Spu
e-A
{
0is2
p-r
"key": "beats",
r
IeN
Hnt
"doc_count": 84,
.t CE
.Lo
"score": 4.668550553314773,
nHd
ar
"bg_count": 253
Le
},..
De
aHr
WB
"interval": "month"
N
Dom
sSt
},
Nu
-OC
TI
"aggs": {
U19
2L0
Sr-O
"monthly_sum_response": {
ASp
1-I
-0N
"sum": {
-
1rt9
"field": "response_size"
p0o
rp-2
Spu
}
e-A
}
p-r
r
IeN
}
.t CE
}
nHd
ar
Le
}
}
De
aHr
node, after the results of the input agg are collected
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
}
N
Dom
“cumulative_sum” is a
sSt
},
Nu
-OC
TI
"cumulative_sum_response": { pipeline agg
U19
2L0
Sr-O
"cumulative_sum": {
ASp
1-I
"buckets_path": "monthly_sum_response"
-0N
-
1rt9
}
p0o
rp-2
Spu
}
p-r
the result of
r
IeN
Hnt
}
.t CE
“monthly_sum_response”
.Lo
}
nHd
ar
}
Le
"buckets": [
{
"key_as_string": "2017-03-01T00:00:00.000Z",
"key": 1488326400000,
"doc_count": 255,
"monthly_sum_response": {
"value": 15860968 monthly sum
},
"cumulative_sum_response": {
"value": 15860968 cumulative sum
}
De
aHr
WB
},
N
Dom
sSt
{
Nu
"key_as_string": "2017-04-01T00:00:00.000Z", -OC
TI
U19
2L0
"key": 1491004800000,
Sr-O
ASp
"doc_count": 467961,
1-I
-0N
"monthly_sum_response": {
-
1rt9
monthly sum
p0o
"value": 25446117219
rp-2
Spu
},
e-A
0is2
"cumulative_sum_response": {
p-r
r
IeN
"value": 25461978187
Hnt
cumulative sum
.t CE
}
.Lo
nHd
ar
},
Le
...
De
aHr
"field": "response_size"
WB
N
Dom
}
sSt
Nu
},
-OC In this example, the input
TI
"monthly_max_response": {
U19
"max": {
Sr-O
ASp
"field": "response_size"
are at the same level
1-I
-0N
}
-
1rt9
},
p0o
rp-2
"cumulative_sum_response": {
Spu
e-A
"cumulative_sum": {
0is2
p-r
"buckets_path": "monthly_sum_response"
r
IeN
Hnt
}
.t CE
.Lo
}
nHd
ar
}
Le
}
}
}
GET logs_server*/_search
{ “max_monthly_sum” is a single output
"size": 0,
"aggs": { over all buckets and the max value of
"logs_by_month": { “monthly_sum_response”
"date_histogram": {
"field": "@timestamp",
"interval": "month"
De
aHr
},
WB
N
Dom
"aggs": {
sSt
"monthly_sum_response": {
Nu
"sum": { -OC
TI
U19
2L0
"field": "response_size"
Use a ‘>’ symbol as a separator
Sr-O
}
ASp
1-I
between aggregations
-0N
}
-
1rt9
}
p0o
rp-2
},
Spu
e-A
"max_monthly_sum": {
0is2
p-r
"max_bucket": {
r
IeN
"buckets_path": "logs_by_month>monthly_sum_response"
Hnt
.t CE
}
.Lo
nHd
}
ar
Le
}
}
De
aHr
WB
‒ bucket_script: executes a script
N
Dom
sSt
Nu
-OC
TI
U19
aggregations:
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
‒
0is2
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
documents being aggregated over
sSt
Nu
-OC
TI
U19
2L0
Sr-O
De
aHr
WB
N
Dom
sSt
5. In pipeline aggregations, when do you use the ‘>’ symbol? Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 5
De
Cluster
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
Management
TI
U19
De
• Shard Allocation Awareness
aHr
WB
N
Dom
sSt
Nu
-OC
• Forced Awareness
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
De
aHr
node1 node2 node3
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
De
aHr
WB
N
Dom
node1 node2 node3
sSt
Nu
-OC
TI
U19
2L0
my_index
Sr-O
ASp
1-I
-0N
P0 P1 P2 P3 P4
-
1rt9
p0o
rp-2
Spu
e-A
R4 R0 R1 R2 R3
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
my_index1 my_index2
De
aHr
WB
N
Dom
P0 P1 P0 P1 P2 P3
sSt
Nu
-OC
TI
U19
R2 R3 R0 R1
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
De
aHr
WB
N
Dom
sSt
• Nodes can be dedicated nodes that only take on a single TI
U19
Nu
-OC
role...
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
data -OC
node.data true
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
De
aHr
WB
N
Dom
“I am a
sSt
Nu
-OC
dedicated ingest “What am I?”
TI
U19
2L0
Sr-O
node.”
ASp
node3 node4
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
node4
ingest node14 data
node5
data node6
data node7 node1
data
node8
data
De
aHr
WB
node15 node9 node2
N
Dom
ingest
data
sSt
Nu
-OC
TI
node10
U19
2L0
data
Sr-O
node3
ASp
node11
1-I
-0N
data
-
1rt9
p0o
rp-2
node12
Spu
e-A
data
0is2
p-r
r
IeN
node13
Hnt
.t CE
my_cluster
De
aHr
WB
N
Dom
sSt
Nu
‒ machines with high CPU, RAM, and disk resources TI
-OC
U19
2L0
Sr-O
ASp
‒ machines with low disk, medium RAM, and high CPU resources
.Lo
nHd
ar
Le
De
‒ perform the gather/reduce phase of search requests
aHr
WB
N
Dom
sSt
Nu
-OC
‒ lightens the load on data nodes
TI
U19
2L0
Sr-O
“What am I?
ASp
1-I
-0N
Oh, I am a coordinating-only
-
1rt9
p0o
node.”
rp-2
Spu
node4
e-A
0is2
p-r
r
IeN
Hnt
.t CE
node.master: false
.Lo
nHd
ar
node.data: false
Le
node.ingest: false
node4
node2
data
node5
data node6 node3
node14
coordinating
data node7
Client
data
node8
App data
De
aHr
node9
WB
node15
data
N
Dom
sSt
Nu
coordinating -OC node10
TI
U19
data
2L0
Sr-O
ASp
node11
1-I
-0N
data
-
1rt9
p0o
node12
rp-2
Spu
data
e-A
0is2
p-r
node13
r
IeN
data
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
node4
READ data node14 data
node5
coordinating
data node6
Client
App data node7
node1
node15
coordinating
data node8
node2
data
De
node9
aHr
WB
data
N
Dom
sSt
WRITE data
Nu
TI
-OC node10 node3
data
U19
2L0
Sr-O
data
1-I
-0N
Client
-
1rt9
node12
p0o
rp-2
App data
Spu
e-A
ingest node17
0is2
node13
p-r
r
IeN
data
Hnt
.t CE
.Lo
nHd
my_cluster
ar
Le
De
aHr
‒ Hot nodes
WB
N
Dom
sSt
Nu
-OC
‒ for supporting the indices with new documents being written to
TI
U19
2L0
Sr-O
ASp
1-I
-0N
‒ Warm nodes
-
1rt9
p0o
rp-2
Spu
e-A
frequently
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
De
aHr
WB
N
Dom
{
sSt
{ "volume": 46965,
Nu
{ "high":
"volume":
31.56,46965,
{ "high":
"volume":
31.56,46965,
"stock_symbol": "ALL", hot_node1
TI
-OC
U19
{ "high":
"volume":
31.56,46965,
"stock_symbol":
"low": 30.68, "ALL",
2L0
{ "high":
"volume":
31.56,46965,
Sr-O
"stock_symbol":
"low":
"close": 30.68,
30.91, "ALL",
{ "high":
"low": "volume":
30.68,31.56,
"stock_symbol": 46965,
"ALL", hot_node2
ASp
"close":
"trade_date": 30.91,
"username"
30.68, : "kimchy",
"high": 31.56,
"stock_symbol":
"low": 30.91,
"close": "ALL",
"trade_date":
1-I
"2010-01-15T07:00:00.000Z",
-0N
"tweet"
"low":
"close":
"trade_date": :
30.68,
30.91, "Search
"stock_symbol": is something
"ALL",
"2010-01-15T07:00:00.000Z",
that any application should have",
-
"low": 30.68,
1rt9
"close":
"trade_date": 30.91,
"2010-01-15T07:00:00.000Z",
"tweet_time"
"close": 30.91,: hot_node3
p0o
"trade_date":
"2010-01-15T07:00:00.000Z",
rp-2
"2010-02-17T23:09:00Z"
"trade_date":
"2010-01-15T07:00:00.000Z",
Spu
}
"2010-01-15T07:00:00.000Z",
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
"query": { warm_node1
aHr
WB
"match": {
N
Dom
GET tweets*/_search
"tweet": "elastic"
sSt
warm_node2
Nu
{ }
hot_node1
TI
-OC
} "query": {
U19
2L0
"match": { warm_node3
Sr-O
{ "tweet": "elastic"
1-I
-0N
}"query": {
warm_node4
-
1rt9
} hot_node3
p0o
"match": {
rp-2
}
Spu
"tweet": "elastic"
warm_node5
e-A
0is2
}
p-r
r
IeN
}
warm_node6
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
has:
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
warm_node1
hot_node1 warm_node2
hot_node2 warm_node3
De
aHr
warm_node4
WB
hot_node3
N
Dom
sSt
Nu
-OC warm_node5
TI
U19
2L0
Sr-O
warm_node6
ASp
1-I
node.attr.my_temp: hot
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
you choose
Le
De
aHr
}
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
The shards of
1-I
-0N
-
1rt9
on “hot” nodes
e-A
0is2
p-r
r
IeN
Hnt
.t CE
hot nodes
Le
PUT logs-2017-02/_settings
{
“index.routing.allocation.require.my_temp" : "warm"
De
aHr
WB
}
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
node1 node3
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
node.attr.my_temp: warm
p0o
rp-2
Spu
node2 node4
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
TI
-OC Put my_index2 on any
U19
PUT my_index2
Sr-O
ASp
{
1-I
-0N
-
1rt9
"settings": {
p0o
rp-2
"number_of_shards": 1,
Spu
e-A
0is2
"number_of_replicas": 1,
p-r
r
IeN
"index.routing.allocation.include.my_server" : "medium,small",
Hnt
.t CE
"index.routing.allocation.exclude.my_temp" : "hot"
.Lo
nHd
ar
}
Le
my_cluster
node1 node3
R0 P1
De
aHr
WB
my_index
N
Dom
node2 node4
sSt
Nu
P0 -OC R1
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
rack1 rack2
node1 node3
De
aHr
WB
N
Dom
node2 node4
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
node.attr.my_rack_id=rack1 node.attr.my_rack_id=rack2
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
awareness attribute
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "my_rack_id"
De
}
aHr
WB
N
Dom
}
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
awareness
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
node1 node3
P1
De
R1
aHr
WB
my_index
N
Dom
sSt
Nu
-OC
TI
U19
2L0
node2 node4
Sr-O
ASp
1-I
-0N
-
1rt9
P0 R0
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
rack1 rack2
De
my_cluster
aHr
WB
N
Dom
node1 node3
sSt
Nu
-OC
TI
U19
2L0
P1
Sr-O
R1 Oh no! We
ASp
1-I
-0N
replicas.
p0o
rp-2
Spu
e-A
node2 node4
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
P0 R0
nHd
ar
Le
rack1 rack2
De
my_cluster
aHr
WB
N
Dom
node1 node3
sSt
Nu
-OC
TI
U19
P1
Sr-O
R1
ASp
replicas.
1rt9
p0o
rp-2
Spu
e-A
node2 node4
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
P0 P0
nHd
ar
Le
PUT _cluster/settings
this setting is the name of
{ the attribute
"persistent": {
De
aHr
WB
"cluster": {
N
Dom
sSt
"routing": {
Nu
-OC
TI
"allocation.awareness.attributes": "my_rack_id",
U19
2L0
Sr-O
"allocation.awareness.force.my_rack_id.values": "rack1,rack2"
ASp
1-I
}
-0N
-
1rt9
}
p0o
rp-2
Spu
}
e-A
0is2
}
this setting contains the
p-r
r
IeN
Hnt
.t CE
De
aHr
WB
N
Dom
sSt
• You can make Elasticsearch aware of the physical configuration TI
U19
Nu
-OC
De
aHr
WB
4. What is the benefit of forced awareness over simply
N
Dom
sSt
Nu
-OC
configuring shard allocation awareness?
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
node3 server3 dedicated data warm, rack1
sSt
Nu
-OC
TI
U19
2L0
Sr-O
2 Field Modeling
3 Fixing Data
5 Cluster Management
De
Capacity Planning
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
TI
U19
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
Growing from 1 to 10
WB
10 to 100
N
Dom
sSt
can be easy…
Nu
Master Node -OC Ingest Nodes
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
1 node cluster
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Data Nodes
Le
node1
P0
De
aHr
WB
scaling for my_index
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
PUT my_index
1-I
-0N
-
{
1rt9
p0o
rp-2
"settings": {
Spu
e-A
0is2
"number_of_shards": 1,
p-r
r
IeN
"number_of_replicas": 0
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
node1
P0 P1
We plan ahead by
De
aHr
WB
overallocating my_index
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
PUT my_index
1-I
-0N
-
{
1rt9
p0o
rp-2
"settings": {
Spu
e-A
0is2
"number_of_shards": 2,
p-r
r
IeN
"number_of_replicas": 0
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
node1 node2
De
aHr
P0 P1 P1
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
De
aHr
WB
N
Dom
sSt
Nu
PUT my_index -OC node1 node2
TI
U19
2L0
{
Sr-O
ASp
"settings": { P1
1-I
-0N
-
1rt9
"number_of_shards": 5,
p0o
rp-2
"number_of_replicas": 0 P0 P2 P3
Spu
e-A
0is2
}
p-r
P4
r
IeN
}
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ optimal depends on the use case
sSt
Nu
-OC
TI
U19
2L0
node1 node2
p0o
rp-2
Spu
e-A
0is2
P0
P2 P0
P2
p-r
r
IeN
P2 P2
Hnt
P2 P2
.t CE
P2 P2
.Lo
P2 P2
nHd
P2 P2
ar
P2 P2
Le
P500 P1000
This is not a good plan
De
aHr
WB
N
Dom
sSt
Nu
‒ 3 different logs TI
-OC
U19
2L0
Sr-O
ASp
‒ 5 shards (default)
e-A
sql-
0is2
p-r
2017.10.01
r
IeN
Hnt
too many
.t CE
‒ 6 months retention
.Lo
nHd
shards for no
ar
Le
De
aHr
WB
N
Dom
sSt
‒ # of documents TI
U19
Nu
-OC
2L0
Sr-O
ASp
De
aHr
WB
N
Dom
‒ actual documents you are going to index
sSt
Nu
-OC
TI
U19
2L0
my_cluster
De
aHr
WB
N
Dom
node1
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
my_cluster
node1
De
aHr
WB
N
Dom
my_index
sSt
P0
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
use in production
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
my_cluster
aHr
WB
N
Dom
sSt
node1
Nu
-OC
TI
U19
“breaks” depends on
2L0
Sr-O
ASp
P0
-
1rt9
De
aHr
WB
N
Dom
• For searching, the total number of shards is the
sSt
Nu
-OC
measurement (agnostic of index)
TI
U19
2L0
Sr-O
ASp
1-I
-0N
primary shards
Spu
e-A
0is2
p-r
r
IeN
Hnt
De
aHr
WB
N
Dom
{
"volume": 46965, node1
sSt
Nu
{ "high": 31.56,
"stock_symbol":
"volume": 693,"ALL",
-OC
TI
U19
"low":"high":
30.68,31.56,
2L0
"close": 30.91,
{ "stock_symbol":
Sr-O
"ALL",
"trade_date":
"low":"volume":
30.68, 2381,
ASp
"2010-01-15T07:00:00.000Z",
{ "high":
"close": 31.56,
30.91,
1-I
-0N
"stock_symbol":
"trade_date": "ALL",
"volume": 90333,
"low":"high":
30.68,31.56,
-
1rt9
"2010-01-15T07:00:00.000Z",
"close": 30.91,
p0o
"stock_symbol": "ALL",
rp-2
"trade_date":
"low": 30.68,
Spu
"2010-01-15T07:00:00.000Z",
e-A
"close": 30.91,
0is2
"trade_date":
p-r
"2010-01-15T07:00:00.000Z",
r
IeN
Hnt
.t CE
.Lo
nHd
ar
De
aHr
WB
one shard on one node
N
Dom
sSt
Nu
PUT my_index TI
U19
-OC node1
2L0
{
Sr-O
ASp
"settings": {
1-I
-0N
-
1rt9
"number_of_shards": 1,
P0
p0o
rp-2
"number_of_replicas": 0
Spu
e-A
0is2
}
p-r
r
IeN
}
Hnt
.t CE
.Lo
nHd
ar
Le
node1 node2
PUT my_index/_settings
De
aHr
{
WB
N
Dom
"number_of_replicas": 1
sSt
P0 R0
Nu
} -OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
50 indices with 1 shard each
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT my_index
node1 node2
{
"settings": {
"number_of_shards": 4
} P0 P2 P1 P3
}
De
PUT my_index1
aHr
So does a search over
WB
N
Dom
{
my_index1,my_index2
sSt
Nu
"settings": { TI
U19
-OC
"number_of_shards": 2
2L0
Sr-O
}
ASp
1-I
-0N
}
-
1rt9
p0o
node1 node2
rp-2
Spu
e-A
PUT my_index2
0is2
p-r
"settings": { P0 P1 P0 P1
.Lo
nHd
"number_of_shards": 2
ar
Le
}
}
De
grow slowly
aHr
WB
N
Dom
sSt
Nu
-OC
‒ time-based data: data that grows rapidly, like log files
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
node1 node2
De
aHr
WB
PUT hotels
N
Dom
P0 R2 P1 R3
sSt
{
Nu
"settings": { -OC
TI
U19
2L0
"number_of_shards": 4,
Sr-O
ASp
"number_of_replicas": 1
1-I
-0N
node3 node4
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
IeN
R0 P2 R1 P3
Hnt
.t CE
.Lo
nHd
ar
Le
De
PUT hotels/_settings
aHr
node1 node2
WB
N
Dom
{
sSt
Nu
"number_of_replicas": 2 TI
U19
-OC
}
2L0
P0 R2 P1 R3
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
R0 R2 R1 R3 R0 P2 R1 P3
ar
Le
De
aHr
node1 node2
WB
N
Dom
sSt
Nu
-OC
TI
U19
PUT tweets-2017-02-05
2L0
Sr-O
{ P0 R2 P1 R3
ASp
1-I
-0N
"settings": {
-
1rt9
p0o
"number_of_shards": 4,
rp-2
Spu
node3 node4
e-A
"number_of_replicas": 1
0is2
p-r
}
r
IeN
Hnt
.t CE
}
.Lo
nHd
R0 P2 R1 P3
ar
Le
De
aHr
WB
N
Dom
‒ you do not want indexing to become a bottleneck (it has to keep
sSt
Nu
-OC
up)
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT tweets-2017-02-05
De
aHr
{
WB
PUT tweets-2017-02-06
N
Dom
"settings":
{ {
sSt
Nu
PUT 4,
tweets-2017-02-07
"number_of_shards":
"settings": { -OC
TI
U19
{
"number_of_replicas": 1
2L0
"number_of_shards": 4,
Sr-O
"settings": {
ASp
} "number_of_replicas": 1
1-I
-0N
} "number_of_shards": 4,
-
1rt9
} "number_of_replicas": 1
p0o
rp-2
}
Spu
}
e-A
0is2
}
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
GET tweets-2017-02*/_search
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
‒ aliases
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
If now = 2018-03-22T11:56:22
De
aHr
WB
logstash-2018.03.22
N
Dom
<logstash-{now/d}>
sSt
Nu
-OC
TI
U19
logstash-2018.03
2L0
<logstash-{now{YYYY.MM}}>
Sr-O
ASp
1-I
-0N
-
1rt9
<logstash-{now/w}> logstash-2018.03.19
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
# GET /<logstash-{now/d}>/_search
Le
GET /%3Clogstash-%7Bnow%2Fd%7D%3E/_search
POST _aliases
{
"actions": [
De
aHr
{
WB
N
Dom
"add": {
sSt
Nu
"index": "<tweets-{now/d}>", -OC
TI
U19
2L0
"alias": "tweets_write"
Sr-O
ASp
}
1-I
-0N
},
-
1rt9
p0o
rp-2
{
Spu
e-A
"remove": {
0is2
p-r
r
IeN
"index": "<tweets-{now/d-1d}",
Hnt
.t CE
"alias": "tweets_write"
.Lo
nHd
ar
}
Le
} ] }
De
aHr
WB
‒ close indices that are no longer being searched
N
Dom
sSt
Nu
-OC
TI
U19
‒ use hot nodes for indexing and warm nodes for querying
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
nodes and increasing the number of replicas of your indices
WB
N
Dom
sSt
Nu
-OC
TI
• You can similarly provide scaling by distributing your documents
U19
2L0
Sr-O
ASp
De
aHr
WB
is not a good design.
N
Dom
sSt
Nu
-OC
TI
U19
2L0
5. Suppose you calculated the max shard size for your dataset
Sr-O
ASp
1-I
-0N
documents?
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 7
De
Document
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
Modeling
TI
U19
De
• The Nested Aggregation
aHr
WB
N
Dom
sSt
Nu
-OC
• Parent/Child Relationship
TI
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
• A flat world has its advantages
WB
N
Dom
sSt
Nu
-OC
TI
‒ Indexing and searching is fast
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
N
Dom
‒ Nested objects: for working with arrays of objects
sSt
Nu
-OC
TI
U19
2L0
‒ Parent/child relationships
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
‒ no need to perform expensive joins
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
users tweets
De
aHr
WB
N
Dom
sSt
Nu
{ { -OC
TI
U19
2L0
"userid" : 1
p0o
rp-2
"state" : "California" }
Spu
e-A
}
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
tweets
PUT tweets/_doc/123
{
"body" : "My favorite movie is Star Wars",
"time" : "2017-09-24T02:32:27",
"user": {
"userid" : 1,
De
aHr
"username" : "harrison",
WB
PUT tweets/_doc/456
N
Dom
"city" : "Los Angeles",
sSt
{
Nu
"state" : "California" "body" : "Laugh it up, fuzzball.", -OC
TI
U19
}
2L0
"time" : "1980-06-20T00:00:00",
Sr-O
ASp
} "user": {
1-I
-0N
-
1rt9
"userid" : 1,
p0o
rp-2
"username" : "harrison",
Spu
e-A
0is2
"state" : "California"
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
GET tweets/_search
{
"query": {
"bool": {
"must": [
{"match": {"body": "movie"}},
{"match": {"user.username": "harrison"}}
]
De
aHr
WB
}
N
Dom
sSt
}
Nu
"_source": { -OC
TI
}
U19
2L0
"time": "2017-09-24T02:32:27",
1-I
-0N
-
1rt9
"user": {
p0o
rp-2
"userid": 1,
Spu
e-A
0is2
"username": "harrison",
p-r
r
IeN
"state": "California"
.Lo
nHd
ar
}
Le
De
aHr
WB
N
Dom
denormalize a field that does not change, like an _id
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
users tweets
-
1rt9
p0o
rp-2
Spu
e-A
0is2
{ {
.t CE
“userid” as an
.Lo
"userid" : 1, "userid" : 1,
nHd
authoritative source
ar
...
Le
...
} }
De
aHr
"filename": "img1.jpg",
WB
N
Dom
"tags": [
A photo can be tagged
sSt
Nu
{"key": "event", "value": "Christmas"},
-OC
with any key/value pair…
TI
{"key": "folder", "value": "December2017"}
U19
2L0
Sr-O
]
ASp
}
1-I
-0N
-
1rt9
p0o
rp-2
PUT photos/_doc/2
Spu
e-A
{
0is2
"filename": "img2.jpg",
r
IeN
Hnt
use an array
ar
]
}
De
aHr
{
WB
N
Dom
"match": {
sSt
Nu
"tags.key": "event"
-OC
TI
} ?
U19
2L0
Sr-O
},
ASp
{
1-I
-0N
-
"match": {
1rt9
p0o
rp-2
"tags.value": "Christmas"
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
]
.t CE
.Lo
nHd
}
ar
}
Le
GET photos/_search
{
"query": { "filename": "img1.jpg",
"bool": {
"must": [ "tags": [
{ {"key": "event", "value": "Christmas"},
"match": { {"key": "folder", "value": "December2017"}
"tags.key": "event" ]
De
aHr
WB
}
N
Dom
},
sSt
Nu
{ -OC
TI
"filename": "img2.jpg",
U19
"match": {
2L0
Sr-O
"tags": [
ASp
"tags.value": "Christmas"
1-I
-0N
}
p0o
]
e-A
]
0is2
}
p-r
r
IeN
}
Hnt
.t CE
}
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
{
sSt
Nu
"filename" : "img2.jpg", -OC
TI
U19
2L0
}
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
"mappings": {
"_doc": {
De
aHr
WB
"properties": {
N
Dom
sSt
"outer_object": {
Nu
-OC
TI
"type": "nested",
U19
2L0
Sr-O
"properties": {
ASp
1-I
"inner_field": "TYPE",
-0N
-
1rt9
...
p0o
rp-2
Spu
}
e-A
0is2
}
p-r
r
IeN
Hnt
},
.t CE
.Lo
nHd
ar
Le
De
maintains the relationship
aHr
"type": "nested",
WB
N
Dom
"properties": {
between its nested fields
sSt
Nu
"key": {
-OC
TI
"type": "keyword"
U19
2L0
Sr-O
},
ASp
"value": {
1-I
-0N
-
"type": "text"
1rt9
p0o
rp-2
}
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
}
nHd
ar
}
Le
“_id”: 1,
“filename”: “img1.jpg"
“tags”: [“12”, “74”]
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
// id=12 // id=74
p0o
rp-2
Spu
GET photos/_search
{ Specify the “path” to
"query": { the nested object
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
De
aHr
WB
{"match": {"tags.key": "event"}},
N
Dom
sSt
{"match": {"tags.value": "Christmas"}}
Nu
-OC
TI
U19
]
2L0
Sr-O
}
ASp
1-I
-0N
} "filename": "img1.jpg",
-
1rt9
} "tags": [
p0o
rp-2
Spu
}
p-r
]
.Lo
nHd
ar
Le
GET photos/_search
{
"query": {
De
"nested": {
aHr
WB
N
Dom
"path": "tags",
sSt
Nu
"query": { TI
U19
-OC “img1.jpg”
"match": {
2L0
Sr-O
"tags.value": "Christmas"
ASp
“img2.jpg”
1-I
-0N
}
-
1rt9
p0o
}
rp-2
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
nHd
ar
Le
GET photos/_search
{
"query": {
De
"nested": { “img1.jpg”
aHr
WB
N
Dom
"path": "tags", “key”: “event”
sSt
Nu
"query": { -OC
TI
“value”: “Christmas”
U19
"match": {
2L0
Sr-O
"tags.value": "Christmas"
ASp
1-I
-0N
} “img2.jpg”
-
1rt9
p0o
},
“key”: “holiday”
rp-2
Spu
e-A
"inner_hits": {}
“value”: “Christmas”
0is2
p-r
}
r
IeN
Hnt
}
.t CE
.Lo
nHd
to a “nested” query
GET photos/_search
De
aHr
WB
{
N
Dom
sSt
"size": 0,
Nu
-OC
TI
"aggs": { "aggregations": {
U19
2L0
Sr-O
"my_tags": { "my_tags": {
ASp
1-I
"nested": { "doc_count": 4
-0N
-
1rt9
"path": "tags" }
p0o
rp-2
Spu
} }
e-A
0is2
}
p-r
r
IeN
Hnt
}
.t CE
.Lo
}
nHd
ar
Le
GET photos/_search
{
"size": 0,
"aggs": { "buckets": [
"my_tags": { {
"nested": { "key": "event",
"path": "tags" "doc_count": 2
}, },
De
aHr
{
WB
"aggs": {
N
Dom
"key": "folder",
sSt
"tag_terms": {
Nu
-OC "doc_count": 1
TI
"terms": {
U19
},
2L0
Sr-O
"field": "tags.key",
ASp
"size": 10 {
1-I
-0N
"key": "holiday",
-
1rt9
}
p0o
rp-2
} "doc_count": 1
Spu
e-A
}
0is2
}
p-r
]
r
IeN
}
Hnt
.t CE
.Lo
}
nHd
ar
}
Le
De
aHr
WB
N
Dom
‒ the parent can be updated without reindexing the children
sSt
Nu
-OC
TI
U19
2L0
De
aHr
WB
N
Dom
‒ The parent’s id is used as the routing value for the child
sSt
Nu
-OC
TI
U19
document
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
"my_join_relation": {
N
Dom
sSt
"type": "join",
Nu
-OC
TI
"relations": {
U19
2L0
Sr-O
"company": "employee"
ASp
1-I
}
-0N
-
1rt9
},
p0o
rp-2
Spu
}
join field, but that join field can
p-r
r
IeN
Hnt
}
.t CE
}
nHd
ar
}
Le
PUT companies/_doc/c1
{
"company_name" : "Stark Enterprises",
"my_join_relation": { name of the join field
"name": "company"
}
De
aHr
WB
} This document is a
N
Dom
sSt
Nu
PUT companies/_doc/c2
TI
-OC “company” document
U19
2L0
Sr-O
{
ASp
1-I
-0N
"my_join_relation": {
p0o
rp-2
Spu
"name": "company"
e-A
0is2
}
p-r
r
IeN
Hnt
}
.t CE
.Lo
nHd
ar
Le
PUT companies/_doc/emp1?routing=c1
A child document has to be on
{ the same shard as its parent
De
aHr
"first_name" : "Tony",
WB
N
Dom
"last_name" : "Stark",
sSt
Nu
"my_join_relation": { -OC name of the join field
TI
U19
2L0
"name": "employee",
Sr-O
ASp
"parent": "c1"
1-I
-0N
}
-
1rt9
This document is an
p0o
rp-2
}
Spu
e-A
“employee” document
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
“Stark Enterprises”
De
aHr
"max_score": 1,
WB
N
Dom
"hits": [
sSt
Nu
-OC ...
TI
U19
2L0
]
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
has_parent…
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
GET my_index/_search -OC
TI
U19
2L0
Sr-O
{
ASp
"query": {
1-I
-0N
-
1rt9
"has_child": {
p0o
rp-2
"type": "relation_name",
Spu
e-A
0is2
"query": {}
p-r
r
IeN
}
Hnt
.t CE
}
nHd
ar
}
Le
I want all
companies who have an
employee named
“Stark”
GET companies/_search "hits": [
{ {
"query": { "_index": "companies",
"has_child": { "_type": "_doc",
"type": "employee", "_id": "c1",
De
"_score": 1,
aHr
"query": {
WB
N
Dom
"_source": {
"match": {
sSt
Nu
"company_name": "Stark Enterprises",
"last_name": "Stark" -OC
TI
"my_join_relation": {
U19
2L0
}
Sr-O
"name": "company"
ASp
} }
1-I
-0N
}
-
}
1rt9
p0o
rp-2
} }
Spu
e-A
} ]
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
"has_child": {
aHr
"hits": [
WB
N
Dom
{
"type": "employee",
sSt
"_type": "_doc",
Nu
"query": { TI
U19
-OC "_id": "emp1",
"_score": 0.6931472,
"match": {
2L0
Sr-O
"_routing": "c1",
"last_name": "Stark"
ASp
"_source": {
1-I
-0N
} "first_name": "Tony",
-
1rt9
"last_name": "Stark",
p0o
},
rp-2
"my_join_relation": {
Spu
e-A
"parent": "c1"
p-r
}
r
IeN
}
Hnt
.t CE
} }
.Lo
nHd
}
}
ar
]
Le
}
}
}
De
aHr
GET my_index/_search
WB
N
Dom
{
sSt
Nu
"query": { -OC
TI
U19
2L0
"has_parent": {
Sr-O
ASp
"parent_type": “relation_name”,
1-I
-0N
-
"query": {}
1rt9
p0o
rp-2
}
Spu
e-A
0is2
}
the parent relation name
p-r
r
IeN
}
Hnt
.t CE
.Lo
nHd
ar
Le
GET companies/_search
{
"query": { "hits": [
"has_parent": { {
"parent_type": "company", "_index": "companies",
"_type": "_doc",
"query": {
De
"_id": "emp3",
aHr
"match": {
WB
"_score": 1,
N
Dom
"company_name": "NBC"
sSt
"_routing": "c2",
Nu
} -OC "_source": {
TI
U19
2L0
} "first_name": "Tony",
Sr-O
ASp
} "last_name": "Potts",
1-I
-0N
} "my_join_relation": {
-
1rt9
p0o
"name": "employee",
rp-2
}
Spu
"parent": "c2"
e-A
0is2
}
p-r
r
IeN
Hnt
}
.t CE
}
.Lo
nHd
ar
]
Le
{
{ "_index": "companies",
"_index": "companies",
De
"_type": "_doc",
aHr
WB
"_type": "_doc", "_id": "emp1",
N
Dom
sSt
"_id": "emp1", "_version": 1,
Nu
-OC
TI
"found": false "_routing": "c1",
U19
2L0
"found": true,
Sr-O
}
ASp
"_source": {
1-I
-0N
"first_name": "Tony",
-
1rt9
p0o
"last_name": "Stark",
rp-2
Spu
e-A
"my_join_relation": {
0is2
"name": "employee",
p-r
r
IeN
Hnt
"parent": "c1"
.t CE
.Lo
}
nHd
ar
}
Le
POST companies/_doc/emp1/_update?routing=c1
{
De
aHr
WB
"doc" : {
N
Dom
sSt
"first_name" : "Anthony"
Nu
} -OC
TI
U19
2L0
}
Sr-O
ASp
De
aHr
WB
…and your Your
N
Dom
queries test more
sSt
docs are
Nu
than 1 property when -OC updated
TI
U19
matching objects in
Consider
2L0
frequently
Sr-O
these arrays
ASp
parent/child
1-I
-0N
-
1rt9
unless
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
Consider
nHd
ar
Le
nested docs
De
aHr
visualize your data in Kibana
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
• Using a parent/child data type, you can completely
sSt
Nu
-OC
TI
whenever possible
De
aHr
WB
N
Dom
4. True or False: Child objects must be routed to the same
sSt
Nu
-OC
TI
objects to be deleted.
e-A
0is2
p-r
r
IeN
Hnt
.t CE
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 8
De
Monitoring and
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
Alerting
TI
U19
De
• The Elastic Monitoring Component
aHr
WB
N
Dom
sSt
Nu
-OC
• The Monitoring UI
TI
U19
2L0
Sr-O
ASp
1-I
-0N
• Alerting
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
• The above APIs return JSON objects (no surprise) TI
U19
Nu
-OC
2L0
Sr-O
ASp
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
creates visualizations
{
"_nodes": {
"total": 2,
"successful": 2,
"failed": 0
},
GET _cluster/stats "cluster_name": "my_cluster",
"timestamp": 1486701112018,
"status": "red",
"indices": {
"count": 26,
De
aHr
WB
"shards": {
N
Dom
"total": 162,
sSt
Nu
"primaries": 87,
-OC
TI
"replication": 0.8620689655172413,
U19
2L0
"index": {
Sr-O
ASp
"shards": {
1-I
-0N
"min": 2,
-
1rt9
"max": 10,
p0o
rp-2
"avg": 6.230769230769231
Spu
e-A
},
0is2
"primaries": {
r
IeN
Hnt
"max": 6,
nHd
ar
"avg": 3.3461538461538463
Le
},
"replication": {
{
"_nodes": {
"total": 2,
GET _nodes/stats "successful": 2,
"failed": 0
},
"cluster_name": "my_cluster",
"nodes": {
"OmWJPhToQ0iyNfz-Qd9i8g": {
"timestamp": 1486701705225,
"name": "node1",
De
aHr
"transport_address": "192.168.1.6:9300",
WB
N
Dom
"host": "192.168.1.6",
sSt
Nu
"ip": "192.168.1.6:9300",
-OC
TI
"roles": [
U19
2L0
"master",
Sr-O
ASp
"data"
You can specify a list of
1-I
-0N
],
-
nodes also
1rt9
"attributes": {
p0o
rp-2
"temp": "hot",
Spu
e-A
"server_size": "small",
0is2
p-r
"zone": "zoneA"
r
IeN
Hnt
},
.t CE
{
"_shards": {
"total": 10,
GET my_index/_stats "successful": 10,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 2,
"deleted": 0
De
aHr
},
WB
N
Dom
"store": {
sSt
Nu
"size_in_bytes": 7399,
-OC
TI
"throttle_time_in_millis": 0
U19
2L0
},
Sr-O
ASp
"indexing": {
You can get the stats from all
1-I
-0N
"index_total": 0,
-
"index_time_in_millis": 0,
p0o
rp-2
"index_current": 0,
Spu
e-A
"index_failed": 0,
0is2
p-r
"delete_total": 0,
r
IeN
Hnt
"delete_time_in_millis": 0,
.t CE
GET _cluster/pending_tasks
De
"tasks": [
aHr
WB
{
N
Dom
"insert_order": 101,
sSt
Nu
-OC
"priority": "URGENT",
TI
U19
"source": "create-index [my_index], cause [api]",
2L0
Sr-O
"time_in_queue_millis": 86,
ASp
"time_in_queue": "86ms"
1-I
-0N
}
-
1rt9
p0o
]
rp-2
Spu
}
e-A
0is2
{
GET _tasks "nodes": {
"OmWJPhToQ0iyNfz-Qd9i8g": {
"name": "node1",
"transport_address": "192.168.1.6:9300",
"host": "192.168.1.6",
De
"ip": "192.168.1.6:9300",
aHr
WB
"roles": [
N
Dom
"master",
sSt
Nu
-OC
"data"
TI
U19
],
2L0
Sr-O
"tasks": {
ASp
"OmWJPhToQ0iyNfz-Qd9i8g:37432": {
1-I
-0N
"node": "OmWJPhToQ0iyNfz-Qd9i8g",
-
1rt9
p0o
"id": 37432,
You can also use this
rp-2
Spu
"type": "direct",
e-A
"action": "cluster:monitor/tasks/lists[n]",
p-r
r
IeN
"start_time_in_millis": 1486702376488,
Hnt
.t CE
"running_time_in_nanos": 2157349,
.Lo
nHd
"cancellable": false,
ar
"parent_task_id": "OmWJPhToQ0iyNfz-Qd9i8g:37431"
Le
},
De
aHr
WB
N
Dom
sSt
Nu
-OC
192.168.1.6 32 98 11 1.83 md * node1
TI
U19
GET _cat/nodes
2L0
GET _cat/nodes?v&h=name,disk.avail,search.query_total,heap.percent
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
node2 144.4gb 0 26
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
...
N
Dom
... “write": {
sSt
Nu
"write": { -OC "threads": 8,
TI
U19
2L0
"min": 8, "active": 0,
1-I
-0N
-
1rt9
"max": 8, "rejected": 0,
p0o
rp-2
} "completed": 177
p-r
r
IeN
... }
Hnt
.t CE
...
.Lo
nHd
ar
Le
De
aHr
WB
node2 index 0 0 0
N
Dom
sSt
node2 management 1 0 0
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
Hot threads at 2018-04-24T19:56:36.274Z,
WB
GET _nodes/node1/hot_threads
N
Dom
interval=500ms, busiestThreads=3,
sSt
ignoreIdleThreads=true:
Nu
-OC
TI
U19
2L0
thread 'elasticsearch[node1][[timer]]'
get hot threads just
1-I
-0N
org.elasticsearch.threadpool.ThreadPool$
Spu
e-A
CachedTimeThread.run(ThreadPool.java:541)
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
"index.indexing.slowlog" : {
WB
N
Dom
"threshold.index" : {
sSt
Nu
"warn" : "10s", -OC
TI
U19
"info" : "5s",
2L0
"debug" : "2s",
1-I
-0N
"trace" : "0s"
-
1rt9
p0o
rp-2
},
Spu
e-A
0is2
p-r
"source" : 1000
ar
}
}
document’s source will be logged
De
aHr
WB
N
Dom
PUT my_index/_settings
sSt
Nu
{
-OC
TI
"index.search.slowlog": {
U19
2L0
Sr-O
"threshold": {
ASp
"info": "5s"
1rt9
},
Spu
thresholds
e-A
"fetch": {
0is2
p-r
"info": "800ms"
r
IeN
Hnt
}
.t CE
.Lo
nHd
},
ar
Le
"level": "info"
}
}
GET crimes/_search
{
"size": 20,
"profile": true, Enable profiling for this search
"query": {
"bool": {
De
aHr
"filter": {"match": {"incident.description": "handgun"}}
WB
N
Dom
}
sSt
Nu
},
-OC
TI
U19
"aggs": {
2L0
Sr-O
"crimes_with_an_arrest": {
ASp
1-I
"aggs": {
p0o
rp-2
"types_of_handgun_crimes": {
Spu
e-A
}
r
IeN
Hnt
.t CE
}
.Lo
nHd
}
ar
Le
}
}
De
aHr
"create_weight": 6393,
WB
N
Dom
"next_doc": 0,
sSt
"match": 0,
Nu
"create_weight_count": 1, -OC
TI
U19
"next_doc_count": 0,
2L0
Sr-O
"score_count": 0,
ASp
"build_scorer": 100244,
1-I
-0N
"advance": 759117,
-
1rt9
"advance_count": 2037
p0o
rp-2
}
Spu
e-A
},
0is2
{
p-r
r
IeN
"type": "BoostQuery",
Hnt
.t CE
"description":
.Lo
"(ConstantScore(incident.description:handgun))^0.0",
nHd
ar
"time_in_nanos": 1875651,
Le
"breakdown": {
"score": 451486,
...
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
xpack.monitoring.collection.interval
aHr
WB
collected. Defaults to 10s
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
Defaults to 7d
Spu
e-A
0is2
p-r
r
IeN
Hnt
‒ https://www.elastic.co/guide/en/elasticsearch/reference/current/monitoring-
settings.html
Copyright Elasticsearch BV 2015-2019 Copying, publishing and/or !401
distributing without written permission is strictly prohibited
Dedicated Monitoring Cluster
• Recommend using a dedicated cluster for Monitoring
‒ reduce the load and storage on your other clusters
‒ access to Monitoring even when other clusters are unhealthy
‒ separate security levels from Monitoring and production clusters
monitoring_cluster
De
aHr
WB
N
Dom
sSt
Monitoring Monitoring
node1
Nu
agent agent -OC
TI
U19
2L0
Sr-O
ASp
dedicated 1-node
e-A
0is2
Monitoring Monitoring
Hnt
.t CE
Production cluster
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
configure in elasticsearch.yml
U19
2L0
Sr-O
of each node
ASp
1-I
-0N
-
1rt9
xpack.monitoring.exporters:
p0o
rp-2
id1:
Spu
e-A
0is2
type: http
p-r
r
IeN
host: ["http://monitoring_cluster:9200"]
Hnt
.t CE
.Lo
auth.username: username
nHd
ar
auth.password: changeme
Le
prod_cluster_1
stats are sent every
10 seconds
converted to JSON
and indexed
prod_cluster_2
De
aHr
WB
N
Dom
monitoring_cluster
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
qa_cluster_1
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
Click here
-
1rt9
p0o
rp-2
Turn on monitoring
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
master node
.Lo
nHd
stats update
in real-time
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
details
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ https://www.elastic.co/guide/en/elastic-stack-overview/current/
sSt
Nu
-OC
xpack-alerting.html
TI
U19
2L0
Sr-O
ASp
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
‒ track network activity to detect malicious activity
N
Dom
sSt
Nu
-OC
TI
U19
De
aHr
WB
N
Dom
sSt
Nu
‒ controls whether the watch actions are executed -OC
TI
U19
2L0
Sr-O
ASp
• Transform
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
• Actions
.Lo
nHd
ar
Le
De
}
aHr
WB
}
N
Dom
sSt
}
Nu
-OC
TI
},
U19
2L0
"condition": {
Sr-O
ASp
},
-
1rt9
p0o
"actions": {
rp-2
Spu
e-A
"log_error" : {
0is2
p-r
"logging" : {
r
IeN
}
nHd
ar
}
Le
}
}
GET .watches/_search
Alerting creates some watches
that you can view
De
‒ You can check the watch history to see execution details:
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
GET .watcher-history*/_search
Sr-O
ASp
{
1-I
-0N
-
1rt9
"sort" : [
p0o
rp-2
{ "result.execution_time" : "desc" }
Spu
e-A
0is2
]
p-r
r
IeN
De
aHr
‒ watcher_user can view all existing watches
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
watch execution
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
watcher state:
r
IeN
Hnt
.t CE
firing, error,
.Lo
nHd
ar
1. create new
threshold alert
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
3. define condition
.t CE
4. check hits
.Lo
nHd
(elasticsearch aggs)
ar
Le
5. define action
6. test action
De
aHr
WB
N
Dom
7. save watch
sSt
Nu
-OC
TI
U19
2L0
De
aHr
WB
N
Dom
monitor Elasticsearch
sSt
Nu
-OC
TI
U19
2L0
Sr-O
De
aHr
WB
N
Dom
sSt
5. Name three of the five watch building blocks. Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
Watcher UI.
Spu
e-A
0is2
p-r
r
IeN
Hnt
2 Field Modeling
3 Fixing Data
5 Cluster Management
6 Capacity Planning
Chapter 9
De
From Dev to
aHr
WB
7 Document Modeling
N
Dom
sSt
Nu
-OC
Production
TI
U19
De
• Cross Cluster Search
aHr
WB
N
Dom
sSt
Nu
-OC
• Overview of Upgrades
TI
U19
2L0
Sr-O
ASp
1-I
-0N
• Cluster Restart
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
PUT logs/log/1
WB
N
Dom
{
sSt
Nu
"level" : "ERROR", -OC
TI
U19
}
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
• Or, you can whitelist certain patterns: TI
U19
Nu
-OC
2L0
Sr-O
ASp
PUT _cluster/settings
1-I
-0N
-
1rt9
{
p0o
rp-2
"persistent": {
Spu
e-A
0is2
"action.auto_create_index" : ".monitoring-es*,logstash-*"
p-r
r
IeN
}
Hnt
.t CE
.Lo
}
nHd
ar
Le
De
aHr
Elasticsearch, but not useful for production systems
WB
N
Dom
sSt
Nu
-OC
TI
U19
9200-9299
p-r
r
IeN
HTTP
Hnt
.t CE
.Lo
nHd
ar
De
aHr
WB
development.” environment.”
N
Dom
sSt
Nu
-OC
TI
U19
2L0
my_dev_cluster my_production_cluster
Sr-O
ASp
1-I
-0N
-
1rt9
De
‒ development mode: any bootstrap checks that fail appear as
aHr
WB
N
Dom
warnings in the Elasticsearch log
sSt
Nu
-OC
TI
U19
2L0
Sr-O
‒ https://www.elastic.co/guide/en/elasticsearch/reference/master/
0is2
p-r
r
IeN
Hnt
bootstrap-checks.html
.t CE
.Lo
nHd
ar
Le
De
aHr
maximum size virtual memory
WB
disable swapping
N
Dom
sSt
maximum number of threads
Nu
-OC
not use serial collector
TI
U19
2L0
file descriptor
Sr-O
ASp
server JVM
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ use separate firewall rules for each kind of traffic
sSt
Nu
-OC
TI
U19
2L0
‒ or use a proxy/load-balancer
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
‒ replicas/software provide HA
sSt
Nu
-OC
TI
U19
2L0
De
aHr
WB
N
Dom
• path.data
sSt
Nu
-OC
TI
U19
2L0
‒ if you lose one disk, the data on the other disks are preserved
nHd
ar
Le
may generate node level watermark issues if disks have different sizes
De
aHr
WB
‒ but, disable concurrent merges
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
index.merge.scheduler.max_thread_count: 1
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
‒ https://www.elastic.co/blog/is-your-elasticsearch-trimmed
Le
De
aHr
WB
N
Dom
‒ one Elasticsearch instance can fully consume a machine
sSt
Nu
-OC
TI
U19
2L0
De
aHr
WB
N
Dom
• Use shard awareness and forced awareness
sSt
Nu
-OC
TI
U19
2L0
Sr-O
De
aHr
WB
"cluster.routing.allocation.node_concurrent_recoveries": 2
N
Dom
sSt
}
Nu
} -OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
• Relocation:
-
1rt9
p0o
rp-2
Spu
e-A
0is2
"cluster.routing.allocation.cluster_concurrent_rebalance" : 2
ar
Le
-Xms30g
-Xmx30g
De
aHr
WB
N
Dom
‒ setting the ES_JAVA_OPTS environment variable
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
De
completion suggester
aHr
WB
N
Dom
sSt
Nu
-OC
TI
cluster state shard query cache (1%)
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
a query
e-A
0is2
p-r
r
IeN
Hnt
.t CE
JVM Heap
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
‒ set Xmx to no more than 50% of your physical RAM TI
U19
Nu
-OC
2L0
Sr-O
ASp
‒ do not exceed more than 30GB of memory (to not exceed the
e-A
0is2
p-r
‒ https://www.elastic.co/blog/a-heap-of-trouble
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
"range": {
WB
N
Dom
"publish_date": {
sSt
Nu
"gte": 2016, TI
U19
-OC
"lte": 2018
2L0
Sr-O
}
ASp
1-I
-0N
}
-
1rt9
}
p0o
rp-2
]
Spu
e-A
}}}
0is2
p-r
r
IeN
Hnt
.t CE
query?
Le
De
aHr
}
WB
N
Dom
},
sSt
Nu
"filter": { -OC
TI
U19
"range": {
2L0
Sr-O
ASp
"publish_date": {
1-I
-0N
"gte": 2016,
-
1rt9
p0o
rp-2
"lte": 2018
Spu
e-A
}
subsequent searches
.Lo
nHd
}
ar
Le
}
}
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
}
WB
N
Dom
},
sSt
Nu
"aggs": {
-OC
TI
"my_aggs": {
U19
2L0
Sr-O
"range": {
ASp
"field": "runtime_ms",
1-I
-0N
-
"ranges": [
1rt9
p0o
rp-2
{
Spu
e-A
"from": 0,
0is2
p-r
"to": 100
r
IeN
Hnt
},
.t CE
.Lo
nHd
{
ar
Le
"from": 100,
"to": 200
}]}}}}
De
"filter": {
aHr
WB
requests that returned a 404
N
Dom
"match": {
sSt
Nu
"status_code": 404
-OC
TI
}
U19
2L0
Sr-O
},
ASp
"aggs": {
1-I
-0N
-
"top_countries": {
1rt9
p0o
rp-2
"terms": {
Spu
e-A
"field": "geoip.country_name.keyword"
0is2
p-r
}
r
IeN
Hnt
}
nHd
}
Le
De
aHr
WB
N
Dom
‒ and limiting the scope with a query or filter is not a viable option
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
"top_countries": {
aHr
WB
"terms": {
N
Dom
sSt
"field": "geoip.country_name.keyword"
Nu
-OC
TI
}
U19
2L0
}
Sr-O
ASp
}
-
1rt9
"aggregations": {
p0o
}
rp-2
"my_sample": {
Spu
e-A
} "doc_count": 1500,
0is2
"top_countries": {
r
IeN
Hnt
"doc_count_error_upper_bound": 4,
.Lo
nHd
"sum_other_doc_count": 251,
ar
Le
"buckets": [
...
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
table1 index1
0is2
p-r
table2 index2
r
IeN
Hnt
.t CE
table3 index3
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
table1
-
1rt9
p0o
rp-2
table2 index1
Spu
e-A
table3
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
• The solution?
TI
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
},
N
Dom
sSt
"filter": [
Nu
{ -OC
TI
U19
2L0
"script": {
Sr-O
ASp
"script": {
1-I
-0N
}
rp-2
Spu
}
e-A
0is2
}
p-r
r
IeN
]
Hnt
.t CE
}
.Lo
nHd
ar
}
Le
PUT _ingest/pipeline/comment_length
{
"processors" : [
De
aHr
{
WB
N
Dom
"script": {
sSt
Nu
"lang": "painless",
-OC
TI
"source": "ctx.title_length = ctx.title.length();"
U19
2L0
Sr-O
}
ASp
}
1-I
-0N
-
]
1rt9
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
using a pipeline)
Le
De
aHr
"query": { "query": {
WB
N
Dom
"regexp": { "regexp": {
sSt
Nu
"title": "net.*" -OC "title": ".*work"
TI
U19
2L0
} }
Sr-O
ASp
} }
1-I
-0N
} }
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
GET blogs/_search
WB
N
Dom
{
sSt
Nu
"query": { -OC
TI
U19
}
Spu
e-A
}
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
GET blogs/_search
{
"query": {
"regexp": {
De
aHr
WB
"title.reversed": "krow.*"
N
Dom
sSt
}
Nu
} -OC
TI
U19
2L0
De
aHr
cluster_1
WB
N
Dom
sSt
Tribe node
Nu
-OC node1
TI
U19
2L0
Sr-O
node2
cluster_3
ASp
1-I
-0N
node3
-
1rt9
cluster_1
p0o
rp-2
node1
Spu
e-A
0is2
cluster_2 node2
p-r
r
IeN
cluster_2 node3
Hnt
.t CE
.Lo
node4
nHd
node1
ar
node5
Le
node2
cluster_3
node3
PUT _cluster/settings
A name you assign to the
remote cluster
De
aHr
{
WB
N
Dom
"persistent": {
sSt
Nu
"cluster.remote" : { -OC
TI
U19
"germany_cluster" : {
2L0
Sr-O
ASp
"seeds" : ["my_server:9300","64.33.90.170:9300"]
1-I
-0N
}
-
1rt9
p0o
rp-2
}
Spu
e-A
}
0is2
p-r
}
r
IeN
Hnt
De
aHr
{
WB
N
Dom
"query": {
sSt
Nu
"match": { -OC
TI
U19
2L0
"title": "network"
Sr-O
ASp
}
1-I
-0N
-
1rt9
}
p0o
rp-2
}
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
GET blogs,germany_cluster:blogs/_search
{
De
aHr
WB
"query": {
N
Dom
sSt
"match": {
Nu
"title": "network" -OC
TI
U19
2L0
}
Sr-O
ASp
}
1-I
-0N
-
1rt9
}
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
1. A search request is 2. node3 fetches information
sSt
Nu
-OC about remote indices and shards
sent to node3
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
my_cluster
1rt9
germany_cluster
p0o
rp-2
Spu
e-A
0is2
node1
p-r
node1
r
IeN
Hnt
node2
.t CE
Client node2
.Lo
nHd
node3
ar
node3
Le
De
aHr
WB
the relevant local shards… on the relevant remote shards
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
my_cluster
1rt9
germany_cluster
p0o
rp-2
Spu
e-A
0is2
node1
p-r
node1
r
IeN
Hnt
node2
.t CE
Client node2
.Lo
nHd
node3
ar
node3
Le
4. node3 gets “size” hits from 5. The top hits are fetched by
De
each shard and determines the node3 and returned to the client
aHr
WB
N
Dom
top hits
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
my_cluster
1rt9
germany_cluster
p0o
rp-2
Spu
e-A
0is2
node1
p-r
node1
r
IeN
Hnt
node2
.t CE
Client node2
.Lo
nHd
node3
ar
node3
Le
De
aHr
"title": "Using Nmap + Logstash to Gain
WB
N
Dom
Insight Into Your Network",
sSt
...
Nu
-OC }
TI
U19
2L0
},
Sr-O
ASp
{
1-I
-0N
"_index": "blogs",
-
1rt9
"_type": "doc",
p0o
rp-2
"_id": "Mc1CKmIBCLh5xF6i7Y",
Spu
e-A
"_score": 4.561167,
0is2
p-r
"_source": {
r
IeN
Hnt
...
ar
Le
}
},
GET blogs,*:blogs/_search
{
"query": {
"match": {
"title": "network"
De
aHr
WB
}
N
Dom
sSt
}
Nu
-OC
TI
}
U19
De
aHr
WB
N
Dom
‒ 6.x can use indices created in 5.x
sSt
Nu
-OC
TI
U19
2L0
‒ ...
Le
De
indices to be reindexed
aHr
WB
N
Dom
sSt
Nu
-OC
can read
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
De
‒ a full cluster restart is required (some downtime)
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
De
aHr
<= 2.x 6.y Full cluster restart
WB
N
Dom
sSt
Nu
-OC
6.x 6.y Rolling upgrade
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
rolling rolling
upgrade upgrade
5.6 6.x
If upgrading to
6.3 or later, no
De
aHr
need to reinstall
WB
N
Dom
X-Pack
sSt
Nu
-OC
5.3
TI
U19
2L0
Sr-O
ASp
full cluster
1-I
-0N
restart
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
6.x
De
aHr
‒ and we build a list of the steps you should follow
WB
N
Dom
sSt
Nu
-OC
TI
‒ https://www.elastic.co/products/upgrade_guide
U19
2L0
Sr-O
ASp
1-I
-0N
• A cluster restart will be part of any list, so let's talk about it...
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
3. stop and update one node
sSt
Nu
-OC
TI
U19
2L0
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.enable" : "none"
De
aHr
}
WB
N
Dom
}
sSt
Nu
-OC
TI
U19
2L0
POST _flush/synced
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.enable" : "all"
}
}
De
• then wait for the cluster to be green again:
aHr
WB
N
Dom
sSt
Nu
-OC
‒ if green is not possible, check there are no initializing or
TI
U19
2L0
Sr-O
GET _cat/health
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
I am upgraded.
Who is next? I will go next!
De
my_cluster
aHr
WB
N
Dom
sSt
Nu
node1 TI
-OC node2 node3
U19
2L0
Sr-O
ASp
1-I
-0N
De
aHr
WB
N
Dom
sSt
Nu
4. shutdown and update all nodes TI
-OC
U19
2L0
downtime is here
Sr-O
ASp
De
aHr
WB
• Cross cluster search allows you to search multiple clusters
N
Dom
sSt
within the same request. Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
De
aHr
WB
N
Dom
sSt
5. True or False: You can search and index documents Nu
-OC
TI
U19
2L0
Sr-O
a node restart?
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
Conclusions
IeN
rp-r
0is2
e-A
Spu
rp-2
p0o
1rt9
- -0N
1-I
ASp
Sr-O
2L0
U19
TI
-OC
Nu
sSt
Dom
N
WB
aHr
De
Resources
• https://www.elastic.co/learn
‒ https://www.elastic.co/training
‒ https://www.elastic.co/community
‒ https://www.elastic.co/docs
• https://discuss.elastic.co
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
FOUNDATION
Immersive Learning
Lab-based exercises and knowledge
checks to help master new skills
Solution-based Curriculum
Real-world examples and common use
cases
De
aHr
WB
N
Dom
sSt
Nu
Experienced Instructors TI
U19
-OC
2L0
everything Elastic
-
1rt9
p0o
SPECIALIZATIONS
rp-2
Spu
e-A
Performance-based Certification
r
IeN
Hnt
.t CE
.Lo
FLEXIBLE SCOPING
Shifts resource as your
requirements change
De
aHr
WB
PHASE-BASED
N
Dom
GLOBAL CAPABILITY
sSt
Nu
PACKAGES -OC Provide expert, trusted
TI
U19
services worldwide
ASp
‹#›
Le
ar
nHd
.Lo
.t CE
Hnt
IeN
rp-r
0is2
e-A
Spu
rp-2
p0o
1rt9
- -0N
1-I
ASp
Sr-O
2L0
U19
TI
-OC
Nu
sSt
Dom
N
WB
aHr
De
Thank you!
Please complete the online survey
Le
ar
nHd
.Lo
.t CE
Hnt
IeN
rp-r
0is2
e-A
Quiz Answers
Spu
rp-2
p0o
1rt9
- -0N
1-I
ASp
Sr-O
2L0
U19
TI
-OC
Nu
sSt
Dom
N
WB
aHr
De
Chapter 1 Quiz Answers
1. True
2. Field names, term dictionary, term frequency, term
proximity, deleted documents, stored fields, normalization
factors
3. False! Only call _forcemerge on read-only indices
4. True
De
aHr
WB
5. The response is only returned after the document is
N
Dom
sSt
searchable (in a segment) Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
7. Any time that you will not have more writes and would like
Le
De
aHr
WB
N
Dom
4. Set “dynamic” to “strict”
sSt
Nu
-OC
TI
U19
2L0
Sr-O
5. “intersect”
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
query, so all documents are hits
sSt
Nu
-OC
TI
U19
2L0
Sr-O
clientip.country_iso_code
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
and allow users to only execute a few predefined queries
N
Dom
sSt
Nu
-OC
TI
U19
2L0
6. True
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
De
across “zones” that you define
aHr
WB
N
Dom
sSt
Nu
-OC
TI
De
aHr
WB
5. 9 or 10, depending upon if you want some extra buffer
N
Dom
sSt
room. And that depending up on other requirements it may Nu
-OC
TI
U19
2L0
Sr-O
shards
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
N
Dom
sSt
5. False Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
6. True
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le
De
aHr
WB
thread_pool or _cat/thread_pool
N
Dom
sSt
Nu
-OC
TI
U19
De
aHr
WB
N
Dom
sSt
6. It greatly speeds up the recovery time of indices that have TI
U19
Nu
-OC
Version 6.6.0
© 2015-2019 Elasticsearch BV. All rights reserved. Decompiling, copying, publishing and/or distribution without written consent of Elasticsearch BV is
strictly prohibited.
De
aHr
WB
N
Dom
sSt
Nu
-OC
TI
U19
2L0
Sr-O
ASp
1-I
-0N
-
1rt9
p0o
rp-2
Spu
e-A
0is2
p-r
r
IeN
Hnt
.t CE
.Lo
nHd
ar
Le