0% found this document useful (0 votes)
49 views28 pages

Elasticsearch Why Big System Need You

Elasticsearch is a distributed, RESTful search and analytics engine that allows real-time search and analytics of datasets for use cases including full-text search, structured search, analytics, and more. It is distributed, scalable, and schema-free. It provides indexing, search, analytics, and APIs for interacting with data. Elasticsearch can be installed and run with basic commands and requires setting the JAVA_HOME variable. Data can be inserted and searched using REST APIs. Configuration is done through YAML or JSON files. Logging uses Log4j. Sharding and replication improve performance and availability. Mapping defines how fields are indexed and searched. Analyzers preprocess text for searching. Faceting allows aggregation of search results. Plugins

Uploaded by

huebesao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views28 pages

Elasticsearch Why Big System Need You

Elasticsearch is a distributed, RESTful search and analytics engine that allows real-time search and analytics of datasets for use cases including full-text search, structured search, analytics, and more. It is distributed, scalable, and schema-free. It provides indexing, search, analytics, and APIs for interacting with data. Elasticsearch can be installed and run with basic commands and requires setting the JAVA_HOME variable. Data can be inserted and searched using REST APIs. Configuration is done through YAML or JSON files. Logging uses Log4j. Sharding and replication improve performance and availability. Mapping defines how fields are indexed and searched. Analyzers preprocess text for searching. Faceting allows aggregation of search results. Plugins

Uploaded by

huebesao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Elasticsearch

Why big system need you


by Trần Kim Hiếu
a Mobile & Web Developer @ Silicon Straits Saigon
Searching is hard and important

Functional requirements

• Find the right things (effectivity)

Non-functional requirements

• Find the things right (efficiency)


Speed is useless without relevance
Biggest problem: Search is highly subjective
Search by term
Search by ID
Suggestion
Suggestion
Instant suggestion
Highlight
What is elasticsearch?

• distributed • document
restful search oriented
and analytics
• conflict
• real time data management

• real time • schema free


analytics
• restful api
• distributed
• per-operation
Installing

• After downloading the latest release and extracting it,


elasticsearch can be started using:

$ bin/elasticsearch

• To run foreground

$ bin/elasticsearch -f

It need set $JAVA_HOME variable


Using
Insert

# curl -X PUT http://localhost:9200/products/product/1

-d '{ "name" : “high quality search engine" }'

Searching

# curl -X POST 'http://localhost:9200/products/product/_search'

-d '{ "query" : { "match" : { "name" : " search"} } }'


Configuration

• config/elasticsearch.yml or config/ elasticsearch.json


http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html

• Some keyword in here

• YAML - http://www.yaml.org

• JSON - http://www.json.org
Logging

• Use famous Java logging library Log4j

• Seperate logging configuration (simplified log4j):


config/logging.yml

• Keywords

• Log4j - http://logging.apache.org/log4j
Sharding & Replication

Sharding is index partitioning

• Split logical data into physically smaller parts

• Control data flows

Replication is share same data over several machines

• Increasing throughput due to concurrency

• Allow outage of nodes without dataloss


One row is a replication
Sharding & Replication One column is a sharding
Indexing

Import data to elasticsearch


Mapping

• Each JSON field can be mapped to a specific core type.


JSON itself already provides us with some typing, with its
support

• string

• integer/long

• float/double

• boolean

• null
Mapping

Other types that elasticsearch supported

• core types • multi field type

• array type • ip type

• object type • geo point type

• root object type • geo shape type

• nested type • attachment type


http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-types.html
Analyzers
Analyzers how

A simple way to understand about analyzers


Index how

A simple way to understand about indexing


Searching

Different ways of searching

• Search queries

match, term, prefix, id, fuzzy

• Counting only, Geo-based queries

• More like this, Highlighting

• Faceting, Percolation, Scripting

• Suggestions
Faceting

• Faceting allows aggregation of search results

• Term: Group results by a term

• Range: Group by price or date ranges

• Histogram: Group results in equally sized buckets, also


as date histogram

• Statistical: Include statistical data like min, max, sum,


avg & some more
Faceting @ Vật Giá Laptop attributes faceting
Pluggable architecture

• Modularized architecture

• Plugins are simple zip files with a predefined layout

• Different plugin use-cases

• Lucene features

• Monitoring

• Scripting languages

• Rivers
Clients & integrations

• Tons of languages supported already


Perl, Python, Ruby, PHP, JavaScript, .NET, Scala, Clojure,
Erlang

• Lots integrations available


Grails, Play Framework (1,2), spring & spring-data
Django, Haystack, Catalyst, Node, Mongoose
Wordpress, Drupal, Symfony2, CakePHP, Nagios, Munin,
collectd, MCollective, chef
Resources

• Elasticsearch official site - elasticsearch.org

• Introduction: Getting down and dirty with elasticsearch


(Clinton Gormley) -
http://www.slideshare.net/clintongormley/down-and-
dirty-with-elasticsearch

• Explore your data with elasticsearch by Elasticsearch Inc


- https://speakerdeck.com/elasticsearch/explore-your-
data-with-elasticsearch

• Extending Elasticsearch by Alexander Reelsen -


https://speakerdeck.com/spinscale/extending-
elasticsearch
Thanks you

Q&A

You might also like