Solr in Depth
Solr in Depth
Edustill 2023
Getting Started
● What is a search engine?
● Information retrieval
● Full text search
● How search engine works?
○ Crawling
○ Indexing
○ Searching
○ Ranking
Introduction to SOLR
● What is Solr?
● Features
● Lucene
● Architecture
● Solr Cloud
Starting Solr
● Solr in Standalone mode
● Solr in Cloud Mode
○ Role of zookeeper
○ External zookeeper
○ Zookeeper ensemble
● Configuring for production
○ Different directories in solr solr home, data dir
○ Starting solr with custom data dir
○ Starting zookeeper with custom data dir
1
Concept of Collection, schema and documents..
● Collection
● Schema
● Field
● Document
● Shard
● replica
● Segment
Creating collection
● Creating collection using admin ui
● Creating collection using command line
● Collection api
2
Schema design
● concepts
○ Solr types
○ Solr Field types
○ Analyzers
○ Tokenizers
○ Filters
● Designing Schema
○ Selecting tokenizer
○ Selecting filters
○ Stemming
● Analysis tab
● User of tokenizers
○ Standard tokenizer
○ Whitespace tokenizer
○ Word delimiter graph filter
● Copy field
● Dynamic fields
Querying Solr
● Select handler
● Query handler
● Basic query syntax
○ Query
○ Solr scoring
■ Factors affecting score
○ Filter query
○ Default field
● Advanced Query
○ Proximity search
○ Fuzzy search
○ Boolean operators
○ Dismax and edismax
○ Query parsers
○ Boosting
○ Phonetic search
3
● Query parsers and their Use
○ Edismax
○ Surround
○ Terms
○ Constant score
○ Combining multiple query parser
● Search relevance tuning:
○ Field boost
○ Proximity boost
○ Minimum match
○ Edismax recipes
● Tagger handler
● Caveats
○ Phrase match when position length changes
○ Surround and term query parser
Query assistance
● Autosuggestions
○ Configuring autosuggest
○ Tuning autosuggest
● Spellchecker
○ Configuration
Customizing Results
● Document fields
○ Field lists
○ Field aliasing
○ pagination
○ Deep pagination
○ Fielded search
○ Document transformer
● Highlighting
○ Highlighter
○ Caveats
4
Faceting
● Simple faceting
● Json faceting
● How faceting works
● Faceting caveats
○ Sparse distribution
○ Refine and over refine
○ Unique count
■ Getting correct unique count in Solr cloud mode
○ Different type of facets
● Advanced faceting
○ Nested facets
○ Date range
○ Query facet
○ Change domain
● Stats component
○ Getting unique count using stats
Export
● Export handler
● Streaming
● Query performance tips
● Highlight,faceting and export performance
Deleting documents
● Internals
● Segment merging
Managing solr
● Managing config
● Logging
● Backup
● Collection aliasing
5
Security
● Https
● Authorization and authentication
● Securing zookeeper