0% found this document useful (0 votes)

60 views1 page

How We Improved Our Performance Using ElasticSearch Plugins - Part 1 - by Xiaohu Li - Tinder Tech Blog - Medium

Uploaded by

nd0906

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views1 page

How We Improved Our Performance Using ElasticSearch Plugins - Part 1 - by Xiaohu Li - Tinder Tech Blog - Medium

Uploaded by

nd0906

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Search Write Sign up Sign in

How We Improved Our Performance

Using ElasticSearch Plugins: Part 1
Xiaohu Li · Follow
Published in Tinder Tech Blog · 12 min read · Sep 5, 2019

342

Written By: Pierre Poitevin, Senior Software Engineer|Daniel Geng,

Software Engineer | Xiaohu Li, Engineering Manager

Problems
The Tinder Eng team has recently been working on integrating machine
learning (ML) algorithms into the Tinder recommendation system. The
Tinder recommendation system is what is used to provide users with
recommendations, that the users can then like or not by using the Swipe
Right or Swipe Left features. This recommendation system is discussed in
the blog post: Powering Tinder® — The Method Behind Our Matching.

To start with, we came up with several potential options, but they all relied
on many more features (or user characteristics) than the other algorithms
we were using at the time. When we tested these ML algorithms, they were
not as fast as the non ML ones. It took much longer for Elasticsearch (ES) to
return results on the many features we were querying. Moreover, Painless
scripts, which were the scripts we used for the ML queries, have a hard limit
of 16384 characters (this was changed to a configurable limit at the time of
this writing, but it was not the case when we were working on it), which we
were closely approaching. We also noticed that Painless had other issues,
such as not having static variables or methods, which led to a performance
penalty, because it forced ES to re-instantiate the same objects over and over.

To solve the character limit issue, we tried to split a large script into multiple
smaller scripts, but noticed that the query performance got worse when we
did. We were aware of an alternative to painless script, ES plugins, which Top highlight

allows us to install new functionality on the ES side. That way, we could put
the functions in Java code and install it instead of using Painless scripts.
However, we could not afford to use the plugin functionality as is, because
each update to the plugin would require a complete cluster restart, which is
not only costly, but also reduces the reliability and operability of our
systems.

Goals
Our goal in this project was to improve the current recommendation system,
so that we could support the ML algorithms without a performance penalty.
In addition, we want to be able to iterate on new algorithms often, and have
updates be painless.

Solution

Main idea
To overcome the character limit and performance issues, our main idea was
to leverage the speed of the ES recommendation plugin. Since we couldn’t
afford to restart the cluster too often, the second idea was to design a system
that would be able to add and update new matching algorithms without a
mandatory restart.

Architecture
In this section, we provide some background on Java and ES, and how we
leveraged these technologies to build a script management system that can
load matching algorithms at runtime.

Background: Java, a dynamically compiled language

Like C, Java is a compiled language, but unlike C, the Java compiler doesn’t
transform code into a binary, but into bytecode instead. This bytecode is
then handled at runtime by the Java Virtual Machine (JVM). The JVM allows
to define new classes or to reload new versions of existing classes at
runtime. This is why Java is called dynamically compiled. The main class
responsible for loading and defining classes in the JVM is the ClassLoader.
This class can be extended to control the loading logic. The ClassLoader can
be called anywhere in the code to request for a new class to be loaded.

In the plugin we implemented, we used this characteristic to update class

definition at runtime without needing to do a restart.

ES Background

Elasticsearch is the indexing system, that stores the user documents we use
to search and provide recommendations. ES is open source, and the
different code version can be found on Github. There are many ways to
query Elasticsearch. One of the simplest ways is to store scripts, or search
algorithms, in Elasticsearch, and then send queries that reference the script.
That way, when the query is interpreted by Elasticsearch, it knows what
algorithm to use to search and return results.

Searches in Elasticsearch happen roughly in two steps: filter and sort. In the
filter step, all the documents that don’t match the filter criteria are excluded
from the results. In the sorting step, all the document that fit the filter
criteria are assigned a relevance factor, ordered from highest to lowest, and
put in the response to the caller.

In Elasticsearch, plugins are a way to enhance Elasticsearch basic

functionality. There are many different types of plugins, they allow you to
add custom types, or expose new endpoints to the Elasticsearch API.

The type of plugin that interested us most was the script engine or script
plugin. This type of plugin allows us to customize the way the relevance
assignment is done for the documents.

In the following paragraph, we talk about some details about script plugins.
We used Elasticsearch 6.3; the vocabulary, names, and logic can change from
version to version and might not apply to future versions of Elasticsearch.

ScriptEngine Overview

A script plugin is essentially a “run()” function from the query parameters

(“params”) and a document (“lookup”) to a relevance factor. For each pair of
“(params, lookup)”, ES needs to call “run()” to compute relevance with this
plugin.

To implement a ScriptPlugin, we roughly need to implement or extend each

of these classes called when running the script for plugin.

Notes:

The ScriptEngine.compile() method is only called once per script. It

caches the SearchScript.Factory in memory. Then, for all the subsequent
queries using the same script, the cache will be used to provide the
SearchScript.Factory. Since we want the script to be changed in some
cases (for instance a new version), we know that this method will not be
run again, so we are not able to use code that is version specific in that
method

SearchScriptFactory handles the query level search

SearchScript.LeafFactory handles the relevance computation for a leafor

a segment of documents.

To be able to handle different versions in the scripts,

we use the delegation pattern.

In the newFactory() method of the SearchScriptFactory, we call

factoryCache, which is our in memory SearchScript.Factory storage system,
that will dynamically provide the SearchScript.Factory implementation that
needs to be used.

1 @Override
2 public <T> T compile() {
3 …
4 scriptAndVersion = fetchScriptAndVersion(scriptSource);
5 SearchScript.Factory wrappedFactory = (p, l) -> {
6 SearchScript.Factory delegate = factoryCache.getScript(scriptAndVersion);
7 return delegate.newFactory(p, l);
8 };
9 return…
10 }

SearchScriptFactory.java hosted with ❤ by GitHub view raw

In the schema we designed, the ScriptEngine is actually not a simple script

provider, but an abstracted layer handling routing between queries and
scripts.

Load new scripts

As we saw in the previous section, we have an in memory cache that checks

the params of the query and returns the appropriate matching algorithm.
The cache layers loads new recommendation scripts as it becomes
necessary, driven by queries.

Loading a new script is equivalent to loading a java class from a Jar file that
we get from a storage system. If the class already exists, but we need a new
version of it, we overload the class definition with the new class definition.
We needed to write a custom class loader to overload the classes in the
current JVM with their new definition.

For instance, let’s assume the current JVM has the class MyScript.class (v1)
and A.class (v1) defined, from a previous jar. In a new jar, we have
MyScript.class (v2), that depend on A.class (v2) and B.class (v2).

When we request MyScript.class from the new jar, the ClassLoader will
check in the new jar for the definition of MyScript.class.

Then, the ClassLoader will overwrite the current definition, same for
A.class, and it will add B.class from the new jar. At the end of the operation,
the JVM will have MyScript.class (v2), A.class (v2) and B.class (v2).

Caching scripts as necessary

Once we are done loading the script class, we store it in a cache. We store
scripts by name and version in the cache, using the “source” field of the
query to pass the name and the version that we want to compute with. We
used a simple Guava LoadingCache in the Script manager. A cache is needed
because loading a script from the jar in the disc or in a storage cannot scale
at several thousand QPS. Some scripts might get deprecated, or be unused
for a long period of time, and the LoadingCache supports custom eviction
logic in the CacheBuilder for this purpose.

Here is an overview of the main classes involved in providing the right

SearchScript.Factory to the ScriptEngine:

Query scope vs. document scope

In some cases, running the same code for each document is wasteful in
resources, and we need to run it once per query, and use the result of the
computation when computing the relevance for each of the documents
afterwards.

Since SearchScript.Factory handles the query scope and the

SearchScript.LeafFactory handles the individual document scope.
Everything that should be computed at the query level is done in
SearchScript.Factory.newFactory(params, lookup) (see computeForQuery
call in snippet). Everything related to individual document relevance, on the
other hand, is done in the SearchScript.LeafFactory.newInstance() method.

1 private SearchScript.Factory getFactory() {

2 return (params, lookup) -> new SearchScript.LeafFactory() {
3 // Leaf Factory definition
4 private Object queryLevelComputed = computeForQuery();
5 private Constructor<? extends SearchScript> constructor = getConstructor();
6
7 @Override
8 public SearchScript newInstance(LeafReaderContext context)
9 throws IOException {
10 return constructor.newInstance(params, lookup, context, queryLevelComputed);
11 }
12
13 /* ... */
14 }
15 }

SearchScript.java hosted with ❤ by GitHub view raw

Query changes on the client side

Since params is a map sent in the JSON format by the ES client, we can
customize any change of behavior by changing the content of params. For
instance, if the query contains the param “use_new_algorithm” we can fork
and use a different matching algorithm without coupling ES to a dynamic
flag system/manager.

Review of the development flow

In this project, one of the goals was to make sure the development of new
matching algorithms wasn’t too cumbersome. For the developer, the
workflow to update scripts is the following.

1. Write code, package in jar, and upload to storage system

2. Specify the new version of the script in an ES query

3. Plugin will use the correct script version and sort the results

This workflow is straightforward and fast from both operational and

development perspectives. Therefore, we achieved our goal in preserving
system maintainability and iteribility.

Observability
ES is a vital part of the recommendations framework, so it is essential that
the plugin is highly observable. Although ES itself has its own set of system
metrics, there is not a simple way to add our plugin-specific metrics. We use
Prometheus for monitoring our microservices, so it makes sense for easier
operational integration to use it for the plugin as well. For microservices,
each machine hosts a Prometheus server that exposes a “_metrics” endpoint.
An external client, which can access individual machines behind the load
balancer calls this endpoint and aggregates the results. However, we want to
keep ES decoupled from third-party services such as Prometheus, so we
developed a custom solution.

ES already has a set of _cat APIs included for monitoring its system metrics.
For example, if the _cat/nodes API is accessed from any query node, it will
aggregate metrics from all nodes in the cluster using TCP and return the
results. We leveraged this existing pattern by adding our own
_cat/pluginmetrics API using an ActionPlugin, which is used to create
custom APIs on ES. This way, instead of hosting a Prometheus server on each
node and requiring a client to have access to individual nodes, the
Prometheus client can simply use the new pluginmetrics API using the load
balancer endpoint. This API returns a response equivalent to querying each
individual machine in the cluster while maintaining the same format as the
Prometheus server, so it was simple for the operations team to setup the
monitoring.

Security
We are downloading jar data from the jar storage system. This jar has access
to sensitive data that we store on Elasticsearch. Even if we control the
storage, we must assume the jar might have been tampered with when we
receive it. For security purposes we implemented 3 steps that allow us to
verify that that code in the jar file is from a reliable source:

The jar is signed with a private key that is stored in a key vault

The jar signature is verified within the Elasticsearch plugin

The repository containing the code has a protected branch with an

explicit approver

That way, we control the authenticity of the code that will be loaded at
runtime.

Overall architecture

Rolling out
On the very first release of our plugin, we chose to re-implement the same
matching script as we did with painless script, so we can get an apples-to-
apples comparison. Since the syntax of painless script is pretty similar to
Java, it is straightforward to convert it to native Java code with minor
modifications.

Simply by doing so, we see a solid improvement in latency, from over 500ms
to less than 400ms.

It is pretty obvious that we saved some parsing time by switching from

Painless to native Java plugin; however, the main benefits actually come
from the flexibility of using Java directly. For instance, in our matching
script we have to perform a binary search against an array, which is constant
across all our queries. In the world of painless script, since there is no
support of static variables, the array needs to be reconstructed for each hit to
compute, which means the JVM needs to allocate memory for it and handle
GC afterwards. This is a huge waste considering the scale of our
recommendation system.

Also by leveraging the Jenkins automation pipeline, it is now much more

streamlined whenever we need to push a new relevance function version.

To control the quality of our work, we set up two pipelines for our staging
and production environment respectively, and one more test ES cluster. Here
is how they look like:

Each time we would like to push a new sorting script, here is the process:

1. Manual staging test: We use the Jenkins staging pipeline to build the jar,
upload to file storage system, and deploy our server side code in staging
env to invoke the newest version of script. This step is to check against
any obvious syntax/loading error of our new script and make sure it can
be executed successfully, and the actual calculated relevance is expected.

2. Large scale data validation in canary environment: Due to historical

reasons, there are some field level schema inconsistencies in our
production cluster. For example, some date fields are ISO strings for
some documents, but UNIX timestamps for others. That is why we set up
a separate canary cluster by replicating our production data and have our
server sending queries to it as the next step, to make sure our scripts can
handle all such edge cases.

3. Dark run in production: Once the first 2 steps are done, we now have
high confidence about the correctness of our script. However, the run
time performance, especially latency is still unclear. To avoid running a
script with long latency and hurt our user experience, we set up a
darkrun step in prod to send queries from production server to
production ElasticSearch cluster with the script loaded, in a fire and
forget manner. By doing this we are able to collect performance metrics
and decide if we should fully roll it out. Usually we want to keep dark run
for a few days because some performance issues (e.g. memory leaks) are
more likely to get exposed in a longer run.

4. Fully cutover: If all previous steps look good, we will slowly dial up the
traffic to use the new query script.

Summary
We invented a whole new infrastructure to support continuous development
and integration of Elasticsearch plugins, which is also highly secured and
observable. Thanks to it, we are able to apply much more sophisticated
matching model in run time. However, we are not done yet — in part 2 of this
blog, we will cover some of the most genius ideas our engineers
implemented on top of this pipeline that greatly improved our query
performance. Stay tuned.

Elasticsearch Plugins Performance Tuning

342

Written by Xiaohu Li Follow

86 Followers · Writer for Tinder Tech Blog

Xiaohu Li in Tinder Tech Blog Tinder in Tinder Tech Blog

How We Improved Our How we built the Tinder API

Performance Using ElasticSearch… Gateway
Written By: Daniel Geng, Software Engineer | Authored by: Vijayvangapandu Vijaya
Pierre Poitevin, Senior Software Engineer|… Vangapandu Distinguished Software…

7 min read · Sep 20, 2019 9 min read · Oct 25, 2022

377 3 2.1K 11

Frank Ren in Tinder Tech Blog Xiaohu Li in Tinder Tech Blog

Geosharded Recommendations Geosharded Recommendations

Part 1: Sharding Approach Part 2: Architecture
Authors: Frank Ren|Director, Backend Authors: Frank Ren|Director, Backend
Engineering, Xiaohu Li|Manager, Backend… Engineering, Xiaohu Li|Manager, Backend…

9 min read · May 15, 2019 7 min read · May 30, 2019

842 1 278 1

See all from Xiaohu Li See all from Tinder Tech Blog

Recommended from Medium

Pinterest Engineeri… in Pinterest Engineering Bl… SDE Story Crafting in Stackademic

Deep Multi-task Learning and Node.js is Single-Threaded But Still

Real-time Personalization for… Concurrent. How?
Haomiao Li | Software Engineer, Closeup If your web server keeps getting client
Ranking & Blending; Travis Ebesu | Software… request and the main thread of your host…

10 min read · Jun 13, 2023 4 min read · Dec 19, 2023

269 4 156 1

Lists

Natural Language Processing

1276 stories · 768 saves

Suresh Podeti Hayk Simonyan in Level Up Coding

System design: Google maps System Design Interview Question:

Requirements Design Spotify
High-level overview of a System Design
Interview Question - Design Spotify.

6 min read · Oct 6, 2023 6 min read · Feb 21, 2024

43 3.5K 28

BigM Ryan Tang in UX Collective

Semantic search with Vector Emerging UX patterns in

embeddings using Elasticsearch Generative AI experiences
So, why vector search ? 🤔 How the evolution of the GUI can tell us what
to expect for Gen AI interactions and…

6 min read · Oct 27, 2023 12 min read · Mar 2, 2024

8 1K 7

See more recommendations

Help Status About Careers Blog Privacy Terms Text to speech Teams

TypeScript from the Ground Up: A Practical Guide with Examples
From Everand
TypeScript from the Ground Up: A Practical Guide with Examples
William E. Clark
No ratings yet
JavaScript Algorithms Step by Step: A Practical Guide with Examples
From Everand
JavaScript Algorithms Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
A.P. Moller Maersk (Maersk India Pvt. LTD.)
No ratings yet
A.P. Moller Maersk (Maersk India Pvt. LTD.)
6 pages
648938EN_05
No ratings yet
648938EN_05
2 pages
JavaScript Data Structures Explained: A Practical Guide with Examples
From Everand
JavaScript Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Introduction To Algorithms
No ratings yet
Introduction To Algorithms
3 pages
Swift Programming Simplified: A Practical Guide with Examples
From Everand
Swift Programming Simplified: A Practical Guide with Examples
William E. Clark
No ratings yet
IEM - ICDC 2021 - Schedule - F
No ratings yet
IEM - ICDC 2021 - Schedule - F
6 pages
Business Requirement Document (BRD)
No ratings yet
Business Requirement Document (BRD)
8 pages
JavaScript File Handling from Scratch: A Practical Guide with Examples
From Everand
JavaScript File Handling from Scratch: A Practical Guide with Examples
William E. Clark
No ratings yet
C Union
No ratings yet
C Union
3 pages
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
100 Recursion Problems by Level
No ratings yet
100 Recursion Problems by Level
4 pages
题库答案
No ratings yet
题库答案
8 pages
Share
No ratings yet
Share
8 pages
PowerShell SysAdmin Crash Course, Second Edition: Unlock the Full Potential of PowerShell with Advanced Techniques, Automation, Configuration Management and Integration
From Everand
PowerShell SysAdmin Crash Course, Second Edition: Unlock the Full Potential of PowerShell with Advanced Techniques, Automation, Configuration Management and Integration
Steeve Lee
No ratings yet
The Ultimate TypeScript Developer's Handbook : A Comprehensive Journey for New Developers
From Everand
The Ultimate TypeScript Developer's Handbook : A Comprehensive Journey for New Developers
Madison Giroux
No ratings yet
Frost Banks Journey To A Paperless Trust Department
No ratings yet
Frost Banks Journey To A Paperless Trust Department
6 pages
Python Exercises
No ratings yet
Python Exercises
4 pages
Operators Used in Pseudocode Assignment Statements
No ratings yet
Operators Used in Pseudocode Assignment Statements
12 pages
Ads&Aa Lab-1 Avl Tree
No ratings yet
Ads&Aa Lab-1 Avl Tree
8 pages
Java Functional Programming Interview Questions With Answers
No ratings yet
Java Functional Programming Interview Questions With Answers
10 pages
AWS Solution Architect Certification Exam Practice Paper 2019
From Everand
AWS Solution Architect Certification Exam Practice Paper 2019
Tech Interviews
3.5/5 (3)
SQAT - FMO - FuncReq Spec - v1.6
No ratings yet
SQAT - FMO - FuncReq Spec - v1.6
110 pages
JavaScript: Best Practice
From Everand
JavaScript: Best Practice
James Kolce
No ratings yet
CSCI 2400 - Exam 4
No ratings yet
CSCI 2400 - Exam 4
2 pages
VUGen Recording Options in LoadRunner
No ratings yet
VUGen Recording Options in LoadRunner
14 pages
Babel Street Analytics Name Match For Elasticsearch - 2024 03 23 172617 - LGWF
No ratings yet
Babel Street Analytics Name Match For Elasticsearch - 2024 03 23 172617 - LGWF
2 pages
Programming 1 Assignment Unit 5
No ratings yet
Programming 1 Assignment Unit 5
7 pages
SRS Template of Sample Case Study - 24032018
No ratings yet
SRS Template of Sample Case Study - 24032018
11 pages
M.Tech I Mid QP
No ratings yet
M.Tech I Mid QP
2 pages
The PHP Workshop: Learn to build interactive applications and kickstart your career as a web developer
From Everand
The PHP Workshop: Learn to build interactive applications and kickstart your career as a web developer
Alexandru Busuioc
No ratings yet
MIC 17431-2018-Summer-Model-Answer-Paper
No ratings yet
MIC 17431-2018-Summer-Model-Answer-Paper
22 pages
CH 2 Practice Test 2019
No ratings yet
CH 2 Practice Test 2019
4 pages
Programming Interview Questions
No ratings yet
Programming Interview Questions
22 pages
Elements of Android Room
From Everand
Elements of Android Room
Mark Murphy
No ratings yet
PowerShell SysAdmin Crash Course, Second Edition
From Everand
PowerShell SysAdmin Crash Course, Second Edition
Steeve Lee
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Binance UI
No ratings yet
Binance UI
4 pages
Map Reduce
No ratings yet
Map Reduce
11 pages
Code Review Report
No ratings yet
Code Review Report
14 pages
Cambridge O Level: Computer Science 2210/22
No ratings yet
Cambridge O Level: Computer Science 2210/22
14 pages
PowerShell SysAdmin Crash Course: Unlock the Full Potential of PowerShell with Advanced Techniques, Automation, Configuration Management and Integration
From Everand
PowerShell SysAdmin Crash Course: Unlock the Full Potential of PowerShell with Advanced Techniques, Automation, Configuration Management and Integration
Steeve Lee
No ratings yet
Practical Java 8: Lambdas, Streams and new resources
From Everand
Practical Java 8: Lambdas, Streams and new resources
Paulo Silveira
5/5 (1)
Website Performance Testing Tools and Services
No ratings yet
Website Performance Testing Tools and Services
6 pages
Operation Management Assignment 2
No ratings yet
Operation Management Assignment 2
4 pages
Learn ClojureScript: Functional programming for the web
From Everand
Learn ClojureScript: Functional programming for the web
Andrew Meredith
No ratings yet
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
TypeScript for Python Developers: Bridging Syntax and Practices
From Everand
TypeScript for Python Developers: Bridging Syntax and Practices
Baldurs L.
No ratings yet
GraphQL APIs with TypeScript: Efficient Back-end Development
From Everand
GraphQL APIs with TypeScript: Efficient Back-end Development
Baldurs L.
No ratings yet
Academic Year: /1 Program Code: Ottawa Campus
No ratings yet
Academic Year: /1 Program Code: Ottawa Campus
8 pages
Advanced TypeScript Patterns for Large-Scale Applications
From Everand
Advanced TypeScript Patterns for Large-Scale Applications
Baldurs L.
No ratings yet
Updated DBMS Mini Project Certificates-Sheet2018-19
No ratings yet
Updated DBMS Mini Project Certificates-Sheet2018-19
4 pages
mcs-12 2020-21
No ratings yet
mcs-12 2020-21
43 pages
Lexical Analysis With Flex: Vern Paxson, Will Estes and John Millaway
No ratings yet
Lexical Analysis With Flex: Vern Paxson, Will Estes and John Millaway
141 pages
Review Questions - Dpco - Unit Wise
0% (1)
Review Questions - Dpco - Unit Wise
4 pages
10 - Trees
No ratings yet
10 - Trees
47 pages
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
From Everand
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
Tim Warren
No ratings yet
Angular Observables and Promises: A Practical Guide to Asynchronous Programming
From Everand
Angular Observables and Promises: A Practical Guide to Asynchronous Programming
Abdelfattah Ragab
No ratings yet
Learning Azure DevOps: Outperform DevOps using Azure Pipelines, Artifacts, Boards, Azure CLI, Test Plans and Repos
From Everand
Learning Azure DevOps: Outperform DevOps using Azure Pipelines, Artifacts, Boards, Azure CLI, Test Plans and Repos
Myra Kelnor
No ratings yet
Study Guide 300-835 CLAUTO Automating and Programming Cisco Collaboration Solutions Exam
From Everand
Study Guide 300-835 CLAUTO Automating and Programming Cisco Collaboration Solutions Exam
Anand Vemula
No ratings yet
Building Serverless Apps with Azure Functions and Cosmos DB: Leverage Azure functions and Cosmos DB for building serverless applications (English Edition)
From Everand
Building Serverless Apps with Azure Functions and Cosmos DB: Leverage Azure functions and Cosmos DB for building serverless applications (English Edition)
Hansamali Gamage
No ratings yet
Mastering MEAN Stack: Build full stack applications using MongoDB, Express.js, Angular, and Node.js (English Edition)
From Everand
Mastering MEAN Stack: Build full stack applications using MongoDB, Express.js, Angular, and Node.js (English Edition)
Pinakin Ashok Chaubal
No ratings yet
SQL 101 Crash Course: Comprehensive Guide to SQL Fundamentals and Practical Applications
From Everand
SQL 101 Crash Course: Comprehensive Guide to SQL Fundamentals and Practical Applications
Emrys Callahan
5/5 (1)
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
From Everand
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
Abdulrazak Nugwa Ibrahim
5/5 (1)
ELK Stack Explanation & Configuration
No ratings yet
ELK Stack Explanation & Configuration
24 pages
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
From Everand
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
Justin Barbara
No ratings yet
Tài liệu giới thiệu giải pháp phần mềm quản lý bán hàng trực tuyến
No ratings yet
Tài liệu giới thiệu giải pháp phần mềm quản lý bán hàng trực tuyến
51 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Java: Tips and Tricks to Programming Code with Java
From Everand
Java: Tips and Tricks to Programming Code with Java
Charlie Masterson
No ratings yet
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
From Everand
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
Gilbert Stew
No ratings yet
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
From Everand
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
Jens Boje
No ratings yet
Mathematics Year 3 2021 2022
No ratings yet
Mathematics Year 3 2021 2022
9 pages
Gate Level Minimization
No ratings yet
Gate Level Minimization
24 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Mastering Postman: A Comprehensive Guide to Building End-to-End APIs with Testing, Integration and Automation
From Everand
Mastering Postman: A Comprehensive Guide to Building End-to-End APIs with Testing, Integration and Automation
Oliver James
No ratings yet
Salesforce Developer Interview Questions: 1.0, #1
From Everand
Salesforce Developer Interview Questions: 1.0, #1
SFDC TELUGU
No ratings yet
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
From Everand
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
Charlie Masterson
No ratings yet
Practical C++ Backend Programming
From Everand
Practical C++ Backend Programming
Justin Barbara
No ratings yet
Article 43
No ratings yet
Article 43
4 pages
Mastering TypeScript: From Basics to Expert Proficiency
From Everand
Mastering TypeScript: From Basics to Expert Proficiency
William Smith
No ratings yet
Selenium Framework Design in Keyword-Driven Testing: Automate Your Test Using Selenium and Appium
From Everand
Selenium Framework Design in Keyword-Driven Testing: Automate Your Test Using Selenium and Appium
Pinakin Ashok Chaubal
No ratings yet
Java Programming: 24-Hour Trainer
From Everand
Java Programming: 24-Hour Trainer
Yakov Fain
No ratings yet
Learning Elasticsearch 7.x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch (English Edition)
From Everand
Learning Elasticsearch 7.x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch (English Edition)
Anurag Srivastava
No ratings yet
Test Case Generator Module - GUITAR
No ratings yet
Test Case Generator Module - GUITAR
24 pages
Learning Azure DevOps
From Everand
Learning Azure DevOps
Myra Kelnor
No ratings yet
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
From Everand
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
Ian Taylor
No ratings yet
Mastering Shell for DevOps
From Everand
Mastering Shell for DevOps
Gilbert Stew
No ratings yet
ConfigMgr - An Administrator's Guide to Deploying Applications using PowerShell
From Everand
ConfigMgr - An Administrator's Guide to Deploying Applications using PowerShell
Owen Smith
5/5 (1)
Go Programming Blueprints - Second Edition
From Everand
Go Programming Blueprints - Second Edition
Mat Ryer
4.5/5 (3)
Mastering Swift
From Everand
Mastering Swift
Jon Hoffman
No ratings yet
JavaScript Introduction
From Everand
JavaScript Introduction
Lisa Saldivar
No ratings yet
Learning ELK Stack: Build mesmerizing visualizations, analytics, and logs from your data using Elasticsearch, Logstash, and Kibana
From Everand
Learning ELK Stack: Build mesmerizing visualizations, analytics, and logs from your data using Elasticsearch, Logstash, and Kibana
Saurabh Chhajed
No ratings yet
Mastering Go Network Automation
From Everand
Mastering Go Network Automation
Ian Taylor
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet
Four Programming Languages Creating a Complete Website Scraper Application
From Everand
Four Programming Languages Creating a Complete Website Scraper Application
Stephen J Link
No ratings yet
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
From Everand
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
Dan Wahlin
4.5/5 (3)
Mastering CryENGINE
From Everand
Mastering CryENGINE
Sascha Gundlach
No ratings yet
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
Intermediate Load Runner With Oracle/Apex Concepts.
From Everand
Intermediate Load Runner With Oracle/Apex Concepts.
Rohan Gordon
No ratings yet

How We Improved Our Performance Using ElasticSearch Plugins - Part 1 - by Xiaohu Li - Tinder Tech Blog - Medium

Uploaded by

How We Improved Our Performance Using ElasticSearch Plugins - Part 1 - by Xiaohu Li - Tinder Tech Blog - Medium

Uploaded by

Search Write Sign up Sign in

How We Improved Our Performance

Written By: Pierre Poitevin, Senior Software Engineer|Daniel Geng,

Background: Java, a dynamically compiled language

In the plugin we implemented, we used this characteristic to update class

In Elasticsearch, plugins are a way to enhance Elasticsearch basic

A script plugin is essentially a “run()” function from the query parameters

To implement a ScriptPlugin, we roughly need to implement or extend each

The ScriptEngine.compile() method is only called once per script. It

SearchScriptFactory handles the query level search

SearchScript.LeafFactory handles the relevance computation for a leafor

To be able to handle different versions in the scripts,

In the newFactory() method of the SearchScriptFactory, we call

SearchScriptFactory.java hosted with ❤ by GitHub view raw

In the schema we designed, the ScriptEngine is actually not a simple script

Load new scripts

As we saw in the previous section, we have an in memory cache that checks

Caching scripts as necessary

Here is an overview of the main classes involved in providing the right

Query scope vs. document scope

Since SearchScript.Factory handles the query scope and the

1 private SearchScript.Factory getFactory() {

SearchScript.java hosted with ❤ by GitHub view raw

Query changes on the client side

Review of the development flow

1. Write code, package in jar, and upload to storage system

2. Specify the new version of the script in an ES query

This workflow is straightforward and fast from both operational and

The jar signature is verified within the Elasticsearch plugin

The repository containing the code has a protected branch with an

It is pretty obvious that we saved some parsing time by switching from

Also by leveraging the Jenkins automation pipeline, it is now much more

2. Large scale data validation in canary environment: Due to historical

Elasticsearch Plugins Performance Tuning

Written by Xiaohu Li Follow

86 Followers · Writer for Tinder Tech Blog

More from Xiaohu Li and Tinder Tech Blog

Xiaohu Li in Tinder Tech Blog Tinder in Tinder Tech Blog

How We Improved Our How we built the Tinder API

Frank Ren in Tinder Tech Blog Xiaohu Li in Tinder Tech Blog

Geosharded Recommendations Geosharded Recommendations

Recommended from Medium

Pinterest Engineeri… in Pinterest Engineering Bl… SDE Story Crafting in Stackademic

Deep Multi-task Learning and Node.js is Single-Threaded But Still

Natural Language Processing

Suresh Podeti Hayk Simonyan in Level Up Coding

System design: Google maps System Design Interview Question:

6 min read · Oct 6, 2023 6 min read · Feb 21, 2024

BigM Ryan Tang in UX Collective

Semantic search with Vector Emerging UX patterns in

6 min read · Oct 27, 2023 12 min read · Mar 2, 2024

See more recommendations

You might also like