Threshold Optimization

After setting up SemanticRouter or SemanticCache it's best to tune the distance_threshold to get the most performance out of your system. RedisVL provides helper classes to make this light weight optimization easy.

Note: Threshold optimization relies on python > 3.9.

CacheThresholdOptimizer

Let's say you setup the following semantic cache with a distance_threshold of X and store the entries:

prompt: what is the capital of france? response: paris
prompt: what is the capital of morocco? response: rabat

from redisvl.extensions.llmcache import SemanticCache

sem_cache = SemanticCache(
    name="sem_cache",                    # underlying search index name
    redis_url="redis://localhost:6379",  # redis connection url string
    distance_threshold=0.5               # semantic cache distance threshold
)

paris_key = sem_cache.store(prompt="what is the capital of france?", response="paris")
rabat_key = sem_cache.store(prompt="what is the capital of morocco?", response="rabat")

This works well but we want to make sure the cache only applies for the appropriate questions. If we test the cache with a question we don't want a response to we see that the current distance_threshold is too high.

sem_cache.check("what's the capital of britain?")

[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',
  'prompt': 'what is the capital of france?',
  'response': 'paris',
  'vector_distance': 0.421104669571,
  'inserted_at': 1741039231.99,
  'updated_at': 1741039231.99,
  'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]

Define test_data and optimize

With the CacheThresholdOptimizer you can quickly tune the distance threshold by providing some test data in the form:

[
    {
        "query": "What's the capital of Britain?",
        "query_match": ""
    },
    {
        "query": "What's the capital of France??",
        "query_match": paris_key
    },
    {
        "query": "What's the capital city of Morocco?",
        "query_match": rabat_key
    },
]

The threshold optimizer will then efficiently execute and score different threshold against the what is currently populated in your cache and automatically update the threshold of the cache to the best setting

from redisvl.utils.optimize import CacheThresholdOptimizer

test_data = [
    {
        "query": "What's the capital of Britain?",
        "query_match": ""
    },
    {
        "query": "What's the capital of France??",
        "query_match": paris_key
    },
    {
        "query": "What's the capital city of Morocco?",
        "query_match": rabat_key
    },
]

print(f"Distance threshold before: {sem_cache.distance_threshold} \n")
optimizer = CacheThresholdOptimizer(sem_cache, test_data)
optimizer.optimize()
print(f"Distance threshold after: {sem_cache.distance_threshold} \n")

Distance threshold before: 0.5 

Distance threshold after: 0.13050847457627118

We can also see that we no longer match on the incorrect example:

sem_cache.check("what's the capital of britain?")

[]

But still match on highly relevant prompts:

sem_cache.check("what's the capital city of france?")

[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',
  'prompt': 'what is the capital of france?',
  'response': 'paris',
  'vector_distance': 0.0835866332054,
  'inserted_at': 1741039231.99,
  'updated_at': 1741039231.99,
  'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]

RouterThresholdOptimizer

Very similar to the caching case, you can optimize your router.

Define the routes

from redisvl.extensions.router import Route

routes = [
        Route(
            name="greeting",
            references=["hello", "hi"],
            metadata={"type": "greeting"},
            distance_threshold=0.5,
        ),
        Route(
            name="farewell",
            references=["bye", "goodbye"],
            metadata={"type": "farewell"},
            distance_threshold=0.5,
        ),
    ]

Initialize the SemanticRouter

import os
from redisvl.extensions.router import SemanticRouter
from redisvl.utils.vectorize import HFTextVectorizer

os.environ["TOKENIZERS_PARALLELISM"] = "false"

# Initialize the SemanticRouter
router = SemanticRouter(
    name="greeting-router",
    vectorizer=HFTextVectorizer(),
    routes=routes,
    redis_url="redis://localhost:6379",
    overwrite=True # Blow away any other routing index with this name
)

Provide test_data

test_data = [
    # Greetings
    {"query": "hello", "query_match": "greeting"},
    {"query": "hi", "query_match": "greeting"},
    {"query": "hey", "query_match": "greeting"},
    {"query": "greetings", "query_match": "greeting"},
    {"query": "good morning", "query_match": "greeting"},
    {"query": "good afternoon", "query_match": "greeting"},
    {"query": "good evening", "query_match": "greeting"},
    {"query": "howdy", "query_match": "greeting"},
    {"query": "what's up", "query_match": "greeting"},
    {"query": "yo", "query_match": "greeting"},
    {"query": "hiya", "query_match": "greeting"},
    {"query": "salutations", "query_match": "greeting"},
    {"query": "how's it going", "query_match": "greeting"},
    {"query": "how are you", "query_match": "greeting"},
    {"query": "nice to meet you", "query_match": "greeting"},
    # Farewells
    {"query": "goodbye", "query_match": "farewell"},
    {"query": "bye", "query_match": "farewell"},
    {"query": "see you later", "query_match": "farewell"},
    {"query": "take care", "query_match": "farewell"},
    {"query": "farewell", "query_match": "farewell"},
    {"query": "have a good day", "query_match": "farewell"},
    {"query": "see you soon", "query_match": "farewell"},
    {"query": "catch you later", "query_match": "farewell"},
    {"query": "so long", "query_match": "farewell"},
    {"query": "peace out", "query_match": "farewell"},
    {"query": "later", "query_match": "farewell"},
    {"query": "all the best", "query_match": "farewell"},
    {"query": "take it easy", "query_match": "farewell"},
    {"query": "have a good one", "query_match": "farewell"},
    {"query": "cheerio", "query_match": "farewell"},
    # Null matches
    {"query": "what's the capital of britain?", "query_match": ""},
    {"query": "what does laffy taffy taste like?", "query_match": ""},
]

Optimize

Note: by default route distance threshold optimization will use a random search to find the best threshold since, unlike caching, there are many thresholds to optimize concurrently.

from redisvl.utils.optimize import RouterThresholdOptimizer

print(f"Route thresholds before: {router.route_thresholds} \n")
optimizer = RouterThresholdOptimizer(router, test_data)
optimizer.optimize()

Route thresholds before: {'greeting': 0.5, 'farewell': 0.5} 

Eval metric F1: start 0.438, end 0.719 
Ending thresholds: {'greeting': 1.0858585858585856, 'farewell': 0.5545454545454545}

Test it out

# Query the router with a statement
route_match = router("hi there")
route_match

RouteMatch(name='greeting', distance=0.295984119177)

Cleanup

router.delete()
sem_cache.delete()

Products

Tools

Get Redis

Connect

Learn

Latest

See how it works