0% found this document useful (0 votes)
151 views49 pages

Design Key-Value Database + Real Usecase

Key-value databases are commonly used for caching, real-time recommendations, and queueing systems due to their fast read/write capabilities. The document discusses designing keys, structuring values, limitations of key-value databases, and common design patterns including time to live, emulating tables, aggregates, atomic aggregates, enumerable keys, and indexes. It also provides examples of when to use key-value databases such as for user sessions in web applications or powering recommendations and advertising.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views49 pages

Design Key-Value Database + Real Usecase

Key-value databases are commonly used for caching, real-time recommendations, and queueing systems due to their fast read/write capabilities. The document discusses designing keys, structuring values, limitations of key-value databases, and common design patterns including time to live, emulating tables, aggregates, atomic aggregates, enumerable keys, and indexes. It also provides examples of when to use key-value databases such as for user sessions in web applications or powering recommendations and advertising.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Design Key-value database

Nguyen Huu Cam


Table of contents
• Key design and partitioning

• Designing structured values


• Limitation of Key-value database

• Design pattern for key-value database


• Use cases to use and when not to use
1. Key design and partitioning
• How you design your keys can impact the ease of working with your key-value
database.

• Avoid meaning less name, such as ‘laklsjfdjjd’

• Key should have logical structure to make code more readable and extensible, but
also be designed with storage efficiency in mind
General guideline to name key
• Use meaningful and unambiguous naming components

• Use range-based components when would like to retrieve ranges of values

• Use common delimiter when appending components to make a key. Normally, use
“:” as delimiter

• Keep key as short as possible without sacrifice other characteristics mentioned


Keys Must Take into Account Implementation
Limitations
• Different key-value databases have different limitations. Consider those limitations
when choosing your key-value database

• Some key-value database restricts size of keys. For example, FoundationDB limits the
size of keys to 10,000 bytes
Valid datatypes in Redis
Redis datatypes
• The variety of data types supported by Redis allows you more flexibility when
creating keys

• Redis keys can be up to 512MB in length


How keys are used in partitioning
• Key is used in range partitioning

• Range partitioning works by grouping contiguous values and sending them to the
same node in a cluster

• Sort order must be defined over keys. For example, partition by customer number,
date, part identifier.
Range partitioning example
If you decide to use range partitioning, carefully consider how your data volumes may grow. If you need to
restructure your partitioning scheme, some keys may be reassigned to different nodes and data will have to
migrate between nodes.
2. Designing structure values
• Store frequently used values in memory. But it is limited by the amount of room in
memory allocated to caching

• Store commonly used attribute values together

as this reduce the number of disk seeks that must be performed to read all the needed
data than store name and address separately.
Designing structured values
• If customer name is used many times but not the address, store key name separately.
This would duplicate the customer name in your key-value database, but that should
not be considered a problem.

• However, there are many cases that use only key “name” to lookup name when only
that attribute is needed. This is beneficial to our application. Or use key “name” and
“address” when name and address are needed.
Large values can lead to inefficient Read/Write
Operations

• Using structured data types such as lists and sets can improve overall efficiency of
applications

• It is important to also consider how increasing the size of a value can adversely
impact read and write operations

• Consider data structure in the next image (don’t do that on your database)
Customer
info.

Cart items

Key: ordID:781379
How this structure is built
• When the customer adds her first item to her cart, the list is created and the
customer name and address are copied from the customer database.

• An order array is created and a list with the item identifier, quantity, description, and
price is added to the array. The key value is hashed and the entire data structure is
written to disk.

• The customer then adds another item to the cart, and a new entry is added to the
array of ordered items. Because the value is treated as an atomic unit, the entire list
(for example, customer information and ordered items) is written to the disk again.

• This process continues for each of the additional items.


3. Limitation of key-value
• The only way to lookup value is by key: If you want to lookup information in the
value, but only using key, this is not convenient.

• Some key-value databases do not support range queries. For example: searching data
within date range.

• There is no standard query language comparable to SQL: Only GET and SET are
enough.
3. Design pattern for Key-Value database
• There are some aspects of Key-Value database need considerations

• Time to Live (TTL)


• Emulating tables
• Aggregates
• Atomic aggregates
• Enumerable keys
• Indexes
3.1. Time to live
• Is a term to describe a temporary object which has start time and expiration time.

• Used in caching data in limited memory server when keys are used to hold a resource
for some specified period of time

• When current time exceeds expiration time, data will be removed to free memory to
store something else.
3.2. Emulating table (Table simulation)
• Implement data structure like a table, with only 2 methods SET and GET

• Emulating tables streamlines the getting and setting of multiple attributes related to
a single instance of an entity, but should not be overused

• Frequent use of emulating tables can indicate a misuse of a key-value database

• It is helpful when you routinely get or set a related set of attributes. This pattern is
useful when you are dealing with a small number of emulated tables.
Sample implementation
SET GET
3.3. Aggregate
• Aggregation is a pattern that supports different attributes for different subtypes of an
entity

• In a relational database, you can handle subtypes in a couple of different ways. You
could create a single table with all attributes across all subtypes.

• You could also create a table with the attributes common to all subtypes and then
create an additional table for each of the subtypes.
Example data
Design in Relational database
Possible design
If it’s stadium

If it’s small venue

If it’s festival
• Each of these ticket may be assigned by single namespace such as ConcertApp, such as
3.4. Atomic Aggregate
• Atomic aggregates contain all values that must be updated together or not at all.

• The atomic aggregate pattern uses a single assignment statement to save multiple values

• For example, if the concert ticket application logged a record each time a stadium ticket is
purchased, it should record the date, location, and seat assignment.

This will save all three values or none at all. Critical data will be missed if only 1 attr is saved
Atomic Aggregate (2)
• If trying to log each attribute separately, some attributes will be updated while others are not

• The best way is to store all information in a single key

• What if the server writing this data to disk failed after writing the locDescr attribute but
before writing the assgnSeat attribute, then you would lose a critical piece of data.
3.5. Enumerable Keys
• Keys that use counters or sequences to generate new keys

• This on its own would not be too useful; however, when combined with other
attributes, this can be helpful when working with groups of keys
Example
• Entity “ticketLog” and counter is used as key

• Counter starts at 1 and increases each time a ticket is sold. For


example: 'ticketLog:20140617:1', 'ticketLog:20140617:2’,
'ticketLog:20140617:3'

• However, with ranged values such as from date… to date…, this key naming
convention doesn’t fully function well.
Example (2)
• If day is required for logging, then this key format is accepted

• 20140617: Date sold on June 17th 2014


• 10: Tenth ticket

• Of course, range of tickets can be retrieved by generating series of keys, such as until
reaches the number of keys specified
3.6. Index
• Inverted indexes are sets of key-value pairs that allow for looking up keys or values by
other attribute values of the same entity

• For example

• This is useful for tracking all seats assigned across concerts, but it is not easy to list
only seats assigned in a particular location unless your key-value database provides
search capabilities
Perform searching
• In order to do that, write a function to search by key and if matches, append to the end
of list and return result

• If the function is initially called as the following,

• It would set the value of ConcertApp['Springfield Civic Center'] to


{'J38’} and so on with other values
When to use design patterns
• The Time to Live pattern is useful when you have operations that may be disrupted
and can be safely ignored after some period of inactivity or inability to finish the
operation

• Emulating tables streamlines the getting and setting of multiple attributes related to
a single instance of an entity, but should not be overused. Frequent use of emulating
tables can indicate a misuse of a key-value database

• Aggregates provide a means for working with entities that need to manage subtypes
and different attributes associated with each subtype.
When to use design patterns(2)
• The atomic aggregate pattern is used when you have multiple attributes that should
be set together

• Enumerable keys provide a crude range functionality by allowing a program to


generate and test for the existence of keys

• Indexes allow you to look up attribute values starting with something other than a
key
5. Common Use-cases
• Web applications may store user session details and preference in a key-value store.
All the information is accessible via user key, and key-value stores lend themselves to
fast reads and writes.
• Real-time recommendations and advertising are often powered by key-value stores
because the stores can quickly access and present new recommendations or ads as a
web visitor moves throughout a site.
• On the technical side, key-value stores are commonly used for in-memory data
caching to speed up applications by minimizing reads and writes to slower disk-based
systems.
Common use cases (2)
• Queues: Any application that deals with traffic congestion, messaging, data
gathering, job management, or packer routing

• Ip address tracking: Redis Sets are a great tool for developers who want to analyze all
of the IP addresses that visited a specific website page or blog post

• Scoreboards: Redis Sorted Sets to maintain their high score lists, as scores can be
repeated, but the strings which contain the unique user details cannot.
5. Applications using Redis
• Handing user information

• Handing user session

• Supporting personalization
Handling user session
• Generally, every web session is unique and is assigned a unique sessionId value after
user successfully logs in.

• Applications that store the sessionId on disk or in an RDBMS will greatly benefit from
moving to a Key-Value DB since everything about the session can be stored by a single
PUT request or retrieved using GET.

• This single-request operation makes it very fast, as everything about the session is
stored in a single object as we use in key-value
Example session storage
• Store sessionId in key
Supporting personalization
• User receives different personalized views of same data from database. These view
preferences need to be saved somewhere

• No need to store personalization settings in database, it’s specific to frontend


application

• Using key-value store contains userId and the service allows to store
personalization settings as value. This makes system responds very quick and prevent
slow performance.
Example personalization storage

User:<userId>:preferences = {
item_per_page: 10,
theme: light,
notifications: {
comments: [‘push’, ‘email’, ‘sms’],
tags: [‘push’,’email’, ‘sms]
}

}
5.2. High-speed data caching
• Should cache data that takes times to complete, or can reuse, instead of querying
same data over and over again, such as querying authenticated users info.

• High‐speed in‐memory caching provides this caching capability without the need for
a separate application level caching layer.

• This reduces total cost of ownership and makes developing well‐performing


applications quicker and easier.
Example
• Laravel_news:posts = [
{
Id: 1,
Name: ‘abc’,
},
{
Id: 2,
Name: ‘def’
}
]
Redis use case: Queues
• For example: Send email to all subscribers. We can use queue to handle sending
email for each person who subscribed
Queues(2)
• Each time user subscribed to a channel, a new user data will be serialized and will be
recorded to Redis and DB for persistent by appending to the end of LIST using
RPUSH. After sending email, LPOP user that has been sent.
When should not use
• Relationships among data
• If you need to have relationships between different sets of data, or correlate the data
between different sets of keys, key-value stores are not the best solution to use.

• Multi-operation transaction
• If you’re saving multiple keys and there is a failure to save any one of them, and you want
to revert or roll back the rest of the operations, key-value stores are not the best solution
to be used.
When should not use (2)
• Query by data
• If you need to search the keys based on something found in the value part of the key-
value pairs, then key-value stores are not going to perform well for you.

• Operations by Sets
• Since operations are limited to one key at a time, there is no way to operate upon multiple
keys at the same time. If you need to operate upon multiple keys, you have to handle this
from the client side

You might also like