0% found this document useful (0 votes)
10 views

10 Data Structures That Make Databases Fast and Scalable

Uploaded by

kdevesh.2099
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

10 Data Structures That Make Databases Fast and Scalable

Uploaded by

kdevesh.2099
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Read in the Substack app Open app

10 Data Structures That Make Databases Fast


and Scalable
From B-Trees to Bloom Filters

ASHISH PRATAP SINGH


NOV 21, 2024

Have you ever wondered why modern databases are so fast and efficient, even when
managing terabytes of data?

The answer lies in their underlying data structures and indexing techniques that enable
efficient storage, retrieval, and management of data.

In this article, we'll look at 10 important data structures that make modern databases
fast, reliable, and scalable.

📣 Design, develop and manage distributed software


better (Sponsored)
Multiplayer auto-documents your system, from the high-level logical architecture down
to the individual components, APIs, dependencies, and environments. Perfect for teams
who want to speed up their workflows and consolidate their technical assets.

1. Hash Indexes
A hash index is a data structure that maps keys to values using a hash function.

The hash function converts a key into an integer, which is used as an index in a hash
table (buckets) to locate the corresponding value.

It is designed for fast insertion and lookup, such as:


Insert/Find a new record with id = 123.

Visualized using Multiplayer

This structure provides O(1) average-time complexity for insertions, deletions, and
lookups.

Hash indexes are widely used in key-value stores (eg., DynamoDB), and caching
systems (eg., Redis).

2. B-Trees
A B-tree is a self-balancing tree data structure designed to store sorted data in a way
that optimizes reads, writes, and queries on large datasets.

It minimizes disk I/O by storing multiple keys in a single node and automatically
balances itself during insertions and deletions.
Visualized using Multiplayer

Unlike binary search trees, where each node has at most two children, B-Trees allow
multiple children per node. The number of children is defined by the order of the B-
Tree.

Internal nodes contain keys and pointers to child nodes and leaf nodes contain keys and
pointers to the actual data.

Keys in each node are stored in sorted order, enabling fast binary searches.

B-Trees are widely used for indexing in relational databases (eg., MySQL).

While many NoSQL databases favor LSM Trees for write-heavy workloads, some use
B-Trees for read-heavy scenarios or as part of their indexing strategy.

3. Skip Lists
A skip list is a probabilistic data structure that extends the functionality of linked lists
by adding multiple levels of "shortcuts" to enable fast search, insertion, and deletion
operations.

It is designed to offer performance comparable to balanced binary trees or B-Trees


while being simpler to implement and manage.
Visualized using Multiplayer

A skip list consists of multiple levels, with each level being a subset of the level below.

The bottom-most layer contains all the elements in sorted order, and higher layers are
sparser subsets that provide shortcuts to quickly navigate the lower layers.

Nodes are promoted to higher levels probabilistically, ensuring an even distribution


without the need for rebalancing.

Skip lists are particularly well-suited for in-memory storage and dynamic datasets
where updates are frequent.

Redis uses skip lists to implement it’s sorted sets (ZSET), enabling fast insertions,
deletions, and range queries while maintaining sorted order.

4. Memtables
A memtable is an in-memory data structure used in modern databases to temporarily
store write operations before they are flushed to disk.

It plays a critical role in optimizing write performance and ensuring data durability,
especially in databases designed for high-throughput workloads, such as Cassandra,
RocksDB, and HBase.
Memtables are typically implemented as a sorted structure like a Red-Black Tree or Skip
List, enabling efficient lookups and ordered writes to disk.

When data is written to the database: it is logged in the Write-Ahead Log (to persist the
change) and added to the memtable.

For recent writes, the memtable is checked first. If the key is not found in the memtable,
the query searches on-disk SSTables (Sorted String Table) or other storage files.

When the memtable reaches its size limit, it is flushed to disk as an immutable SSTable
and a new memtable is initialized for subsequent writes.

5. SSTables
An SSTable (Sorted String Table) is an immutable, on-disk data structure used in
modern databases like Cassandra, RocksDB, and HBase to store sorted key-value pairs.

SSTables are primarily used in Log-Structured Merge Tree (LSM Tree) databases to
optimize read and write performance.

They enable efficient sequential writes, fast lookups, and range queries.
Visualized using Multiplayer

Key Characteristics of SSTables:


Once written, an SSTable is never modified. New writes are handled by creating
new SSTables.

SSTables are written as sequential blocks to disk, minimizing fragmentation and


improving I/O performance.

SSTables include an in-memory index or Bloom filter to quickly determine whether


a key might exist without scanning the entire table.

Older SSTables are periodically merged into larger tables, removing duplicates and
reclaiming storage space.

Subscribe to receive new articles every week.

Type your email... Subscribe


6. Inverted Index
An inverted index is a data structure that maps terms (words or tokens) to the
documents or locations where they appear.

It is called "inverted" because it reverses the conventional relationship of an index:


instead of mapping documents to the terms they contain, it maps terms to the
documents that contain them.

This structure allows for optimal handling of full-text searches like:

Find all document which contain the terms: “database powerful“

Visualized using Multiplayer

How Inverted Index is Created:


1. Tokenization: Text is split into individual tokens (words or terms).

Example: "Database systems are powerful" → ["database", "systems", "are",


"powerful"]

2. Normalization: Tokens are standardized (e.g., lowercased, stemmed, or lemmatized).

Example: "Databases" → "database"

3. Index Construction and Storage: For each term, a postings list is created or updated
with the document ID and metadata (e.g., term frequency, positions).
Inverted indexes are widely used in databases, search engines, and information
retrieval systems to enable efficient keyword lookups, Boolean queries, and relevance
ranking.

7. Bloom Filters
A Bloom Filter is a space-efficient, probabilistic data structure that answers the
question: "Does this element exist in a set?"

It starts as a bit array of size m, initialized with all bits set to 0. It also requires k
independent hash functions, each of which maps an element to one of the m positions in
the bit array.

Visualized using Multiplayer

To insert an element into the Bloom filter, you pass it through each of the k hash
functions to get k positions in the bit array. The bits at these positions are set to 1.

To check if an element is in the set, you again pass it through the k hash functions to get
k positions.
If all the bits at these positions are set to 1, the element is probably in the set
(though there's a chance it might be a false positive).

If any bit at these positions is 0, the element is definitely not in the set.

Unlike traditional data structures, it does not store the actual elements, making it
extremely memory-efficient.

Bloom filters allow databases to quickly check if a key might exist in a specific data
structure (e.g., an SSTable or a database partition). They avoid unnecessary disk
lookups in places where the key is guaranteed to be absent.

8. Bitmap Indexes
A bitmap index is a specialized indexing technique that encodes the values of a column
as a series of bitmaps, where each bitmap corresponds to a unique value in the column.

Each bit in the bitmap represents whether a row in the dataset contains that value.

Consider a dataset with a column Color:

In the bitmap index:

Each row corresponds to a bit in the bitmap.

A 1 indicates that the value is present in the row, while a 0 indicates its absence.
Bitmap indexes use bitwise operations (AND, OR, NOT, XOR) to efficiently filter data.

Example Query: Find rows with Color = 'Red' OR Color = 'Green'

Retrieve the bitmap for 'Red' → 1 0 0 1 0

Retrieve the bitmap for 'Green' → 0 0 1 0 1

Perform a bitwise OR → 1 0 1 1 1

Bitmap indexes are widely used in data warehouses, columnar databases, and OLAP
(Online Analytical Processing) systems for their ability to speed up complex queries
like filtering, aggregations, and joins when dealing with large datasets containing
low-cardinality columns (columns with few unique values).

9. R-trees
An R-Tree (short for Rectangle Tree) is a tree-based data structure designed for indexing
multidimensional spatial data, such as geographic locations, geometric shapes, or
bounding boxes.

It is particularly effective for queries involving spatial relationships, such as


intersections, containment, and nearest neighbors.

Each node is represented by an Minimum Bounding Rectangle (MBR) that encloses all
its child nodes or objects. Leaf nodes store the actual spatial objects.

PostGIS, an extension of PostgreSQL, uses R-Trees to index spatial data for queries like:
Find all locations within a rectangular region.

SELECT * FROM locations


WHERE ST_Intersects(
geometry,
ST_MakeEnvelope(-74.02, 40.70, -73.93, 40.80)
);

10. Write-Ahead Logs (WAL)


A Write-Ahead Log is an append-only persistent log file that records all changes made
to a database before they are applied to the actual database.

This ensures that even if the system crashes during a write operation, the database can
recover to a consistent state by replaying or rolling back these logged changes.

Structure of a WAL Entry:


Transaction ID (TXID): A unique identifier for the transaction.

Operation Type: The action performed (INSERT, UPDATE, DELETE).

Table Name: The table where the operation occurred.

Key: The primary key or unique identifier of the record.

Old Value: The previous value of the record (used in UPDATE and DELETE).

New Value: The new value of the record (used in INSERT and UPDATE).

Timestamp: The time the operation was logged.

Sample WAL Record:

TXID: 1002
Operation: UPDATE
Table: users
Key: id=1
Old Value: {"id": 1, "name": "Alice", "email":
"[email protected]", "age": 30}
New Value: {"id": 1, "name": "Alice", "email":
"[email protected]", "age": 31}
Timestamp: 2024-11-19T10:05:00Z

By logging changes before they are applied to the main database, WAL enables
databases to recover from crashes and maintain ACID (Atomicity, Consistency,
Isolation, Durability) properties.

Periodically, the database truncates or archives old log entries after ensuring that the
changes are safely written to the main database file.

You might also like