Hypertable An Open Source, High Performance, Scalable Database
Hypertable An Open Source, High Performance, Scalable Database
Doug Judd
Zvents, Inc.
Background
hypertable.org
Web 2.0 = Data Explosion
Web 1.0 Web 2.0
Web 2.0
Web 1.0
Traditional Tools
Don’t Scale Well
Designed for a single machine
Typical scaling solutions
ad-hoc
manual/static resource allocation
The Google Stack
Google File System (GFS)
Map-reduce
Bigtable
hypertable.org
Architectural Overview
hypertable.org
What is Hypertable?
A open source high performance, scalable
database, modelled after Google's Bigtable
Not relational
Does not support transactions
hypertable.org
Hypertable Improvements
Over Traditional RDBMS
Scalable
High random insert, update, and delete
rate
Data Model
Sparse, two-dimensional table with cell versions
Cells are identified by a 4-part key
Row
Column Family
Column Qualifier
Timestamp
hypertable.org
Table: Visual Representation
hypertable.org
Table: Actual Representation
hypertable.org
Anatomy of a Key
Row key is \0 terminated
Column Family is represented with 1 byte
Column qualifier is \0 terminated
Timestamp is stored big-endian ones-compliment
hypertable.org
Concurrency
Bigtable uses copy-on-write
Hypertable uses a form of MVCC
(multi-version concurrency control)
Deletes are carried out by inserting “delete”
records
CellStore
Sequence of 65K
blocks of compressed
key/value pairs
System Overview
Range Server
Manages ranges of table data
Caches updates in memory (CellCache)
Periodically spills (compacts) cached updates to disk
(CellStore)
hypertable.org
Client API
class Client {
hypertable.org
Caching
Block Cache
Caches CellStore blocks
Blocks are cached uncompressed
Query Cache
Caches query results
TBD
hypertable.org
Bloom Filter
Negative Cache
Probabilistic data structure
Indicates if key is not present
Scaling (part I)
hypertable.org
Scaling (part II)
hypertable.org
Scaling (part III)
hypertable.org
Access Groups
Provides control of physical data layout --
hybrid row/column oriented
Improves performance by minimizing I/O
CREATE TABLE crawldb {
Title MAX_VERSIONS=3,
Content MAX_VERSIONS=3,
PageRank MAX_VERSIONS=10,
ClickRank MAX_VERSIONS=10,
ACCESS GROUP default (Title, Content),
ACCESS GROUP ranking (PageRank, ClickRank)
};
hypertable.org
Filesystem Broker
Architecture
Hypertable can run on top of any distributed
filesystem (e.g. Hadoop, KFS, etc.)
hypertable.org
Keys To Performance
C++
Asynchronous communication
C++ vs. Java
Hypertable is CPU intensive
Manages large in-memory key/value map
Alternate compression codecs (e.g. BMZ)
Hypertable is memory intensive
Java uses 2-3 times the amount of memory to
manage large in-memory map (e.g. TreeMap)
Poor processor cache performance
hypertable.org
Performance Test
(AOL Query Logs)
75,274,825 inserted cells
8 node cluster
1 1.8 GHz Dual-core Opteron
4 GB RAM
3 x 7200 RPM SATA drives
Average row key: 7 bytes
Average value: 15 bytes
Replication factor: 3
4 simultaneous insert clients
500K random inserts/s
680K scanned cells/s
hypertable.org
Performance Test II
Simulated AOL query log data
1TB data
9 node cluster
1 2.33 GHz quad-core Intel
16 GB RAM
3 x 7200 RPM SATA drives
Average row key: 9 bytes
Average value: 18 bytes
Replication factor: 3
4 simultaneous insert clients
Over 1M random inserts/s (sustained)
hypertable.org
Weaknesses
Range data managed by a single range
server
Though no data loss, can cause periods of
unavailability
Can be mitigated with client-side cache or
memcached
hypertable.org
Project Status
Currently in “alpha”
Just released version 0.9.0.7
Will release “beta” version end of August
Waiting on Hadoop JIRA 1700
hypertable.org
License
GPL 2.0
Why not Apache?
Questions?
www.hypertable.org
hypertable.org