100% found this document useful (2 votes)

4K views

Hypertable An Open Source, High Performance, Scalable Database

This document introduces Hypertable, an open source scalable database modeled after Google's Bigtable. It provides a high-level overview of Hypertable's architecture, data model, scaling capabilities, and performance. Key points include that Hypertable uses a sparse multi-dimensional table structure, supports high insert/update rates, and scales horizontally across clusters through partitioning of data into ranges managed by range servers.

Uploaded by

Oleksiy Kovyrin

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

4K views

Hypertable An Open Source, High Performance, Scalable Database

Uploaded by

Oleksiy Kovyrin

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 37

Hypertable

Doug Judd
Zvents, Inc.
Background

hypertable.org
Web 2.0 = Data Explosion

Web 1.0 Web 2.0

Web 2.0
Web 1.0
Traditional Tools
Don’t Scale Well
 Designed for a single machine
 Typical scaling solutions
 ad-hoc
 manual/static resource allocation
The Google Stack
 Google File System (GFS)
 Map-reduce
 Bigtable

hypertable.org
Architectural Overview

hypertable.org
What is Hypertable?
 A open source high performance, scalable
database, modelled after Google's Bigtable
 Not relational
 Does not support transactions

hypertable.org
Hypertable Improvements
Over Traditional RDBMS
 Scalable
 High random insert, update, and delete
rate
Data Model
 Sparse, two-dimensional table with cell versions
 Cells are identified by a 4-part key
 Row
 Column Family
 Column Qualifier
 Timestamp

hypertable.org
Table: Visual Representation

hypertable.org
Table: Actual Representation

hypertable.org
Anatomy of a Key
 Row key is \0 terminated
 Column Family is represented with 1 byte
 Column qualifier is \0 terminated
 Timestamp is stored big-endian ones-compliment

hypertable.org
Concurrency
 Bigtable uses copy-on-write
 Hypertable uses a form of MVCC
(multi-version concurrency control)
 Deletes are carried out by inserting “delete”
records
CellStore
 Sequence of 65K
blocks of compressed
key/value pairs
System Overview
Range Server
 Manages ranges of table data
 Caches updates in memory (CellCache)
 Periodically spills (compacts) cached updates to disk
(CellStore)

hypertable.org
Client API
class Client {

void create_table(const String &name,

const String &schema);

Table *open_table(const String &name);

String get_schema(const String &name);

void get_tables(vector<String> &tables);

void drop_table(const String &name,

bool if_exists);
};
Client API (cont.)
class Table {
TableMutator *create_mutator();
TableScanner *create_scanner(ScanSpec &scan_spec);
};
class TableMutator {
void set(KeySpec &key, const void *value, int value_len);
void set_delete(KeySpec &key);
void flush();
};
class TableScanner {
bool next(CellT &cell);
};
hypertable.org
Language Bindings
 Currently C++ only
 Thrift Broker
Write Ahead Commit Log
 Persists all modifications (inserts and
deletes)
 Written into underlying DFS
Range Meta-Operation Log
 Facilitates Range meta operation
 Loads
 Splits
 Moves
 Part of Master and RangeServer
 Ensures Range state and location
consistency
Compression
 Cell Stores store compressed blocks of key/value
pairs
 Commit Log stores compressed blocks of
updates
 Supported Compression Schemes
 zlib (--best and --fast)
 lzo
 quicklz
 bmz
 none

hypertable.org
Caching
 Block Cache
 Caches CellStore blocks
 Blocks are cached uncompressed
 Query Cache
 Caches query results
 TBD

hypertable.org
Bloom Filter
 Negative Cache
 Probabilistic data structure
 Indicates if key is not present
Scaling (part I)

hypertable.org
Scaling (part II)

hypertable.org
Scaling (part III)

hypertable.org
Access Groups
 Provides control of physical data layout --
hybrid row/column oriented
 Improves performance by minimizing I/O
CREATE TABLE crawldb {
Title MAX_VERSIONS=3,
Content MAX_VERSIONS=3,
PageRank MAX_VERSIONS=10,
ClickRank MAX_VERSIONS=10,
ACCESS GROUP default (Title, Content),
ACCESS GROUP ranking (PageRank, ClickRank)
};

hypertable.org
Filesystem Broker
Architecture
 Hypertable can run on top of any distributed
filesystem (e.g. Hadoop, KFS, etc.)

hypertable.org
Keys To Performance
 C++
 Asynchronous communication
C++ vs. Java
 Hypertable is CPU intensive
 Manages large in-memory key/value map
 Alternate compression codecs (e.g. BMZ)
 Hypertable is memory intensive
 Java uses 2-3 times the amount of memory to
manage large in-memory map (e.g. TreeMap)
 Poor processor cache performance

hypertable.org
Performance Test
(AOL Query Logs)
 75,274,825 inserted cells
 8 node cluster
 1 1.8 GHz Dual-core Opteron
 4 GB RAM
 3 x 7200 RPM SATA drives
 Average row key: 7 bytes
 Average value: 15 bytes
 Replication factor: 3
 4 simultaneous insert clients
 500K random inserts/s
 680K scanned cells/s
hypertable.org
Performance Test II
 Simulated AOL query log data
 1TB data
 9 node cluster
 1 2.33 GHz quad-core Intel
 16 GB RAM
 3 x 7200 RPM SATA drives
 Average row key: 9 bytes
 Average value: 18 bytes
 Replication factor: 3
 4 simultaneous insert clients
 Over 1M random inserts/s (sustained)
hypertable.org
Weaknesses
 Range data managed by a single range
server
 Though no data loss, can cause periods of
unavailability
 Can be mitigated with client-side cache or
memcached

hypertable.org
Project Status
 Currently in “alpha”
 Just released version 0.9.0.7
 Will release “beta” version end of August
 Waiting on Hadoop JIRA 1700

hypertable.org
License
 GPL 2.0
 Why not Apache?
Questions?
 www.hypertable.org

hypertable.org

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Designing Data Intensive Applications
25% (4)
Designing Data Intensive Applications
61 pages
11 PJBUMI Digital WAN RASHIDAH WAN RAMLI CV 2022
No ratings yet
11 PJBUMI Digital WAN RASHIDAH WAN RAMLI CV 2022
5 pages
CS614 Finalterm Subjective Referencefile
No ratings yet
CS614 Finalterm Subjective Referencefile
27 pages
Coq 8.10.2 Reference Manual PDF
No ratings yet
Coq 8.10.2 Reference Manual PDF
643 pages
Marking Scheme For Computer Networks Exam
No ratings yet
Marking Scheme For Computer Networks Exam
18 pages
DCA1101 - BCA - 1 - Set 1and2 - Dec2021
No ratings yet
DCA1101 - BCA - 1 - Set 1and2 - Dec2021
1 page
Netezza Database Users Guide
100% (2)
Netezza Database Users Guide
320 pages
Interview With Stana Katic
No ratings yet
Interview With Stana Katic
5 pages
MySQL Cluster Tutorial
100% (3)
MySQL Cluster Tutorial
64 pages
Automating With STEP 7 in LAD and FB1
0% (1)
Automating With STEP 7 in LAD and FB1
1 page
SAP F-28 Guide: Posting Manual Customer Payment
80% (10)
SAP F-28 Guide: Posting Manual Customer Payment
14 pages
Online Aptitude Test: 1.1 Purpose of The System
No ratings yet
Online Aptitude Test: 1.1 Purpose of The System
5 pages
A CPU: FCFS, - STF, SRTF (STF), R - R ( 12)
No ratings yet
A CPU: FCFS, - STF, SRTF (STF), R - R ( 12)
8 pages
8255
No ratings yet
8255
28 pages
Bus Routes
No ratings yet
Bus Routes
2 pages
DBMS Sol
No ratings yet
DBMS Sol
11 pages
Slide 10 Distribution
No ratings yet
Slide 10 Distribution
34 pages
DBMS 5th Sem - LabManual
No ratings yet
DBMS 5th Sem - LabManual
63 pages
Aos Notes
No ratings yet
Aos Notes
87 pages
Sumita - Arora - Classes - and - Objects - Long - Answer - Questions PDF
No ratings yet
Sumita - Arora - Classes - and - Objects - Long - Answer - Questions PDF
10 pages
Information Storage Management
No ratings yet
Information Storage Management
2 pages
4.7.1 - Data Warehousing Mining & Business Intelligence
No ratings yet
4.7.1 - Data Warehousing Mining & Business Intelligence
3 pages
Terminate and Stay Resident
No ratings yet
Terminate and Stay Resident
23 pages
JNTUH Mobile Application Development Syllabi
No ratings yet
JNTUH Mobile Application Development Syllabi
2 pages
Chapter 5
No ratings yet
Chapter 5
69 pages
10.4.6 Module Quiz - Basic Router Configuration (Answers)
No ratings yet
10.4.6 Module Quiz - Basic Router Configuration (Answers)
6 pages
Data Structures Viva Questions
100% (1)
Data Structures Viva Questions
8 pages
Client-Server Reviewer
No ratings yet
Client-Server Reviewer
7 pages
Cse306 Question Paper
No ratings yet
Cse306 Question Paper
3 pages
CS-701-CBGS: B.Tech., VII Semester
No ratings yet
CS-701-CBGS: B.Tech., VII Semester
2 pages
Data Structure Assignment
No ratings yet
Data Structure Assignment
7 pages
Design Issues of Network Layer
100% (1)
Design Issues of Network Layer
16 pages
Assignment: Lovely Professional University
No ratings yet
Assignment: Lovely Professional University
7 pages
Optimizing IP For IoT-Routing Over Low Power and
No ratings yet
Optimizing IP For IoT-Routing Over Low Power and
15 pages
CN Lab Manual
No ratings yet
CN Lab Manual
131 pages
Unit3-Routing Protocols - Edited
No ratings yet
Unit3-Routing Protocols - Edited
124 pages
Polygon Clipping
No ratings yet
Polygon Clipping
9 pages
Gate 2023 Roadmap
100% (1)
Gate 2023 Roadmap
14 pages
8255 (Programmable Peripheral Interface)
No ratings yet
8255 (Programmable Peripheral Interface)
6 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
DATA STRUCTURE Lab Manual For 3rd Sem
No ratings yet
DATA STRUCTURE Lab Manual For 3rd Sem
60 pages
Peripheral Component Interconnect
No ratings yet
Peripheral Component Interconnect
3 pages
DCA6201 Operating System (All Units) PDF
No ratings yet
DCA6201 Operating System (All Units) PDF
258 pages
AL3411 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING LABORATORY L T P C
No ratings yet
AL3411 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING LABORATORY L T P C
11 pages
CN and WP Lab Manual
No ratings yet
CN and WP Lab Manual
101 pages
URL and Socket
No ratings yet
URL and Socket
10 pages
Serial Schedule Non-Serial Schedule: Checkpoints
No ratings yet
Serial Schedule Non-Serial Schedule: Checkpoints
7 pages
Group A Mysql Handout PDF
No ratings yet
Group A Mysql Handout PDF
7 pages
M.SC., - CS &amp IT - 2011-12
No ratings yet
M.SC., - CS &amp IT - 2011-12
33 pages
V Raghavan
No ratings yet
V Raghavan
206 pages
Notes - 4 Unit-Big Data
No ratings yet
Notes - 4 Unit-Big Data
38 pages
DCCN
No ratings yet
DCCN
48 pages
CC Lab Manual
No ratings yet
CC Lab Manual
67 pages
Hyper Table
No ratings yet
Hyper Table
12 pages
Hbase in Practice
No ratings yet
Hbase in Practice
46 pages
Bigtable: A Distributed Storage System For Structured Data
No ratings yet
Bigtable: A Distributed Storage System For Structured Data
23 pages
Bigtable: A Distributed Storage System For Structured Data
No ratings yet
Bigtable: A Distributed Storage System For Structured Data
14 pages
Bigtable: A Distributed Storage System For Structured Data
No ratings yet
Bigtable: A Distributed Storage System For Structured Data
26 pages
6. Introduction-to-Data-Storage-and-Retrieval
No ratings yet
6. Introduction-to-Data-Storage-and-Retrieval
26 pages
DBMS 3
No ratings yet
DBMS 3
3 pages
Week7 Slides
No ratings yet
Week7 Slides
70 pages
System Design
No ratings yet
System Design
150 pages
System Design
No ratings yet
System Design
150 pages
Cheat Sheet v4
No ratings yet
Cheat Sheet v4
3 pages
Lecture 33
No ratings yet
Lecture 33
32 pages
Deploying IP Unicast
No ratings yet
Deploying IP Unicast
83 pages
Metadata Locking and Deadlock Detection in MySQL 5.5
No ratings yet
Metadata Locking and Deadlock Detection in MySQL 5.5
14 pages
MySQL and Linux Tuning - Better Together
100% (1)
MySQL and Linux Tuning - Better Together
26 pages
MySQL and SSD: Usage Patterns
No ratings yet
MySQL and SSD: Usage Patterns
29 pages
Lessons Learned: Scaling A Social Network
No ratings yet
Lessons Learned: Scaling A Social Network
52 pages
Granular Archival and Nearline Storage Using MySQL, S3 and SQS Presentation
No ratings yet
Granular Archival and Nearline Storage Using MySQL, S3 and SQS Presentation
28 pages
Data in The Cloud Presentation
No ratings yet
Data in The Cloud Presentation
13 pages
Large Datasets in MySQL On Amazon EC2
No ratings yet
Large Datasets in MySQL On Amazon EC2
30 pages
Dynamic Columns
No ratings yet
Dynamic Columns
18 pages
Forecasting MySQL Performance and Scalability
100% (1)
Forecasting MySQL Performance and Scalability
41 pages
A Beginner's Guide To MariaDB Presentation
67% (3)
A Beginner's Guide To MariaDB Presentation
26 pages
Automated, Non-Stop MySQL Operations and Failover Presentation
100% (1)
Automated, Non-Stop MySQL Operations and Failover Presentation
46 pages
Linux and H/W Optimizations For MySQL
100% (2)
Linux and H/W Optimizations For MySQL
160 pages
Zipcar Incident Report
No ratings yet
Zipcar Incident Report
2 pages
Bottom-Up Database Benchmarking
No ratings yet
Bottom-Up Database Benchmarking
43 pages
Advanced Replication Monitoring Presentation
100% (1)
Advanced Replication Monitoring Presentation
13 pages
Book Upload
100% (20)
Book Upload
270 pages
Search Analytics With Flume and HBase
No ratings yet
Search Analytics With Flume and HBase
24 pages
A Code Stub Generator For MySQL and Drizzle Plugins Presentation
No ratings yet
A Code Stub Generator For MySQL and Drizzle Plugins Presentation
29 pages
Top 10 Lessons Learned From Deploying Hadoop in A Private Cloud
No ratings yet
Top 10 Lessons Learned From Deploying Hadoop in A Private Cloud
33 pages
ITSM and ITDS Workshop
No ratings yet
ITSM and ITDS Workshop
3 pages
Linux Admin
No ratings yet
Linux Admin
96 pages
Ways To Start An AIX in Maintenance Mode or Single User
No ratings yet
Ways To Start An AIX in Maintenance Mode or Single User
7 pages
Creating Packages - ArchWiki
No ratings yet
Creating Packages - ArchWiki
6 pages
BSOD Codes
No ratings yet
BSOD Codes
19 pages
Detailed Syllabus in Ite113 - It Elective 1
No ratings yet
Detailed Syllabus in Ite113 - It Elective 1
6 pages
W3schools: PHP Tutorial
No ratings yet
W3schools: PHP Tutorial
9 pages
ESP Quiz Book
No ratings yet
ESP Quiz Book
31 pages
500+ Interview Questions-1
No ratings yet
500+ Interview Questions-1
126 pages
Power Bi Interview Question Asked in TCS
No ratings yet
Power Bi Interview Question Asked in TCS
14 pages
iceDQ - Snowflake Migration and Testing Guide-1
No ratings yet
iceDQ - Snowflake Migration and Testing Guide-1
4 pages
Bcom CA Ms-Office Practical Examination Question Paper
No ratings yet
Bcom CA Ms-Office Practical Examination Question Paper
7 pages
Design Spec - 0IC - C03
100% (1)
Design Spec - 0IC - C03
18 pages
DBMS_Practical
No ratings yet
DBMS_Practical
7 pages
MAIS-eFaraid Project TimeLine CR 04082021
No ratings yet
MAIS-eFaraid Project TimeLine CR 04082021
2 pages
HANA Modeling Exercises
100% (1)
HANA Modeling Exercises
13 pages
What Is Cloud ?: Redhat Openstack Training
No ratings yet
What Is Cloud ?: Redhat Openstack Training
7 pages
Data Bases in SQL Server
No ratings yet
Data Bases in SQL Server
2 pages
MapXtreme2008 DevGuide
No ratings yet
MapXtreme2008 DevGuide
580 pages
Scrum and CMMI: A High Level Assessment of Compatibility: Empirical, Since It Is
No ratings yet
Scrum and CMMI: A High Level Assessment of Compatibility: Empirical, Since It Is
7 pages
Computer Fundamentals: Aimee G. Acoba
No ratings yet
Computer Fundamentals: Aimee G. Acoba
11 pages
The MVS 3.8j Tur (N) Key 4-System - Version 1.00 - : Installation
No ratings yet
The MVS 3.8j Tur (N) Key 4-System - Version 1.00 - : Installation
8 pages
b42437c9-a47c-0010-82c7-eda71af511fa
No ratings yet
b42437c9-a47c-0010-82c7-eda71af511fa
22 pages
10 Things You Can Do To Become A Better PHP Developer
No ratings yet
10 Things You Can Do To Become A Better PHP Developer
15 pages
Pengembangan Proyek Sistem Informasi
No ratings yet
Pengembangan Proyek Sistem Informasi
13 pages
Sap QM UsingDigitalSignaturesEMconvtSearchSAP
No ratings yet
Sap QM UsingDigitalSignaturesEMconvtSearchSAP
37 pages

Hypertable An Open Source, High Performance, Scalable Database

Uploaded by

Hypertable An Open Source, High Performance, Scalable Database

Uploaded by

Hypertable

void create_table(const String &name,

Table *open_table(const String &name);

String get_schema(const String &name);

void get_tables(vector<String> &tables);

void drop_table(const String &name,

You might also like