100% found this document useful (1 vote)

941 views33 pages

Horizontal Scaling With HiveDB Presentation

This document discusses scaling a product catalog database at Cafepress.com using HiveDB. Key points: - Cafepress had hundreds of millions of products with millions added weekly, requiring a database that could scale with accelerating growth. - HiveDB uses key-based partitioning to distribute data across nodes without broadcasting or repartitioning. This allows easy addition of capacity and relocation of records. - Testing showed HiveDB met Cafepress' performance goals, with read throughput of 2250/s (goal 1500/s) and response time of 8ms (goal 100ms). - HiveDB provided the highest database uptime at Cafepress, handling billions of

Uploaded by

yejr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

941 views33 pages

Horizontal Scaling With HiveDB Presentation

Uploaded by

yejr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Scaling with HiveDB

Project Genesis
• Cafepress.com Product Catalog
• Hundreds of Millions of Products
• Millions
week
of new products every

• Accelerating growth
Enter Jeremy and
HiveDB
Our Requirements
• OLTP Optimized
• Constant response time is more
important than low latency
• Related sets vary wildly in size
• Growth hotspots
• Usage hotspots
Partition by key
Directory
• No broadcasting
• No re-partitioning
• Easy to relocate records
• Easy to add capacity
Disadvantages
• Intelligence moved from the
database to the application
(Queries have to be planned and
indexed)
• Can’t join across partitions
• NO OLAP! (We consider this a
benefit.)
• Directory is a bottleneck
Original Design
• Smallest possible implementation
• HiveDB was just a JDBC Gatekeeper
for applications.
Development
Complexity Problem
• You have to maintain
synchronization between the
directory and the data nodes.
• Lots of code for simple operations
• Data access objects have to be re-
implemented
Enter Hibernate
Shards
• Partitioned Hibernate from Google
• Why did we write this thing
again?
• Oh wait, you have to tell it how
to look things up...we’re good at
that.
Benefits of Shards
• Unified data access layer
• Result set aggregation across
partitions
• Everyone in the JAVA-world knows
Hibernate.
Show don’t tell!
Competitive Landscape

• Clustered Relational Databases

• Non-relational
Databases
Structured

• Non-relational,
Storage
Unstructured
Competitive Landscape
Clustered Non-relational Unstructured

Oracle RAC
MS SQL Server Hypertable
Hadoop
MySQL Cluster HBase
MogileFS
DB2 CouchDB
S3
Teradata (OLAP) SimpleDB
HiveDB
Competitive Landscape
Storage Interface Partitioning Expansion Node Types Maturity

Oracle RAC Shared SQL Transparent No downtime Identical 7 years

Memory / Requires
MySQL Cluster SQL Transparent Mixed 3 years
Local Restart

?
Hypertable Local HQL Transparent No downtime Mixed (released
2/08)
3 years
Degraded
DB2 Local SQL Fixed Hash Identical (25 years
Performance
total)

Key-based 18 months
HiveDB Local SQL No downtime Mixed
Directory (+13 years!)
Case Study:
CafePress
• Leader in User-generated Commerce
• Same number of products as eBay
(>150,000,000)
• 24/7/365
Case Study:
Performance Requirements
• Thousands of queries/sec
• 10 : 1 read/write ratio
• Geographically distributed
Case Study:
Test Environment
• Real schema
• Production-class hardware
• Partial data (~40M records)
CafePress HiveDB 2007
Performance Test Environment

JMeter (1 thread) JMeter (no threads)

command &

client.jar client.jar
control

Measurement Test Controller

Workstation Workstation
100MBit switch
load generators

JMeter (100s of threads) JMeter (100s of threads)

client.jar client.jar

48GB backplane non-

Dell 2950 / 2x2 Xeon blocking gigabit switch Dell 2950 / 2x2 Xeon
16GB, 6x72GB 15k 16GB, 6x72GB 15k
web service

Hardware LB
(hivedb)

Dell 1950 / 2x2 Xeon Dell 1950 / 2x2 Xeon Dell 1950 / 2x2 Xeon
Tomcat 5.5 Tomcat 5.5 Tomcat 5.5

Directory Partition 0 Partition 1

databases
(mysql)

Dell 2950 / 2x2 Xeon Dell 2950 / 2x2 Xeon Dell 2950 / 2x2 Xeon
16GB, 6x72GB 15k 16GB, 6x72GB 15k 16GB, 6x72GB 15k

[email protected] Modified on April 09 2007

Case Study:
Performance Goals
• Large object reads: 1500/s
• Large object writes: 200/s
• Response time: 100ms
Case Study:
Performance Results
• Large object reads: 1500/s
Actual result: 2250/s
• Large object writes: 200/s
Actual result: 300/s
• Response time: 100ms
Actual result: 8ms
Case Study:
Performance Results
• Max read throughput
Actual result: 4100/s
(CPU limited in Java layer;
MySQL <25% utilized)
Case Study:
Organizational Results
• Billions of queries served
• Highest DB uptime at CafePress
• Hundreds
performed
of millions of updates
High Availability &
Replication
• We don’t specify a fail over
strategy
• We delegate to MySQL replication
Non-JAVA Deployment
Options
• Web service
• JVM Dynamic Languages
HiveDB Accessories
Class Generation
• Automatically generate Data
Transfer Objects from interfaces
(and soon web services).
Blobject
• Gets around the problem of ALTER
statements
• Compression
• The hive can contain multiple
versions of a serialized record.
• No data set of this size can be
transformed synchronously.
Features Teaser
• We’re taking over HA...you’re still
on your own for replication.
• Generated Web Services
• Monitoring
graphs!)
& RRD stats (with

• Query/transform tool
• Record migration & balancing tools
Contributing
• Post to the mailing list
http://groups.google.com/group/hivedb-dev

• Comment on our site

http://www.hivedb.org

• File a bug
http://hivedb.lighthouseapp.com

• Submit a patch / pull request

git clone git://github.com/britt/hivedb.git
Photo Credits
• http://www.flickr.com/photos/7362313@N07/1240245941/sizes/o

• http://www.flickr.com/photos/99287245@N00/2229322675/sizes/o

6th Central Pay Commission Salary Calculator
100% (436)
6th Central Pay Commission Salary Calculator
15 pages
Stability Analysis and Modelling of Unde
No ratings yet
Stability Analysis and Modelling of Unde
309 pages
SQL Antipatterns
92% (25)
SQL Antipatterns
220 pages
MySQL Replication Tutorial
100% (9)
MySQL Replication Tutorial
114 pages
Real World Web: Performance & Scalability
100% (26)
Real World Web: Performance & Scalability
189 pages
Livejournal'S Backend: Brad Fitzpatrick Mark Smith
100% (3)
Livejournal'S Backend: Brad Fitzpatrick Mark Smith
70 pages
Data File Handling Worksheet
No ratings yet
Data File Handling Worksheet
10 pages
Social Science Sec 2024-25
No ratings yet
Social Science Sec 2024-25
65 pages
Icats Basic HEO (HE)
No ratings yet
Icats Basic HEO (HE)
102 pages
TOS - Statistics and Probability - 3rd Quarter Examination
No ratings yet
TOS - Statistics and Probability - 3rd Quarter Examination
2 pages
PSM1
No ratings yet
PSM1
4 pages
MySQL Backups Go Near-Continuous
No ratings yet
MySQL Backups Go Near-Continuous
43 pages
Theory of Elasticity
No ratings yet
Theory of Elasticity
4 pages
The Hydrologic Budget
100% (1)
The Hydrologic Budget
6 pages
Revision For Mid Term Test
No ratings yet
Revision For Mid Term Test
7 pages
MySQL Magazine Issue 2
100% (1)
MySQL Magazine Issue 2
22 pages
Web Workloads For Comparing, Testing: Tuning MySQL Performance
100% (1)
Web Workloads For Comparing, Testing: Tuning MySQL Performance
43 pages
Portable Scale-Out Benchmarks For MySQL Presentation
100% (3)
Portable Scale-Out Benchmarks For MySQL Presentation
45 pages
MySQL Online Backup: An In-Depth Introduction
100% (5)
MySQL Online Backup: An In-Depth Introduction
38 pages
MySQL Proxy: The Complete Tutorial (Full Day) Presentation
100% (4)
MySQL Proxy: The Complete Tutorial (Full Day) Presentation
90 pages
MySQL Online Backup: An In-Depth Introduction
100% (5)
MySQL Online Backup: An In-Depth Introduction
38 pages
MySQL Workbench: The Ultimate Guide To DBA Productivity. Writing Plugins and Scripts
100% (3)
MySQL Workbench: The Ultimate Guide To DBA Productivity. Writing Plugins and Scripts
13 pages
The Top 20 Design Tips For MySQL Enterprise Data Architects
91% (11)
The Top 20 Design Tips For MySQL Enterprise Data Architects
38 pages
The Top 20 Design Tips For MySQL Enterprise Data Architects
91% (11)
The Top 20 Design Tips For MySQL Enterprise Data Architects
38 pages
Sphinx High Performance Full Text Search For MySQL Presentation
100% (4)
Sphinx High Performance Full Text Search For MySQL Presentation
32 pages
Lesson 3: Surface Creation
No ratings yet
Lesson 3: Surface Creation
86 pages
Eco 1
No ratings yet
Eco 1
24 pages
DS-ZF - 400 - A Gear Box For Volvo Penta d13
No ratings yet
DS-ZF - 400 - A Gear Box For Volvo Penta d13
4 pages
Accelerator 960-1 052018
No ratings yet
Accelerator 960-1 052018
4 pages
Wikimedia Architecture
100% (12)
Wikimedia Architecture
32 pages
Deep and Surface Learning PDF
No ratings yet
Deep and Surface Learning PDF
1 page
New Subquery Optimizations in MySQL 6
100% (6)
New Subquery Optimizations in MySQL 6
36 pages
Aug 1-27 Final
No ratings yet
Aug 1-27 Final
90 pages
The Lost Art of The Self Join
100% (2)
The Lost Art of The Self Join
65 pages
FreemanWhite Hybrid Operating Room Design Guide PDF
No ratings yet
FreemanWhite Hybrid Operating Room Design Guide PDF
11 pages
Securing MySQL For A Security Audit
100% (4)
Securing MySQL For A Security Audit
29 pages
What MySQL Can Learn From PostgreSQL
100% (6)
What MySQL Can Learn From PostgreSQL
21 pages
Mutations
No ratings yet
Mutations
48 pages
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
100% (1)
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
33 pages
Mixi JP Scaling Out With Open Source
100% (1)
Mixi JP Scaling Out With Open Source
57 pages
Gr.8 - Unit #3 - L.4 - Speech Analysis
No ratings yet
Gr.8 - Unit #3 - L.4 - Speech Analysis
11 pages
CH 6
No ratings yet
CH 6
19 pages
An Exhaust Emissions Based Air-Fuel Ratio Calculation
No ratings yet
An Exhaust Emissions Based Air-Fuel Ratio Calculation
8 pages
MySQL Magazine - Issue 1
100% (2)
MySQL Magazine - Issue 1
24 pages
Performing MySQL Backups Using LVM Snapshots
100% (4)
Performing MySQL Backups Using LVM Snapshots
14 pages
Astm C40 C40M 16
No ratings yet
Astm C40 C40M 16
1 page
TRC P4P Proposal
No ratings yet
TRC P4P Proposal
48 pages
03 Takaful MAYBANK EZYPAY Application Form V1.0 2018
No ratings yet
03 Takaful MAYBANK EZYPAY Application Form V1.0 2018
2 pages
Unit 2 Principles of Assessm Ent in Instructional Decision
No ratings yet
Unit 2 Principles of Assessm Ent in Instructional Decision
11 pages
Project Management: - Dr. Gyanesh Kumar Sinha Associate Professor - Operations and Analytics
No ratings yet
Project Management: - Dr. Gyanesh Kumar Sinha Associate Professor - Operations and Analytics
10 pages
COA-RO9 APP-CSE 2024 Other Items
No ratings yet
COA-RO9 APP-CSE 2024 Other Items
3 pages
Speed Up Your LAMP Stack Applications With Dolphin Express Presentation
No ratings yet
Speed Up Your LAMP Stack Applications With Dolphin Express Presentation
38 pages
The MySQL Query Cache Presentation
No ratings yet
The MySQL Query Cache Presentation
24 pages
Yahoo! Communities Architectures: Ian Flint
No ratings yet
Yahoo! Communities Architectures: Ian Flint
26 pages
Inside The PBXT Storage Engine Presentation
No ratings yet
Inside The PBXT Storage Engine Presentation
24 pages
Project 619839 EPP 1 2020 1 FI EPPKA1 JMD MOB
No ratings yet
Project 619839 EPP 1 2020 1 FI EPPKA1 JMD MOB
2 pages
fml-g12s Ds en
No ratings yet
fml-g12s Ds en
7 pages
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
No ratings yet
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
9 pages
03 - PH of Cement For Floor Installation Testing Hi-Res
No ratings yet
03 - PH of Cement For Floor Installation Testing Hi-Res
2 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

Horizontal Scaling With HiveDB Presentation

Uploaded by

Horizontal Scaling With HiveDB Presentation

Uploaded by

Scaling with HiveDB

• Clustered Relational Databases

Oracle RAC Shared SQL Transparent No downtime Identical 7 years

JMeter (1 thread) JMeter (no threads)

Measurement Test Controller

JMeter (100s of threads) JMeter (100s of threads)

48GB backplane non-

Directory Partition 0 Partition 1

[email protected] Modified on April 09 2007

• Comment on our site

• Submit a patch / pull request

You might also like