Yahoo Sherpa

Sherpa
Developer(s)	Yahoo! Inc.
Development status	Active
Written in	C++, PHP
Operating system	Linux
Type	key-value store

Sherpa is Yahoo's next-generation cloud storage platform. It is a hosted, distributed and geographically replicated key-value datastore. It is a NoSQL solution that has been developed by Yahoo, to address scalability, availability and latency needs of Yahoo! websites. Sherpa has capabilities such as elastic growth, multi-tenancy, global footprint for local low-latency access, asynchronous replication, REST-based web service APIs, novel per-record consistency knobs, high availability, compression, secondary indexes, and record level replication.

1 Architecture
2 Data model
3 Features
4 References
5 External links

Architecture [link]

Sherpa is a multi-tenant system. An application can store data in a table, which is a collection of records. A table is sharded into smaller pieces called tablets. Data is sharded based on the hash value of the key, or range partitioned. Tablets are stored on nodes referred to as storage units. A software routing layer keeps track of mapping between applications tablets and storage units. Applications send requests to the router, which forwards them to the correct storage unit based on the tablet map. Clients can get/set/delete, and scan records via unique record primary keys.

Data model [link]

Sherpa's data model is a key-value store where data is stored as JSON blobs. Data is organized in tables where primary key uniqueness can be enforced, but other than that, there are no fixed schemas. It supports single-table scans with predicates. Customers can choose a variety of table types: Distributed Hash Table, Distributed Ordered Table, Mastered and Merging Tables. Application-specific access patterns determine the suitability of each table type. Query patterns affect key definition.

Features [link]

Scalability [link]

Sherpa scales by partitioning data. In Sherpa, data partitions are called tablets. Each customer-defined table is partitioned into tablets. Therefore, tablets are both units of work assignment and tenancy. Each tablet contains a range of records. Sherpa can scale to very large numbers of tables, tablets and records.

Elasticity [link]

The system scales horizontally as newer machines are added, with no downtime to applications. Other elasticity operations include data partition assignment, reassignment and splitting.

Fault-tolerance [link]

Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is supported. Single-node failure is transparent to the applications. Sherpa relied on a reliable transaction message bus for replicating transactions. This message bus guarantees at-least-once delivery of transaction messages.

Podcasts:

Email this Page Play all in Full Screen Show More Related Videos

Most Related
Most Recent
Most Popular
Top Rated

PLAYLIST TIME:

Contents

Architecture [link]

Data model [link]

Features [link]

Scalability [link]

Elasticity [link]

Fault-tolerance [link]

Tunable consistency [link]

Selective record replication [link]

Backup [link]

Secondary indexes [link]

References [link]

External links [link]

https://wn.com/Yahoo_Sherpa

Podcasts:

EXPLORE WN.com