URL Shortner
URL Shortner
Functional Requirements
Given a long URL, generate a unique short URL
Analytics
Identify microservices from requirements
1. URL shortner microservice
Dissect Microservice
Each microservice is a set of tiers
Tiers
Distributed system
Replication (generic)
Consistency (generic)
Single server
Tiers
Application server
Cache tier
APIs
Row oriented: pros: Write firendly, con: selection of a small number of fields in the
value when the value is arbritrarily large
Column oriented: pro: selection of a small number of fields in the value when the
value is arbritrarily large , con: not write friendly
Write row oriented data in a memtable, and then merge lazily into column
stores (LSM trees)
In our case, we will go with row oriented, with primary key index on the keys
Cache
Hashmap of keys and values
Go to source of truth,
Assign an unique id
Cluster manager
Config store
Mapping of data to partitions
Horizontal sharding
Partitioning by key: Subsets of keys with full values in a single shard or bucket
Vertical sharding
Less common
Hash based: pros: uniform distribution, but split or merge of shards is hard
Mapping of data to partitions for this problem
Horizontal hash is the best
CAP theorem
Consistency, availability and network partition tolerance does not come all
three together
Document search problem
Requirements
Given a search string of terms, return all document ids that contain all the terms in
the string
K-V
API: search(string)
[ag - ]
Stream Processing
Problem statement
Imagine a data center having hundreds of servers each emitting thousands of
statistics per second (such as CPU utilization, memory utilization)
1. Given a server id, return min, max, avg of all stats within a time window of 30
minutes
2. Given a statistic id, return min, max, avg of all hosts within a time window of
30 minutes
3. Given a server id and a time range, return min, max, avg of all stats
4. Given a statistic id and a time range, return min, max, avg of all hosts
APIs
Table_1 minute:
Time series
How distributed
Horizontal hash
News Feed, Uber, Netflix
Recommendations
Workflow
OLTP: Simple workloads but with high concurrency
Tweet generation