0% found this document useful (0 votes)
24 views3 pages

Design Google Drive/Dropbox

The document outlines the design of a cloud storage service like Google Drive or Dropbox. It discusses requirements like file uploads, sharing, and synchronization. It describes the data storage needs of billions of users storing petabytes of data across multiple data centers. The document then describes the major components needed like the uploader service, metadata service, sync service, and clients.

Uploaded by

kirankaranth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views3 pages

Design Google Drive/Dropbox

The document outlines the design of a cloud storage service like Google Drive or Dropbox. It discusses requirements like file uploads, sharing, and synchronization. It describes the data storage needs of billions of users storing petabytes of data across multiple data centers. The document then describes the major components needed like the uploader service, metadata service, sync service, and clients.

Uploaded by

kirankaranth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Design Google Drive/Dropbox

FR:
- Upload files/media/
- Upload limit per file
- Shareable
- User Auth
- CRUD operations on uploaded file
- Syncronization

NFR:
- Data availbility
- Data integrity
- Fast downloads

Data storage:
- 1B users, average 15GB/user
- 15PB*3 replication = 45PB
- Need CRUD - Need ACID for sync
- SQL for file storage and S3 for blob storage

Components:
- Uploader service
- Updates to storage in chunks
- Updates Metadata servicewith with what was uploaded - this would probably
be a gRPC call
- Receives individual chunk from client and uploads to Blob

- Metadataservice
- Tallks to metadata DB
- Gets input from clients about the chunk and metadata they have
- Gets input from Uploader service about the chunks and metadata uploaded
from client
- Offloads syncing to all devices with updated metadata via sync service
- Sync service
- Powered by a messages queue
- Communicate only the diff
- Push pull model for all types of documents
- If file, push the metadata change and client can pull what it doesn’t
have
- If large file pull the entire file.
- Metadata DB
- Replication service
- Offline replication of all shards in all zones
- Clients
- Store some metadata of state of file on that client
- List of all files
- Chunk info of each file
- Locations
- Versions
- Last updated time
- Has a chunker that does the actual chunking work
- Has and indexer to store which chunk goes where and re-create index when
there a change in local client - talks to sync component
- Sends data to indexer whenever there is a change in local client
- Inline deduplication to avoid storing same files in server
- Metadata partitioning for scale
- Caching of hot files

You might also like