Report System Design
Report System Design
Cassandra DB is suitable for when you need to have data distributed across
different regions, use sharding for faster access
POST request is used when submitted data to DB
GET request is used when you need to access data from DB
Load balancers are essential when the traffic to server is very high, we need
multiple servers and divide traffic among when using the Load balancer
Each request must be accompanied with a proper response HTTP status code
to allow exception handling on client side
APIs are endoints on the server that the client can access and request and
submit data to
Since databases are essentially files stored on the disk, it is much faster to
access data stored in memory, hence we should cache frequently used data
in a cache and use appropriate cache eviction policies
HTTP status code can be used for tracking and data analytics purpose
301 redirect – browser cache response, less load on server but no analytics
302 redirect – browser does not cache response, more load on server, but
allows data collection
Since using a centralized cache creates a single point of failure and all traffic
is directed to the cache, overloading it – use multiple instances of cache
service like Redis
This causes issues with duplicate tokens – use a token service to assign
token ranges for each instance of cache service
Use Kafka for analytics, instead of each request creating a write to Kafka, use
batch writes, may cause issues in case of failed writes, but some data lost is
fine
Cassandra is good for NoSQL write-heavy workloads
Websockets used for peer-to-peer connection and delivery of messages
Chat application needs to be highly available and scalable due to large
number of users and users from different regions
Combine regional traffic on websockets for faster delivery of messages
Compress data like multimedia, text, documents before storage on server to
save space
Move old data to less frequently used storage for lower costs
Use data hashing or error correction codes for delivering messages and
ensuring that the data delivered has not been corrupted
Depending on traffic, use either pull or push mechanism for delivery of
messages – push makes immediate delivery but in high traffic scenarios has
huge load on server, pull delays deliver until client requests the data and
thus has low load on server