Saved Search Service Architecture
Saved Search Service Architecture
Separation of Concerns
1. Insertion Service is responsible for creating, updating and deleting Saved Searches, only this
service is allowed to write to the Saved Searches database and only this service will have
access to the Mongodb master.
2. Extraction Service is responsible for reading Saved Searches, this service will interact with
secondary because it only needs to read data. The only parameter in the Query will be
email_address, this field will be indexed for faster retrievals.
3. Compute Abstract & Send Alert Service, Compute Service will consume the main
dubizzle.com listings search service and save listing search results for the email_address.
Since one user can have many saved searches, these will be inserted into the mongodb as
denormalized data.
1. SavedSearch Database
2. Alerts Database
Scalability Principles
1. Having separate infrastructure i.e load balancers and app servers for insertion service,
extraction service and compute service
2. Using Sharding on Mongodb from the get go using hashed Sharding Technique which allows
for uniform load distributing when writing data. (https://www.mongodb.com/blog/post/
performance-best-practices-sharding)
4. Never disturbing the main dubizzile.com database, not writing to it in any way, only requesting
read access within the compute service to generate listing alerts and save them to a different
database "Alerts"
5. Writing to Master via RabbitMQ, Giving write access to only 1 service for the SavedSearch
database via RabbitMQ to cater for millions of requests daily, thus freeing it up, using
secondary/slave nodes to read the data.
Insert Service
We will be using separate top level nodes(appservers) for reading data and writing data because
in the interest of scalability, ease of development/documentation and testing and openness of
architecture, this is why we will have separate API for inserting new SavedSearches into the
system and separate API for extracting existing SavedSearches from the system.
1. Ensues openness of architecture by allowing more data to be transmitted, for example if in the
future dubizzle.com wants to allow a user to take a picture of a laptop or car and post it to the
website and say "send me new listings which contain a laptop like this", an AI method will
then match and find similar listings, the POST call will still work because it can take way more
data than GET call, which is limited to data in the URL.
2. Developers will have to make less choices when writing code, they will know by default all
calls on SavedSearch API are POST calls, making work easy for them.
1. Load balancer serving http servers (for insertion API) for horizontal scaling
2. Http server code forwards the data to RabbitmMQ which fills all key value pairs and converts
to search api data representation (leveraging Mongodb dynamic fields)
3. This "Create Data Representations for all products on the fly" operation will contain code
which will generate all products' data representations given any 1 product's data.
Compute Service
Opennes of Architecture
By Creatings Data Representations for all (new and existing)products on the fly we will ensure if
any new number of products are added to the system, their respective data representation will
automatically become available. This will also allow back-version compatibility if we want to
change data representation in the next version of any of our products.
This is essentially script which creates new key value pairs for other products with same email
address in the same mongodb record
to get key names for other products this script will read data from a static class