0% found this document useful (0 votes)
8 views

⚡ Wp-aggregator-api-document

The WP Aggregator API is a wrapper over ElasticSearch that allows querying WordPress content stored in an ElasticSearch cluster, addressing issues like double setup, advanced search capabilities, and content duplication. It supports auto-replication of content from parent to child WordPress sites and provides various search strategies for efficient data retrieval. The current version is design-v3, and it is built using PHP 8.0, Laravel 9, and ElasticSearch 7.2.

Uploaded by

anotherandres
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

⚡ Wp-aggregator-api-document

The WP Aggregator API is a wrapper over ElasticSearch that allows querying WordPress content stored in an ElasticSearch cluster, addressing issues like double setup, advanced search capabilities, and content duplication. It supports auto-replication of content from parent to child WordPress sites and provides various search strategies for efficient data retrieval. The current version is design-v3, and it is built using PHP 8.0, Laravel 9, and ElasticSearch 7.2.

Uploaded by

anotherandres
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Wp-aggregator-api-document

WP Aggregator API
WP Aggregator API is Wrapper Over ElasticSearch Datasource that does queries on Wordpress Content
Data.
Using WP AGG API, we can get the wordpress content data.
Wordpress Data is Stored in ElasticSearch Cluster, And WP AGG API does queries on ES Cluster to fetch
the WP Content data.

What Problem it solves ?


Double Setup Problem
We have two setup : wordpress and product site.
using WP AGG API, we can get the wordpress content over the product site.
Thus, we can remove the duplication of work over two setup.
Advanced Search Problem
By Using ElasticSearch as Datasource, We can use its advanced full text and fuzzy search queries.
ElasticSearch Cluster is More Scalable than Mysql(Original Wordpress Datasource)

Content Migration and Auto Replication | Duplicate Content Problem


for sites, which have parent-children relationship. Those sites usually have same content.
Thus, they create same data over different wordpress instances.
Using ES Index, we can share the same data across different sites.
And Can use WP AGG API for Search And Other Queries.

Current Version
code version : design-v3
deployment version : D-V2

What is Auto-replication ?
For Wordpress site, we have relation like parent - child between sites.
When Some site is child of parent site then all the changes on post/articles will be replicated to the child
sites after a day.
Auto-replication is process of replicating the post/article content from parent site to child site.

What is SearchAPI ?
SearchAPI is API like WP AGG API. It does search on elasticsearch for wordpress index data.
Types of SearchAPI Endpoint
webresult : Used in Ad API
smart.content : Used in Product Site to show article
qna : Used in QnA Widget

text
# smart.content
http://search-api.proxy.sem.infra/smartContentResults?keyword=eczema+treatment&strategy=es7
http://es.petradigital.agency/fetch?
keyword=site%3Asmart.content+eczema+treatment&limit=1&fc=1&strategy=es7

# qna
http://search-api.proxy.sem.infra/qnaResults?
keyword=eucrisa+atopic+dermatitis&limit=6&offset=0&source=default&strategy=es7
http://es.petradigital.agency/fetch?
keyword=site%3Aqanda.widget+eucrisa+atopic+dermatitis&limit=6&offset=0&strategy=es7

# webresults
http://es.petradigital.agency/fetch/?
&keyword=site:knowledgedesk.net%20car%20insurance%20quotes&limit=10&ver=10&strategy=es7

wp agg api is meant to solve different problem. But the similarity between both is hard to ignore. so we
decided to replace the SearchAPI by including the functionality of it in WP AGG API.

Tech Stack
php 8.0
laravel 9
elasticsearch 7.2

Class Diagram

class diagram image

Terminology and API Structure


How is Request Body looklike ?

field name description is_mandantory

site name of site/domain on which index is created yes

strategy specify in which to queries ES.Depending on strategy fields in query yes


changes
query fields inside this depends on strategy that choosen. depends on
strategy

limit size of document to fetch. default is 10 no

offset tells to skip initial offset. default is 0 no

orderBy sort the searched document on es field. default is score. no


[post_date,post_modified]

setup response structure change. default is new . use old if want searchapi no
like response

additional_filters can specify extra paramters while searching. no

language used to get articles result in respective language no

Additional Filters :

text
- Addtional Filters feature is excluded for strategies like category_tree , links
categories, tags.
- Currently it supports 3 filter i.e. post_modified , post_date, siteurl.
# post_modified & post_date support 4 parmas (gt , gte, lt, lte) value must be in YYYY-
MM-DD format.
*note combination of gt and gte or lt or lte won't work
# siteurl must be of type string.

What is Strategy?

text
- Strategy in WP AGG API is number of ways, we can search and fetch wp content data.
- Each Strategy have it own query fields, unique to it. You can see it in postman
collection api.

strategy description query_type response class

articles get articles by SearchModel and keyword search WPResponse

articles_by_ids get articles by list of id search WPResponse

articles_by_slugs get articles by list of slugs search WPResponse

articles_by_category get articles by a category search WPResponse

articles_by_tag get articles by a tag search WPResponse

articles_by_author get articles by a author name search WPResponse

links Internal. Depending on ComputeModel give search WPLinksResponse


Result

category_tree get category_tree of categories search WPCatTreeResponse

categories get category and count data aggregation WPTaxResponse


tags get tag and count data aggregation WPTaxResponse

What are SearchModel For Strategy Articles?

text
- tells the articles strategy. how to search the keyword on es fields.
- important fields to be searched in es fields are post_title/title, post_content/content,
post_tags/tags, category
- with this 4 fields, we can search on ES effectively for wp content data.
- so, we made searchModel for this 4 fields.
- With combination from 4 fields - t0c0t0c0
- t0c0t0c0 -> are title,content,tags, category respectively while succeeding number
represents boost for each field.

name description avaible

t0c0t0c0 keyword is not search on any es fields. yes

t1c1t0c0 keyword is searched on title and content yes

t1c1t1c1 keyword is search on title, content,tags and categories no

t1c0t3c2 keyword is search on title, tags with boost 3 and categories with 2 no

Secondary APIs
there are few other APIs. For other than search and aggregation.

text
health check - /ping
# if you cache using wp agg api cache, you should not use it for production usage. use
varnish or other http cache.
flush_cache - /flush_cache
# we cache for aggregation query_type, as aggregation query are heavy on ES. Generally,
Aggregation don't change much. But if lot's of data is changed then you should clear this
flush
flush_cache for aggregation - /flush_cache
# we use redis cache and ttl is 7 days.
# for more detials, see in postman collections api.

Deployment Diagram

#
deployment diagram image

Index Patterns
text
- base prefix : wp-es7-plugin
- for consistenty, remove any special characters in domain_name with -
- ex. vervily.com -> vervily-com , content.vervily.com -> content-vervily-com, vervily-com
- special case : for wordpress domain alias
- if domain is content.vervily.com or blog.vervily.com
- then alias will be [wp-es7-plugin-vervily-com, wp-es7-plugin-content-vervily-com]

type pattern description name sample

base prefix wp-es7-plugin all other should prefixed this

wordpress wp-es7-plugin-site- RTB should use this pattern vervily.com wp-es7-plugin-


domain index {domain} while creating index site-vervily-com

wordpress wp-es7-plugin-api- It is automatically, generate vervily.com wp-es7-plugin-


domain alias {domain} from wp es7 plugin api-vervily-com

wordpress wp-es7-plugin-site- It is automatically generate vervily.com wp-es7-plugin-


domain index {domain}_{code} from es7 plugin site-vervily-
with lang com_es

wordpress wp-es7-plugin-api- It is automatically generate vervily.com wp-es7-plugin-


domain alias with {domain}_{code} api-vervily-
lang com_es

category tree wp-es7-plugin_cat- It is automatically generate vervily.com wp-es7-


index {domain} plugin_cat-
vervily-com

Index Mapper wp-es7-plugin-alias- It will be generate from index vervily.com wp-es7-plugin-


Alias (alias) {domain} mapper,alias over category alias-vervily-com
domain indices

Postman API Document


wp agg api's api document
how to import postman collection

Future Plan
Add Support for Multiple ComputeModel for LinksAPI
Add Support for Multi-Indexed Alias

Ref Links
wp agg api prod url
wp agg api stage url
wp agg api repo
project status csv
class diagram and deployment diagram
varnish APS wp agg api url
APS wp agg api url

Grafana Links
redis server
consul
server monitor
request-response
loki

You might also like