OpenDoor Assignment
OpenDoor Assignment
Overreliance on dashboards leads Dashboards are unable to provide The impracticality of scaling
to information overload, making it detailed explanations for the analytics without scaling
difficult for teams to extract observed data trends, limiting the analysts hinders the ability to
meaningful insights. teams' ability to address root provide timely and detailed
causes. insights.
Unified Anomaly Detection
Recognizing the company-wide challenge, we acknowledged the necessity for a comprehensive anomaly
detection product to address issues at scale. The following options were considered for its development.
● Expertise ● Cost
● Advance ● Vendor Lock-in
algorithms ● Customization
● Scalability limitations
Third party tools
In-house
development
Unified Anomaly Detection
Recognizing the company-wide challenge, we acknowledged the necessity for a comprehensive anomaly
detection product to address issues at scale. The following options were considered for its development.
1 Feedback Gathering
We collaborated with various analytics teams to gather their insights and understand
the specific needs across the organization.
2 Proof of Concept
Leveraging a strategic partnership with a third-party company, we were able to roll out a
proof of concept within six months.
3 Platform Development
The collaborative effort led to the development and refinement of a robust and scalable
anomaly detection platform, addressing the challenges faced by multiple teams.
High level Architecture
Auto scale analytics
Metadata of KPIs and worker pool
dimensions
TS Module
Input data
AD Module
Client data Combine stage Insights
AS Module
Web Interface
High-level architecture of the anomaly detection pipeline showing key components - Time Series (TS)
Module, Anomaly Detetction (AD) module and Anomaly summarisation (AS) module.
High level Architecture - TS Module
Auto scale analytics
Metadata of KPIs and worker pool
dimensions
TSTSModule
Module
Input data
AD Module
Client data Combine stage Insights
AS Module
Web Interface
The Time Series (TS) module reads the business data and converts it into trackers. The trackers are then
used as input for the AD module.
High level Architecture - AD Module
Auto scale analytics
Metadata of KPIs and worker pool
dimensions
TS Module
Input data
ADADModule
Module
Client data Combine stage Insights
AS Module
Web Interface
The Anomaly detection (AD) module performs anomaly detection on all the trackers created by the TS
module, using statistical analysis and machine learning algorithms.The results of the anomaly detection are
then stored in a blob storage for further processing by the AS module.
High level Architecture - AS Module
Auto scale analytics
Metadata of KPIs and worker pool
dimensions
TS Module
Input data
AD Module
Client data Combine stage Insights
ASAS
Module
Module
Web Interface
The Anomaly summarisation (AS) Module reads the AD output and performs anomaly summarisation,
noise-reduction, conditional probability analysis to bring out the “What happened and Why it happened?”
for each insight in the best possible way.
AD module - Overview
Dynamic
Data
Imputation batch
forecasting
Pre-processor
Detectors
Trend & Forecast
seasonality
extraction
bands Anomaly Detection (AD) module is
generation
Tracker
made up of three sub-modules.
Anomaly
These sub-modules are:
Forecastability
assessment detection
2. Meta learners
Output 3. Detectors
Trained Model
Feature
extraction
Meta learners
Dynamic
sensitivity
selection
Model
selection
Hyperparameter
selection
AD module - Preprocessor
● Data Imputation: Fill missing values with
Data
Dynamic
batch
interpolation or imputation techniques to
Imputation
forecasting
minimize their impact on analysis.
Pre-processor
Detectors
Trend & Forecast
seasonality bands
extraction generation ● Trend & Seasonality Extraction: Time series
Tracker components are decomposed into their
Forecastability Anomaly
assessment detection constituents which include trend, seasonal
patterns and residuals.
Change Point
detection
● Forecastability Assessment: Perform tests
(white noise, random walk etc)to assess
Output
Trained Model time series forecastability, ensuring
Feature
suitability for modeling.
extraction
Meta learners
Hyperparameter
selection
AD module - Meta learners
● Feature Extraction: Extract relevant features
from time series data for model selection..
Dynamic
Data
Imputation batch
forecasting
Dynamic Sensitivity Selection: Meta-learners
Pre-processor
Detectors
●
Trend &
seasonality
Forecast algorithmically detect data nature and optimize
bands
extraction generation model sensitivity.
Tracker
Forecastability Anomaly
assessment detection ● Model Selection: Model selection is a
classification algorithm identifying the best
Change Point
detection
predictive model for each time series, ensuring
accuracy
Output
Trained Model ● Hyper parameter Selection: Selects optimal
Feature hyperparameters, fine-tunes models. The
extraction
algorithm suite used by AD module include
Meta learners
Model
selection
Hyperparameter
selection
AD module - Detectors
● Dynamic batch forecasting: Forecasts are
generated in batch sizes that are dynamically
Dynamic
Data
Imputation batch determined for a balanced approach between
forecasting
compute optimized execution and the
Pre-processor
Detectors
Trend &
seasonality
Forecast highest quality forecasts.
bands
extraction generation
Output
Trained Model ● Anomaly detection - Z-scores between
Feature forecasts and actuals identify anomalies,
extraction
considering user-defined limits for anomaly
Meta learners
Dynamic detection
sensitivity
selection
Model
selection
Hyperparameter
selection
Driving Measurable Business Impact
Embraced by 17 teams across diverse Implemented over 50 use cases, Enabled teams to quickly identify
functions, including pricing, marketing, yielding substantial value and and mitigate issues, leading to
operations, fraud, and CRM. annual cost savings of $7 million. improved customer experiences and
enhanced operational efficiency.
Marketing Pricing
Operations Fraud
Identified and mitigated Identified pricing glitches
Detected lost parcels of a Uncovered unknown
promo code abuse, and shipping subsidy
particular brand, abuse patterns,
resulting in cost savings issues to mitigate
enabling prompt minimizing financial
and improved customer financial impact
corrective action and losses and bolstering
experiences.
enhancing operational platform security.
efficiency.
Thank You