Regime Detection via Unsupervised Learning
Regime Detection via Unsupervised Learning
Problem Statement :: Regime Detection via Unsupervised Learning from Order Book and
Volume Data
You might not know about many technical terms and jargon. Feel free to use chatgpt!
Objective :: Segment the market into distinct behavioral regimes depending on 3 factors:
1. Trending vs Mean-reverting
2. Volatile vs Stable
3. Liquid vs Illiquid
Using unsupervised learning, based on real-time order book and volume features.
Note: This is an example breakdown, you can build differently from this or more on
top of this. That would be counted as a plus.
1. Feature Engineering
Extract meaningful, hand-crafted features from each timestamp's data. For example:
Volume Features:
Volume imbalance (buy volume vs sell volume)
Cumulative volume in last 10s, 30s
VWAP shift (change in VWAP over short windows)
Derived Features:
Sloped depth: quantify how quickly size decays away from top of book
Trade Wipe Level : Average levels wiped in a duration by trades (10 sec, 30sec, etc)
2. Data Normalization
3. Clustering
Average volatility
Typical spread and liquidity
Price movement directionality
Name and describe each regime (e.g., " Trending & Liquidity & Stable", "Mean Reverting &
Illiquid & Volatile")
See if there is any correlation when one state of regime changes to another i.e. what is the
probability a “Trending & Liquidity & Stable” follows “Mean Reverting & Illiquid & Volatile” etc.
Deliverables: