Quantconnect Research Environment Python
Quantconnect Research Environment Python
QuantConnect
and Explore
Features
RESEARCH ENVIRONMENT
Powerful notebooks
attached to our
massive data
repository
Our data repository is preformatted
and ready to go. Skip expensive
and tedious data processing and
get to work.
Table of Content
1 Key Concepts
1.1 Getting Started
1.2 Research Engine
2 Initialization
3 Datasets
3.1 Key Concepts
3.2 US Equity
3.3 Equity Fundamental Data
3.8.2 Universes
3.8.3 Individual Contracts
3.9 Forex
3.10 CFD
3.11 Indices
3.12 Index Options
3.12.1 Key Concepts
3.12.2 Universes
3.12.3 Individual Contracts
3.13 Alternative Data
3.14 Custom Data
4 Charting
4.1 Bokeh
4.2 Matplotlib
4.3 Plotly
4.4 Seaborn
4.5 Plotly NET
5 Universes
6 Indicators
6.1 Data Point Indicators
6.2 Bar Indicators
6.3 Trade Bar Indicators
6.4 Combining Indicators
6.5 Custom Indicators
10 Meta Analysis
10.1 Key Concepts
10.2 Backtest Analysis
10.3 Optimization Analysis
10.4 Live Analysis
10.5 Live Deployment Automation
11 Applying Research
11.1 Key Concepts
11.2 Mean Reversion
11.3 Random Forest Regression
11.4 Uncorrelated Assets
11.5 Kalman Filters and Stat Arb
11.6 PCA and Pairs Trading
11.7 Hidden Markov Models
11.8 Long Short-Term Memory
11.9 Airline Buybacks
11.10 Sparse Optimization
Key Concepts
Key Concepts
Key Concepts
Getting Started
Introduction
The Research Environment is a Jupyter notebook -based, interactive commandline environment where you can
access our data through the QuantBook class. The environment supports both Python and C#. If you use Python,
you can import code from the code files in your project into the Research Environment to aid development.
Before you run backtests, we recommend testing your hypothesis in the Research Environment. It's easier to
perform data analysis and produce plots in the Research Environment than in a backtest.
Before backtesting or live trading with machine learning models, you may find it beneficial to train them in the
Research Environment, save them in the Object Store, and then load them from the Object Store into the
backtesting and live trading environment
In the Research Environment, you can also use the QuantConnect API to import your backtest results for further
analysis.
Example
The following snippet demonstrates how to use the Research Environment to plot the price and Bollinger Bands of
the S&P 500 index ETF, SPY:
PY
# Create a QuantBook
qb = QuantBook()
# Add an asset.
symbol = qb.add_equity("SPY").symbol
Open Notebooks
The process to open notebooks depends on if you use the Algorithm Lab , Local Platform , or the CLI .
Notebooks are a collection of cells where you can write code snippets or MarkDown. To execute a cell, press
Shift+Enter .
The following describes some helpful keyboard shortcuts to speed up your research:
Stop Nodes
The process to stop Research Environment nodes depends on if you use the Algorithm Lab , Local Platform , or the
CLI .
Add Notebooks
The process to add notebook files depends on if you use the Algorithm Lab , Local Platform , or the CLI .
Rename Notebooks
The process to rename notebook files depends on if you use the Algorithm Lab , Local Platform , or the CLI .
Delete Notebooks
The process to delete notebooks depends on if you use the Algorithm Lab , Local Platform , or the CLI .
Learn Jupyter
Key Concepts
Research Engine
Introduction
The Research Environment is a Jupyter notebook -based, interactive commandline environment where you can
access our data through the QuantBook class. The environment supports both Python and C#. If you use Python,
you can import code from the code files in your project into the Research Environment to aid development.
Before you run backtests, we recommend testing your hypothesis in the Research Environment. It's easier to
perform data analysis and produce plots in the Research Environment than in a backtest.
Before backtesting or live trading with machine learning models, you may find it beneficial to train them in the
Research Environment, save them in the Object Store, and then load them from the Object Store into the
backtesting and live trading environment
In the Research Environment, you can also use the QuantConnect API to import your backtest results for further
analysis.
The backtesting environment is an event-based simulation of the market. Backtests aim to provide an accurate
representation of whether a strategy would have performed well in the past, but they are generally slow and aren't
the most efficient way to test the foundational ideas behind strategies. You should only use backtests to verify an
The Research Environment lets you build a strategy by starting with a central hypothesis about the market. For
example, you might hypothesize that an increase in sunshine hours will increase the production of oranges, which
will lead to an increase in the supply of oranges and a decrease in the price of Orange Juice Futures. You can
attempt to confirm this working hypothesis by analyzing weather data, production of oranges data, and the price
of Orange Juice futures. If the hypothesis is confirmed with a degree of statistical significance, you can be
confident in the hypothesis and translate it into an algorithm you can backtest.
Jupyter Notebooks
Jupyter notebooks support interactive data science and scientific computing across various programming
languages. We carry on that philosophy by providing an environment for you to perform exploratory research and
brainstorm new ideas for algorithms. A Jupyter notebook installed in QuantConnect allows you to directly explore
the massive amounts of data that is available in the Dataset Market and analyze it with python or C# commands.
We call this exploratory notebook environment the Research Environment.
Open Notebooks
To open a notebook, open one of the .ipynb files in your cloud projects or see Running Local Research Environment
Execute Code
The notebook allows you to run code in a safe and disposable environment. It's composed of independent cells
where you can write, edit, and execute code. The notebooks support Python, C#, and Markdown code.
Keyboard Shortcuts
Shortcut Description
If you use the Research Environment in QuantConnect Cloud, to terminate a research session, stop the research
node in the Resources panel . If you use the local Research Environment, see Managing Kernels and Terminals in
the JupyterLab documentation.
To analyze data in a research notebook, create an instance of the QuantBook class. QuantBook is a wrapper on
QCAlgorithm , which means QuantBook allows you to access all the methods available to QCAlgorithm and some
additional methods. The following table describes the helper methods of the QuantBook class that aren't available
in the QCAlgorithm class:
Method Description
QuantBook gives you access to the vast amounts of data in the Dataset Market. Similar to backtesting, you can
access that data using history calls. You can also create indicators, consolidate data, and access charting
features. However, keep in mind that event-driven features available in backtesting, like universe selection and
OnData events, are not available in research. After you analyze a dataset in the Research Environment, you can
easily transfer the logic to the backtesting environment. For example, consider the following code in the Research
Environment:
PY
# Initialize QuantBook
qb = QuantBook()
PY
One of the drawbacks of using the Research Environment you may encounter is the need to rewrite code you've
already written in a file in the backtesting environment. Instead of rewriting the code, you can import the methods
from the backtesting environment into the Research Environment to reduce development time. For example, say
To import the preceding method into your research notebook, use the import statement.
PY
If you adjust the file that you import, restart the Research Environment session to import the latest version of the
file. To restart the Research Environment, stop the research node and then open the notebook again.
Import C# Libraries
Initialization
Introduction
Before you request and manipulate historical data in the Research Environment, you should set the notebook
Set Dates
The start date of your QuantBook determines the latest date of data you get from history requests . By default, the
start date is the current day. To change the start date, call the set_start_date method.
PY
qb.set_start_date(2022, 1, 1)
The end date of your QuantBook should be greater than the end date. By default, the start date is the current day.
PY
qb.set_end_date(2022, 8, 15)
Add Data
You can subscribe to asset, fundamental, alternative, and custom data. The Dataset Market provides 400TB of
data that you can easily import into your notebooks.
Asset Data
To subscribe to asset data, call one of the asset subscription methods like add_equity or add_forex . Each asset
class has its own method to create subscriptions. For more information about how to create subscriptions for each
asset class, see the Create Subscriptions section of an asset class in the Datasets chapter.
PY
Alternative Data
To add alternative datasets to your notebooks, call the add_data method. For a full example, see Alternative Data .
Custom Data
To add custom data to your notebooks, call the add_data method. For more information about custom data, see
Custom Data .
Limitations
There is no official limit to how much data you can add to your notebooks, but there are practical resource
limitations. Each security subscription requires about 5MB of RAM, so larger machines let you request more data.
For more information about our cloud nodes, see Research Nodes .
The notebook time zone determines which time zone the datetime objects are in when you make a history request
based on a defined period of time. When your history request returns a DataFrame , the timestamps in the
DataFrame are based on the data time zone . When your history request returns a TradeBars , QuoteBars , Ticks ,
or Slice object, the time properties of these objects are based on the notebook time zone, but the end_time
properties of the individual TradeBar , QuoteBar , and Tick objects are based on the data time zone.
The default time zone is Eastern Time (ET), which is UTC-4 in summer and UTC-5 in winter. To set a different time
zone, call the set_time_zone method. This method accepts either a string following the IANA Time Zone database
convention or a NodaTime .DateTimeZone object. If you pass a string, the method converts it to a
NodaTime.DateTimeZone object. The TimeZones class provides the following helper attributes to create
NodaTime.DateTimeZone objects:
PY
qb.set_time_zone("Europe/London")
qb.set_time_zone(TimeZones.CHICAGO)
Datasets
Datasets
Datasets
Key Concepts
Introduction
You can access most of the data from the Dataset Market in the Research Environment. The data includes Equity,
Crypto, Forex, and derivative data going back as far as 1998. Similar to backtesting, to access the data, create a
The historical data API has many different options to give you the greatest flexibility in how to apply it to your
algorithm.
You can request historical data based on a trailing number of bars, a trailing period of time, or a defined period of
time. If you request data in a defined period of time, the datetime objects you provide are based in the notebook
time zone .
Return Formats
Each asset class supports slightly different data formats. When you make a history request, consider what data
returns. Depending on how you request the data, history requests return a specific data type. For example, if you
don't provide Symbol objects, you get Slice objects that contain all of the assets you created subscriptions for in
the notebook.
The most popular return type is a DataFrame . If you request a DataFrame , LEAN unpacks the data from Slice
objects to populate the DataFrame . If you intend to use the data in the DataFrame to create TradeBar or QuoteBar
objects, request that the history request returns the data type you need. Otherwise, LEAN will waste computational
Time Index
When your history request returns a DataFrame , the timestamps in the DataFrame are based on the data time zone
. When your history request returns a TradeBars , QuoteBars , Ticks , or Slice object, the time properties of these
objects are based on the notebook time zone, but the end_time properties of the individual TradeBar , QuoteBar ,
and Tick objects are based on the data time zone . The end_time is the end of the sampling period and when the
data is actually available. For daily US Equity data, this results in data points appearing on Saturday and skipping
Monday.
Request Data
The simplest form of history request is for a known set of Symbol objects. History requests return slightly different
data depending on the overload you call. The data that returns is in ascending order from oldest to newest.
To request history for a single asset, pass the asset Symbol to the history method. The return type of the method
call depends on the history request [Type] . The following table describes the return type of each request [Type] :
No argument DataFrame
TradeBar List[TradeBars]
QuoteBar List[QuoteBars]
Tick List[Ticks]
Each row of the DataFrame represents the prices at a point in time. Each column of the DataFrame is a property of
that price data (for example, open, high, low, and close (OHLC)). If you request a DataFrame object and pass
TradeBar as the first argument, the DataFrame that returns only contains the OHLC and volume columns. If you
request a DataFrame object and pass QuoteBar as the first argument, the DataFrame that returns contains the
OHLC of the bid and ask and it contains OHLC columns, which are the respective means of the bid and ask OHLC
values. If you request a DataFrame and don't pass TradeBar or QuoteBar as the first arugment, the DataFrame that
returns contains columns for all of the data that's available for the given resolution.
PY
PY
PY
# Important Note: Period history requests are relative to "now" notebook time.
PY
PY
To request history for multiple symbols at a time, pass an array of Symbol objects to the same API methods shown
in the preceding section. The return type of the method call depends on the history request [Type] . The following
No argument DataFrame
TradeBar List[TradeBars]
QuoteBar List[QuoteBars]
Tick List[Ticks]
# EXAMPLE 7: Requesting By Bar Count for Multiple Symbols: 2 bars at the security resolution:
vix = qb.add_data[CBOE]("VIX", Resolution.DAILY).symbol
v3m = qb.add_data[CBOE]("VIX3M", Resolution.DAILY).symbol
cboe_data = qb.history[CBOE]([vix, v3m], 2)
PY
# EXAMPLE 8: Requesting By Bar Count for Multiple Symbols: 5 bars with a specific resolution:
trade_bars_list = qb.history[TradeBar]([ibm, aapl], 5, Resolution.DAILY)
quote_bars_list = qb.history[QuoteBar]([ibm, aapl], 5, Resolution.MINUTE)
PY
# EXAMPLE 10: Requesting By Defined Period: 3 days of data at the security resolution:
trade_bars = qb.history[TradeBar]([btc_symbol], start_time, end_time)
quote_bars = qb.history[QuoteBar]([btc_symbol], start_time, end_time)
ticks = qb.history[Tick]([eth_symbol], start_time, end_time)
trade_bars_df = qb.history(TradeBar, btc_symbol, start_time, end_time)
quote_bars_df = qb.history(QuoteBar, btc_symbol, start_time, end_time)
ticks_df = qb.history(Tick, eth_symbol, start_time, end_time)
df = qb.history([btc_symbol], start_time, end_time) # Includes trade and quote data
If you request data for multiple securities and you use the TICK request type, each Ticks object in the list of results
only contains the last tick of each security for that particular timeslice .
You can request history for all the securities you have created subscriptions for in your notebook session. The
parameters are very similar to other history method calls, but the return type is an array of Slice objects. The Slice
object holds all of the results in a sorted enumerable collection that you can iterate over with a loop.
PY
# EXAMPLE 11: Requesting 5 bars for all securities at their respective resolution:
# Create subscriptions
qb.add_equity("IBM", Resolution.DAILY)
qb.add_equity("AAPL", Resolution.DAILY)
# timedelta history requests are relative to "now" in notebook Time. If you request this data at 16:05,
it returns an empty array because the market is closed.
Argument Assumption
Additional Options
True to include
extended_market_hour
bool/NoneType extended market hours None
s
data. Otherwise, false.
PY
future = qb.add_future(Futures.Currencies.BTC)
history = qb.history(
tickers=[future.symbol],
start=qb.time - timedelta(days=15),
end=qb.time,
resolution=Resolution.MINUTE,
fill_forward=False,
extended_market_hours=False,
dataMappingMode=DataMappingMode.OPEN_INTEREST,
dataNormalizationMode=DataNormalizationMode.RAW,
contractDepthOffset=0)
Resolutions
Resolution is the duration of time that's used to sample a data source. The Resolution enumeration has the
following members:
The default resolution for market data is MINUTE . To set the resolution for a security, pass the resolution
PY
qb.add_equity("SPY", Resolution.DAILY)
When you request historical data, the history method uses the resolution of your security subscription. To get
historical data with a different resolution, pass a resolution argument to the history method.
PY
Markets
The datasets integrated into the Dataset Market cover many markets. The Market enumeration has the following
members:
LEAN can usually determine the correct market based on the ticker you provide when you create the security
subscription. To manually set the market for a security, pass a market argument when you create the security
subscription.
PY
qb.add_equity("SPY", market=Market.USA)
Fill Forward
Fill forward means if there is no data point for the current sample, LEAN uses the previous data point. Fill forward
is the default data setting. To disable fill forward for a security, set the fill_forward argument to false when you
PY
qb.add_equity("SPY", fill_forward=False)
When you request historical data, the history method uses the fill forward setting of your security subscription.
To get historical data with a different fill forward setting, pass a fill_forward argument to the history method.
PY
By default, your security subscriptions only cover regular trading hours. To subscribe to pre and post-market
trading hours for a specific asset, enable the extended_market_hours argument when you create the security
subscription.
PY
self.add_equity("SPY", extended_market_hours=True)
You only receive extended market hours data if you create the subscription with minute, second, or tick resolution.
If you create the subscription with daily or hourly resolution, the bars only reflect the regular trading hours.
When you request historical data, the history method uses the extended market hours setting of your security
subscription. To get historical data with a different extended market hours setting, pass an
PY
Look-Ahead Bias
In the Research Environment, all the historical data is directly available. In backtesting, you can only access the
data that is at or before the algorithm time. If you make a history request for the previous 10 days of data in the
Research Environment, you get the previous 10 days of data from today's date. If you request the same data in a
backtest, you get the previous 10 days of data from the algorithm time.
Consolidate Data
History requests usually return data in one of the standard resolutions . To analyze data on custom time frames like
5-minute bars or 4-hour bars, you need to aggregate it. Consider an example where you make a history call for
minute resolution data and want to create 5-minute resolution data.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
start_date = datetime(2018, 4, 1)
end_date = datetime(2018, 7, 15)
history = qb.history(symbol, start_date, end_date, Resolution.MINUTE)
Consolidators
# Attach a consolidation handler method that saves the consolidated bars in the RollingWindow
def on_data_consolidated(sender, bar):
window.add(bar)
consolidator.data_consolidated += on_data_consolidated
# Iterate the historical market data and feed each bar into the consolidator
for bar in history.itertuples():
tradebar = TradeBar(bar.index[1], bar.index[0], bar.open, bar.high, bar.low, bar.close, bar.volume)
consolidator.update(tradebar)
Resample Method
The resample method converts the frequency of a time series DataFrame into a custom frequency. The method
only works on DataFrame objects that have a datetime index. The history method returns a DataFrame with a
multi-index. The first index is a Symbol index for each security and the second index is a time index for the
timestamps of each row of data. To make the DataFrame compatible with the resample method, call the
PY
The resample method returns a Resampler object, which needs to be downsampled using one of the pandas
downsampling computations . For example, you can use the Resampler.ohlc downsampling method to aggregate
price data.
When you resample a DataFrame with the ohlc downsampling method, it creates an OHLC row for each column in
the DataFrame. To just calculate the OHLC of the close column, select the close column before you resample the
DataFrame. A resample offset of 5T corresponds to a 5-minute resample. Other resampling offsets include 2D = 2
close_prices = history["close"]
offset = "5T"
close_5min_ohlc = close_prices.resample(offset).ohlc()
Common Errors
If the history request returns an empty DataFrame and you try to slice it, it throws an exception. To avoid issues,
PY
def get_safe_history_closes(symbols):
if not symbols:
print(f'No symbols')
return False, None
df = qb.history(symbols, 100, Resolution.DAILY)
if df.empty:
print(f'Empy history for {symbols}')
return False, None
return True, df.close.unstack(0)
If you run the Research Environment on your local machine and history requests return no data, check if your data
directory contains the data you request. To download datasets, see Download .
Datasets > US Equity
Datasets
US Equity
Introduction
This page explains how to request, manipulate, and visualize historical US Equity data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_equity method with a ticker and then save a reference to the US Equity Symbol .
PY
spy = qb.add_equity("SPY").symbol
tlt = qb.add_equity("TLT").symbol
To view the supported assets in the US Equities dataset, see the Data Explorer .
You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the security dimension, you can request historical data for a single US Equity, a subset of the US
Equities you created subscriptions for in your notebook, or all of the US Equities in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# Slice objects
all_history_slice = qb.history(10)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, 10)
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, 10)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](spy, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spy, tlt], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.
To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], start_time, end_time, Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](spy, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spy, tlt], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.
Resolutions
The following table shows the available resolutions and data formats for Equity subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
LEAN groups all of the US Equity exchanges under Market.USA .
Data Normalization
The data normalization mode defines how historical data is adjusted for corporate actions . By default, LEAN
adjusts US Equity data for splits and dividends to produce a smooth price curve, but the following data
If you use ADJUSTED , SPLIT_ADJUSTED , or TOTAL_RETURN , we use the entire split and dividend history to adjust
historical prices. This process ensures you get the same adjusted prices, regardless of the QuantBook time. If you
use SCALED_RAW , we use the split and dividend history before the QuantBook 's EndDate to adjust historical prices.
To set the data normalization mode for a security, pass a data_normalization_mode argument to the add_equity
method.
PY
When you request historical data, the history method uses the data normalization of your security subscription.
To get historical data with a different data normalization, pass a data_normalization_mode argument to the
history method.
PY
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Equity Symbol
and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single Equity, index the loc property of the DataFrame with the Equity Symbol .
PY
all_history_df.loc[spy] # or all_history_df.loc['SPY']
PY
all_history_df.loc[spy]['close']
If you request historical data for multiple Equities, you can transform the DataFrame so that it's a time series of
close values for all of the Equities. To transform the DataFrame , select the column you want to display for each
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each Equity and each row contains the
close value.
If you prefer to display the ticker of each Symbol instead of the string representation of the SecurityIdentifier ,
1. Create a dictionary where the keys are the string representations of each SecurityIdentifier and the
PY
2. Get the values of the symbol level of the DataFrame index and create a list of tickers.
PY
3. Set the values of the symbol level of the DataFrame index to the list of tickers.
PY
PY
all_history_df.loc[spy.value] # or all_history_df.loc["SPY"]
After the index renaming, the unstacked DataFrame has the following format:
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Equity subscriptions. To avoid issues, check if the Slice contains data for your
Equity before you index it with the Equity Symbol .
You can also iterate through each TradeBar and QuoteBar in the Slice .
PY
TradeBar Objects
If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.
PY
If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Equity. The
TradeBars may not have data for all of your Equity subscriptions. To avoid issues, check if the TradeBars object
contains data for your security before you index it with the Equity Symbol .
PY
QuoteBar Objects
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Equity. The
QuoteBars may not have data for all of your Equity subscriptions. To avoid issues, check if the QuoteBars object
contains data for your security before you index it with the Equity Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Equity. The Ticks may not
have data for all of your Equity subscriptions. To avoid issues, check if the Ticks object contains data for your
PY
The Ticks objects only contain the last tick of each security for that particular timeslice
Plot Data
You need some historical Equity data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
PY
import plotly.graph_objects as go
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
history = qb.history([spy, tlt], datetime(2021, 11, 23), datetime(2021, 12, 8), Resolution.DAILY)
PY
volume = history['volume'].unstack(level=0)
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Common Errors
Some factor files have INF split values, which indicate that the stock has so many splits that prices can't be
calculated with correct numerical precision. To allow history requests with these symbols, we need to move the
starting date forward when reading the data or use raw data normalization . If there are numerical precision errors
in the factor files for a security in your history request, LEAN throws the following error:
"Warning: when performing history requests, the start date will be adjusted if there are numerical precision errors
in the factor files."
Datasets > Equity Fundamental Data
Datasets
Equity Fundamental Data
Introduction
This page explains how to request, manipulate, and visualize historical Equity Fundamental data. Corporate
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_equity method with a ticker and then save a reference to the Equity Symbol .
PY
symbols = [
qb.add_equity(ticker, Resolution.DAILY).symbol
for ticker in [
"AAL", # American Airlines Group, Inc.
"ALGT", # Allegiant Travel Company
"ALK", # Alaska Air Group, Inc.
"DAL", # Delta Air Lines, Inc.
"LUV", # Southwest Airlines Company
"SKYW", # SkyWest, Inc.
"UAL" # United Air Lines
]
]
You need a subscription before you can request historical fundamental data for US Equities. On the time
dimension, you can request an amount of historical data based on a trailing number of bars, a trailing period of
time, or a defined period of time. On the security dimension, you can request historical data for a single US Equity,
a set of US Equities, or all of the US Equities in the US Fundamental dataset. On the property dimension, you can
When you call the history method, you can request Fundamental or Fundamentals objects. If you use
Fundamental , the method returns all fundamental properties for the Symbol object(s) you provide. If you use
Fundamentals , the method returns all fundamental properties for all the US Equities in the US Fundamental dataset
that were trading during that time period you request, including companies that no longer trade.
days. If you didn't use Resolution.DAILY when you subscribed to the US Equities, pass it as the last argument to
PY
# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], 10)
set_fundamental_history = qb.history[Fundamental](symbols, 10)
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), 10)
# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](qb.securities.keys(), 10)
The preceding calls return fundamental data for the 10 most recent trading days.
To get historical data for a trailing period of time, call the history method with a timedelta object.
PY
# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], timedelta(days=10))
set_fundamental_history = qb.history[Fundamental](symbols, timedelta(days=10))
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), timedelta(days=10))
# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](timedelta(days=10))
The preceding calls return fundamental data for the most recent trading days.
To get the historical data of all the fundamental properties over specific period of time, call the history method
with a start datetime and an end datetime . To view the possible fundamental properties, see the Fundamental
attributes in Data Point Attributes . The start and end times you provide to these methods are based in the
start_date = datetime(2021, 1, 1)
end_date = datetime(2021, 2, 1)
# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], start_date, end_date)
set_fundamental_history = qb.history[Fundamental](symbols, start_date, end_date)
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), start_date, end_date)
# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](qb.securities.keys(), start_date, end_date)
The preceding method returns the fundamental property values that are timestamped within the defined period of
time.
Wrangle Data
You need some historical data to perform wrangling operations. To display pandas objects, run a cell in a notebook
with the pandas object as the last line. To display other data formats, call the print method.
DataFrame Objects
The history method returns a multi-index DataFrame where the first level is the Equity Symbol and the second
level is the end_time of the trading day. The columns of the DataFrame are the names of the fundamental
properties. The following image shows the first 4 columns of an example DataFrame:
To access an attribute from one of the cells in the DataFrame, select the value in the cell and then access the
object's property.
PY
single_fundamental_df.iloc[0].companyprofile.share_class_level_shares_outstanding
Fundamental Objects
If you pass a Symbol to the history[Fundamental] method, run the following code to get the fundamental
If you pass a list of Symbol objects to the history[Fundamental] method, run the following code to get the
fundamental properties over time:
PY
Fundamentals Objects
If you request all fundamental properties for all US Equities with the history[Fundamentals] method, run the
PY
Plot Data
You need some historical Equity fundamental data to produce plots. You can use many of the supported plotting
libraries to visualize data in various formats. For example, you can plot line charts.
PY
data = {}
for fundamental_dict in history: # Iterate trading days
for symbol, fundamental in fundamental_dict.items(): # Iterate Symbols
datum = data.get(symbol, dict())
datum['index'] = datum.get('index', [])
datum['index'].append(fundamental.end_time)
datum['pe_ratio'] = datum.get('pe_ratio', [])
datum['pe_ratio'].append(fundamental.valuation_ratios.pe_ratio)
data[symbol] = datum
df = pd.DataFrame()
for symbol, datum in data.items():
df_symbol = pd.DataFrame({symbol: pd.Series(datum['pe_ratio'], index=datum['index'])})
df = pd.concat([df, df_symbol], axis=1)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Equity Options
Datasets
Equity Options
Equity Options
Key Concepts
Introduction
Equity Options are a financial derivative that gives the holder the right (but not the obligation) to buy or sell the
underlying Equity, such as Apple, at the stated exercise price. This page explains the basics of Equity Option data
in the Research Environment. To get some data, see Universes or Individual Contracts . For more information about
the specific datasets we use, see the US Equity Options and US Equity Option Universe dataset listings.
Resolutions
The following table shows the available resolutions and data formats for Equity Option contract subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
LEAN groups all of the US Equity Option exchanges under Market.USA , so you don't need to pass a Market to the
Data Normalization
The data normalization mode doesn't affect data from history request. By default, LEAN doesn't adjust Equity
Options data for splits and dividends of their underlying. If you change the data normalization mode, it won't
Equity Options
Universes
Introduction
This page explains how to request historical data for a universe of Equity Option contracts.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
2. Subscribe to the underlying Equity with raw data normalization and save a reference to the Equity Symbol .
PY
To view the supported underlying assets in the US Equity Options dataset, see the Data Explorer .
PY
option = qb.add_option(equity_symbol)
Price History
The contract filter determines which Equity Option contracts are in your universe each trading day. The default
PY
# Set the contract filter to select contracts that have the strike price
# within 1 strike level and expire within 90 days.
option.set_filter(-1, 1, 0, 90)
To get the prices and volumes for all of the Equity Option contracts that pass your filter during a specific period of
time, call the option_history method with the underlying Equity Symbol object, a start datetime , and an end
datetime .
PY
option_history = qb.option_history(
equity_symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), Resolution.MINUTE,
fill_forward=False, extended_market_hours=False
)
To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each
PY
option_history.data_frame
To get the expiration dates of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_expiry_dates()
To get the strike prices of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_strikes()
To get daily data on all the tradable contracts for a given date, call the history method with the canoncial Option
Symbol, a start date, and an end date. This method returns the entire Option chain for each trading day, not the
subset of contracts that pass your universe filter. The daily Option chains contain the prices, volume, open
interest, implied volaility, and Greeks of each contract.
PY
# DataFrame format
history_df = qb.history(option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), flatten=True)
# OptionUniverse objects
history = qb.history[OptionUniverse](option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5))
for chain in history:
end_time = chain.end_time
filtered_chain = [contract for contract in chain if contract.greeks.delta > 0.3]
for contract in filtered_chain:
price = contract.price
iv = contract.implied_volatility
The method represents each contract with an OptionUniverse object, which have the following properties:
Datasets > Equity Options > Individual Contracts
Equity Options
Individual Contracts
Introduction
This page explains how to request historical data for individual Equity Option contracts. The history requests on
this page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks. For
information about history requests that return the daily implied volatility and Greeks, see Universes .
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
To view the supported underlying assets in the US Equity Options dataset, see the Data Explorer .
3. Set the start date to a date in the past that you want to use as the analysis date.
PY
qb.set_start_date(2024, 1, 1)
The method that you call in the next step returns data on all the contracts that were tradable on this date.
PY
# Get the Option contracts that were tradable on January 1st, 2024.
chain = qb.option_chain(underlying_symbol, flatten=True)
This method returns an OptionChain object, which represent an entire chain of Option contracts for a single
underlying security. You can even format the chain data into a DataFrame where each row in the DataFrame
5. Sort and filter the data to select the specific contract(s) you want to analyze.
PY
# Select a contract.
expiry = chain.expiry.min()
contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
# Select contracts with a 0.3-0.7 delta.
(chain.delta > 0.3) &
(chain.delta < 0.7)
# Select the contract with the largest open interest.
].sort_values('openinterest').index[-1]
6. Call the add_option_contract method with an OptionContract Symbol and disable fill-forward.
PY
Disable fill-forward because there are only a few OpenInterest data points per day.
Trade History
TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,
high, low, close, and volume of trading activity over a period of time.
To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(TradeBar, contract_symbol, timedelta(3))
display(history_df)
# TradeBar objects
history = qb.history[TradeBar](contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)
Quote History
QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,
and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the
mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,
low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.
To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(QuoteBar, contract_symbol, timedelta(3))
display(history_df)
# QuoteBar objects
history = qb.history[QuoteBar](contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)
Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor
interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated
To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(OpenInterest, contract_symbol, timedelta(3))
display(history_df)
# OpenInterest objects
history = qb.history[OpenInterest](contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)
The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price
changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)
represents the market's expectation of the underlying asset's volatility over the life of the Option.
PY
mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)
2. Set up the risk free interest rate , dividend yield , and Option pricing models.
In our research , we found the Forward Tree model to be the best pricing model for indicators.
PY
risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(underlying_symbol)
option_model = OptionPricingModelType.FORWARD_TREE
3. Define a method to return the IV & Greeks indicator values for each contract.
PY
return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})
PY
The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in
time.
Examples
The following examples demonstrate some common practices for analyzing individual Equity Option contracts in
The following notebook plots the historical prices of an SPY Equity Option contract using Plotly :
PY
import plotly.graph_objects as go
The following notebook plots the historical open interest of a TSLA Equity Option contract using Matplotlib :
PY
Datasets
Crypto
Introduction
This page explains how to request, manipulate, and visualize historical Crypto data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
qb.set_time_zone(TimeZones.UTC)
3. Call the add_crypto method with a ticker and then save a reference to the Crypto Symbol .
PY
btcusd = qb.add_crypto("BTCUSD").symbol
ethusd = qb.add_crypto("ETHUSD").symbol
To view the supported assets in the Crypto datasets, see the Supported Assets section of the CoinAPI dataset
listings .
You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the security dimension, you can request historical data for a single Cryptocurrency, a subset of
the Cryptocurrencies you created subscriptions for in your notebook, or all of the Cryptocurrencies in your
notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# Slice objects
all_history_slice = qb.history(10)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, 10)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](btcusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
To get historical data for a specific period of time, call the History method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](btcusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
Resolutions
The following table shows the available resolutions and data formats for Crypto subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Crypto Symbol
and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single Crypto, index the loc property of the DataFrame with the Crypto Symbol .
PY
all_history_df.loc[btcusd] # or all_history_df.loc['BTCUSD']
PY
all_history_df.loc[btcusd]['close']
If you request historical data for multiple Crypto pairs, you can transform the DataFrame so that it's a time series of
close values for all of the Crypto pairs. To transform the DataFrame , select the column you want to display for
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each Crypto pair and each row
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Crypto subscriptions. To avoid issues, check if the Slice contains data for your
You can also iterate through each TradeBar and QuoteBar in the Slice .
PY
TradeBar Objects
If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.
PY
If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Crypto pair.
The TradeBars may not have data for all of your Crypto subscriptions. To avoid issues, check if the TradeBars
object contains data for your security before you index it with the Crypto Symbol .
PY
PY
QuoteBar Objects
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Crypto pair.
The QuoteBars may not have data for all of your Crypto subscriptions. To avoid issues, check if the QuoteBars
object contains data for your security before you index it with the Crypto Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Crypto pair. The Ticks may
not have data for all of your Crypto subscriptions. To avoid issues, check if the Ticks object contains data for your
PY
PY
The Ticks objects only contain the last tick of each security for that particular timeslice
Plot Data
You need some historical Crypto data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
PY
volume = history['volume'].unstack(level=0)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Crypto Futures
Datasets
Crypto Futures
Introduction
This page explains how to request, manipulate, and visualize historical Crypto Futures data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
qb.set_time_zone(TimeZones.UTC)
3. Call the add_crypto_future method with a ticker and then save a reference to the Crypto Future Symbol .
PY
btcusd = qb.add_crypto_future("BTCUSD").symbol
ethusd = qb.add_crypto_future("ETHUSD").symbol
To view the supported assets in the Crypto Futures datasets, see the Data Explorer .
You need a subscription before you can request historical data for a security. You can request an amount of
historical data based on a trailing number of bars, a trailing period of time, or a defined period of time. You can also
request historical data for a single contract, a subset of the contracts you created subscriptions for in your
notebook, or all of the contracts in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# Slice objects
all_history_slice = qb.history(10)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, 10)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](btcusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
To get historical data for a specific period of time, call the History method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](btcusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
Resolutions
The following table shows the available resolutions and data formats for Crypto Futures contract subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Crypto Future
Symbol and the second level is the end_time of the data sample. The columns of the DataFrame are the data
properties.
To select the historical data of a single Crypto Future, index the loc property of the DataFrame with the Crypto
Future Symbol .
PY
all_history_df.loc[btcusd] # or all_history_df.loc['BTCUSD']
PY
all_history_df.loc[btcusd]['close']
If you request historical data for multiple Crypto Futures contracts, you can transform the DataFrame so that it's a
time series of close values for all of the Crypto Futures contracts. To transform the DataFrame , select the column
you want to display for each Crypto Futures contract and then call the unstack method.
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each Crypto Futures contract and each
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Crypto Future subscriptions. To avoid issues, check if the Slice contains data for
your Crypto Futures contract before you index it with the Crypto Future Symbol .
You can also iterate through each TradeBar and QuoteBar in the Slice .
PY
TradeBar Objects
If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.
PY
If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Crypto
Futures contract. The TradeBars may not have data for all of your Crypto Future subscriptions. To avoid issues,
check if the TradeBars object contains data for your security before you index it with the Crypto Future Symbol .
PY
PY
QuoteBar Objects
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Crypto
Futures contract. The QuoteBars may not have data for all of your Crypto Future subscriptions. To avoid issues,
check if the QuoteBars object contains data for your security before you index it with the Crypto Future Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Crypto Futures contract.
The Ticks may not have data for all of your Crypto Future subscriptions. To avoid issues, check if the Ticks object
contains data for your security before you index it with the Crypto Future Symbol .
PY
PY
The Ticks objects only contain the last tick of each security for that particular timeslice
Plot Data
You need some historical Crypto Futures data to produce plots. You can use many of the supported plotting
libraries to visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
PY
var history = qb.history(new List<Symbol> { btcusd, ethusd }, new DateTime(2021, 11, 23), new
DateTime(2021, 12, 8), Resolution.DAILY);
PY
volume = history['volume'].unstack(level=0)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Futures
Datasets
Futures
Introduction
This page explains how to request, manipulate, and visualize historical Futures data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_future method with a ticker, resolution, and contract rollover settings .
PY
To view the available tickers in the US Futures dataset, see Supported Assets .
If you omit any of the arguments after the ticker, see the following table for their default values:
resolution Resolution.MINUTE
data_normalization_mode DataNormalizationMode.ADJUSTED
data_mapping_mode DataMappingMode.OpenInterest
contract_depth_offset 0
PY
future.set_filter(0, 90)
If you don't call the set_filter method, the future_history method won't return historical data.
If you want historical data on individual contracts and their OpenInterest , follow these steps to subscribe to
1. Call the GetFuturesContractList method with the underlying Future Symbol and a datetime .
PY
start_date = datetime(2021,12,20)
symbols = qb.future_chain_provider.get_future_contract_list(future.symbol, start_date)
This method returns a list of Symbol objects that reference the Future contracts that were trading at the given
time. If you set a contract filter with set_filter , it doesn't affect the results of get_future_contract_list .
2. Select the Symbol of the FutureContract object(s) for which you want to get historical data.
For example, select the Symbol of the contract with the closest expiry.
PY
3. Call the add_future_contract method with an FutureContract Symbol and disable fill-forward.
PY
Disable fill-forward because there are only a few OpenInterest data points per day.
You need a subscription before you can request historical data for Futures contracts. On the time dimension, you
can request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the contract dimension, you can request historical data for a single contract, a subset of the
contracts you created subscriptions for in your notebook, or all of the contracts in your notebook.
These history requests return the prices and open interest of the Option contracts. They don't provide the implied
volatility or Greeks. To get the implied volaility and Greeks, call the option_chain method or create some
indicators .
Before you request historical data, call the set_start_date method with a datetime to reduce the risk of look-
ahead bias .
PY
qb.set_start_date(start_date)
If you call the set_start_date method, the date that you pass to the method is the latest date for which your
history requests will return data.
an integer.
PY
# Slice objects
all_history_slice = qb.history(10)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, 10)
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, 10)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, 400)
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], 400)
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), 400)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for the continous Futures contract, in the preceding history requests, replace
To get historical data for a trailing period of time, call the history method with the contract Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], timedelta(days=3),
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](contract_symbol, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([contract_symbol], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, timedelta(days=2))
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], timedelta(days=2))
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), timedelta(days=2))
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for the continous Futures contract, in the preceding history requests, replace
To get historical data for individual Futures contracts during a specific period of time, call the history method with
the Futures contract Symbol object(s), a start datetime , and an end datetime . The start and end times you
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, start_time, end_time,
Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](contract_symbol, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([contract_symbol], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, start_time, end_time)
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], start_time, end_time)
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), start_time, end_time)
To get historical data for the continous Futures contract, in the preceding history requests, replace
To get historical data for all of the Futures contracts that pass your filter during a specific period of time, call the
future_history method with the Symbol object of the continuous Future, a start datetime , and an end datetime .
PY
The preceding calls return data that have a timestamp within the defined period of time.
Resolutions
The following table shows the available resolutions and data formats for Futures subscriptions:
Resolution TradeBar QuoteBar Trade Tick Quote Tick
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
Data Normalization
The data normalization mode doesn't affect data from history request for Futures contracts. If you change the data
The following data normalization modes are available for continuous Futures contracts :
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
DataFrame Objects
If your history request returns a DataFrame , the DataFrame has the following index levels:
1. Contract expiry
2. Encoded contract Symbol
The columns of the DataFrame are the data properties. Depending on how you request data, the DataFrame may
contain data for the continuous Futures contract. The continuous contract doesn't expire, so the default expiry
PY
If you remove the first index level, you can index the DataFrame with just the contract Symbol , similiar to how you
would with non-derivative asset classes. To remove the first index level, call the droplevel method.
PY
all_history_df.index = all_history_df.index.droplevel(0)
To select the historical data of a single Futures contract, index the loc property of the DataFrame with the contract
Symbol .
PY
all_history_df.loc[contract_symbol]
To select a column of the DataFrame , index it with the column name.
PY
all_history_df.loc[contract_symbol]['close']
If you request historical data for multiple Futures contracts, you can transform the DataFrame so that it's a time
series of close values for all of the Futures contracts. To transform the DataFrame , select the column you want to
display for each Futures contract and then call the unstack method.
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each security and each row contains
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Futures subscriptions. To avoid issues, check if the Slice contains data for your
You can also iterate through each TradeBar and QuoteBar in the Slice .
PY
TradeBar Objects
If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.
PY
If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Futures
contract. The TradeBars may not have data for all of your Futures subscriptions. To avoid issues, check if the
TradeBars object contains data for your security before you index it with the Futures Symbol .
PY
PY
QuoteBar Objects
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Futures
contract. The QuoteBars may not have data for all of your Futures subscriptions. To avoid issues, check if the
QuoteBars object contains data for your security before you index it with the Futures Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Futures contract. The Ticks
may not have data for all of your Futures subscriptions. To avoid issues, check if the Ticks object contains data for
your security before you index it with the Futures Symbol .
PY
PY
The Ticks objects only contain the last tick of each security for that particular timeslice
OpenInterest Objects
If the history method returns OpenInterest objects, iterate through the OpenInterest objects to get each one.
PY
If the history method returns a dictionary of OpenInterest objects, iterate through the dictionary to get the
OpenInterest of each Futures contract. The dictionary of OpenInterest objects may not have data for all of your
Futures contract subscriptions. To avoid issues, check if the dictionary contains data for your contract before you
PY
PY
FutureHistory Objects
The future_history method returns a FutureHistory object. To get each slice in the FutureHistory object,
iterate through it.
PY
To convert the FutureHistory object to a DataFrame that contains the trade and quote information of each
PY
future_history.get_all_data()
To get the expiration dates of all the contracts in an FutureHistory object, call the GetExpiryDates method.
PY
future_history.get_expiry_dates()
Plot Data
You need some historical Futures data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
PY
import plotly.graph_objects as go
4. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
5. Create a Layout .
PY
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the contract.
Line Chart
PY
PY
history.index = history.index.droplevel(0)
PY
closing_prices = history['close'].unstack(level=0)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Futures Options
Datasets
Futures Options
Futures Options
Key Concepts
Introduction
Future Option contracts give the buyer a window of opportunity to buy or sell the underlying Future contract at a
specific price. This page explains the basics of Future Option data in the Research Environment. To get some data,
see Universes or Individual Contracts . For more information about the specific datasets we use, see the US Future
Options dataset listing.
Resolutions
The following table shows the available resolutions and data formats for Future Option contract subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
The following Market enumeration members are available for Future Options:
Data Normalization
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Futures Options
Universes
Introduction
This page explains how to request historical data for a universe of Future Option contracts.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
future = qb.add_future(Futures.Indices.SP_500_E_MINI)
To view the available underlying Futures in the US Future Options dataset, see Supported Assets .
Price History
The contract filter determines which Future Option contracts are in your universe each trading day. The default
To get the prices and volumes for all of the Future Option contracts that pass your filter during a specific period of
time, get the underlying Future contract and then call the option_history method with the Future contract's
start_date = datetime(2024, 1, 1)
# Select an underlying Futures contract. For example, get the front-month contract.
futures_contract = sorted(
qb.future_chain_provider.get_future_contract_list(future.symbol, start_date),
key=lambda symbol: symbol.id.date
)[0]
To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each
PY
option_history.data_frame
To get the expiration dates of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_expiry_dates()
To get the strike prices of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_strikes()
Datasets > Futures Options > Individual Contracts
Futures Options
Individual Contracts
Introduction
This page explains how to request historical data for individual Future Option contracts. The history requests on
this page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
future = qb.add_future(Futures.Indices.SP_500_E_MINI)
start_date = datetime(2023, 12, 20)
futures_contract_symbol = sorted(
qb.future_chain_provider.get_future_contract_list(future.symbol, start_date),
key=lambda s: s.id.date
)[0]
qb.add_future_contract(futures_contract_symbol, fill_forward=False)
To view the available underlying Futures in the US Future Options dataset, see Supported Assets .
3. Set the start date to a date in the past that you want to use as the analysis date.
PY
qb.set_start_date(futures_contract_symbol.id.date - timedelta(5))
The method that you call in the next step returns data on all the contracts that were tradable on this date.
4. Call the option_chain method with the underlying Futures contract Symbol .
PY
This method returns an OptionChain object, which represent an entire chain of Option contracts for a single
underlying security. You can even format the chain data into a DataFrame where each row in the DataFrame
represents a single contract.
5. Sort and filter the data to select the specific Futures Options contract(s) you want to analyze.
PY
# Select a contract.
expiry = chain.expiry.min()
fop_contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL)
# Select the contract with a strike price near the middle.
].sort_values('strike').index[150]
6. Call the add_future_option_contract method with an OptionContract Symbol and disable fill-forward.
PY
Disable fill-forward because there are only a few OpenInterest data points per day.
Trade History
TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,
high, low, close, and volume of trading activity over a period of time.
To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(TradeBar, fop_contract_symbol, timedelta(3))
display(history_df)
# TradeBar objects
history = qb.history[TradeBar](fop_contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)
Quote History
QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,
and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the
mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,
low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.
To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(QuoteBar, fop_contract_symbol, timedelta(3))
display(history_df)
# QuoteBar objects
history = qb.history[QuoteBar](fop_contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)
Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor
interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated
To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(OpenInterest, fop_contract_symbol, timedelta(3))
display(history_df)
# OpenInterest objects
history = qb.history[OpenInterest](fop_contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)
The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price
changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)
represents the market's expectation of the underlying asset's volatility over the life of the Option.
PY
mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, fop_contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)
2. Set up the risk free interest rate , dividend yield , and Option pricing models.
In our research , we found the Forward Tree model to be the best pricing model for indicators.
PY
risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(futures_contract_symbol)
option_model = OptionPricingModelType.FORWARD_TREE
3. Define a method to return the IV & Greeks indicator values for each contract.
PY
return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})
PY
The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in
time.
Examples
The following examples demonstrate some common practices for analyzing individual Future Option contracts in
the Research Environment.
Example 1: Contract Mid-Price History
The following notebook plots the historical mid-prices of an E-mini S&P 500 Future Option contract using Plotly :
PY
import plotly.graph_objects as go
# Get the Future Option chain as of 5 days before the underlying Future's expiry date.
qb.set_start_date(futures_contract_symbol.id.date - timedelta(5))
chain = qb.option_chain(futures_contract_symbol, flatten=True).data_frame
Datasets
Forex
Introduction
This page explains how to request, manipulate, and visualize historical Forex data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
qb.set_time_zone(TimeZones.UTC)
3. Call the add_forex method with a ticker and then save a reference to the Forex Symbol .
PY
eurusd = qb.add_forex("EURUSD").symbol
gbpusd = qb.add_forex("GBPUSD").symbol
You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the security dimension, you can request historical data for a single Forex pair, a subset of the
pairs you created subscriptions for in your notebook, or all of the pairs in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# DataFrame
single_history_df = qb.history(eurusd, 10)
subset_history_df = qb.history([eurusd, gbpusd], 10)
all_history_df = qb.history(qb.securities.keys(), 10)
# Slice objects
all_history_slice = qb.history(10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](eurusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([eurusd, gbpusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.
To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](eurusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([eurusd, gbpusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.
Resolutions
The following table shows the available resolutions and data formats for Forex subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
Data Normalization
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
To display other data formats, call the print method.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Forex Symbol and
the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single Forex, index the loc property of the DataFrame with the Forex Symbol .
PY
all_history_df.loc[eurusd] # or all_history_df.loc['EURUSD']
PY
all_history_df.loc[eurusd]['close']
If you request historical data for multiple Forex pairs, you can transform the DataFrame so that it's a time series of
close values for all of the Forex pairs. To transform the DataFrame , select the column you want to display for each
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each Forex pair and each row contains
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Forex subscriptions. To avoid issues, check if the Slice contains data for your
Forex pair before you index it with the Forex Symbol .
PY
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Forex pair.
The QuoteBars may not have data for all of your Forex subscriptions. To avoid issues, check if the QuoteBars
object contains data for your security before you index it with the Forex Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Forex pair. The Ticks may
not have data for all of your Forex subscriptions. To avoid issues, check if the Ticks object contains data for your
security before you index it with the Forex Symbol .
PY
PY
Plot Data
You need some historical Forex data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
PY
pct_change = history['close'].unstack(0).pct_change().dropna()
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > CFD
Datasets
CFD
Introduction
This page explains how to request, manipulate, and visualize historical CFD data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
qb.set_time_zone(TimeZones.UTC)
3. Call the add_cfd method with a ticker and then save a reference to the CFD Symbol .
PY
spx = qb.add_cfd("SPX500USD").symbol
usb = qb.add_cfd("USB10YUSD").symbol
You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the security dimension, you can request historical data for a single CFD contract, a subset of the
contracts you created subscriptions for in your notebook, or all of the contracts in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# DataFrame
single_history_df = qb.history(spx, 10)
subset_history_df = qb.history([spx, usb], 10)
all_history_df = qb.history(qb.securities.keys(), 10)
# Slice objects
all_history_slice = qb.history(10)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, 10)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](spx, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, usb], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.
To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], start_time, end_time, Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)
# Tick objects
single_history_ticks = qb.history[Tick](spx, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, usb], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.
Resolutions
The following table shows the available resolutions and data formats for CFD subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
Data Normalization
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
To display other data formats, call the print method.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded CFD Symbol and
the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single CFD, index the loc property of the DataFrame with the CFD Symbol .
PY
all_history_df.loc[spx] # or all_history_df.loc['SPX500USD']
PY
all_history_df.loc[spx]['close']
If you request historical data for multiple CFD contracts, you can transform the DataFrame so that it's a time series
of close values for all of the CFD contracts. To transform the DataFrame , select the column you want to display for
each CFD contract and then call the unstack method.
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each CFD contract and each row
contains the close value.
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your CFD subscriptions. To avoid issues, check if the Slice contains data for your CFD
contract before you index it with the CFD Symbol .
PY
If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.
PY
If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each CFD contract.
The QuoteBars may not have data for all of your CFD subscriptions. To avoid issues, check if the QuoteBars object
contains data for your security before you index it with the CFD Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each CFD contract. The Ticks
may not have data for all of your CFD subscriptions. To avoid issues, check if the Ticks object contains data for
PY
PY
Plot Data
You need some historical CFD data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
history = qb.history([spx, usb], datetime(2021, 11, 26), datetime(2021, 12, 8), Resolution.DAILY)
PY
pct_change = history['close'].unstack(0).pct_change().dropna()
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Indices
Datasets
Indices
Introduction
This page explains how to request, manipulate, and visualize historical Index data.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_index method with a ticker and then save a reference to the Index Symbol .
PY
spx = qb.add_index("SPX").symbol
vix = qb.add_index("VIX").symbol
You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the security dimension, you can request historical data for a single Index, a subset of the Indices
you created subscriptions for in your notebook, or all of the Indices in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# DataFrame
single_history_df = qb.history(spx, 10)
single_history_trade_bar_df = qb.history(TradeBar, spx, 10)
subset_history_df = qb.history([spx, vix], 10)
subset_history_trade_bar_df = qb.history(TradeBar, [spx, vix], 10)
all_history_df = qb.history(qb.securities.keys(), 10)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)
# Slice objects
all_history_slice = qb.history(10)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, 10)
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# Slice objects
all_history_slice = qb.history(timedelta(days=3))
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))
# Tick objects
single_history_ticks = qb.history[Tick](spx, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, vix], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)
The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.
To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)
# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)
# Tick objects
single_history_ticks = qb.history[Tick](spx, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, vix], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.
Resolutions
The following table shows the available resolutions and data formats for Index subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
Data Normalization
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Index Symbol and
the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single Index, index the loc property of the DataFrame with the Index Symbol .
PY
all_history_df.loc[spx] # or all_history_df.loc['SPX']
PY
all_history_df.loc[spx]['close']
If you request historical data for multiple Indices, you can transform the DataFrame so that it's a time series of close
values for all of the Indices. To transform the DataFrame , select the column you want to display for each Index and
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each Index and each row contains the
close value.
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Index subscriptions. To avoid issues, check if the Slice contains data for your
PY
If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.
PY
If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Index. The
TradeBars may not have data for all of your Index subscriptions. To avoid issues, check if the TradeBars object
contains data for your security before you index it with the Index Symbol .
PY
PY
Tick Objects
If the history method returns TICK objects, iterate through the TICK objects to get each one.
PY
If the history method returns Ticks , iterate through the Ticks to get the TICK of each Index. The Ticks may not
have data for all of your Index subscriptions. To avoid issues, check if the Ticks object contains data for your
PY
PY
Plot Data
You need some historical Indices data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
5. Create a Figure .
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
history = qb.history([spx, vix], datetime(2021, 11, 24), datetime(2021, 12, 8), Resolution.DAILY)
PY
pct_change = history['close'].unstack(0).pct_change().dropna()
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Index Options
Datasets
Index Options
Index Options
Key Concepts
Introduction
Index Options are a financial derivative that gives the holder the right (but not the obligation) to buy or sell the
value of an underlying Index, such as the S&P 500 index, at the stated exercise price. No actual assets are bought
or sold. This page explains the basics of Index Option data in the Research Environment. To get some data, see
Universes or Individual Contracts . For more information about the specific datasets we use, see the US Index
Resolutions
The following table shows the available resolutions and data formats for Index Option contract subscriptions:
TICK
SECOND
MINUTE
HOUR
DAILY
Markets
Data Normalization
The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
it won't change the outcome.
Datasets > Index Options > Universes
Index Options
Universes
Introduction
This page explains how to request historical data for a universe of Index Option contracts.
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
3. Call the add_index_option method with the underlying Index Symbol and, if you want non-standard Index
PY
option = qb.add_index_option(index_symbol)
Price History
The contract filter determines which Index Option contracts are in your universe each trading day. The default filter
# Set the contract filter to select contracts that have the strike price
# within 1 strike level and expire within 90 days.
option.set_filter(-1, 1, 0, 90)
To get the prices and volumes for all of the Index Option contracts that pass your filter during a specific period of
time, call the option_history method with the underlying Index Symbol object, a start datetime , and an end
datetime .
PY
option_history = qb.option_history(
index_symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), Resolution.MINUTE,
fill_forward=False, extended_market_hours=False
)
To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each
PY
option_history.data_frame
To get the expiration dates of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_expiry_dates()
To get the strike prices of all the contracts in an OptionHistory object, call the method.
PY
option_history.get_strikes()
Daily Price and Greeks History
To get daily data on all the tradable contracts for a given date, call the history method with the canoncial Option
Symbol, a start date, and an end date. This method returns the entire Option chain for each trading day, not the
subset of contracts that pass your universe filter. The daily Option chains contain the prices, volume, open
interest, implied volaility, and Greeks of each contract.
PY
# DataFrame format
history_df = qb.history(option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), flatten=True)
# OptionUniverse objects
history = qb.history[OptionUniverse](option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5))
for chain in history:
end_time = chain.end_time
filtered_chain = [contract for contract in chain if contract.greeks.delta > 0.3]
for contract in filtered_chain:
price = contract.price
iv = contract.implied_volatility
The method represents each contract with an OptionUniverse object, which have the following properties:
Datasets > Index Options > Individual Contracts
Index Options
Individual Contracts
Introduction
This page explains how to request historical data for individual Index Option contracts. The history requests on this
page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks. For
information about history requests that return the daily implied volatility and Greeks, see Universes .
Create Subscriptions
1. Create a QuantBook .
PY
qb = QuantBook()
PY
3. Set the start date to a date in the past that you want to use as the analysis date.
PY
qb.set_start_date(2024, 1, 1)
The method that you call in the next step returns data on all the contracts that were tradable on this date.
PY
# Get the Option contracts that were tradable on January 1st, 2024.
# Option A: Standard contracts.
chain = qb.option_chain(
Symbol.create_canonical_option(underlying_symbol, Market.USA, "?SPX"), flatten=True
).data_frame
5. Sort and filter the data to select the specific contract(s) you want to analyze.
PY
# Select a contract.
expiry = chain.expiry.min()
contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
# Select contracts with a 0.3-0.7 delta.
(chain.delta > 0.3) &
(chain.delta < 0.7)
# Select the contract with the largest open interest.
].sort_values('openinterest').index[-1]
6. Call the add_index_option_contract method with an OptionContract Symbol and disable fill-forward.
PY
Disable fill-forward because there are only a few OpenInterest data points per day.
Trade History
TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,
high, low, close, and volume of trading activity over a period of time.
To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(TradeBar, contract_symbol, timedelta(3))
display(history_df)
# TradeBar objects
history = qb.history[TradeBar](contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)
Quote History
QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,
and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the
mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,
low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.
To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(QuoteBar, contract_symbol, timedelta(3))
display(history_df)
# QuoteBar objects
history = qb.history[QuoteBar](contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)
QuoteBar objects have the following properties:
Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor
interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated
To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).
PY
# DataFrame format
history_df = qb.history(OpenInterest, contract_symbol, timedelta(3))
display(history_df)
# OpenInterest objects
history = qb.history[OpenInterest](contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)
The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price
changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)
represents the market's expectation of the underlying asset's volatility over the life of the Option.
PY
mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)
2. Set up the risk free interest rate , dividend yield , and Option pricing models.
In our research , we found the Forward Tree model to be the best pricing model for indicators.
PY
risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(underlying_symbol)
option_model = OptionPricingModelType.FORWARD_TREE
3. Define a method to return the IV & Greeks indicator values for each contract.
PY
return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})
PY
The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in
time.
Examples
The following examples demonstrate some common practices for analyzing individual Index Option contracts in
the Research Environment.
Example 1: Contract Trade History
The following notebook plots the historical prices of an SPX Index Option contract using Plotly :
PY
import plotly.graph_objects as go
The following notebook plots the historical open interest of a VIXW Index Option contract using Matplotlib :
PY
Datasets
Alternative Data
Introduction
This page explains how to request, manipulate, and visualize historical alternative data. This tutorial uses the VIX
Create Subscriptions
Follow these steps to subscribe to an alternative dataset from the Dataset Market :
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_data method with the dataset class, a ticker, and a resolution and then save a reference to the
PY
To view the arguments that the add_data method accepts for each dataset, see the dataset listing .
If you don't pass a resolution argument, the default resolution of the dataset is used by default. To view the
supported resolutions and the default resolution of each dataset, see the dataset listing .
You need a subscription before you can request historical data for a dataset. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the dataset dimension, you can request historical data for a single dataset subscription, a subset
of the dataset subscriptions you created in your notebook, or all of the dataset subscriptions in your notebook.
To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY
# DataFrame
single_history_df = qb.History(vix, 10)
subset_history_df = qb.History([vix, v3m], 10)
all_history_df = qb.History(qb.Securities.Keys, 10)
# Slice objects
all_history_slice = qb.History(10)
# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, 10)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], 10)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, 10)
The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.
To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY
# DataFrame
single_history_df = qb.History(vix, timedelta(days=3))
subset_history_df = qb.History([vix, v3m], timedelta(days=3))
all_history_df = qb.History(qb.Securities.Keys, timedelta(days=3))
# Slice objects
all_history_slice = qb.History(timedelta(days=3))
# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, timedelta(days=3))
subset_history_data_objects = qb.History[CBOE]([vix, v3m], timedelta(days=3))
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, timedelta(days=3))
The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.
To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 3, 1)
# DataFrame
single_history_df = qb.History(vix, start_time, end_time)
subset_history_df = qb.History([vix, v3m], start_time, end_time)
all_history_df = qb.History(qb.Securities.Keys, start_time, end_time)
# Slice objects
all_history_slice = qb.History(start_time, end_time)
# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, start_time, end_time)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], start_time, end_time)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, start_time, end_time)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.
If you do not pass a resolution to the history method, the history method uses the resolution that the add_data
method used when you created the subscription .
Wrangle Data
You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
DataFrame Objects
If the history method returns a DataFrame , the first level of the DataFrame index is the encoded dataset Symbol
and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.
To select the historical data of a single dataset, index the loc property of the DataFrame with the dataset Symbol .
PY
all_history_df.loc[vix] # or all_history_df.loc['VIX']
PY
all_history_df.loc[vix]['close']
If you request historical data for multiple tickers, you can transform the DataFrame so that it's a time series of close
values for all of the tickers. To transform the DataFrame , select the column you want to display for each ticker and
then call the unstack method.
PY
all_history_df['close'].unstack(level=0)
The DataFrame is transformed so that the column indices are the Symbol of each ticker and each row contains the
close value.
Slice Objects
If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your dataset subscriptions. To avoid issues, check if the Slice contains data for your
ticker before you index it with the dataset Symbol .
Plot Data
You need some historical alternative data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
You can only create candlestick charts for alternative datasets that have open, high, low, and close properties.
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
5. Create a Figure .
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the alternative data.
Line Chart
PY
PY
values = history['close'].unstack(0)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Custom Data
Datasets
Custom Data
Introduction
This page explains how to request, manipulate, and visualize historical user-defined custom data.
You must format the data file into chronological order before you define the custom data class.
To define a custom data class, extend the PythonData class and override the GetSource and Reader methods.
PY
class Nifty(PythonData):
'''NIFTY Custom Data Class'''
def get_source(self, config: SubscriptionDataConfig, date: datetime, is_live_mode: bool) ->
SubscriptionDataSource:
url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv"
return SubscriptionDataSource(url, SubscriptionTransportMedium.REMOTE_FILE)
def reader(self, config: SubscriptionDataConfig, line: str, date: datetime, is_live_mode: bool) ->
BaseData:
if not (line.strip() and line[0].isdigit()): return None
try:
# Example File Format:
# Date, Open High Low Close Volume Turnover
# 2011-09-13 7792.9 7799.9 7722.65 7748.7 116534670 6107.78
data = line.split(',')
index.time = datetime.strptime(data[0], "%Y-%m-%d")
index.end_time = index.time + timedelta(days=1)
index.value = data[4]
index["Open"] = float(data[1])
index["High"] = float(data[2])
index["Low"] = float(data[3])
index["Close"] = float(data[4])
except:
pass
return index
Create Subscriptions
You need to define a custom data class before you can subscribe to it.
1. Create a QuantBook .
PY
qb = QuantBook()
2. Call the add_data method with a ticker and then save a reference to the data Symbol .
PY
Custom data has its own resolution, so you don't need to specify it.
You need a subscription before you can request historical data for a security. You can request an amount of
historical data based on a trailing number of bars, a trailing period of time, or a defined period of time.
Before you request data, call set_start_date method with a datetime to reduce the risk of look-ahead bias .
PY
qb.set_start_date(2014, 7, 29)
If you call the set_start_date method, the date that you pass to the method is the latest date for which your
Call the history method with a symbol, integer, and resolution to request historical data based on the given
number of trailing bars and resolution.
PY
This method returns the most recent bars, excluding periods of time when the exchange was closed.
Call the history method with a symbol, timedelta , and resolution to request historical data based on the given
trailing period of time and resolution.
PY
This method returns the most recent bars, excluding periods of time when the exchange was closed.
Call the history method with a symbol, start datetime , end datetime , and resolution to request historical data
based on the defined period of time and resolution. The start and end times you provide are based in the notebook
time zone .
PY
This method returns the bars that are timestamped within the defined period of time.
In all of the cases above, the history method returns a DataFrame with a MultiIndex .
Download Method
To download the data directly from the remote file location instead of using your custom data class, call the
download method with the data URL.
PY
content = qb.download("http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv")
PY
2. Create a StringIO .
PY
data = StringIO(content)
PY
You need some historical data to perform wrangling operations. To display pandas objects, run a cell in a notebook
with the pandas object as the last line. To display other data formats, call the print method.
The DataFrame that the history method returns has the following index levels:
1. Dataset Symbol
2. The end_time of the data sample
To select the data of a single dataset, index the loc property of the DataFrame with the data Symbol .
PY
history.loc[symbol]
To select a column of the DataFrame , index it with the column name.
PY
history.loc[symbol]['close']
Plot Data
You need some historical custom data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.
Candlestick Chart
PY
import plotly.graph_objects as go
3. Create a Candlestick .
PY
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])
4. Create a Layout .
PY
5. Create a Figure .
PY
PY
fig.show()
Candlestick charts display the open, high, low, and close prices of the security.
Line Chart
PY
values = history['value'].unstack(level=0)
PY
PY
plt.show()
Line charts display the value of the property you selected in a time series.
Charting
Charting
The Research Environment is centered around analyzing and understanding data. One way to gain a more intuitive
understanding of the existing relationships in our data is to visualize it using charts. There are many different
libraries that allow you to chart our data in different ways. Sometimes the right chart can illuminate an interesting
relationship in the data. Click one of the following libraries to learn more about it:
Bokeh
Matplotlib
Plotly
Seaborn
Plotly NET
See Also
Supported Libraries
Algorithm Charting
Charting > Bokeh
Charting
Bokeh
Introduction
bokeh is a Python library you can use to create interactive visualizations. It helps you build beautiful graphics,
ranging from simple plots to complex dashboards with streaming datasets. With bokeh , you can create JavaScript-
Import Libraries
PY
PY
output_notebook()
PY
import numpy as np
Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
PY
qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))
Create Candlestick Chart
You must import the plotting libraries and get some historical data to create candlestick charts.
In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the
banking securities. Follow these steps to create the candlestick chart:
1. Select a Symbol .
PY
symbol = symbols[0]
PY
data = history.loc[symbol]
3. Divide the data into days with positive returns and days with negative returns.
PY
4. Call the figure function with a title, axis labels and x-axis type.
PY
5. Call the segment method with the data timestamps, high prices, low prices, and a color.
PY
6. Call the vbar method for the up and down days with the data timestamps, open prices, close prices, and a
color.
PY
width = 12*60*60*1000
plot.vbar(up_days.index, width, up_days['open'], up_days['close'],
fill_color="green", line_color="green")
plot.vbar(down_days.index, width, down_days['open'], down_days['close'],
fill_color="red", line_color="red")
show(plot)
You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these
steps to create the line chart:
1. Select a Symbol .
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
close_prices = history.loc[symbol]['close']
3. Call the figure function with title, axis labels and x-axis type..
PY
4. Call the line method with the timestamps, close_prices , and some design settings.
PY
plot.line(close_prices.index, close_prices,
legend_label=symbol.value, color="blue", line_width=2)
PY
show(plot)
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a scatter plot that shows the relationship between the daily returns of two banking
securities. Follow these steps to create the scatter plot:
1. Select 2 Symbol s.
For example, to select the Symbol s of the first 2 bank stocks, run:
PY
symbol1 = symbols[1]
symbol2 = symbols[2]
2. Slice the history DataFrame with each Symbol and then select the close column.
PY
close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']
PY
daily_return1 = close_price1.pct_change().dropna()
daily_return2 = close_price2.pct_change().dropna()
4. Call the polyfit method with the daily_returns1 , daily_returns2 , and a degree.
PY
This method call returns the slope and intercept of the ordinary least squares regression line.
5. Call the linspace method with the minimum and maximum values on the x-axis.
PY
x = np.linspace(daily_returns1.min(), daily_returns1.max())
PY
y = m*x + b
PY
8. Call the line method with x- and y-axis values, a color, and a line width.
PY
PY
PY
show(plot)
Create Histogram
You must import the plotting libraries and get some historical data to create histograms.
In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
ETF. In addition to the bins in the histogram, you overlay a normal distribution curve for comparison. Follow these
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
close_prices = history.loc[symbol]['close']
3. Call the pct_change method and then call the dropna method.
PY
daily_returns = close_prices.pct_change().dropna()
4. Call the histogram method with the daily_returns , the density argument enabled, and a number of bins.
PY
hist : The value of the probability density function at each bin, normalized such that the integral over the
range is 1.
edges : The x-axis value of the edges of each bin.
PY
Call the quad method with the coordinates of the bins and some design settings.
PY
PY
mean = daily_returns.mean()
std = daily_returns.std()
Call the linspace method with the lower limit, upper limit, and number data points for the x-axis of the normal
distribution curve.
PY
PY
Call the line method with the data and style of the normal distribution PDF curve.
PY
PY
show(plot)
In this example, you create a bar chart that shows the average daily percent return of the banking securities.
Follow these steps to create the bar chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
PY
avg_daily_returns = daily_returns.mean()
4. Call the DataFrame constructor with the data Series and then call the reset_index method.
PY
5. Call the figure function with a title, x-axis values, and axis labels.
PY
6. Call the vbar method with the avg_daily_returns , x- and y-axis column names, and a bar width.
PY
7. Rotate the x-axis label and then call the show function.
PY
plot.xaxis.major_label_orientation = 0.6
show(plot)
You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking
securities. Follow these steps to create the heat map:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
PY
corr_matrix = daily_returns.corr()
4. Set the index and columns of the corr_matrix to the ticker of each security and then set the name of the
PY
corr_matrix = corr_matrix.stack().rename("value").reset_index()
6. Call the figure function with a title, axis ticks, and some design settings.
PY
plot = figure(title=f"Banking Stocks and Bank Sector ETF Correlation Heat Map",
x_range=list(corr_matrix.symbol.drop_duplicates()),
y_range=list(corr_matrix.stocks.drop_duplicates()),
toolbar_location=None,
tools="",
x_axis_location="above")
7. Select a color palette and then call the LinearColorMapper constructor with the color pallet, the minimum
PY
colors = Category20c[len(corr_matrix.columns)]
mapper = LinearColorMapper(palette=colors, low=corr_matrix.value.min(),
high=corr_matrix.value.max())
8. Call the rect method with the correlation plot data and design setting.
PY
plot.rect(source=ColumnDataSource(corr_matrix),
x="stocks",
y="symbol",
width=1,
height=1,
line_color=None,
fill_color=transform('value', mapper))
9. Call the ColorBar constructor with the mapper , a location, and a BaseTicker .
PY
color_bar = ColorBar(color_mapper=mapper,
location=(0, 0),
ticker=BasicTicker(desired_num_ticks=len(colors)))
This snippet creates a color bar to represent the correlation coefficients of the heat map cells.
10. Call the add_layout method with the color_bar and a location.
PY
plot.add_layout(color_bar, 'right')
This method call plots the color bar to the right of the heat map.
PY
show(plot)
You must import the plotting libraries and get some historical data to create pie charts.
In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate
to them based on their inverse volatility. Follow these steps to create the pie chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
2. Call the pct_change method.
PY
daily_returns = close_prices.pct_change()
3. Call the var method, take the inverse, and then normalize the result.
PY
inverse_variance = 1 / daily_returns.var()
inverse_variance /= np.sum(inverse_variance) # Normalization
inverse_variance *= np.pi*2 # For a full circle circumference in radian
4. Call the DataFrame constructor with the inverse_variance Series and then call the reset_index method.
PY
PY
inverse_variance['color'] = Category20c[len(inverse_variance.index)]
PY
7. Call the wedge method with design settings and the inverse_variance DataFrame .
PY
PY
show(plot)
Charting
Matplotlib
Introduction
matplotlib is the most popular 2d-charting library for python. It allows you to easily create histograms, scatter
plots, and various other charts. In addition, pandas is integrated with matplotlib , so you can seamlessly move
between data manipulation and data visualization. This makes matplotlib great for quickly producing a chart to
Import Libraries
PY
PY
Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
PY
qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))
You must import the plotting libraries and get some historical data to create candlestick charts.
In this example, we'll create a candlestick chart that shows the open, high, low, and close prices of one of the
banking securities. Follow these steps to create the candlestick chart:
1. Select a Symbol .
PY
symbol = symbols[0]
PY
data = history.loc[symbol]
PY
4. Call the plot method with the data , chart type, style, title, y-axis label, and figure size.
PY
mplfinance.plot(data,
type='candle',
style='charles',
title=f'{symbol.value} OHLC',
ylabel='Price ($)',
figratio=(15, 10))
You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these
1. Select a Symbol .
PY
symbol = symbols[0]
2. Slice the history DataFrame with symbol and then select the close column.
PY
data = history.loc[symbol]['close']
PY
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a scatter plot that shows the relationship between the daily returns of two banking
For example, to select the Symbol s of the first 2 bank stocks, run:
PY
symbol1 = symbols[1]
symbol2 = symbols[2]
2. Slice the history DataFrame with each Symbol and then select the close column.
PY
close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']
PY
daily_returns1 = close_price1.pct_change().dropna()
daily_returns2 = close_price2.pct_change().dropna()
4. Call the polyfit method with the daily_returns1 , daily_returns2 , and a degree.
PY
This method call returns the slope and intercept of the ordinary least squares regression line.
5. Call the linspace method with the minimum and maximum values on the x-axis.
PY
x = np.linspace(daily_returns1.min(), daily_returns1.max())
PY
y = m*x + b
7. Call the plot method with the coordinates and color of the regression line.
PY
plt.plot(x, y, color='red')
8. In the same cell that you called the plot method, call the scatter method with the 2 daily return series.
PY
plt.scatter(daily_returns1, daily_returns2)
9. In the same cell that you called the scatter method, call the title , xlabel , and ylabel methods with a title
and axis labels.
PY
You must import the plotting libraries and get some historical data to create histograms.
In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
ETF. In addition to the bins in the histogram, you overlay a normal distribution curve for comparison. Follow these
steps to create the histogram:
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
close_prices = history.loc[symbol]['close']
3. Call the pct_change method and then call the dropna method.
PY
daily_returns = close_prices.pct_change().dropna()
mean = daily_returns.mean()
std = daily_returns.std()
5. Call the linspace method with the lower limit, upper limit, and number data points for the x-axis of the normal
distribution curve.
PY
PY
7. Call the plot method with the data for the normal distribution curve.
PY
8. In the same cell that you called the plot method, call the hist method with the daily return data and the
number of bins.
PY
plt.hist(daily_returns, bins=20)
9. In the same cell that you called the hist method, call the title , xlabel , and ylabel methods with a title and
PY
You must import the plotting libraries and get some historical data to create bar charts.
In this example, you create a bar chart that shows the average daily percent return of the banking securities.
Follow these steps to create the bar chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
PY
avg_daily_returns = daily_returns.mean()
PY
plt.figure(figsize=(15, 10))
5. Call the bar method with the x-axis and y-axis values.
PY
plt.bar(avg_daily_returns.index, avg_daily_returns)
6. In the same cell that you called the bar method, call the title , xlabel , and ylabel methods with a title and
PY
You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
2. Call the pct_change method.
PY
daily_returns = close_prices.pct_change()
PY
corr_matrix = daily_returns.corr()
4. Call the imshow method with the correlation matrix, a color map, and an interpolation method.
PY
5. In the same cell that you called the imshow method, call the title , xticks , and yticks , methods with a title
PY
6. In the same cell that you called the imshow method, call the colorbar method.
PY
plt.colorbar();
You must import the plotting libraries and get some historical data to create pie charts.
In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate
to them based on their inverse volatility. Follow these steps to create the pie chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
PY
inverse_variance = 1 / daily_returns.var()
4. Call the pie method with the inverse_variance Series , the plot labels, and a display format.
PY
5. In the cell that you called the pie method, call the title method with a title.
PY
Create 3D Chart
You must import the plotting libraries and get some historical data to create 3D charts.
In this example, you create a 3D chart that shows the price of an asset on each dimension, i.e. the price correlation
PY
x, y, z = symbols[:3]
x_hist = history.loc[x].close
y_hist = history.loc[y].close
z_hist = history.loc[z].close
PY
4. Call the ax.scatter method with the 3 price series to plot the graph.
PY
PY
ax.set_xlabel(f"{x} Price")
ax.set_ylabel(f"{y} Price")
ax.set_zlabel(f"{z} Price")
6. Display the 3D chart. Note that you need to zoom the chart to avoid z-axis cut off.
PY
ax.set_box_aspect(None, zoom=0.85)
plt.show()
Charting
Plotly
Introduction
plotly is an online charting tool with a python API. It offers the ability to create rich and interactive graphs.
Import Libraries
PY
import plotly.express as px
import plotly.graph_objects as go
Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
PY
qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))
You must import the plotting libraries and get some historical data to create candlestick charts.
In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the
1. Select a Symbol .
PY
symbol = symbols[0]
PY
data = history.loc[symbol]
3. Call the Candlestick constructor with the time and open, high, low, and close price Series .
PY
candlestick = go.Candlestick(x=data.index,
open=data['open'],
high=data['high'],
low=data['low'],
close=data['close'])
PY
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these
1. Select a Symbol .
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
data = history.loc[symbol]['close']
3. Call the DataFrame constructor with the data Series and then call the reset_index method.
PY
data = pd.DataFrame(data).reset_index()
4. Call the line method with data , the column names of the x- and y-axis in data , and the plot title.
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a scatter plot that shows the relationship between the daily returns of two banking
1. Select 2 Symbol s.
For example, to select the Symbol s of the first 2 bank stocks, run:
PY
symbol1 = symbols[1]
symbol2 = symbols[2]
2. Slice the history DataFrame with each Symbol and then select the close column.
PY
close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']
PY
daily_return1 = close_price1.pct_change().dropna()
daily_return2 = close_price2.pct_change().dropna()
4. Call the scatter method with the 2 return Series , the trendline option, and axes labels.
PY
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create histograms.
In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
ETF. Follow these steps to create the histogram:
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
data = history.loc[symbol]['close']
3. Call the pct_change method and then call the dropna method.
PY
daily_returns = data.pct_change().dropna()
4. Call the DataFrame constructor with the data Series and then call the reset_index method.
PY
daily_returns = pd.DataFrame(daily_returns).reset_index()
5. Call the histogram method with the daily_returns DataFrame, the x-axis label, a title, and the number of
bins.
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create bar charts.
In this example, you create a bar chart that shows the average daily percent return of the banking securities.
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
avg_daily_returns = daily_returns.mean()
4. Call the DataFrame constructor with the avg_daily_returns Series and then call the reset_index method.
PY
5. Call the bar method with the avg_daily_returns and the axes column names.
PY
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
PY
corr_matrix = daily_returns.corr()
4. Call the imshow method with the corr_matrix and the axes labels.
PY
PY
PY
fig.show()
You must import the plotting libraries and get some historical data to create pie charts.
In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate
to them based on their inverse volatility. Follow these steps to create the pie chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
PY
inverse_variance = 1 / daily_returns.var()
4. Call the DataFrame constructor with the inverse_variance Series and then call the reset_index method.
PY
5. Call the pie method with the inverse_variance DataFrame , the column name of the values, and the column
PY
PY
fig.show()
Create 3D Chart
You must import the plotting libraries and get some historical data to create 3D charts.
In this example, you create a 3D chart that shows the price of an asset on each dimension. Follow these steps to
create the 3D chart:
PY
x, y, z = symbols[:3]
2. Call the Scatter3d constructor with the data for the x, y, and z axes.
PY
scatter = go.Scatter3d(
x=history.loc[x].close,
y=history.loc[y].close,
z=history.loc[z].close,
mode='markers',
marker=dict(
size=2,
opacity=0.8
)
)
3. Call the Layout constructor with the axes titles and chart dimensions.
PY
layout = go.Layout(
scene=dict(
xaxis_title=f'{x.value} Price',
yaxis_title=f'{y.value} Price',
zaxis_title=f'{z.value} Price'
),
width=700,
height=700
)
4. Call the Figure constructor with the scatter and layout variables.
PY
PY
fig.show()
Charting
Seaborn
Introduction
seaborn is a data visualization library based on matplotlib . It makes it easier to create more complicated plots
and allows us to create much more visually-appealing charts than matplotlib charts.
Import Libraries
PY
PY
Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
banking companies over 2021, run:
PY
qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))
Seaborn does not currently support candlestick charts. Use one of the other plotting libraries to create candlestick
charts.
You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these
1. Select a Symbol .
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
data = history.loc[symbol]['close']
3. Call the DataFrame constructor with the data Series and then call the reset_index method.
PY
data = pd.DataFrame(data).reset_index()
4. Call the lineplot method with the data Series and the column name of each axis.
PY
plot = sns.lineplot(data=data,
x='time',
y='close')
5. In the same cell that you called the lineplot method, call the set method with the y-axis label and a title.
PY
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a scatter plot that shows the relationship between the daily returns of two banking
1. Select 2 Symbol s.
For example, to select the Symbol s of the first 2 bank stocks, run:
PY
symbol1 = symbols[1]
symbol2 = symbols[2]
2. Select the close column of the history DataFrame, call the unstack method, and then select the symbol1 and
symbol2 columns.
PY
3. Call the pct_change method and then call the dropna method.
PY
daily_returns = close_prices.pct_change().dropna()
4. Call the regplot method with the daily_returns DataFrame and the column names.
PY
plot = sns.regplot(data=daily_returns,
x=daily_returns.columns[0],
y=daily_returns.columns[1])
5. In the same cell that you called the regplot method, call the set method with the axis labels and a title.
PY
plot.set(xlabel=f'{daily_returns.columns[0]} % Returns',
ylabel=f'{daily_returns.columns[1]} % Returns',
title=f'{symbol1} vs {symbol2} Daily % Returns');
Create Histogram
You must import the plotting libraries and get some historical data to create histograms.
In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
PY
symbol = symbols[0]
2. Slice the history DataFrame with the symbol and then select the close column.
PY
data = history.loc[symbol]['close']
3. Call the pct_change method and then call the dropna method.
PY
daily_returns = data.pct_change().dropna()
4. Call the DataFrame constructor with the daily_returns Series and then call the reset_index method.
PY
daily_returns = pd.DataFrame(daily_returns).reset_index()
5. Call the histplot method with the daily_returns , the close column name, and the number of bins.
PY
6. In the same cell that you called the histplot method, call the set method with the axis labels and a title.
PY
plot.set(xlabel='Return',
ylabel='Frequency',
title=f'{symbol} Daily Return of Close Price Distribution');
You must import the plotting libraries and get some historical data to create bar charts.
In this example, you create a bar chart that shows the average daily percent return of the banking securities.
PY
close_prices = history['close'].unstack(level=0)
PY
PY
avg_daily_returns = daily_returns.mean()
4. Call the DataFrame constructor with the avg_daily_returns Series and then call the reset_index method.
PY
5. Call barplot method with the avg_daily_returns Series and the axes column names.
PY
6. In the same cell that you called the barplot method, call the set method with the axis labels and a title.
PY
plot.set(xlabel='Tickers',
ylabel='%',
title='Banking Stocks Average Daily % Returns')
7. In the same cell that you called the set method, call the tick_params method to rotate the x-axis labels.
PY
plot.tick_params(axis='x', rotation=90)
You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
PY
corr_matrix = daily_returns.corr()
4. Call the heatmap method with the corr_matrix and the annotation argument enabled.
PY
5. In the same cell that you called the heatmap method, call the set method with a title.
PY
You must import the plotting libraries and get some historical data to create pie charts.
In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate
to them based on their inverse volatility. Follow these steps to create the pie chart:
1. Select the close column and then call the unstack method.
PY
close_prices = history['close'].unstack(level=0)
PY
daily_returns = close_prices.pct_change()
inverse_variance = 1 / daily_returns.var()
4. Call the color_palette method with a palette name and then truncate the returned colors to so that you have
one color for each security.
PY
colors = sns.color_palette('pastel')[:len(inverse_variance.index)]
5. Call the pie method with the security weights, labels, and colors.
PY
6. In the same cell that you called the pie method, call the title method with a title.
PY
Charting
Plotly NET
Introduction
Plotly.NET provides functions for generating and rendering plotly.js charts in .NET programming languages. Our
Import Libraries
1. Load the assembly files and data types in their own cell.
2. Load the necessary assembly files.
Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
You must import the plotting libraries and get some historical data to create candlestick charts.
In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the
1. Select a Symbol .
2. Call the Chart2D.Chart.Candlestick constructor with the time and open, high, low, and close price
IEnumerable .
3. Call the Layout constructor and set the title , xaxis , and yaxis properties as the title and axes label
objects.
You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the volume of a security. Follow these steps to create the chart:
1. Select a Symbol .
3. Create a Layout .
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a scatter plot that shows the relationship between the daily price of two securities.
Follow these steps to create the scatter plot:
2. Call the Chart2D.Chart.Point constructor with the closing prices of both securities.
3. Create a Layout .
4. Assign the Layout to the chart.
You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking
securities. Follow these steps to create the heat map:
4. Create a Layout .
You must import the plotting libraries and get some historical data to create scatter plots.
In this example, you create a 3D chart that shows the price of an asset on each dimension, i.e. the price correlation
2. Call the Chart3D.Chart.Point3D constructor with the closing price series of each securities.
Universes
Introduction
Universe selection is the process of selecting a basket of assets to research. Dynamic universe selection increases
Universes are data types. To get historical data for a universe, pass the universe data type to the UniverseHistory
method. The object that returns contains a universe data collection for each day. With this object, you can iterate
through each day and then iterate through the universe data objects of each day to analyze the universe
constituents.
For example, follow these steps to get the US Equity Fundamental data for a specific universe:
1. Create a QuantBook .
PY
qb = QuantBook()
2. Define a universe.
The following example defines a dynamic universe that contains the 10 Equities with the lowest PE ratios in
the market. To see all the Fundamental attributes you can use to define a filter function for a Fundamental
universe, see Data Point Attributes . To create the universe, call the add_universe method with the filter
function.
PY
def filter_function(fundamentals):
sorted_by_pe_ratio = sorted(
[f for f in fundamentals if not np.isnan(f.valuation_ratios.pe_ratio)],
key=lambda fundamental: fundamental.valuation_ratios.pe_ratio
)
return [fundamental.symbol for fundamental in sorted_by_pe_ratio[:10]]
universe = qb.add_universe(filter_function)
3. Call the universe_history method with the universe, a start date, and an end date.
PY
The end date arguments is optional. If you omit it, the method returns Fundamental data between the start
Fundamental objects. The following image shows the first 5 rows of an example Series:
PY
Available Universes
To get universe data for other types of universes, you usually just need to replace Fundamental in the preceding
code snippets with the universe data type. The following table shows the datasets that support universe selection
and their respective data type. For more information, about universe selection with these datasets and the data
points you can use in the filter function, see the dataset's documentation.
Dataset Name Universe Type(s) Documentation
SmartInsiderIntentionUni
verse
Corporate Buybacks Learn more
SmartInsiderTransactionU
niverse
BrainSentimentIndicatorUniver
Brain Sentiment Indicator Learn more
se
BrainCompanyFilingLangua
QuiverGovernmentContractUnive
US Government Contracts Learn more
rse
To get universe data for Futures and Options, use the future_history and option_history methods,
respectively.
Indicators
Indicators
Indicators let you analyze market data in an abstract form rather than in its raw form. For example, indicators like
the RSI tell you, based on price and volume data, if the market is overbought or oversold. Because indicators can
extract overall market trends from price data, sometimes, you may want to look for correlations between indicators
and the market, instead of between raw price data and the market. To view all of the indicators and candlestick
Bar Indicators
Combining Indicators
Custom Indicators
Custom Resolutions
See Also
Key Concepts
Indicators > Data Point Indicators
Indicators
Data Point Indicators
Introduction
This page explains how to create, update, and visualize LEAN data-point indicators.
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator
PY
bb = BollingerBands(20, 2)
You can create the indicator timeseries with the Indicator helper method or you can manually create the
timeseries.
To create an indicator timeseries with the helper method, call the Indicator method.
PY
# Create a dataframe with a date index, and columns are indicator values.
bb_dataframe = qb.indicator(bb, symbol, 50, Resolution.DAILY)
PY
2. Set the indicator window.size for each attribute of the indicator to hold their values.
PY
3. Iterate through the historical market data and update the indicator.
PY
PY
bb_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in bb}),
"lowerband": pd.Series({x.end_time: x.value for x in bb.lower_band}),
"middleband": pd.Series({x.end_time: x.value for x in bb.middle_band}),
"upperband": pd.Series({x.end_time: x.value for x in bb.upper_band}),
"bandwidth": pd.Series({x.end_time: x.value for x in bb.band_width}),
"percentb": pd.Series({x.end_time: x.value for x in bb.percent_b}),
"standarddeviation": pd.Series({x.end_time: x.value for x in bb.standard_deviation}),
"price": pd.Series({x.end_time: x.value for x in bb.price})
}).sort_index()
Plot Indicators
PY
PY
PY
plt.show()
Indicators > Bar Indicators
Indicators
Bar Indicators
Introduction
This page explains how to create, update, and visualize LEAN bar indicators.
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator
PY
atr = AverageTrueRange(20)
You can create the indicator timeseries with the Indicator helper method or you can manually create the
timeseries.
To create an indicator timeseries with the helper method, call the Indicator method.
PY
# Create a dataframe with a date index, and columns are indicator values.
atr_dataframe = qb.indicator(atr, symbol, 50, Resolution.DAILY)
PY
2. Set the indicator window.size for each attribute of the indicator to hold their values.
PY
3. Iterate through the historical market data and update the indicator.
PY
PY
atr_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in atr}),
"truerange": pd.Series({x.end_time: x.value for x in atr.true_range})
}).sort_index()
Plot Indicators
PY
plt.show()
Indicators > Trade Bar Indicators
Indicators
Trade Bar Indicators
Introduction
This page explains how to create, update, and visualize LEAN TradeBar indicators.
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator
PY
vwap = VolumeWeightedAveragePriceIndicator(20)
You can create the indicator timeseries with the Indicator helper method or you can manually create the
timeseries.
To create an indicator timeseries with the helper method, call the Indicator method.
PY
# Create a dataframe with a date index, and columns are indicator values.
vwap_dataframe = qb.indicator(vwap, symbol, 50, Resolution.DAILY)
PY
2. Set the indicator window.size for each attribute of the indicator to hold their values.
PY
3. Iterate through the historical market data and update the indicator.
PY
PY
vwap_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in vwap}))
}).sort_index()
Plot Indicators
PY
PY
plt.show()
Indicators > Combining Indicators
Indicators
Combining Indicators
Introduction
This page explains how to create, update, and visualize LEAN Composite indicators.
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data and create a composite indicator in order to calculate a timeseries of
indicator.
PY
PY
2. Create a RollingWindow for each attribute of the indicator to hold their values.
PY
PY
# Define an update function to add the indicator values to the RollingWindow object.
def update_sma_of_rsi_window(sender: object, updated: IndicatorDataPoint) -> None:
indicator = sender
window['time'].add(updated.end_time)
window["SMA Of RSI"].add(updated.value)
window["rollingsum"].add(indicator.rolling_sum.current.value)
sma_of_rsi.updated += UpdateSmaOfRsiWindow
When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint
4. Iterate the historical market data to update the indicators and the RollingWindow s.
PY
PY
sma_of_rsi_dataframe = pd.DataFrame(window).set_index('time')
Plot Indicators
PY
PY
PY
plt.show()
Indicators > Custom Indicators
Indicators
Custom Indicators
Introduction
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data in order to calculate a timeseries of indicator values.
PY
2. Define a custom indicator class. Note the PythonIndicator superclass inheritance, Value attribute, and
In this tutorial, create an ExpectedShortfallPercent indicator that uses Monte Carlo to calculate the
class ExpectedShortfallPercent(PythonIndicator):
import math, numpy as np
self.alpha = alpha
self.window = RollingWindow[float](period)
# Override the IsReady attribute to flag all attributes values are ready.
@property
def IsReady(self) -> bool:
return self.Value and self.ValueAtRisk
# Method to update the indicator values. Note that it only receives 1 IBaseData object (Tick,
TradeBar, QuoteBar) argument.
def Update(self, input: BaseData) -> bool:
count = self.window.Count
self.window.Add(input.Close)
# Update the Value and other attributes as the indicator current value.
if count >= 2:
cutoff = math.ceil(self.alpha * count)
self.Value = np.mean(lowest)
self.ValueAtRisk = lowest[-1]
PY
4. Create a RollingWindow for each attribute of the indicator to hold their values.
PY
5. Attach a handler method to the indicator that updates the RollingWindow objects.
When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint
values into the respective RollingWindow .
6. Iterate through the historical market data and update the indicator.
PY
# The Updated event handler is not available for custom indicator in Python, RollingWindows are
needed to be updated in here.
if custom.is_ready:
window['time'].add(bar.end_time)
window['expectedshortfall'].add(custom.value)
window['valueatrisk'].add(custom.value_at_risk)
PY
custom_dataframe = pd.DataFrame(window).set_index('time'))
Plot Indicators
PY
custom_dataframe.plot()
PY
plt.show()
Indicators > Custom Resolutions
Indicators
Custom Resolutions
Introduction
This page explains how to create and update indicators with data of a custom resolution.
Create Subscriptions
You need to subscribe to some market data in order to calculate indicator values.
PY
qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator
values.
PY
PY
bb = BollingerBands(20, 2)
3. Create a RollingWindow for each attribute of the indicator to hold their values.
PY
4. Attach a handler method to the indicator that updates the RollingWindow objects.
PY
# Define an update function to add the indicator values to the RollingWindow object.
def update_bollinger_band_window(sender: object, updated: IndicatorDataPoint) -> None:
indicator = sender
window['time'].add(updated.end_time)
window["bollingerbands"].add(updated.value)
window["lowerband"].add(indicator.lower_band.current.value)
window["middleband"].add(indicator.middle_band.current.value)
window["upperband"].add(indicator.upper_band.current.value)
window["bandwidth"].add(indicator.band_width.current.value)
window["percentb"].add(indicator.percent_b.current.value)
window["standarddeviation"].add(indicator.standard_deviation.current.value)
window["price"].add(indicator.price.current.value)
bb.updated += UpdateBollingerBandWindow
When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint
values into the respective RollingWindow .
PY
consolidator = TradeBarConsolidator(timedelta(days=7))
6. Attach a handler method to feed data into the consolidator and updates the indicator with the consolidated
bars.
PY
When the consolidator receives 7 days of data, the handler generates a 7-day TradeBar and update the
indicator.
7. Iterate through the historical market data and update the indicator.
PY
PY
bb_dataframe = pd.DataFrame(window).set_index('time')
Plot Indicators
PY
PY
df.plot()
PY
plt.show()
Object Store
Object Store
Introduction
The Object Store is a file system that you can use in your algorithms to save, read, and delete data. The Object
Store is organization-specific, so you can save or read data from the same Object Store in all of your
organization's projects. The Object Store works like a key-value storage system where you can store regular
strings, JSON encoded strings, XML encoded strings, and bytes. You can access the data you store in the Object
Store from backtests, the Research Environment, and live algorithms.
To get all of the keys and values in the Object Store, iterate through the object_store property.
PY
To iterate through just the keys in the Object Store, iterate through the keys property.
PY
1. Create a string .
PY
PY
Save Data
The Object Store saves objects under a key-value system. If you save objects in backtests, you can access them
from the Research Environment.
If you run algorithms in QuantConnect Cloud, you need storage create permissions to save data in the Object
Store.
You can save Bytes and string objects in the Object Store.
Strings
PY
Bytes
To save a Bytes object (for example, zipped data), call the save_bytes method.
PY
Read Data
To read data from the Object Store, you need to provide the key you used to store the object.
You can load Bytes and string objects from the Object Store.
Before you read data from the Object Store, check if the key exists.
PY
if qb.object_store.contains_key(key):
# Read data
Strings
PY
string_data = qb.object_store.read(f"{qb.project_id}/string_key")
Bytes
PY
byte_data = qb.object_store.read_bytes(f"{qb.project_id}/bytes_key")
Delete Data
Delete objects in the Object Store to remove objects that you no longer need. If you use the Research Environment
in QuantConnect Cloud, you need storage delete permissions to delete data from the Object Store.
To delete objects from the Object Store, call the delete method. Before you delete data, check if the key exists. If
you try to delete an object with a key that doesn't exist in the Object Store, the method raises an exception.
PY
if qb.object_store.contains_key(key):
qb.object_store.delete(key)
To delete all of the content in the Object Store, iterate through all the stored data.
PY
Cache Data
When you write to or read from the Object Store, the notebook caches the data. The cache speeds up the
notebook execution because if you try to read the Object Store data again with the same key, it returns the cached
data instead of downloading the data again. The cache speeds up execution, but it can cause problems if you are
trying to share data between two nodes under the same Object Store key. For example, consider the following
scenario:
1. You open project A and save data under the key 123 .
2. You open project B and save new data under the same key 123 .
3. In project A, you read the Object Store data under the key 123 , expecting the data from project B, but you get
You get the data from step 1 instead of step 2 because the cache contains the data from step 1.
PY
qb.object_store.clear()
To get the file path for a specific key in the Object Store, call the get_file_path method. If the key you pass to the
method doesn't already exist in the Object Store, it's added to the Object Store.
PY
file_path = qb.object_store.get_file_path(key)
Storage Quotas
If you use the Research Environment locally, you can store as much data as your hardware will allow. If you use
the Research Environment in QuantConnect Cloud, you must stay within your storage quota . If you need more
Follow these steps to create a DataFrame, save it into the Object Store, and load it from the Object Store:
PY
spy = qb.add_equity("SPY").symbol
df = qb.history(qb.securities.keys, 360, Resolution.DAILY)
2. Get the file path for a specific key in the Object Store.
PY
file_path = qb.object_store.get_file_path("df_to_csv")
3. Call the to_csv method to save the DataFrame in the Object Store as a CSV file.
PY
4. Call the read_csv method to load the CSV file from the Object Store.
PY
reread = pd.read_csv(file_path)
pandas supports saving and loading DataFrame objects in the following additional formats:
XML
PY
file_path = qb.object_store.get_file_path("df_to_xml")
df.to_xml(file_path) # File size: 87816 bytes
reread = pd.read_xml(file_path)
JSON
PY
file_path = qb.object_store.get_file_path("df_to_json")
df.to_json(file_path) # File size: 125250 bytes
reread = pd.read_json(file_path)
Parquet
PY
file_path = qb.object_store.get_file_path("df_to_parquet")
df.to_parquet(file_path) # File size: 23996 bytes
reread = pd.read_parquet(file_path)
Pickle
PY
file_path = qb.object_store.get_file_path("df_to_pickle")
df.to_pickle(file_path) # File size: 19868 bytes
reread = pd.read_pickle(file_path)
You can use the Object Store to plot data from your backtests and live algorithm in the Research Environment. The
following example demonstrates how to plot a Simple Moving Average indicator that's generated during a backtest.
1. Create a algorithm, add a data subscription, and add a simple moving average indicator.
PY
class ObjectStoreChartingAlgorithm(QCAlgorithm):
def initialize(self):
self.add_equity("SPY")
self.content = ''
self._sma = self.sma("SPY", 22)
PY
3. In the OnEndOfAlgorithm method, save the indicator data to the Object Store.
PY
def on_end_of_algorithm(self):
self.object_store.save('sma_values_python', self.content)
PY
qb = QuantBook()
content = qb.object_store.read("sma_values_python")
The key you provide must be the same key you used to save the object.
PY
data = {}
for line in content.split('\n'):
csv = line.split(',')
if len(csv) > 1:
data[csv[0]] = float(csv[1])
Machine Learning
Machine Learning
Key Concepts
Introduction
Machine learning is a field of study that combines statistics and computer science to build intelligent systems that
predict outcomes. Quant researchers commonly use machine learning models to optimize portfolios, make trading
signals, and manage risk. These models can find relationships in datasets that humans struggle to find, are subtle,
or are too complex. You can use machine learning techniques in your research notebooks.
Supported Libraries
To request a new library, contact us . We will add the library to the queue for review and deployment. Since the
libraries run on our servers, we need to ensure they are secure and won't cause harm. The process of adding new
libraries takes 2-4 weeks to complete. View the list of libraries currently under review on the Issues list of the Lean
GitHub repository .
Transfer Models
You can load machine learning models from the Object Store or a custom data file like pickle. If you train a model in
the Research Environment, you can also save it into the Object Store to transfer it to the backtesting and live
trading environment.
Machine Learning > Popular Libraries
Machine Learning
Popular Libraries
These are examples of using some of the most common machine learning libraries in an algorithm. Click one to
learn more.
Aesera
GPlearn
Hmmlearn
Keras
PyTorch
Scikit-Learn
Stable Baselines
TensorFlow
Tslearn
XGBoost
Machine Learning > Popular Libraries > Aesera
Popular Libraries
Aesera
Introduction
This page explains how to build, train, test, and store Aesera models.
Import Libraries
PY
import aesara
import aesara.tensor as at
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import joblib
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
PY
close = history['close']
returns = data['close'].pct_change().shift(-1)[lookback*2-1:-1].reset_index(drop=True)
labels = pd.Series([1 if y > 0 else 0 for y in returns]) # binary class
PY
lookback = 5
lookback_series = []
for i in range(1, lookback + 1):
df = data['close'].shift(i)[lookback:-1]
df.name = f"close-{i}"
lookback_series.append(df)
X = pd.concat(lookback_series, axis=1)
# Normalize using the 5 day interval
X = MinMaxScaler().fit_transform(X.T).T[4:]
PY
X = np.array(features)
y = np.array(labels)
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a Logistic Regression model with log loss cross entropy and square error
1. Generate a dataset.
PY
# D = (input_values, target_class)
D = (np.array(X_train), np.array(y_train))
2. Initialize variables.
PY
# initialize the weight vector w randomly using share so model coefficients keep their values
# between training iterations (updates)
rng = np.random.default_rng(100)
w = aesara.shared(rng.standard_normal(X.shape[1]), name="w")
PY
PY
train = aesara.function(
inputs=[x, y],
outputs=[prediction, xent],
updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)))
predict = aesara.function(inputs=[x], outputs=prediction)
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:
1. Call the predict method with the features of the testing period.
PY
y_hat = predict(np.array(X_test))
PY
Store Models
You can save and load aesera models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Call the dump method with the model and file path.
PY
joblib.dump(predict, file_name)
If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
loaded_model = joblib.load(file_name)
Popular Libraries
GPlearn
Introduction
This page introduces how to build, train, test, and store GPlearn models.
Import Libraries
PY
You need the sklearn library to prepare the data and the joblib library to store models.
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
Labels Daily percent return of the SPY over the next day
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
1. Call the pct_change method and then drop the first row.
PY
daily_returns = history['close'].pct_change()[1:]
2. Loop through the daily_returns DataFrame and collect the features and labels.
PY
n_steps = 5
features = []
labels = []
for i in range(len(daily_returns)-n_steps):
features.append(daily_returns.iloc[i:i+n_steps].values)
labels.append(daily_returns.iloc[i+n_steps])
PY
X = np.array(features)
y = np.array(labels)
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, create a Symbolic Transformer to generate new non-linear features and then
build a Symbolic Regressor model. Follow these steps to create the model:
1. Declare a set of functions to use for feature engineering.
PY
PY
gp_transformer = SymbolicTransformer(function_set=function_set,
random_state=0,
verbose=1)
3. Call the fit method with the training features and labels.
PY
gp_transformer.fit(X_train, y_train)
PY
gp_features_train = gp_transformer.transform(X_train)
5. Call the hstack method with the original features and the transformed features.
PY
7. Call the fit method with the engineered features and the original labels.
PY
gp_regressor.fit(new_X_train, y_train)
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:
PY
gp_features_test = gp_transformer.transform(X_test)
new_X_test = np.hstack((X_test, gp_features_test))
2. Call the predict method with the engineered testing set data.
PY
y_predict = gp_regressor.predict(new_X_test)
PY
PY
r2 = gp_regressor.score(new_X_test, y_test)
print(f"The explained variance of the GP model: {r2*100:.2f}%")
Store Models
You can save and load GPlearn models using the Object Store.
Save Models
1. Set the key names of the models to be stored in the Object Store.
PY
transformer_key = "transformer"
regressor_key = "regressor"
PY
transformer_file = qb.object_store.get_file_path(transformer_key)
regressor_file = qb.object_store.get_file_path(regressor_key)
This method returns the file paths where the models will be stored.
3. Call the dump method with the models and file paths.
PY
joblib.dump(gp_transformer, transformer_file)
joblib.dump(gp_regressor, regressor_file)
If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
qb.object_store.contains_key(transformer_key)
qb.object_store.contains_key(regressor_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
transformer_file = qb.object_store.get_file_path(transformer_key)
regressor_file = qb.object_store.get_file_path(regressor_key)
PY
loaded_transformer = joblib.load(transformer_file)
loaded_regressor = joblib.load(regressor_file)
Popular Libraries
Hmmlearn
Introduction
This page explains how to build, train, test, and store Hmmlearn models.
Import Libraries
PY
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. Follow these steps to prepare the data:
PY
closes = history['close']
2. Call the pct_change method and then drop the first row.
PY
daily_returns = closes.pct_change().iloc[1:]
X = daily_returns.values.reshape(-1, 1)
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, assume the market has only 2 regimes and the market returns follow a
Gaussian distribution. Therefore, create a 2-component Hidden Markov Model with Gaussian emissions, which is
equivalent to a Gaussian mixture model with 2 means. Follow these steps to create the model:
1. Call the GaussianHMM constructor with the number of components, a covariance type, and the number of
iterations.
PY
PY
model.fit(X)
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:
PY
y = model.predict(X)
PY
plt.figure(figsize=(15, 10))
plt.scatter(ret.index, [f'Regime {n+1}' for n in y])
plt.title(f'{symbol} market regime')
plt.xlabel("time")
plt.show()
Store Models
You can save and load Hmmlearn models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Call the dump method with the model and file path.
PY
joblib.dump(model, file_name)
If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
loaded_model = joblib.load(file_name)
Popular Libraries
Keras
Introduction
This page explains how to build, train, test, and store keras models.
Import Libraries
PY
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
Labels Daily percent return of the SPY over the next day
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
PY
daily_pct_change = history.pct_change().dropna()
2. Loop through the daily_pct_change DataFrame and collect the features and labels.
PY
n_steps = 5
features = []
labels = []
for i in range(len(daily_pct_change)-n_steps):
features.append(daily_pct_change.iloc[i:i+n_steps].values)
labels.append(daily_pct_change['close'].iloc[i+n_steps])
PY
features = np.array(features)
labels = np.array(labels)
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a neural network model that predicts the future return of the SPY. Follow
PY
Set the input_shape of the first layer to (5, 5) because each sample contains the percent change of 5
factors (percent change of the open, high, low, close, and volume) over the previous 5 days. Call the Flatten
constructor because the input is 2-dimensional but the output is just a single value.
2. Call the compile method with a loss function, an optimizer, and a list of metrics to monitor.
PY
model.compile(loss='mse',
optimizer=RMSprop(0.001),
metrics=['mae', 'mse'])
3. Call the fit method with the features and labels of the training dataset and a number of epochs.
PY
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
1. Call the predict method with the features of the testing period.
PY
y_hat = model.predict(X_test)
PY
You can save and load keras models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
The key must end with a .keras extension for the native Keras format (recommended) or a .h5 extension.
PY
model_key = "model.keras"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
PY
model.save(file_name)
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
loaded_model = load_model(file_name)
Popular Libraries
PyTorch
Introduction
This page explains how how to build, train, test, and store PyTorch models.
Import Libraries
PY
import torch
from torch import nn
from sklearn.model_selection import train_test_split
import joblib
You need the sklearn library to prepare the data and the joblib library to store models.
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
PY
Fractional differencing helps make the data stationary yet retains the variance information.
2. Loop through the df DataFrame and collect the features and labels.
PY
n_steps = 5
features = []
labels = []
for i in range(len(df)-n_steps):
features.append(df.iloc[i:i+n_steps].values)
labels.append(df.iloc[i+n_steps])
PY
features = np.array(features)
labels = np.array(labels)
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, create a deep neural network with 2 hidden layers. Follow these steps to
In this example, use the ReLU activation function for each layer.
PY
class NeuralNetwork(nn.Module):
# Model Structure
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(5, 5), # input size, output size of the layer
nn.ReLU(), # Relu non-linear transformation
nn.Linear(5, 5),
nn.ReLU(),
nn.Linear(5, 1), # Output size = 1 for regression
)
# Feed-forward training/prediction
def forward(self, x):
x = torch.from_numpy(x).float() # Convert to tensor in type float
result = self.linear_relu_stack(x)
return result
2. Create an instance of the model and set its configuration to train on the GPU if it's available.
PY
In this example, use the mean squared error as the loss function and stochastic gradient descent as the
optimizer.
PY
loss_fn = nn.MSELoss()
learning_rate = 0.001
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
# Since we're using SGD, we'll be using the size of data as batch number.
for batch, (X, y) in enumerate(zip(X_train, y_train)):
# Compute prediction and loss
pred = model(X)
real = torch.from_numpy(np.array(y).flatten()).float()
loss = loss_fn(pred, real)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch
print(f"loss: {loss:.5f} [{current:5d}/{len(X_train):5d}]")
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:
PY
predict = model(X_test)
y_predict = predict.detach().numpy() # Convert tensor to numpy ndarray
PY
r2 = 1 - np.sum(np.square(y_test.flatten() - y_predict.flatten())) /
np.sum(np.square(y_test.flatten() - y_test.mean()))
print(f"The explained variance by the model (r-square): {r2*100:.2f}%")
Store Models
You can save and load PyTorch models using the Object Store.
Save Models
Don't use the torch.save method to save models because the tensor data will be lost and corrupt the save. Follow
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
2. Call the get_file_path method with the key.
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Call the dump method with the model and file path.
PY
joblib.dump(model, file_name)
If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
loaded_model = joblib.load(file_name)
Popular Libraries
Scikit-Learn
Introduction
This page explains how to build, train, test, and store Scikit-Learn / sklearn models.
Import Libraries
PY
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
Labels Daily percent return of the SPY over the next day
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
1. Call the pct_change method and then drop the first row.
PY
daily_returns = history['close'].pct_change()[1:]
2. Loop through the daily_returns DataFrame and collect the features and labels.
PY
n_steps = 5
features = []
labels = []
for i in range(len(daily_returns)-n_steps):
features.append(daily_returns.iloc[i:i+n_steps].values)
labels.append(daily_returns.iloc[i+n_steps])
PY
X = np.array(features)
y = np.array(labels)
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a Support Vector Regressor model and optimize its hyperparameters
with grid search cross-validation. Follow these steps to create the model:
1. Set the choices of hyperparameters used for grid search testing.
PY
2. Call the GridSearchCV constructor with the SVR model, the parameter grid, a scoring method, the number of
cross-validation folds.
PY
3. Call the fit method and then select the best estimator.
PY
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
1. Call the predict method with the features of the testing period.
PY
y_hat = model.predict(X_test)
PY
You can save and load sklearn models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Call the dump method with the model and file path.
PY
joblib.dump(model, file_name)
If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
loaded_model = joblib.load(file_name)
Popular Libraries
Stable Baselines
Introduction
This page introduces how to use stable baselines library in Python for reinforcement machine learning (RL)
model building, training, saving in the Object Store, and loading, through an example of a Proximal Policy
Import Libraries
PY
import gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
Get some historical market data to train and test the model. For example, to get data for the different asset class
PY
qb = QuantBook()
symbols = [
qb.add_equity("SPY", Resolution.DAILY).symbol,
qb.add_equity("GLD", Resolution.DAILY).symbol,
qb.add_equity("TLT", Resolution.DAILY).symbol,
qb.add_equity("USO", Resolution.DAILY).symbol,
qb.add_equity("UUP", Resolution.DAILY).symbol
]
df = qb.history(symbols, datetime(2010, 1, 1), datetime(2024, 1, 1))
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, extract the close price series as the outcome and obtain the partial-differenced
PY
history = df.unstack(0)
# we arbitrarily select weight 0.5 here, but ideally one should strike a balance between variance
retained and stationarity.
partial_diff = (history.diff() * 0.5 + history * 0.5).iloc[1:].fillna(0)
history = history.close.iloc[1:]
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the environment and the model. In this example, create a gym environment to initialize the training
environment, agent and reward. Then, create a RL model by DQN algorithm. Follow these steps to create the
1. Split the data for training and testing to evaluate our model.
PY
X_train = partial_diff.iloc[:-100].values
X_test = partial_diff.iloc[-100:].values
y_train = history.iloc[:-100].values
y_test = history.iloc[-100:].values
In this example, create a custom environment with previous 5 OHLCV partial-differenced price data as the
class PortfolioEnv(gym.Env):
def __init__(self, data, prediction, num_stocks):
super(PortfolioEnv, self).__init__()
self.data = data
self.prediction = prediction
self.num_stocks = num_stocks
self.current_step = 5
self.portfolio_value = []
self.portfolio_weights = np.ones(num_stocks) / num_stocks
def reset(self):
self.current_step = 5
self.portfolio_value = []
self.portfolio_weights = np.ones(self.num_stocks) / self.num_stocks
return self._get_observation()
# Update portfolio value based on the new weights and the market prices less fee
self.portfolio_value.append(np.dot(self.portfolio_weights, value) - fees)
def _get_observation(self):
# Return the last 5 partial differencing OHLCV as the observation
return self.data[self.current_step-5:self.current_step]
@property
def _neg_max_drawdown(self):
# Return max drawdown within 20 days in portfolio value as reward (negate since max reward
is preferred)
portfolio_value_20d = np.array(self.portfolio_value[-min(len(self.portfolio_value), 20):])
acc_max = np.maximum.accumulate(portfolio_value_20d)
return -(portfolio_value_20d - acc_max).min()
In this example, create a RL model and train with MLP-policy PPO algorithm.
PY
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:
1. Initialize a return series to calculate performance and a list to store the equity value at each timestep.
PY
test = np.log(y_test[1:]/y_test[:-1])
equity = [1]
PY
equity.append((1+value) * equity[i-5])
PY
plt.figure(figsize=(15, 10))
plt.title("Equity Curve")
plt.xlabel("timestep")
plt.ylabel("equity")
plt.plot(equity)
plt.show()
Store Models
You can save and load stable baselines models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
PY
model.save(file_name)
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
3. Call the load method with the file path, environment and policy.
PY
Popular Libraries
TensorFlow
Introduction
This page explains how to build, train, test, and store Tensorflow models.
Import Libraries
PY
import tensorflow as tf
from sklearn.model_selection import train_test_split
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
1. Loop through the DataFrame of historical prices and collect the features.
PY
data = history
lookback = 5
lookback_series = []
for i in range(1, lookback + 1):
df = data['close'].diff(i)[lookback:-1]
df.name = f"close-{i}"
lookback_series.append(df)
X = pd.concat(lookback_series, axis=1).reset_index(drop=True).dropna()
X
2. Select the close column and then call the shift method to collect the labels.
PY
Y = data['close'].diff(-1)
3. Drop the first 5 samples and then call the reset_index method.
PY
Y = Y[lookback:-1].reset_index(drop=True)
For example, to use the last third of data to test the model, run:
PY
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a neural network model that predicts the future price of the SPY.
1. Set the number of layers, their number of nodes, the number of epoch and the learning rate.
PY
num_factors = X_test.shape[1]
num_neurons_1 = 10
num_neurons_2 = 10
num_neurons_3 = 5
epochs = 20
learning_rate = 0.0001
2. Create hidden layers with the set number of layer and their corresponding number of nodes.
In this example, we're constructing the model with the in-built Keras API, with Relu activator for non-linear
PY
model = tf.keras.sequential([
tf.keras.layers.dense(num_neurons_1, activation=tf.nn.relu, input_shape=(num_factors,)), #
input shape required
tf.keras.layers.dense(num_neurons_2, activation=tf.nn.relu),
tf.keras.layers.dense(num_neurons_3, activation=tf.nn.relu),
tf.keras.layers.dense(1)
])
3. Select an optimizer.
We're using Adam optimizer in this example. You may also consider others like SGD.
PY
optimizer = tf.keras.optimizers.adam(learning_rate=learning_rate)
In the context of numerical regression, we use MSE as our objective function. If you're doing classification,
PY
Iteratively train the model by the set epoch number. The model will train adaptively by the gradient provided by the
for i in range(epochs):
with tf.gradient_tape() as t:
loss = loss_mse(y_train, model(X_train))
Test Models
To test the model, we'll setup a method to plot test set predictions ontop of the SPY price.
PY
plt.figure(figsize=(16, 6))
plt.plot(actual, label="Actual")
plt.plot(prediction, label="Prediction")
plt.title(title)
plt.xlabel("Time step")
plt.ylabel("SPY Price")
plt.legend()
plt.show()
Store Models
You can save and load TensorFlow models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model.keras"
PY
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Call the save method with the model and file path.
PY
model.save(file_name)
PY
qb.object_store.save(model_key)
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
file_name = qb.object_store.get_file_path(model_key)
PY
model = tf.keras.models.load_model(file_name)
Machine Learning > Popular Libraries > Tslearn
Popular Libraries
Tslearn
Introduction
This page explains how to build, train, test, and store tslearn models.
Import Libraries
PY
Get some historical market data to train and test the model. For example, get data for the securities shown in the
following table:
PY
qb = QuantBook()
tickers = ["SPY", "QQQ", "DIA",
"AAPL", "MSFT", "TSLA",
"IEF", "TLT", "SHV", "SHY",
"GLD", "IAU", "SLV",
"USO", "XLE", "XOM"]
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2020, 1, 1), datetime(2022, 2, 20))
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, standardize the log close price time-series of the securities. Follow these
PY
close = history.unstack(0).close
PY
log_close = np.log(close)
PY
Train Models
Instead of using real-time comparison, we could apply a technique call Dynamic Time Wrapping (DTW) with
Barycenter Averaging (DBA). Intuitively, it is a technique of averaging a few time-series into a single one without
losing much of their information. Since not all time-series would move efficiently like in ideal EMH assumption, this
would allow similarity analysis of different time-series with sticky lags. Check the technical details from tslearn
documentation page .
We then can separate different clusters by KMean after DBA.
PY
Test Models
PY
labels = km.predict(standard_close.T)
PY
def plot_helper(ts):
# plot all points of the data set
for i in range(ts.shape[0]):
plt.plot(ts[i, :], "k-", alpha=.2)
PY
j = 1
plt.figure(figsize=(15, 10))
for i in set(labels):
# Select the series in the i-th cluster.
X = standard_close.iloc[:, [n for n, k in enumerate(labels) if k == i]].values
j += 1
plt.show()
4. Display the groupings.
PY
for i in set(labels):
print(f"Cluster {i+1}: {standard_close.columns[[n for n, k in enumerate(labels) if k == i]]}")
Store Models
You can save and load tslearn models using the Object Store.
Save Models
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
file_name = qb.object_store.get_file_path(model_key)
This method returns the file path where the model will be stored.
3. Delete the current file to avoid a FileExistsError error when you save the model.
PY
import os
os.remove(file_name)
PY
km.to_hdf5(file_name + ".hdf5")
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
PY
file_name = qb.object_store.get_file_path(model_key)
PY
Reference
F. Petitjean, A. Ketterlin, P. Gancarski. (2010). A global averaging method for dynamic time warping, with
Popular Libraries
XGBoost
Introduction
This page explains how to build, train, test, and store XGBoost models.
Import Libraries
PY
You need the sklearn library to prepare the data and the joblib library to save models.
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020
PY
qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:
The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:
PY
Fractional differencing helps make the data stationary yet retains the variance information.
2. Loop through the df DataFrame and collect the features and labels.
PY
n_steps = 5
features = []
labels = []
for i in range(len(df)-n_steps):
features.append(df.iloc[i:i+n_steps].values)
labels.append(df.iloc[i+n_steps])
PY
features = np.array(features)
labels = np.array(labels)
PY
Train Models
We're about to train a gradient-boosted random forest for future price prediction.
1. Split the data for training and testing to evaluate our model.
PY
PY
PY
params = {
'booster': 'gbtree',
'colsample_bynode': 0.8,
'learning_rate': 0.1,
'lambda': 0.1,
'max_depth': 5,
'num_parallel_tree': 100,
'objective': 'reg:squarederror',
'subsample': 0.8,
}
model = xgb.train(params, dtrain, num_boost_round=10)
Test Models
We then make predictions on the testing data set. We compare our Predicted Values with the Expected Values by
plotting both to see if our Model has predictive power.
PY
PY
y_predict = model.predict(dtest)
Store Models
We dump the model using the joblib module and save it to Object Store file path. This way, the model doesn't
1. Set the key name of the model to be stored in the Object Store.
PY
model_key = "model"
2. Call GetFilePath with the key's name to get the file path.
PY
file_name = qb.object_store.get_file_path(model_key)
3. Call dump with the model and file path to save the model to the file path.
PY
joblib.dump(model, file_name)
Loading the Model
Let's retrieve the model from the Object Store file path and load by joblib .
PY
qb.object_store.contains_key(model_key)
This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.
2. Call GetFilePath with the key's name to get the file path.
PY
file_name = qb.object_store.get_file_path(model_key)
3. Call load with the file path to fetch the saved model.
PY
loaded_model = joblib.load(file_name)
To ensure loading the model was successfuly, let's test the model.
PY
y_pred = loaded_model.predict(dtest)
df = pd.DataFrame({'Real': y_test.flatten(), 'Predicted': y_pred.flatten()})
df.plot(title='Model Performance: predicted vs actual closing price', figsize=(15, 10))
Machine Learning > Hugging Face
Machine Learning
Hugging Face
Hugging Face
Key Concepts
Introduction
Debugging
Debugging
Introduction
The debugger is a built-in tool to help you debug coding errors while in the Research Environment. The debugger
enables you to slow down the code execution, step through the program line-by-line, and inspect the variables to
Breakpoints
Breakpoints are lines in your notebook where execution pauses. You need at least one breakpoint in your
notebook to start the debugger. Open a project to start adjusting its breakpoints.
Add Breakpoints
2. In the Run and Debug panel, hover over the Breakpoints section and then click the Toggle Active
Breakpoints icon.
Remove Breakpoints
1. In the right navigation menu, click the Run and Debug icon.
2. In the Run and Debug panel, hover over the Breakpoints section and then click the Remove All
Breakpoints icon.
Launch Debugger
4. In the top-left corner of the cell, click the drop-down arrow and then click Debug Cell .
If the Run and Debug panel is not open, it opens when the first breakpoint is hit.
Control Debugger
After you launch the debugger, you can use the following buttons to control it:
Default Keyboard
Button Name Description
Shortcut
Inspect Variables
After you launch the debugger, you can inspect the state of your notebook as it executes each line of code. You
can inspect local variables or custom expressions. The values of variables in your notebook are formatted in the
IDE to improve readability. For example, if you inspect a variable that references a DataFrame, the debugger
represents the variable value as the following:
Local Variables
The Variables section of the Run and Debug panel shows the local variables at the current breakpoint. If a variable
in the panel is an object, click it to see its members. The panel updates as the notebook runs.
1. In the Run and Debug panel, right-click a variable and then click Set Value .
Custom Expressions
The Watch section of the Run and Debug panel shows any custom expressions you add. For example, you can add
1. Hover over the Watch section and then click the plus icon that appears.
Meta Analysis
Meta Analysis
Key Concepts
Introduction
Understanding your strategy trades in detail is key to attributing performance, and determining areas to focus for
improvement. This analysis can be done with the QuantConnect API. We enable you to load backtest, optimization,
Backtest Analysis
Load your backtest results into the Research Environment to analyze trades and easily compare them against the
raw backtesting data. For more information on loading and manipulating backtest results, see Backtest Analysis .
Optimization Analysis
Load your optimization results into the Research Environment to analyze how different combinations of parameters
affect the algorithm's performance. For more information on loading and manipulating optimizations results, see
Optimization Analysis .
Live Analysis
Load your live trading results into the Research Environment to compare live trading performance against
simulated backtest results, or analyze your trades to improve your slippage and fee models. For more information
Meta Analysis
Backtest Analysis
Introduction
Load your backtest results into the Research Environment to analyze trades and easily compare them against the
raw backtesting data. Compare backtests from different projects to find uncorrelated strategies to combine for
better performance.
Loading your backtest trades allows you to plot fills against detailed data, or locate the source of profits. Similarly
you can search for periods of high churn to reduce turnover and trading fees.
To get the results of a backtest, call the read_backtest method with the project Id and backtest ID.
PY
The following table provides links to documentation that explains how to get the project Id and backtest Id,
Note that this method returns a snapshot of the backtest at the current moment. If the backtest is still executing,
The read_backtest method returns a Backtest object, which have the following attributes:
PY
The read_backtest_orders method returns a list of Order objects, which have the following properties:
2. Organize the trade times and prices for each security into a dictionary.
PY
class OrderData:
def __init__(self):
self.buy_fill_times = []
self.buy_fill_prices = []
self.sell_fill_times = []
self.sell_fill_prices = []
order_data_by_symbol = {}
order_data_by_symbol[order.symbol] = OrderData()
order_data = order_data_by_symbol[order.symbol]
order_data.sell_fill_times).append(order.last_fill_time.date())
(order_data.buy_fill_prices if is_buy else order_data.sell_fill_prices).append(order.price)
PY
qb = QuantBook()
start_date = datetime.max.date()
end_date = datetime.min.date()
if order_data.buy_fill_times:
start_date = min(start_date, min(order_data.buy_fill_times))
start_date -= timedelta(days=3)
all_history = qb.history(list(order_data_by_symbol.keys()), start_date, end_date, Resolution.DAILY)
4. Create a candlestick plot for each security and annotate each plot with buy and sell markers.
PY
import plotly.express as px
import plotly.graph_objects as go
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'],
name='Price')
layout = go.Layout(title=go.layout.Title(text=f'{symbol.value} Trades'),
xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False,
height=600)
# Plot buys
fig.add_trace(go.Scatter(
x=order_data.buy_fill_times,
y=order_data.buy_fill_prices,
marker=go.scatter.Marker(color='aqua', symbol='triangle-up', size=10),
mode='markers',
name='Buys',
))
# Plot sells
fig.add_trace(go.Scatter(
x=order_data.sell_fill_times,
y=order_data.sell_fill_prices,
name='Sells',
))
fig.show()
Note: The preceding plots only show the last fill of each trade. If your trade has partial fills, the plots only
Plot Metadata
Follow these steps to plot the equity curve, benchmark, and drawdown of a backtest:
PY
The following table provides links to documentation that explains how to get the project Id and backtest Id,
PY
3. Get the "Equity", "Equity Drawdown", and "Benchmark" Series from the preceding charts.
PY
equity = equity_chart.series["Equity"].values
drawdown = drawdown_chart.series["Equity Drawdown"].values
benchmark = benchmark_chart.series["Benchmark"].values
PY
df = pd.DataFrame({
"Equity": pd.Series({value.TIME: value.CLOSE for value in equity}),
"Drawdown": pd.Series({value.TIME: value.Y for value in drawdown}),
"Benchmark": pd.Series({value.TIME: value.Y for value in benchmark})
}).ffill()
PY
# Plot the benchmark on the same plot, scale by using another y-axis
ax2 = ax[0].twinx()
ax2.plot(df.index, df["Benchmark"], color="grey")
ax2.set_ylabel("Benchmark Price ($)", color="grey")
Benchmark Benchmark Time series of the benchmark closing price (SPY, by default)
Plot Insights
PY
The following table provides links to documentation that explains how to get the project Id and backtest Id,
depending on the platform you use:
The read_backtest_insights method returns an InsightResponse object, which have the following
properties:
import pytz
def _eastern_time(unix_timestamp):
return unix_timestamp.replace(tzinfo=pytz.utc)\
.astimezone(pytz.timezone('US/Eastern')).replace(tzinfo=None)
insight_df = pd.DataFrame(
[
{
'Symbol': i.symbol,
'Direction': i.direction,
'Generated Time': _eastern_time(i.generated_time_utc),
'Close Time': _eastern_time(i.close_time_utc),
'Weight': i.weight
}
for i in insight_response.insights
]
)
PY
symbols = list(insight_df['Symbol'].unique())
qb = QuantBook()
history = qb.history(
symbols, insight_df['Generated Time'].min()-timedelta(1),
insight_df['Close Time'].max(), Resolution.DAILY
)['close'].unstack(0)
PY
Meta Analysis
Optimization Analysis
Introduction
Load your optimization results into the Research Environment to analyze how different combinations of parameters
affect the algorithm's performance.
To get the results of an optimization, call the read_optimization method with the optimization Id.
PY
optimization = api.read_optimization(optimization_id)
The following table provides links to documentation that explains how to get the optimization Id, depending on the
Platform Optimization Id
CLI
The read_optimization method returns an Optimization object, which have the following attributes:
Meta Analysis > Live Analysis
Meta Analysis
Live Analysis
Introduction
Load your live trading results into the Research Environment to compare live trading performance against
To get the results of a live algorithm, call the read_live_algorithm method with the project Id and deployment ID.
PY
The following table provides links to documentation that explains how to get the project Id and deployment Id,
The read_live_algorithm method returns a LiveAlgorithmResults object, which have the following attributes:
Reconciliation
Reconciliation is a way to quantify the difference between an algorithm's live performance and its out-of-sample
Seeing the difference between live performance and OOS performance gives you a way to determine if the
algorithm is making unrealistic assumptions, exploiting data differences, or merely exhibiting behavior that is
A perfectly reconciled algorithm has an exact overlap between its live equity and OOS backtest curves. Any
deviation means that the performance of the algorithm has differed for some reason. Several factors can
Dynamic Time Warp (DTW) Distance quantifies the difference between two time-series. It is an algorithm that
measures the shortest path between the points of two time-series. It uses Euclidean distance as a measurement of
point-to-point distance and returns an overall measurement of the distance on the scale of the initial time-series
values. We apply DTW to the returns curve of the live and OOS performance, so the DTW distance measurement is
L
∑
{ ( )
DTW(X, Y) = min l= 1 xm − yn 2 ∈ PN× M
l l }
For the reasons outlined in our research notebook on the topic (linked below), QuantConnect annualizes the daily
DTW. An annualized distance provides a user with a measurement of the annual difference in the magnitude of
returns between the two curves. A perfect score is 0, meaning the returns for each day were precisely the same. A
DTW score of 0 is nearly impossible to achieve, and we consider anything below 0.2 to be a decent score. A
distance of 0.2 means the returns between an algorithm's live and OOS performance deviated by 20% over a year.
Returns correlation is the simple Pearson correlation between the live and OOS returns. Correlation gives us a
rudimentary understanding of how the returns move together. Do they trend up and down at the same time? Do
cov(X, Y)
σXσY
ρXY =
An algorithm's returns correlation should be as close to 1 as possible. We consider a good score to be 0.8 or
above, meaning that there is a strong positive correlation. This indicates that the returns move together most of
the time and that for any given return you see from one of the curves, the other curve usually has a similar
direction return (positive or negative).
Each measurement provides insight into distinct elements of time-series similarity, but neither measurement alone
gives us the whole picture. Returns correlation tells us whether or not the live and OOS returns move together, but
it doesn't account for the possible differences in the magnitude of the returns. DTW distance measures the
difference in magnitude of returns but provides no insight into whether or not the returns move in the same
direction. It is possible for there to be two cases of equity curve similarity where both pairs have the same DTW
distance, but one has perfectly negatively correlated returns, and the other has a perfectly positive correlation.
Similarly, it is possible for two pairs of equity curves to each have perfect correlation but substantially different
DTW distance. Having both measurements provides us with a more comprehensive understanding of the actual
similarity between live and OOS performance. We outline several interesting cases and go into more depth on the
Follow these steps to plot the daily order fills of a live algorithm:
PY
orders = api.read_live_orders(project_id)
The following table provides links to documentation that explains how to get the project Id, depending on the
Platform Project Id
By default, the orders with an ID between 0 and 100. To get orders with an ID greater than 100, pass start
and end arguments to the read_live_orders method. Note that end - start must be less than 100.
PY
The read_live_orders method returns a list of Order objects, which have the following properties:
2. Organize the trade times and prices for each security into a dictionary.
PY
class OrderData:
def __init__(self):
self.buy_fill_times = []
self.buy_fill_prices = []
self.sell_fill_times = []
self.sell_fill_prices = []
order_data_by_symbol = {}
order_data_by_symbol[order.symbol] = OrderData()
order_data = order_data_by_symbol[order.symbol]
order_data.sell_fill_times).append(order.last_fill_time.date())
(order_data.buy_fill_prices if is_buy else order_data.sell_fill_prices).append(order.price)
PY
qb = QuantBook()
start_date = datetime.max.date()
end_date = datetime.min.date()
if order_data.sell_fill_times:
start_date -= timedelta(days=3)
all_history = qb.history(list(order_data_by_symbol.keys()), start_date, end_date, Resolution.DAILY)
4. Create a candlestick plot for each security and annotate each plot with buy and sell markers.
PY
import plotly.express as px
import plotly.graph_objects as go
candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'],
name='Price')
yaxis_title='Price',
xaxis_rangeslider_visible=False,
height=600)
# Plot buys
fig.add_trace(go.Scatter(
x=order_data.buy_fill_times,
y=order_data.buy_fill_prices,
name='Buys',
))
# Plot sells
fig.add_trace(go.Scatter(
x=order_data.sell_fill_times,
y=order_data.sell_fill_prices,
name='Sells',
))
fig.show()
Note: The preceding plots only show the last fill of each trade. If your trade has partial fills, the plots only
Plot Metadata
Follow these steps to plot the equity curve, benchmark, and drawdown of a live algorithm:
PY
The following table provides links to documentation that explains how to get the project Id and deployment Id,
PY
results = live_algorithm.live_results.results
PY
4. Get the "Equity", "Equity Drawdown", and "Benchmark" Series from the preceding charts.
PY
equity = equity_chart.series["Equity"].values
drawdown = drawdown_chart.series["Equity Drawdown"].values
benchmark = benchmark_chart.series["Benchmark"].values
PY
df = pd.DataFrame({
"Equity": pd.Series({value.TIME: value.CLOSE for value in equity}),
"Drawdown": pd.Series({value.TIME: value.Y for value in drawdown}),
"Benchmark": pd.Series({value.TIME: value.Y for value in benchmark})
}).ffill()
# Plot the benchmark on the same plot, scale by using another y-axis
ax2 = ax[0].twinx()
ax2.plot(df.index, df["Benchmark"], color="grey")
ax2.set_ylabel("Benchmark Price ($)", color="grey")
The following table shows all the chart series you can plot:
Chart Series Description
Benchmark Benchmark Time series of the benchmark closing price (SPY, by default)
Meta Analysis
Live Deployment Automation
Introduction
This page explains how use QuantConnect API in an interactive notebook to deploy and stop a set of live trading
algorithms in QC Cloud.
To automate live deployments for multiple projects, save the projects under a single directory in QuantConnect
Cloud. This tutorial assumes you save all the projects under a /Live directory.
Follow the below steps to get the project Ids of all projects under the /Live directory:
PY
list_project_response = api.list_projects()
PY
project_ids = [
project.project_id for project in list_project_response.projects
if project.name.split("/")[0] == "Live"
]
Follow these steps to progromatically deploy the preceding projects with the QuantConnect API:
1. Compile all the projects and cache the compilation Ids with a dictionary.
PY
compile_id_by_project_id = {}
for project_id in project_ids:
compile_response = api.create_compile(project_id)
if not compile_response.success:
print(f"Errors compiling project {project_id}: \n{compile_response.errors}")
else:
compile_id_by_project_id[project_id] = compile_response.compile_id
2. Get the Ids of all the live nodes that are available and sort them by their speed.
PY
live_nodes = []
node_response = api.read_project_nodes(project_ids[0])
if not node_response.success:
print(f"Error getting nodes: \n{node_response.errors}")
else:
nodes = sorted(
[node for node in node_response.nodes.live_nodes if not node.busy],
key=lambda node: node.speed,
reverse=True
)
node_ids = [node.id for node in nodes]
Check the length of node_ids is greater than 0 to ensure there are live nodes available.
PY
base_live_algorithm_settings = {
"id": "QuantConnectBrokerage",
"user": "",
"password": "",
"environment": "paper",
"account": ""
}
version_id = "-1" # Master branch
4. Deploy the projects and cache the project Ids of the successful deployments.
PY
deployed_ids = []
if not live_response.success:
print(f"Errors deploying project {project_id}: \n{live_response.errors}")
else:
print(f"Deployed {project_id}")
deployed_ids.append(project_id)
To stop multiple live algorithms from an interactive notebook through the QuantConnect API, call the
Applying Research
Applying Research
Key Concepts
Introduction
The ultimate goal of research is to produce a strategy that you can backtest and eventually trade live. Once you've
developed a hypothesis that you're confident in, you can start working towards exporting your research into
backtesting. To export the code, you need to replace QuantBook() with self and replace the QuantBook methods
Workflow
Imagine that you've developed the following hypothesis: stocks that are below 1 standard deviation of their 30-day
mean are due to revert and increase in value. The following Research Environment code picks out such stocks
from a preselected basket of stocks:
PY
import numpy as np
qb = QuantBook()
symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]
for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol
# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).tail(1)
# Get the std values for the True values (used for magnitude)
magnitude = df.std().transpose()[classifier_indexes].values
Once you are confident in your hypothesis, you can export this code into the backtesting environment. The
algorithm will ultimately go long on the stocks that pass the classifier logic. One way to accommodate this model
into a backtest is to create a Scheduled Event that uses the model to pick stocks and place orders.
PY
self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())
self.symbols = {}
Now that the initialize method of the algorithm is set, export the model into the Scheduled Event method. You
just need to switch qb with self and replace QuantBook methods with their QCAlgorithm counterparts. In this
example, you don't need to switch any methods because the model only uses methods that exist in QCAlgorithm .
PY
def every_day_after_market_open(self):
qb = self
# Fetch history on our universe
df = qb.history(qb.securities.keys(), 5, Resolution.DAILY)
# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).tail(1)
# Get the std values for the True values (used for magnitude)
magnitude = df.std().transpose()[classifier_indexes].values
# ==============================
insights = []
self.emit_insights(insights)
With the Research Environment model now in the backtesting environment, you can further analyze its
performance with its backtesting metrics . If you are confident in the backtest, you can eventually live trade this
strategy.
To view full examples of this Research to Production workflow, see the examples in the menu.
Contribute Tutorials
If you contribute Research to Production tutorials, you'll get the following benefits:
A QCC reward
You'll learn the Research to Production methodology to improve your own strategy research and
development
To view the topics the community wants Research to Production tutorials for, see the issues with the WishList tag
in the Research GitHub repository . If you find a topic you want to create a tutorial for, make a pull request to the
Applying Research
Mean Reversion
Introduction
This page explains how to you can use the Research Environment to develop and test a Mean Reversion
hypothesis, then put the hypothesis in production.
Create Hypothesis
Imagine that we've developed the following hypothesis: stocks that are below 1 standard deviation of their 30-
day-mean are due to revert and increase in value, statistically around 85% chance if we assume the return series
is stationary and the price series is a Random Process. We've developed the following code in research to pick out
such stocks from a preselected basket of stocks.
Import Libraries
We'll need to import libraries to help with data processing. Import numpy and scipy libraries by the following:
PY
import numpy as np
from scipy.stats import norm, zscore
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
3. Call the add_equity method with the tickers, and their corresponding resolution.
PY
for i in range(len(assets)):
qb.add_equity(assets[i],Resolution.MINUTE)
If you do not pass a resolution argument, Resolution.MINUTE is used by default.
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
Prepare Data
We'll have to process our data to get an extent of the signal on how much the stock is deviated from its norm for
each ticker.
1. Select the close column and then call the unstack method.
PY
df = history['close'].unstack(level=0)
2. Calculate the truth value of the most recent price being less than 1 standard deviation away from the mean
price.
PY
3. Get the z-score for the True values, then compute the expected return and probability (used for Insight
PY
z_score = df.apply(zscore)[classifier]
magnitude = -z_score * df.rolling(30).std() / df.shift(1)
confidence = (-z_score).apply(norm.cdf)
4. Call fillna to fill NaNs with 0.
PY
magnitude.fillna(0, inplace=True)
confidence.fillna(0, inplace=True)
5. Get our trading weight, we'd take a long only portfolio and normalized to total weight = 1.
PY
Test Hypothesis
We would test the performance of this strategy. To do so, we would make use of the calculated weight for portfolio
optimization.
PY
PY
PY
total_ret.index = weight.index
PY
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into research is to create a scheduled event which uses our model to pick stocks and goes long.
PY
self.set_portfolio_construction(InsightWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
PY
# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).iloc[-1]
if not classifier.any(): return
# Get the z-score for the True values, then compute the expected return and probability
z_score = df.apply(zscore)[[classifier.index[i] for i in range(classifier.size) if
classifier.iloc[i]]]
# ==============================
insights = []
self.EmitInsights(insights)
Applying Research
Random Forest Regression
Introduction
This page explains how to you can use the Research Environment to develop and test a Random Forest Regression
Create Hypothesis
We've assumed the price data is a time series with some auto regressive property (i.e. its expectation is related to
past price information). Therefore, by using past information, we could predict the next price level. One way to do
so is by Random Forest Regression, which is a supervised machine learning algorithm where its weight and bias is
Import Libraries
We'll need to import libraries to help with data processing and machine learning. Import sklearn , numpy and
matplotlib libraries by the following:
PY
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]
3. Call the add_equity method with the tickers, and their corresponding resolution. Then store their Symbol s.
PY
for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
Prepare Data
We'll have to process our data as well as to build the ML model before testing the hypothesis. Our methodology is
to use fractional differencing close price as the input data in order to (1) provide stationarity, and (2) retain
sufficient extent of variance of the previous price information. We assume d=0.5 is the right balance to do so.
1. Select the close column and then call the unstack method.
PY
df = history['close'].unstack(level=0)
PY
output = df.shift(-1).iloc[:-1]
PY
PY
PY
regressor.fit(X_train, y_train)
Test Hypothesis
We would test the performance of this ML model to see if it could predict 1-step forward price precisely. To do so,
we would compare the predicted and actual prices.
PY
predictions = regressor.predict(X_test)
PY
PY
y_test[col].plot(label="Actual")
predictions[col].plot(label="Prediction")
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return. Since we
could calculate the expected return, we'd use Mean-Variance Optimization for portfolio construction.
PY
self.set_portfolio_construction(MeanVarianceOptimizationPortfolioConstructionModel(portfolio_bias =
PortfolioBias.LONG,
period=252))
self.set_execution(ImmediateExecutionModel())
We'll also need to create a function to train and update our model from time to time.
PY
# Select the close column and then call the unstack method.
df = history['close'].unstack(level=0)
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
qb = self
# Fetch history on our universe
df = qb.History(qb.Securities.Keys, 2, Resolution.Daily)
if df.empty: return
# ==============================
insights = []
for i in range(len(predictions)):
insights.append( Insight.Price(self.assets[i], timedelta(days=1), InsightDirection.Up,
predictions[i]) )
self.EmitInsights(insights)
Applying Research
Uncorrelated Assets
Introduction
This page explains how to you can use the Research Environment to develop and test a Uncorrelated Assets
Create Hypothesis
According to Modern Portfolio Thoery, asset combinations with negative or very low correlation could have lower
total portfolio variance given the same level of return. Thus, uncorrelated assets allows you to find a portfolio that
will, theoretically, be more diversified and resilient to extreme market events. We're testing this statement in real
life scenario, while hypothesizing a portfolio with uncorrelated assets could be a consistent portfolio. In this
example, we'll compare the performance of 5-least-correlated-asset portfolio (proposed) and 5-most-correlated-
Import Libraries
We'll need to import libraries to help with data processing and visualization. Import numpy and matplotlib libraries
by the following:
PY
import numpy as np
from matplotlib import pyplot as plt
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
3. Call the add_equity method with the tickers, and their corresponding resolution.
PY
for i in range(len(assets)):
qb.add_equity(assets[i],Resolution.MINUTE)
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
Prepare Data
We'll have to process our data to get their correlation and select the least and most related ones.
1. Select the close column and then call the unstack method, then call pct_change to compute the daily return.
PY
returns = history['close'].unstack(level=0).pct_change().iloc[1:]
Test Hypothesis
To test the hypothesis: Our desired outcome would be a consistent and low fluctuation equity curve should be
1. Construct a equal weighting portfolio for the 5-uncorrelated-asset-portfolio and the 5-correlated-asset-
portfolio (benchmark).
PY
PY
PY
plt.figure(figsize=(15, 10))
total_ret.plot(label='Proposed')
total_ret_bench.plot(label='Benchmark')
plt.title('Equity Curve')
plt.legend()
plt.show()
-image
We can clearly see from the results, the proposed uncorrelated-asset-portfolio has a lower variance/fluctuation,
thus more consistent than the benchmark. This proven our hypothesis.
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into research is to create a scheduled event which uses our model to pick stocks and goes long.
PY
self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())
# Set Scheduled Event Method For Our Model. In this example, we'll rebalance every month.
self.schedule.on(self.date_rules.month_start(),
self.time_rules.before_market_close("SHY", 5),
self.every_day_before_market_close)
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
# Select the close column and then call the unstack method, then call pct_change to compute the
daily return.
returns = history['close'].unstack(level=0).pct_change().iloc[1:]
# Get correlation
correlation = returns.corr()
# ==============================
insights = []
self.emit_insights(insights)
Applying Research
Kalman Filters and Stat Arb
Introduction
This page explains how to you can use the Research Environment to develop and test a Kalman Filters and
Create Hypothesis
In finance, we can often observe that 2 stocks with similar background and fundamentals (e.g. AAPL vs MSFT, SPY
vs QQQ) move in similar manner. They could be correlated, although not necessary, but their price difference/sum
(spread) is stationary. We call this cointegration. Thus, we could hypothesize that extreme spread could provide
chance for arbitrage, just like a mean reversion of spread. This is known as pairs trading. Likewise, this could also
be applied to more than 2 assets, this is known as statistical arbitrage.
However, although the fluctuation of the spread is stationary, the mean of the spread could be changing by time
due to different reasons. Thus, it is important to update our expectation on the spread in order to go in and out of
the market in time, as the profit margin of this type of short-window trading is tight. Kalman Filter could come in
handy in this situation. We can consider it as an updater of the underlying return Markov Chain's expectation,
In this example, we're making a hypothesis on trading the spread on cointegrated assets is profitable. We'll be
using forex pairs EURUSD, GBPUSD, USDCAD, USDHKD and USDJPY for this example, skipping the normalized
Import Libraries
We'll need to import libraries to help with data processing, model building, validation and visualization. Import arch
, pykalman , scipy , statsmodels , numpy , matplotlib and pandas libraries by the following:
PY
import numpy as np
from matplotlib import pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
PY
qb = QuantBook()
PY
3. Call the add_forex method with the tickers, and their corresponding resolution. Then store their Symbol s.
PY
for i in range(len(assets)):
qb.add_forex(assets[i],Resolution.MINUTE)
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
Cointegration
We'll have to test if the assets are cointegrated. If so, we'll have to obtain the cointegration vector(s).
Cointegration Testing
1. Select the close column and then call the unstack method.
PY
df = history['close'].unstack(level=0)
2. Call np.log to convert the close price into log-price series to eliminate compounding effect.
PY
log_price = np.log(data)
PY
It shows a p-value < 0.05 for the unit test, with lag-level 0. This proven the log price series are cointegrated
in realtime. The spread of the 5 forex pairs are stationary.
1. Initialize a VECM model by following the unit test parameters, then fit to our data.
PY
2. Obtain the Beta attribute. This is the cointegration subspaces' unit vectors.
PY
beta = vecm_result.beta
PY
Although the 4 cointegratoin subspaces are not looking stationarym, we can optimize for a mean-reverting
portfolio by putting various weights in different subspaces. We use the Portmanteau statistics as a proxy for the
with s is spread, v is predetermined desirable variance level (the larger the higher the profit, but lower the trading
frequency)
1. We set the weight on each vector is between -1 and 1. While overall sum is 0.
PY
PY
PY
opt.x = opt.x/np.sum(abs(opt.x))
for i in range(len(opt.x)):
print(f"The weight put on {i+1}th normalized cointegrating subspace: {opt.x[i]}")
PY
The weighted spread looks more stationary. However, the fluctuation half-life is very long accrossing zero. We aim
to trade as much as we can to maximize the profit of this strategy. Kalman Filter then comes into the play. It could
modify the expectation of the next step based on smoothening the prediction and actual probability distribution of
return.
Image Source: Understanding Kalman Filters, Part 3: An Optimal State Estimator. Melda Ulusoy (2017). MathWorks.
Retreived from: https://www.mathworks.com/videos/understanding-kalman-filters-part-3-optimal-state-
estimator--1490710645421.html
1. Initialize a KalmanFilter .
In this example, we use the first 20 data points to optimize its initial state. We assume the market has no
regime change so that the transitional matrix and observation matrix is [1].
PY
PY
currentMean = filtered_state_means[-1, :]
currentCov = filtered_state_covariances[-1, :]
3. Initialize a mean series for spread normalization using the KalmanFilter 's results.
PY
mean_series = np.array([None]*(new_spread.shape[0]-100))
PY
PY
PY
plt.figure(figsize=(15, 10))
plt.plot(normalized_spread, label="Processed spread")
plt.title("Normalized spread series")
plt.ylabel("Spread - Expectation")
plt.legend()
plt.show()
Determine Trading Threshold
Now we need to determine the threshold of entry. We want to maximize profit from each trade (variance of spread)
‖ ˉf − f ‖2 + λ ‖ Df ‖22
2
minimizef
fj T
where =
[ ]
1 −1
1 −1
D= ∈ R (j− 1) × j
⋱ ⋱
1 −1
ˉ
so f ∗ = (I + λD TD) − 1 f
PY
PY
f_bar = np.array([None]*50)
for i in range(50):
f_bar[i] = len(normalized_spread.values[normalized_spread.values > s0[i]]) /
normalized_spread.shape[0]
3. Set trading frequency matrix.
PY
D = np.zeros((49, 50))
for i in range(D.shape[0]):
D[i, i] = 1
D[i, i+1] = -1
PY
l = 1.0
PY
PY
threshold = s0[s_star.index(max(s_star))]
print(f"The optimal threshold is {threshold}")
PY
plt.figure(figsize=(15, 10))
plt.plot(s0, s_star)
plt.title("Profit of mean-revertion trading")
plt.xlabel("Threshold")
plt.ylabel("Profit")
plt.show()
Test Hypothesis
1. Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.
PY
PY
3. Set the buy and sell preiod when the spread exceeds the threshold.
PY
PY
value = equity.cumprod()
PY
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return.
PY
# Set Scheduled Event Method For Recalibrate Our Model Every Week.
self.schedule.on(self.date_rules.week_start(),
self.time_rules.at(0, 0),
self.recalibrate)
We'll also need to create a function to train and update our model from time to time. We will switch qb with self
and replace methods with their QCAlgorithm counterparts as needed. In this example, this is not an issue because
PY
# Select the close column and then call the unstack method
data = history['close'].unstack(level=0)
# Obtain the Beta attribute. This is the cointegration subspaces' unit vectors.
beta = vecm_result.beta
# Initialize a mean series for spread normalization using the Kalman Filter's results.
mean_series = np.array([None]*(new_spread.shape[0]-20))
# Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.
trading_weight = beta @ opt.x
self.trading_weight = trading_weight / np.sum(abs(trading_weight))
Now we export our model into the scheduled event method for trading. We will switch qb with self and replace
methods with their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the
PY
# Get the real-time log close price for all assets and store in a Series
series = pd.Series()
for symbol in qb.Securities.Keys:
series[symbol] = np.log(qb.Securities[symbol].Close)
# ==============================
# Mean-reversion
if normalized_spread < -self.threshold:
orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], self.trading_weight[i]))
self.SetHoldings(orders)
self.state = 1
self.state = -1
self.state = 0
Reference
1. A Signal Processing Perspective on Financial Engineering. Y. Feng, D. P. Palomer (2016). Foundations and
Applying Research
PCA and Pairs Trading
Introduction
This page explains how to you can use the Research Environment to develop and test a Principle Component
Create Hypothesis
Principal Component Analysis (PCA) a way of mapping the existing dataset into a new "space", where the
dimensions of the new data are linearly-independent, orthogonal vectors. PCA eliminates the problem of
multicollinearity. In another way of thought, can we actually make use of the collinearity it implied, to find the
Import Libraries
We'll need to import libraries to help with data processing, validation and visualization. Import sklearn , arch ,
PY
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]
3. Call the add_equity method with the tickers, and their corresponding resolution. Then store their Symbol s.
PY
for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
Prepare Data
We'll have to process our data to get the principle component unit vector that explains the most variance, then find
the highest- and lowest-absolute-weighing assets as the pair, since the lowest one's variance is mostly explained
by the highest.
1. Select the close column and then call the unstack method.
PY
close_price = history['close'].unstack(level=0)
PY
returns = close_price.pct_change().iloc[1:]
3. Initialize a PCA model, then get the principle components by the maximum likelihood.
PY
pca = PCA()
pca.fit(returns)
4. Get the number of principle component in a list, and their corresponding explained variance ratio.
PY
PY
plt.figure(figsize=(15, 10))
plt.bar(components, explained_variance_pct)
plt.title("Ratio of Explained Variance")
plt.xlabel("Principle Component #")
plt.ylabel("%")
plt.show()
We can see over 95% of the variance is explained by the first principle. We could conclude that collinearity
exists and most assets' return are correlated. Now, we can extract the 2 most correlated pairs.
PY
first_component = pca.components_[0, :]
7. Select the highest- and lowest-absolute-weighing asset.
PY
highest = assets[abs(first_component).argmax()]
lowest = assets[abs(first_component).argmin()]
print(f'The highest-absolute-weighing asset: {highest}\nThe lowest-absolute-weighing asset:
{lowest}')
PY
plt.figure(figsize=(15, 10))
plt.bar(assets, first_component)
plt.title("Weightings of each asset in the first component")
plt.xlabel("Assets")
plt.ylabel("Weighting")
plt.xticks(rotation=30)
plt.show()
Test Hypothesis
We now selected 2 assets as candidate for pair-trading. Hence, we're going to test if they are cointegrated and
PY
PY
coint_vector = coint_result.cointegrating_vector[:2]
PY
PY
PY
trading.
Set Up Algorithm
Pairs trading is exactly a 2-asset version of statistical arbitrage. Thus, we can just modify the algorithm from the
Kalman Filter and Statistical Arbitrage tutorial , except we're using only a single cointegrating unit vector so no
optimization of cointegration subspace is needed.
PY
def initialize(self):
def recalibrate(self):
qb = self
history = qb.history(self.assets, 252*2, Resolution.DAILY)
if history.empty: return
# Select the close column and then call the unstack method
data = history['close'].unstack(level=0)
# Initialize a mean series for spread normalization using the Kalman Filter's results.
mean_series = np.array([None]*(spread.shape[0]-20))
# Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.
self.trading_weight = coint_vector / np.sum(abs(coint_vector))
def every_day_before_market_close(self):
qb = self
# Get the real-time log close price for all assets and store in a Series
series = pd.Series()
for symbol in qb.securities.Keys:
series[symbol] = np.log(qb.securities[symbol].close)
# ==============================
# Mean-reversion
if normalized_spread < -self.threshold:
orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], self.trading_weight[i]))
self.set_holdings(orders)
self.state = 1
self.state = -1
self.state = 0
Applying Research
Hidden Markov Models
Introduction
This page explains how to you can use the Research Environment to develop and test a Hidden Markov Model
Create Hypothesis
A Markov process is a stochastic process where the possibility of switching to another state depends only on the
current state of the model by the current state's probability distribution (it is usually represented by a state
transition matrix). It is history-independent, or memoryless. While often a Markov process's state is observable,
the states of a Hidden Markov Model (HMM) is not observable. This means the input(s) and output(s) are
A 3-state HMM example, where S are the hidden states, O are the observable states and a are the probabilities of
state transition.
Image source: Modeling Strategic Use of Human Computer Interfaces with Novel Hidden Markov Models. L. J.
Mariano, et. al. (2015). Frontiers in Psychology 6:919. DOI:10.3389/fpsyg.2015.00919
In finance, HMM is particularly useful in determining the market regime, usually classified into "Bull" and "Bear"
markets. Another popular classification is "Volatile" vs "Involatile" market, such that we can avoid entering the
market when it is too risky. We hypothesis a HMM could be able to do the later, so we can produce a SPY-out-
Import Libraries
We'll need to import libraries to help with data processing, validation and visualization. Import statsmodels , scipy
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
asset = "SPX"
3. Call the add_index method with the tickers, and their corresponding resolution.
PY
qb.add_index(asset, Resolution.MINUTE)
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
historical data for the symbol.
PY
We'll have to process our data to get the volatility of the market for classification.
1. Select the close column and then call the unstack method.
PY
close_price = history['close'].unstack(level=0)
PY
returns = close_price.pct_change().iloc[1:]
3. Initialize the HMM, then fit by the daily return data. Note that we're using varinace as switching regime, so
PY
is smaller than 0.05, indicating the model should be able to classify the data into 2 different volatility regimes.
Test Hypothesis
We now verify if the model can detect high and low volatility period effectively.
1. Get the regime as a column, 1 as Low Variance Regime, 2 as High Variance Regime.
PY
regime = pd.Series(model.smoothed_marginal_probabilities.values.argmax(axis=1)+1,
index=returns.index, name='regime')
df_1 = close.loc[returns.index][regime == 1]
df_2 = close.loc[returns.index][regime == 2]
2. Get the mean and covariance matrix of the 2 regimes, assume 0 covariance between the two.
PY
3. Fit a 2-dimensional multivariate normal distribution by the 2 means and covriance matrix.
PY
PY
PY
df_1.index = pd.to_datetime(df_1.index)
df_1 = df_1.sort_index()
df_2.index = pd.to_datetime(df_2.index)
df_2 = df_2.sort_index()
plt.figure(figsize=(15, 10))
plt.scatter(df_1.index, df_1, color='blue', label="Low Variance Regime")
plt.scatter(df_2.index, df_2, color='red', label="High Variance Regime")
plt.title("Price series")
plt.ylabel("Price ($)")
plt.xlabel("Date")
plt.legend()
plt.show()
PY
PY
plt.figure(figsize=(12, 8))
plt.contourf(X, Y, pdf, cmap = 'viridis')
plt.xlabel("Low Volatility Regime")
plt.ylabel("High Volatility Regime")
plt.title('Bivariate normal distribution of the Regimes')
plt.tight_layout()
plt.show()
We can clearly seen from the results, the Low Volatility Regime has much lower variance than the High Volatility
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return. Since we
could calculate the expected return, we'd use Mean-Variance Optimization for portfolio construction.
PY
self.assets = ["SPY", "TLT"] # "TLT" as fix income in out-of-market period (high volatility)
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
PY
# Get history
history = qb.history(["SPY"], datetime(2010, 1, 1), datetime.now(), Resolution.DAILY)
# ==============================
if regime == 0:
self.set_holdings([PortfolioTarget("TLT", 0.), PortfolioTarget("SPY", 1.)])
else:
self.set_holdings([PortfolioTarget("TLT", 1.), PortfolioTarget("SPY", 0.)])
Applying Research
Long Short-Term Memory
Introduction
This page explains how to you can use the Research Environment to develop and test a Long Short Term Memory
Recurrent neural networks (RNN) are a powerful tool in deep learning. These models quite accurately mimic how
humans process sequencial information and learn. Unlike traditional feedforward neural networks, RNNs have
memory. That is, information fed into them persists and the network is able to draw on this to make inferences.
Long Short-term Memory (LSTM) is a type of RNN. Instead of one layer, LSTM cells generally have four, three of
which are part of "gates" -- ways to optionally let information through. The three gates are commonly referred to
as the forget, input, and output gates. The forget gate layer is where the model decides what information to keep
from prior states. At the input gate layer, the model decides which values to update. Finally, the output gate layer is
where the final output of the cell state is decided. Essentially, LSTM separately decides what to remember and the
An exmaple of a LSTM cell: x is the input data, c is the long-term memory, h is the current state and serve as short-
term memory, σ and tanh is the non-linear activation function of the gates.
Create Hypothesis
LSTM models have produced some great results when applied to time-series prediction. One of the central
challenges with conventional time-series models is that, despite trying to account for trends or other non-
stationary elements, it is almost impossible to truly predict an outlier like a recession, flash crash, liquidity crisis,
etc. By having a long memory, LSTM models are better able to capture these difficult trends in the data without
suffering from the level of overfitting a conventional model would need in order to capture the same data.
For a very basic application, we're hypothesizing LSTM can offer an accurate prediction in future price.
Import Libraries
We'll need to import libraries to help with data processing, validation and visualization. Import keras , sklearn ,
PY
import numpy as np
from matplotlib import pyplot as plt
1. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
asset = "SPY"
3. Call the add_equity method with the tickers, and their corresponding resolution.
PY
qb.add_equity(asset, Resolution.MINUTE)
4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
PY
We'll have to process our data as well as build the LSTM model before testing the hypothesis. We would scale our
1. Select the close column and then call the unstack method.
PY
close_price = history['close'].unstack(level=0)
PY
PY
df = pd.DataFrame(scaler.fit_transform(close), index=close.index)
PY
PY
output = df.shift(-1).iloc[:-1]
PY
7. Build feauture and label sets (using number of steps 60, and feature rank 1).
PY
features_set = []
labels = []
for i in range(60, X_train.shape[0]):
features_set.append(X_train.iloc[i-60:i].values.reshape(-1, 1))
labels.append(y_train.iloc[i])
features_set, labels = np.array(features_set), np.array(labels)
features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))
Build Model
PY
model = Sequential()
PY
We use Adam as optimizer for adpative step size and MSE as loss function since it is continuous data.
PY
PY
model.summary()
Note that different training session's results will not be the same since the batch is randomly selected.
PY
We would test the performance of this ML model to see if it could predict 1-step forward price precisely. To do so,
we would compare the predicted and actual prices.
PY
test_features = []
for i in range(60, X_test.shape[0]):
test_features.append(X_test.iloc[i-60:i].values.reshape(-1, 1))
test_features = np.array(test_features)
test_features = np.reshape(test_features, (test_features.shape[0], test_features.shape[1], 1))
2. Make predictions.
PY
predictions = model.predict(test_features)
predictions = scaler.inverse_transform(predictions)
actual = scaler.inverse_transform(y_test.values)
PY
plt.figure(figsize=(15, 10))
plt.plot(actual[60:], color='blue', label='Actual')
plt.plot(predictions , color='red', label='Prediction')
plt.title('Price vs Predicted Price ')
plt.legend()
plt.show()
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return. If we
predict the price will go up, we long SPY, else, we short it.
PY
self.asset = "SPY"
# Set Scheduled Event Method For Our Model Retraining every month
self.schedule.on(self.date_rules.month_start(),
self.time_rules.at(0, 0),
self.build_model)
We'll also need to create a function to train and update our model from time to time.
PY
# Select the close column and then call the unstack method.
close = history['close'].unstack(level=0)
# Build feauture and label sets (using number of steps 60, and feature rank 1)
features_set = []
labels = []
for i in range(60, input_.shape[0]):
features_set.append(input_.iloc[i-60:i].values.reshape(-1, 1))
labels.append(output.iloc[i])
features_set, labels = np.array(features_set), np.array(labels)
features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))
# Compile the model. We use Adam as optimizer for adpative step size and MSE as loss function since
it is continuous data.
self.model.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae', 'acc'])
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
# Prediction
prediction = self.model.predict(input_)
# ==============================
Applying Research
Airline Buybacks
Introduction
This page explains how to you can use the Research Environment to develop and test a Airline Buybacks
Create Hypothesis
Buyback represents a company buy back its own stocks in the market, as (1) management is confident on its own
future, and (2) wants more control over its development. Since usually buyback is in large scale on a schedule, the
Airlines is one of the largest buyback sectors. Major US Airlines use over 90% of their free cashflow to buy back
their own stocks in the recent years. [1] Therefore, we can use airline companies to test the hypothesis of
buybacks would cause price action. In this particular exmaple, we're hypothesizing that difference in buyback
price and close price would suggest price change in certain direction. (we don't know forward return would be in
Import Libraries
We'll need to import libraries to help with data processing, validation and visualization. Import
SmartInsiderTransaction class, statsmodels , sklearn , numpy , pandas and seaborn libraries by the following:
PY
1. Instantiate a QuantBook .
PY
qb = QuantBook()
3. Call the add_equity method with the tickers, and its corresponding resolution. Then call add_data with
SmartInsiderTransaction to subscribe to their buyback transaction data. Save the Symbol s into a dictionary.
PY
symbols = {}
for ticker in assets:
symbol = qb.add_equity(ticker, Resolution.MINUTE).symbol
symbols[symbol] = qb.add_data(SmartInsiderTransaction, symbol).symbol
4. Call the history method with a list of Symbol s for all tickers, time argument(s), and resolution to request
PY
PY
6. Call the history method with a list of SmartInsiderTransaction Symbol s for all tickers, time argument(s),
PY
We'll have to process our data to get the buyback premium/discount% vs forward return data.
1. Select the close column and then call the unstack method.
PY
df = history['close'].unstack(level=0)
spy_close = spy['close'].unstack(level=0)
2. Call pct_change to get the daily return of close price, then shift 1-step backward as prediction.
PY
ret = df.pct_change().shift(-1).iloc[:-1]
ret_spy = spy_close.pct_change().shift(-1).iloc[:-1]
PY
4. Select the ExecutionPrice column and then call the unstack method to get the buyback dataframe.
PY
df_buybacks = history_buybacks['executionprice'].unstack(level=0)
df_buybacks = df_buybacks.groupby(df_buybacks.index.date).mean()
df_buybacks.columns = df.columns
PY
df_close = df.reindex(df_buybacks.index)[~df_buybacks.isna()]
df_buybacks = (df_buybacks - df_close)/df_close
7. Create a Dataframe to hold the buyback and 1-day forward return data.
PY
PY
PY
data.dropna(inplace=True)
Test Hypothesis
We would test (1) if buyback has statistically significant effect on return direction, and (2) buyback could be a
return predictor.
PY
binary_ret = data["Return"].copy()
binary_ret[binary_ret < 0] = 0
binary_ret[binary_ret > 0] = 1
2. Construct a logistic regression model.
PY
PY
display(model.summary())
We can see a p-value of < 0.05 in the logistic regression model, meaning the separation of positive and
PY
plt.figure(figsize=(10, 6))
sns.regplot(x=data["Buybacks"]*100, y=binary_ret, logistic=True, ci=None, line_kws={'label': "
Logistic Regression Line"})
plt.plot([-50, 50], [0.5, 0.5], "r--", label="Selection Cutoff Line")
plt.title("Buyback premium vs Profit/Loss")
plt.xlabel("Buyback premium %")
plt.xlim([-50, 50])
plt.ylabel("Profit/Loss")
plt.legend()
plt.show()
Interesting, from the logistic regression line, we observe that when the airlines brought their stock in premium
price, the price tended to go down, while the opposite for buying back in discount.
PY
predictions = model.predict(data["Buybacks"].values)
for i in range(len(predictions)):
predictions[i] = 1 if predictions[i] > 0.5 else 0
PY
cm = confusion_matrix(binary_ret, predictions)
PY
df_result = pd.DataFrame(cm,
index=pd.MultiIndex.from_tuples([("Prediction", "Positive"),
("Prediction", "Negative")]),
columns=pd.MultiIndex.from_tuples([("Actual", "Positive"), ("Actual",
"Negative")]))
The logistic regression is having a 55.8% accuracy (55% sensitivity and 56.3% specificity), this can suggest
a > 50% win rate before friction costs, proven our hypothesis.
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return.
PY
self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())
# Call the AddEquity method with the tickers, and its corresponding resolution. Then call AddData
with SmartInsiderTransaction to subscribe to their buyback transaction data.
for ticker in assets:
symbol = self.add_equity(ticker, Resolution.MINUTE).symbol
self.symbols[symbol] = self.add_data(SmartInsiderTransaction, symbol).symbol
self.add_equity("SPY")
# Set Scheduled Event Method For Our Model Recalibration every month
self.schedule.on(self.date_rules.month_start(), self.time_rules.at(0, 0), self.build_model)
We'll also need to create a function to train and update the logistic regression model from time to time.
PY
# Call the History method with list of buyback tickers, time argument(s), and resolution to request
buyback data for the symbol.
history_buybacks = qb.History(list(self.symbols.values()), datetime(2015, 1, 1), datetime.now(),
Resolution.Daily)
# Select the close column and then call the unstack method to get the close price dataframe.
df = history['close'].unstack(level=0)
spy_close = spy['close'].unstack(level=0)
# Call pct_change to get the daily return of close price, then shift 1-step backward as prediction.
ret = df.pct_change().shift(-1).iloc[:-1]
ret_spy = spy_close.pct_change().shift(-1).iloc[:-1]
# Select the ExecutionPrice column and then call the unstack method to get the dataframe.
df_buybacks = history_buybacks['executionprice'].unstack(level=0)
# Create a dataframe to hold the buyback and 1-day forward return data
data = pd.DataFrame(columns=["Buybacks", "Return"])
Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
# Select the ExecutionPrice column and then call the unstack method to get the dataframe.
df_buybacks = history_buybacks['executionprice'].unstack(level=0)
# ==============================
insights = []
# Iterate the buyback data, thne pass to the model for prediction
row = df_buybacks.iloc[-1]
for i in range(len(row)):
prediction = self.model.predict(row[i])
# Long if the prediction predict price goes up, short otherwise. Do opposite for SPY (active
return)
if prediction > 0.5:
insights.append( Insight.Price(row.index[i].split(".")[0], timedelta(days=1),
InsightDirection.Up) )
insights.append( Insight.Price("SPY", timedelta(days=1), InsightDirection.Down) )
else:
insights.append( Insight.Price(row.index[i].split(".")[0], timedelta(days=1),
InsightDirection.Down) )
insights.append( Insight.Price("SPY", timedelta(days=1), InsightDirection.Up) )
self.EmitInsights(insights)
Reference
US Airlines Spent 96% of Free Cash Flow on Buybacks: Chart. B. Kochkodin (17 March 2020). Bloomberg.
Applying Research
Sparse Optimization
Introduction
This page explains how to you can use the Research Environment to develop and test a Sparse Optimization Index
Tracking hypothesis, then put the hypothesis in production.
Create Hypothesis
Passive index fund portfolio managers will buy in corresponding weighting of stocks from an index's constituents.
The main idea is allowing market participants to trade an index in a smaller cost. Their performance is measured by
Tracking Error (TE), which is the standard deviation of the active return of the portfolio versus its benchmark
index. The lower the TE means that the portfolio tracks the index very accurately and consistently.
A technique called Sparse Optimization comes into the screen as the portfolio managers want to cut their cost
even lower by trading less frequently and with more liquid stocks. They select a desired group/all constituents
from an index and try to strike a balance between the number of stocks in the portfolio and the TE, like the idea of
L1/L2-normalization.
On the other hand, long-only active fund aimed to beat the benchmark index. Their performance are measured by
the mean-adjusted tracking error, which also take the mean active return into account, so the better fund can be
We can combine the 2 ideas. In this tutorial, we are about to generate our own active fund and try to use Sparse
Optimization to beat QQQ. However, we need a new measure on active fund for this technique -- Downward Risk
(DR). This is a measure just like the tracking error, but taking out the downward period of the index, i.e. we only
want to model the index's upward return, but not downward loss. We would also, for a more robust regression,
combining Huber function as our loss function. This is known as Huber Downward Risk (HDR). Please refer to
Optimization Methods for Financial Index Tracking: From Theory to Practice. K. Benidis, Y. Feng, D. P. Palomer
Import Libraries
We'll need to import libraries to help with data processing and visualization. Import numpy , matplotlib and pandas
PY
import numpy as np
from matplotlib import pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
Get Historical Data
class ETFUniverse:
"""
A class to create a universe of equities from the constituents of an ETF
"""
def __init__(self, etf_ticker, universe_date):
"""
Input:
- etf_ticker
Ticker of the ETF
- universe_date
The date to gather the constituents of the ETF
"""
self.etf_ticker = etf_ticker
self.universe_date = universe_date
Input:
- qb
The QuantBook instance inside the DatasetAnalyzer
Input:
- qb
The QuantBook instance inside the DatasetAnalyzer
- etf_ticker
Ticker of the ETF
- universe_date
The date to gather the constituents of the ETF
2. Instantiate a QuantBook .
PY
qb = QuantBook()
PY
qqq = qb.add_equity("QQQ").symbol
PY
5. Prepare the historical return data of the constituents and the benchmark index to track.
PY
6. Call the history method with a list of SmartInsiderTransaction Symbol s for all tickers, time argument(s),
and resolution to request historical data for the symbols.
PY
Prepare Data
We'll have to process our data and construct the proposed sparse index tracking portfolio.
PY
m = pctChangePortfolio.shape[0]; n = pctChangePortfolio.shape[1]
2. Set up optimization parameters (penalty of exceeding bounds, Huber statistics M-value, penalty weight).
PY
p = 0.5
M = 0.0001
l = 0.01
3. Set up convergence tolerance, maximum iteration of optimization, iteration counter and HDR as minimization
indicator.
PY
tol = 0.001
maxIter = 20
iters = 1
hdr = 10000
PY
w_ = np.array([1/n] * n).reshape(n, 1)
weights = pd.Series()
a = np.array([None] * m).reshape(m, 1)
c = np.array([None] * m).reshape(m, 1)
d = np.array([None] * n).reshape(n, 1)
# Else, we would increase the iteration count and use the current weights for the next
iteration.
iters += 1
hdr = hdr_
PY
for i in range(n):
weights[pctChangePortfolio.columns[i]] = w_[i]
PY
To test the hypothesis. We wish to (1) outcompete the benchmark and (2) the active return is consistent in the in-
1. Obtain the equity curve of our portfolio and normalized benchmark for comparison.
PY
PY
proposed_ret = proposed.pct_change().iloc[1:]
benchmark_ret = benchmark.pct_change().iloc[1:]
active_ret = proposed_ret - benchmark_ret.values
PY
fig, ax = plt.subplots(1, 1)
active_ret["Mean"] = float(active_ret.mean())
active_ret.plot(figsize=(15, 5), title="Active Return", ax=ax)
plt.show()
We can see from the plots, both in- and out-of-sample period the proposed portfolio out preform the benchmark
while remaining a high correlation with it. Although the active return might not be very consistent, but it is a
stationary series above zero. So, in a long run, it does consistently outcompete the QQQ benchmark!
Set Up Algorithm
Once we are confident in our hypothesis, we can export this code into backtesting.
PY
# Add our ETF constituents of the index that we would like to track.
self.QQQ = self.add_equity("QQQ", Resolution.MINUTE).symbol
self.universe_settings.asynchronous = True
self.universe_settings.resolution = Resolution.MINUTE
self.add_universe(self.universe.etf(self.QQQ, self.universe_settings, self.etf_selection))
self.set_benchmark("QQQ")
# Set up varaibles to flag the time to recalibrate and hold the constituents.
self.time = datetime.min
self.assets = []
We'll also need to create a function for getting the ETF constituents.
PY
Now we export our model into the on_data method. We will switch qb with self and replace methods with their
QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in
PY
# Prepare the historical return data of the constituents and the ETF (as index to track).
history = qb.History(self.assets, 252, Resolution.Daily)
if history.empty: return
historyPortfolio = history.close.unstack(0)
pctChangePortfolio = np.log(historyPortfolio/historyPortfolio.shift(1)).dropna()
m = pctChangePortfolio.shape[0]; n = pctChangePortfolio.shape[1]
# Set up convergence tolerance, maximum iteration of optimization, iteration counter and Huber
downward risk as minimization indicator.
tol = 0.001; maxIter = 20; iters = 1; hdr = 10000
# Else, we would increase the iteration count and use the current weights for the next
iteration.
iters += 1
hdr = hdr_
# -----------------------------------------------------------------------------------------
orders = []
for i in range(n):
orders.append(PortfolioTarget(pctChangePortfolio.columns[i], float(w_[i])))
self.SetHoldings(orders)
Reference
Optimization Methods for Financial Index Tracking: From Theory to Practice. K. Benidis, Y. Feng, D. P.
Loading [MathJax]/jax/output/HTML-CSS/jax.js
Loading [MathJax]/jax/output/HTML-CSS/jax.js