0% found this document useful (0 votes)
11 views

Quantconnect Research Environment Python

Uploaded by

cinvoke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Quantconnect Research Environment Python

Uploaded by

cinvoke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 414

Learn to use

QuantConnect
and Explore
Features
RESEARCH ENVIRONMENT

Powerful notebooks
attached to our
massive data
repository
Our data repository is preformatted
and ready to go. Skip expensive
and tedious data processing and
get to work.
Table of Content

1 Key Concepts
1.1 Getting Started
1.2 Research Engine
2 Initialization
3 Datasets
3.1 Key Concepts
3.2 US Equity
3.3 Equity Fundamental Data

3.4 Equity Options


3.4.1 Key Concepts
3.4.2 Universes
3.4.3 Individual Contracts
3.5 Crypto
3.6 Crypto Futures
3.7 Futures
3.8 Futures Options
3.8.1 Key Concepts

3.8.2 Universes
3.8.3 Individual Contracts
3.9 Forex
3.10 CFD
3.11 Indices
3.12 Index Options
3.12.1 Key Concepts
3.12.2 Universes
3.12.3 Individual Contracts
3.13 Alternative Data
3.14 Custom Data
4 Charting
4.1 Bokeh
4.2 Matplotlib

4.3 Plotly
4.4 Seaborn
4.5 Plotly NET
5 Universes
6 Indicators
6.1 Data Point Indicators
6.2 Bar Indicators
6.3 Trade Bar Indicators
6.4 Combining Indicators
6.5 Custom Indicators

6.6 Custom Resolutions


7 Object Store
8 Machine Learning
8.1 Key Concepts
8.2 Popular Libraries
8.2.1 Aesera
8.2.2 GPlearn
8.2.3 Hmmlearn
8.2.4 Keras
8.2.5 PyTorch
8.2.6 Scikit-Learn
8.2.7 Stable Baselines
8.2.8 TensorFlow
8.2.9 Tslearn
8.2.10 XGBoost
8.3 Hugging Face
8.3.1 Key Concepts
9 Debugging

10 Meta Analysis
10.1 Key Concepts
10.2 Backtest Analysis
10.3 Optimization Analysis
10.4 Live Analysis
10.5 Live Deployment Automation
11 Applying Research
11.1 Key Concepts
11.2 Mean Reversion
11.3 Random Forest Regression
11.4 Uncorrelated Assets
11.5 Kalman Filters and Stat Arb
11.6 PCA and Pairs Trading
11.7 Hidden Markov Models
11.8 Long Short-Term Memory
11.9 Airline Buybacks
11.10 Sparse Optimization
Key Concepts

Key Concepts

Key Concepts > Getting Started

Key Concepts
Getting Started

Introduction

The Research Environment is a Jupyter notebook -based, interactive commandline environment where you can
access our data through the QuantBook class. The environment supports both Python and C#. If you use Python,
you can import code from the code files in your project into the Research Environment to aid development.

Before you run backtests, we recommend testing your hypothesis in the Research Environment. It's easier to
perform data analysis and produce plots in the Research Environment than in a backtest.

Before backtesting or live trading with machine learning models, you may find it beneficial to train them in the
Research Environment, save them in the Object Store, and then load them from the Object Store into the
backtesting and live trading environment

In the Research Environment, you can also use the QuantConnect API to import your backtest results for further
analysis.

Example

The following snippet demonstrates how to use the Research Environment to plot the price and Bollinger Bands of
the S&P 500 index ETF, SPY:

PY

# Create a QuantBook
qb = QuantBook()

# Add an asset.
symbol = qb.add_equity("SPY").symbol

# Request some historical data.


history = qb.history(symbol, 360, Resolution.DAILY)

# Calculate the Bollinger Bands.


bbdf = qb.indicator(BollingerBands(30, 2), symbol, 360, Resolution.DAILY)

# Plot the data


bbdf[['price', 'lowerband', 'middleband', 'upperband']].plot();

Open Notebooks

The process to open notebooks depends on if you use the Algorithm Lab , Local Platform , or the CLI .

Run Notebook Cells

Notebooks are a collection of cells where you can write code snippets or MarkDown. To execute a cell, press
Shift+Enter .
The following describes some helpful keyboard shortcuts to speed up your research:

Keyboard Shortcut Description

Shift+Enter Run the selected cell.

a Insert a cell above the selected cell.

b Insert a cell below the selected cell.

x Cut the selected cell.

v Paste the copied or cut cell.

z Undo cell actions.

Stop Nodes

The process to stop Research Environment nodes depends on if you use the Algorithm Lab , Local Platform , or the

CLI .

Add Notebooks

The process to add notebook files depends on if you use the Algorithm Lab , Local Platform , or the CLI .

Rename Notebooks

The process to rename notebook files depends on if you use the Algorithm Lab , Local Platform , or the CLI .

Delete Notebooks

The process to delete notebooks depends on if you use the Algorithm Lab , Local Platform , or the CLI .

Learn Jupyter

The following table lists some helpful resources to learn Jupyter:


Type Name Producer

Text Jupyter Tutorial tutorialspoint

Jupyter Notebook Tutorial: The


Text DataCamp
Definitive Guide

Text An Introduction to DataFrame Microsoft Developer Blogs


Key Concepts > Research Engine

Key Concepts
Research Engine

Introduction

The Research Environment is a Jupyter notebook -based, interactive commandline environment where you can

access our data through the QuantBook class. The environment supports both Python and C#. If you use Python,
you can import code from the code files in your project into the Research Environment to aid development.

Before you run backtests, we recommend testing your hypothesis in the Research Environment. It's easier to
perform data analysis and produce plots in the Research Environment than in a backtest.

Before backtesting or live trading with machine learning models, you may find it beneficial to train them in the

Research Environment, save them in the Object Store, and then load them from the Object Store into the
backtesting and live trading environment

In the Research Environment, you can also use the QuantConnect API to import your backtest results for further
analysis.

Batch vs Stream Analysis

The backtesting environment is an event-based simulation of the market. Backtests aim to provide an accurate
representation of whether a strategy would have performed well in the past, but they are generally slow and aren't

the most efficient way to test the foundational ideas behind strategies. You should only use backtests to verify an

idea after you have already tested it with statistical analysis.

The Research Environment lets you build a strategy by starting with a central hypothesis about the market. For
example, you might hypothesize that an increase in sunshine hours will increase the production of oranges, which

will lead to an increase in the supply of oranges and a decrease in the price of Orange Juice Futures. You can
attempt to confirm this working hypothesis by analyzing weather data, production of oranges data, and the price

of Orange Juice futures. If the hypothesis is confirmed with a degree of statistical significance, you can be
confident in the hypothesis and translate it into an algorithm you can backtest.

Jupyter Notebooks

Jupyter notebooks support interactive data science and scientific computing across various programming

languages. We carry on that philosophy by providing an environment for you to perform exploratory research and
brainstorm new ideas for algorithms. A Jupyter notebook installed in QuantConnect allows you to directly explore

the massive amounts of data that is available in the Dataset Market and analyze it with python or C# commands.
We call this exploratory notebook environment the Research Environment.

Open Notebooks
To open a notebook, open one of the .ipynb files in your cloud projects or see Running Local Research Environment

Execute Code

The notebook allows you to run code in a safe and disposable environment. It's composed of independent cells
where you can write, edit, and execute code. The notebooks support Python, C#, and Markdown code.

Keyboard Shortcuts

The following table describes some useful keyboard shortcuts:

Shortcut Description

Shift+Enter Run the selected cell

a Insert a cell above the selected cell

b Insert a cell below the selected cell

x Cut the selected cell

v Paste the copied or cut cell

z Undo cell actions

Terminate Research Sessions

If you use the Research Environment in QuantConnect Cloud, to terminate a research session, stop the research

node in the Resources panel . If you use the local Research Environment, see Managing Kernels and Terminals in
the JupyterLab documentation.

Your Research and LEAN

To analyze data in a research notebook, create an instance of the QuantBook class. QuantBook is a wrapper on
QCAlgorithm , which means QuantBook allows you to access all the methods available to QCAlgorithm and some

additional methods. The following table describes the helper methods of the QuantBook class that aren't available
in the QCAlgorithm class:
Method Description

UniverseHistory Get historical data for a universe.

Get the expiration, open interest, and price data of


FutureHistory
the contracts in a Futures chain.

Get the strike, expiration, open interest, option right,


OptionHistory
and price data of the contracts in an Options chain.

Indicator Get the values of an indicator for an asset over time.

QuantBook gives you access to the vast amounts of data in the Dataset Market. Similar to backtesting, you can

access that data using history calls. You can also create indicators, consolidate data, and access charting
features. However, keep in mind that event-driven features available in backtesting, like universe selection and

OnData events, are not available in research. After you analyze a dataset in the Research Environment, you can
easily transfer the logic to the backtesting environment. For example, consider the following code in the Research

Environment:

PY

# Initialize QuantBook
qb = QuantBook()

# Subscribe to SPY data with QuantBook


symbol = qb.add_equity("SPY").symbol

# Make history call with QuantBook


history = qb.history(symbol, timedelta(days=10), Resolution.DAILY)

To use the preceding code in a backtest, replace QuantBook() with self .

PY

def initialize(self) -> None:

# Set qb to instance of QCAlgorithm


qb = self

# Subscribe to SPY data with QCAlgorithm


symbol = qb.add_equity("SPY").symbol

# Make history call with QCAlgorithm


history = qb.history(symbol, timedelta(days=10), Resolution.DAILY)

Import Project Code

One of the drawbacks of using the Research Environment you may encounter is the need to rewrite code you've

already written in a file in the backtesting environment. Instead of rewriting the code, you can import the methods
from the backtesting environment into the Research Environment to reduce development time. For example, say

you have the following helpers.py file in your project:


PY

def add(a, b):


return a+b

To import the preceding method into your research notebook, use the import statement.

PY

from helpers import Add

# reuse method from helpers.py


Add(3, 4)

If you adjust the file that you import, restart the Research Environment session to import the latest version of the

file. To restart the Research Environment, stop the research node and then open the notebook again.

Import C# Libraries

This session is reserved for C# notebooks.


Initialization

Initialization

Introduction

Before you request and manipulate historical data in the Research Environment, you should set the notebook

dates, add data subscriptions, and set the time zone.

Set Dates

The start date of your QuantBook determines the latest date of data you get from history requests . By default, the

start date is the current day. To change the start date, call the set_start_date method.

PY

qb.set_start_date(2022, 1, 1)

The end date of your QuantBook should be greater than the end date. By default, the start date is the current day.

To change the end date, call the set_end_date method.

PY

qb.set_end_date(2022, 8, 15)

Add Data

You can subscribe to asset, fundamental, alternative, and custom data. The Dataset Market provides 400TB of
data that you can easily import into your notebooks.

Asset Data

To subscribe to asset data, call one of the asset subscription methods like add_equity or add_forex . Each asset

class has its own method to create subscriptions. For more information about how to create subscriptions for each
asset class, see the Create Subscriptions section of an asset class in the Datasets chapter.

PY

qb.add_equity("SPY") # Add Apple 1 minute bars (minute by default)


qb.add_forex("EURUSD", Resolution.SECOND) # Add EURUSD 1 second bars

Alternative Data

To add alternative datasets to your notebooks, call the add_data method. For a full example, see Alternative Data .

Custom Data

To add custom data to your notebooks, call the add_data method. For more information about custom data, see

Custom Data .
Limitations

There is no official limit to how much data you can add to your notebooks, but there are practical resource
limitations. Each security subscription requires about 5MB of RAM, so larger machines let you request more data.

For more information about our cloud nodes, see Research Nodes .

Set Time Zone

The notebook time zone determines which time zone the datetime objects are in when you make a history request

based on a defined period of time. When your history request returns a DataFrame , the timestamps in the
DataFrame are based on the data time zone . When your history request returns a TradeBars , QuoteBars , Ticks ,

or Slice object, the time properties of these objects are based on the notebook time zone, but the end_time

properties of the individual TradeBar , QuoteBar , and Tick objects are based on the data time zone.

The default time zone is Eastern Time (ET), which is UTC-4 in summer and UTC-5 in winter. To set a different time
zone, call the set_time_zone method. This method accepts either a string following the IANA Time Zone database

convention or a NodaTime .DateTimeZone object. If you pass a string, the method converts it to a

NodaTime.DateTimeZone object. The TimeZones class provides the following helper attributes to create

NodaTime.DateTimeZone objects:

PY

qb.set_time_zone("Europe/London")
qb.set_time_zone(TimeZones.CHICAGO)
Datasets

Datasets

Datasets > Key Concepts

Datasets
Key Concepts

Introduction

You can access most of the data from the Dataset Market in the Research Environment. The data includes Equity,

Crypto, Forex, and derivative data going back as far as 1998. Similar to backtesting, to access the data, create a

security subscription and then make a history request.

Key History Concepts

The historical data API has many different options to give you the greatest flexibility in how to apply it to your
algorithm.

Time Period Options

You can request historical data based on a trailing number of bars, a trailing period of time, or a defined period of

time. If you request data in a defined period of time, the datetime objects you provide are based in the notebook
time zone .

Return Formats

Each asset class supports slightly different data formats. When you make a history request, consider what data

returns. Depending on how you request the data, history requests return a specific data type. For example, if you
don't provide Symbol objects, you get Slice objects that contain all of the assets you created subscriptions for in

the notebook.

The most popular return type is a DataFrame . If you request a DataFrame , LEAN unpacks the data from Slice

objects to populate the DataFrame . If you intend to use the data in the DataFrame to create TradeBar or QuoteBar
objects, request that the history request returns the data type you need. Otherwise, LEAN will waste computational

resources populating the DataFrame .

Time Index

When your history request returns a DataFrame , the timestamps in the DataFrame are based on the data time zone

. When your history request returns a TradeBars , QuoteBars , Ticks , or Slice object, the time properties of these
objects are based on the notebook time zone, but the end_time properties of the individual TradeBar , QuoteBar ,

and Tick objects are based on the data time zone . The end_time is the end of the sampling period and when the
data is actually available. For daily US Equity data, this results in data points appearing on Saturday and skipping

Monday.

Request Data

The simplest form of history request is for a known set of Symbol objects. History requests return slightly different

data depending on the overload you call. The data that returns is in ascending order from oldest to newest.

Single Symbol History Requests

To request history for a single asset, pass the asset Symbol to the history method. The return type of the method

call depends on the history request [Type] . The following table describes the return type of each request [Type] :

Request Type Return Data Type

No argument DataFrame

TradeBar List[TradeBars]

QuoteBar List[QuoteBars]

Tick List[Ticks]

alternativeDataClass List[ alternativeDataClass ]


(ex: CBOE ) (ex: List[CBOE] )

Each row of the DataFrame represents the prices at a point in time. Each column of the DataFrame is a property of
that price data (for example, open, high, low, and close (OHLC)). If you request a DataFrame object and pass

TradeBar as the first argument, the DataFrame that returns only contains the OHLC and volume columns. If you

request a DataFrame object and pass QuoteBar as the first argument, the DataFrame that returns contains the

OHLC of the bid and ask and it contains OHLC columns, which are the respective means of the bid and ask OHLC
values. If you request a DataFrame and don't pass TradeBar or QuoteBar as the first arugment, the DataFrame that

returns contains columns for all of the data that's available for the given resolution.

PY

# EXAMPLE 1: Requesting By Bar Count: 5 bars at the security resolution:


vix_symbol = qb.add_data(CBOE, "VIX", Resolution.DAILY).symbol
cboe_data = qb.history[CBOE](vix_symbol, 5)

btc_symbol = qb.add_crypto("BTCUSD", Resolution.MINUTE).symbol


trade_bars = qb.history[TradeBar](btc_symbol, 5)
quote_bars = qb.history[QuoteBar](btc_symbol, 5)
trade_bars_df = qb.history(TradeBar, btc_symbol, 5)
quote_bars_df = qb.history(QuoteBar, btc_symbol, 5)
df = qb.history(btc_symbol, 5) # Includes trade and quote data
PY

# EXAMPLE 2: Requesting By Bar Count: 5 bars with a specific resolution:


trade_bars = qb.history[TradeBar](btc_symbol, 5, Resolution.DAILY)
quote_bars = qb.history[QuoteBar](btc_symbol, 5, Resolution.MINUTE)
trade_bars_df = qb.history(TradeBar, btc_symbol, 5, Resolution.MINUTE)
quote_bars_df = qb.history(QuoteBar, btc_symbol, 5, Resolution.MINUTE)
df = qb.history(btc_symbol, 5, Resolution.MINUTE) # Includes trade and quote data

PY

# EXAMPLE 3: Requesting By a Trailing Period: 3 days of data at the security resolution:


eth_symbol = qb.add_crypto('ETHUSD', Resolution.TICK).symbol
ticks = qb.history[Tick](eth_symbol, timedelta(days=3))
ticks_df = qb.history(eth_symbol, timedelta(days=3))

vix_data = qb.history[CBOE](vix_symbol, timedelta(days=3))


trade_bars = qb.history[TradeBar](btc_symbol, timedelta(days=3))
quote_bars = qb.history[QuoteBar](btc_symbol, timedelta(days=3))
trade_bars_df = qb.history(TradeBar, btc_symbol, timedelta(days=3))
quote_bars_df = qb.history(QuoteBar, btc_symbol, timedelta(days=3))
df = qb.history(btc_symbol, timedelta(days=3)) # Includes trade and quote data

PY

# EXAMPLE 4: Requesting By a Trailing Period: 3 days of data with a specific resolution:


trade_bars = qb.history[TradeBar](btc_symbol, timedelta(days=3), Resolution.DAILY)
quote_bars = qb.history[QuoteBar](btc_symbol, timedelta(days=3), Resolution.MINUTE)
ticks = qb.history[Tick](eth_symbol, timedelta(days=3), Resolution.TICK)

trade_bars_df = qb.history(TradeBar, btc_symbol, timedelta(days=3), Resolution.DAILY)


quote_bars_df = qb.history(QuoteBar, btc_symbol, timedelta(days=3), Resolution.MINUTE)
ticks_df = qb.history(eth_symbol, timedelta(days=3), Resolution.TICK)
df = qb.history(btc_symbol, timedelta(days=3), Resolution.HOUR) # Includes trade and quote data

# Important Note: Period history requests are relative to "now" notebook time.
PY

# EXAMPLE 5: Requesting By a Defined Period: 3 days of data at the security resolution:


start_time = datetime(2022, 1, 1)
end_time = datetime(2022, 1, 4)

vix_data = qb.history[CBOE](vix_symbol, start_time, end_time)


trade_bars = qb.history[TradeBar](btc_symbol, start_time, end_time)
quote_bars = qb.history[QuoteBar](btc_symbol, start_time, end_time)
ticks = qb.history[Tick](eth_symbol, start_time, end_time)

trade_bars_df = qb.history(TradeBar, btc_symbol, start_time, end_time)


quote_bars_df = qb.history(QuoteBar, btc_symbol, start_time, end_time)
ticks_df = qb.history(Tick, eth_symbol, start_time, end_time)
df = qb.history(btc_symbol, start_time, end_time) # Includes trade and quote data

PY

# EXAMPLE 6: Requesting By a Defined Period: 3 days of data with a specific resolution:


trade_bars = qb.history[TradeBar](btc_symbol, start_time, end_time, Resolution.DAILY)
quote_bars = qb.history[QuoteBar](btc_symbol, start_time, end_time, Resolution.MINUTE)
ticks = qb.history[Tick](eth_symbol, start_time, end_time, Resolution.TICK)

trade_bars_df = qb.history(TradeBar, btc_symbol, start_time, end_time, Resolution.DAILY)


quote_bars_df = qb.history(QuoteBar, btc_symbol, start_time, end_time, Resolution.MINUTE)
ticks_df = qb.history(eth_symbol, start_time, end_time, Resolution.TICK)
df = qb.history(btc_symbol, start_time, end_time, Resolution.HOUR) # Includes trade and quote data

Multiple Symbol History Requests

To request history for multiple symbols at a time, pass an array of Symbol objects to the same API methods shown

in the preceding section. The return type of the method call depends on the history request [Type] . The following

table describes the return type of each request [Type] :

Request Type Return Data Type

No argument DataFrame

TradeBar List[TradeBars]

QuoteBar List[QuoteBars]

Tick List[Ticks]

alternativeDataClass List[Dict[Symbol, alternativeDataClass ]]


(ex: CBOE ) (ex: List[Dict[Symbol, CBOE]] )
PY

# EXAMPLE 7: Requesting By Bar Count for Multiple Symbols: 2 bars at the security resolution:
vix = qb.add_data[CBOE]("VIX", Resolution.DAILY).symbol
v3m = qb.add_data[CBOE]("VIX3M", Resolution.DAILY).symbol
cboe_data = qb.history[CBOE]([vix, v3m], 2)

ibm = qb.add_equity("IBM", Resolution.MINUTE).symbol


aapl = qb.add_equity("AAPL", Resolution.MINUTE).symbol
trade_bars_list = qb.history[TradeBar]([ibm, aapl], 2)
quote_bars_list = qb.history[QuoteBar]([ibm, aapl], 2)

trade_bars_df = qb.history(TradeBar, [ibm, aapl], 2)


quote_bars_df = qb.history(QuoteBar, [ibm, aapl], 2)
df = qb.history([ibm, aapl], 2) # Includes trade and quote data

PY

# EXAMPLE 8: Requesting By Bar Count for Multiple Symbols: 5 bars with a specific resolution:
trade_bars_list = qb.history[TradeBar]([ibm, aapl], 5, Resolution.DAILY)
quote_bars_list = qb.history[QuoteBar]([ibm, aapl], 5, Resolution.MINUTE)

trade_bars_df = qb.history(TradeBar, [ibm, aapl], 5, Resolution.DAILY)


quote_bars_df = qb.history(QuoteBar, [ibm, aapl], 5, Resolution.MINUTE)
df = qb.history([ibm, aapl], 5, Resolution.DAILY) # Includes trade data only. No quote for daily equity
data

PY

# EXAMPLE 9: Requesting By Trailing Period: 3 days of data at the security resolution:


ticks = qb.history[Tick]([eth_symbol], timedelta(days=3))

trade_bars = qb.history[TradeBar]([btc_symbol], timedelta(days=3))


quote_bars = qb.history[QuoteBar]([btc_symbol], timedelta(days=3))
trade_bars_df = qb.history(TradeBar, [btc_symbol], timedelta(days=3))
quote_bars_df = qb.history(QuoteBar, [btc_symbol], timedelta(days=3))
df = qb.history([btc_symbol], timedelta(days=3)) # Includes trade and quote data
PY

# EXAMPLE 10: Requesting By Defined Period: 3 days of data at the security resolution:
trade_bars = qb.history[TradeBar]([btc_symbol], start_time, end_time)
quote_bars = qb.history[QuoteBar]([btc_symbol], start_time, end_time)
ticks = qb.history[Tick]([eth_symbol], start_time, end_time)
trade_bars_df = qb.history(TradeBar, btc_symbol, start_time, end_time)
quote_bars_df = qb.history(QuoteBar, btc_symbol, start_time, end_time)
ticks_df = qb.history(Tick, eth_symbol, start_time, end_time)
df = qb.history([btc_symbol], start_time, end_time) # Includes trade and quote data

If you request data for multiple securities and you use the TICK request type, each Ticks object in the list of results

only contains the last tick of each security for that particular timeslice .

All Symbol History Requests

You can request history for all the securities you have created subscriptions for in your notebook session. The

parameters are very similar to other history method calls, but the return type is an array of Slice objects. The Slice

object holds all of the results in a sorted enumerable collection that you can iterate over with a loop.

PY

# EXAMPLE 11: Requesting 5 bars for all securities at their respective resolution:

# Create subscriptions
qb.add_equity("IBM", Resolution.DAILY)
qb.add_equity("AAPL", Resolution.DAILY)

# Request history data and enumerate results


slices = qb.history(5)
for s in slices:
print(str(s.time) + " AAPL:" + str(s.bars["AAPL"].close) + " IBM:" + str(s.bars["IBM"].close))
PY

# EXAMPLE 12: Requesting 5 minutes for all securities:

slices = qb.history(timedelta(minutes=5), Resolution.MINUTE)


for s in slices:
print(str(s.time) + " AAPL:" + str(s.bars["AAPL"].close) + " IBM:" + str(s.bars["IBM"].close))

# timedelta history requests are relative to "now" in notebook Time. If you request this data at 16:05,
it returns an empty array because the market is closed.

Assumed Default Values

The following table describes the assumptions of the History API:

Argument Assumption

LEAN guesses the resolution you request by looking


at the securities you already have in your notebook.
If you have a security subscription in your notebook
Resolution with a matching Symbol , the history request uses the
same resolution as the subscription. If you don't
have a security subscription in your notebook with a
matching Symbol , Resolution.MINUTE is the default.

Additional Options

The history method accepts the following additional arguments:


Argument Data Type Description Default Value

True to fill forward


missing data. Otherwise,
false. If you don't
fill_forward bool/NoneType provide a value, it uses None
the fill forward mode of
the security
subscription.

True to include
extended_market_hour
bool/NoneType extended market hours None
s
data. Otherwise, false.

The contract mapping


DataMappingMode/NoneT
data_mapping_mode mode to use for the None
ype
security history request.

The price scaling mode


to use for US Equities or
continuous Futures
data_normalization_m DataNormalizationMode contracts . If you don't
None
ode /NoneType provide a value, it uses
the data normalization
mode of the security
subscription.

The desired offset from


contract_depth_offse the current front month
int/NoneType None
t for continuous Futures
contracts .

PY

future = qb.add_future(Futures.Currencies.BTC)
history = qb.history(
tickers=[future.symbol],
start=qb.time - timedelta(days=15),
end=qb.time,
resolution=Resolution.MINUTE,
fill_forward=False,
extended_market_hours=False,
dataMappingMode=DataMappingMode.OPEN_INTEREST,
dataNormalizationMode=DataNormalizationMode.RAW,
contractDepthOffset=0)

Resolutions

Resolution is the duration of time that's used to sample a data source. The Resolution enumeration has the

following members:

The default resolution for market data is MINUTE . To set the resolution for a security, pass the resolution

argument when you create the security subscription.

PY

qb.add_equity("SPY", Resolution.DAILY)
When you request historical data, the history method uses the resolution of your security subscription. To get

historical data with a different resolution, pass a resolution argument to the history method.

PY

history = qb.history(spy, 10, Resolution.MINUTE)

Markets

The datasets integrated into the Dataset Market cover many markets. The Market enumeration has the following

members:

LEAN can usually determine the correct market based on the ticker you provide when you create the security

subscription. To manually set the market for a security, pass a market argument when you create the security
subscription.

PY

qb.add_equity("SPY", market=Market.USA)

Fill Forward

Fill forward means if there is no data point for the current sample, LEAN uses the previous data point. Fill forward

is the default data setting. To disable fill forward for a security, set the fill_forward argument to false when you

create the security subscription.

PY

qb.add_equity("SPY", fill_forward=False)

When you request historical data, the history method uses the fill forward setting of your security subscription.

To get historical data with a different fill forward setting, pass a fill_forward argument to the history method.

PY

history = qb.history(qb.securities.keys(), qb.time-timedelta(days=10), qb.time, fillForward=True)

Extended Market Hours

By default, your security subscriptions only cover regular trading hours. To subscribe to pre and post-market

trading hours for a specific asset, enable the extended_market_hours argument when you create the security

subscription.

PY

self.add_equity("SPY", extended_market_hours=True)

You only receive extended market hours data if you create the subscription with minute, second, or tick resolution.

If you create the subscription with daily or hourly resolution, the bars only reflect the regular trading hours.
When you request historical data, the history method uses the extended market hours setting of your security

subscription. To get historical data with a different extended market hours setting, pass an

extended_market_hours argument to the history method.

PY

history = qb.history(qb.securities.keys(), qb.time-timedelta(days=10), qb.time,


extended_market_hours=False)

Look-Ahead Bias

In the Research Environment, all the historical data is directly available. In backtesting, you can only access the

data that is at or before the algorithm time. If you make a history request for the previous 10 days of data in the

Research Environment, you get the previous 10 days of data from today's date. If you request the same data in a

backtest, you get the previous 10 days of data from the algorithm time.

Consolidate Data

History requests usually return data in one of the standard resolutions . To analyze data on custom time frames like

5-minute bars or 4-hour bars, you need to aggregate it. Consider an example where you make a history call for
minute resolution data and want to create 5-minute resolution data.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol
start_date = datetime(2018, 4, 1)
end_date = datetime(2018, 7, 15)
history = qb.history(symbol, start_date, end_date, Resolution.MINUTE)

To aggregate the data, use a consolidator or the pandas resample method.

Consolidators

The following snippet demonstrates how to use a consolidator to aggregate data:


PY

# Set up a consolidator and a RollingWindow to save the data


consolidator = TradeBarConsolidator(timedelta(7))
window = RollingWindow[TradeBar](20)

# Attach a consolidation handler method that saves the consolidated bars in the RollingWindow
def on_data_consolidated(sender, bar):
window.add(bar)
consolidator.data_consolidated += on_data_consolidated

# Iterate the historical market data and feed each bar into the consolidator
for bar in history.itertuples():
tradebar = TradeBar(bar.index[1], bar.index[0], bar.open, bar.high, bar.low, bar.close, bar.volume)
consolidator.update(tradebar)

Resample Method

The resample method converts the frequency of a time series DataFrame into a custom frequency. The method

only works on DataFrame objects that have a datetime index. The history method returns a DataFrame with a
multi-index. The first index is a Symbol index for each security and the second index is a time index for the

timestamps of each row of data. To make the DataFrame compatible with the resample method, call the

reset_index method to drop the Symbol index.

PY

# Drop level 0 index (Symbol index) from the DataFrame


history.reset_index(level = 0, drop = True, inplace=True)

The resample method returns a Resampler object, which needs to be downsampled using one of the pandas
downsampling computations . For example, you can use the Resampler.ohlc downsampling method to aggregate

price data.

When you resample a DataFrame with the ohlc downsampling method, it creates an OHLC row for each column in

the DataFrame. To just calculate the OHLC of the close column, select the close column before you resample the

DataFrame. A resample offset of 5T corresponds to a 5-minute resample. Other resampling offsets include 2D = 2

days, 5H = 5 hours, and 3S = 3 seconds.


PY

close_prices = history["close"]

offset = "5T"
close_5min_ohlc = close_prices.resample(offset).ohlc()

Common Errors

If the history request returns an empty DataFrame and you try to slice it, it throws an exception. To avoid issues,

check if the DataFrame contains data before slicing it.

PY

df = qb.history(symbol, 10).close # raises exception if the request is empty

def get_safe_history_closes(symbols):
if not symbols:
print(f'No symbols')
return False, None
df = qb.history(symbols, 100, Resolution.DAILY)
if df.empty:
print(f'Empy history for {symbols}')
return False, None
return True, df.close.unstack(0)

If you run the Research Environment on your local machine and history requests return no data, check if your data

directory contains the data you request. To download datasets, see Download .
Datasets > US Equity

Datasets
US Equity

Introduction

This page explains how to request, manipulate, and visualize historical US Equity data.

Create Subscriptions

Follow these steps to subscribe to a US Equity security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Call the add_equity method with a ticker and then save a reference to the US Equity Symbol .

PY

spy = qb.add_equity("SPY").symbol
tlt = qb.add_equity("TLT").symbol

To view the supported assets in the US Equities dataset, see the Data Explorer .

Get Historical Data

You need a subscription before you can request historical data for a security. On the time dimension, you can

request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the security dimension, you can request historical data for a single US Equity, a subset of the US
Equities you created subscriptions for in your notebook, or all of the US Equities in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an

integer.
PY

# DataFrame of trade and quote data


single_history_df = qb.history(spy, 10)
subset_history_df = qb.history([spy, tlt], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, spy, 10)
subset_history_trade_bar_df = qb.history(TradeBar, [spy, tlt], 10)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, spy, 10)
subset_history_quote_bar_df = qb.history(QuoteBar, [spy, tlt], 10)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, 10)
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, 10)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a

timedelta .
PY

# DataFrame of trade and quote data


single_history_df = qb.history(spy, timedelta(days=3))
subset_history_df = qb.history([spy, tlt], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, spy, timedelta(days=3))
subset_history_trade_bar_df = qb.history(TradeBar, [spy, tlt], timedelta(days=3))
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, spy, timedelta(days=3))
subset_history_quote_bar_df = qb.history(QuoteBar, [spy, tlt], timedelta(days=3))
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(spy, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([spy, tlt], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](spy, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spy, tlt], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start

datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of trade and quote data


single_history_df = qb.history(spy, start_time, end_time)
subset_history_df = qb.history([spy, tlt], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, spy, start_time, end_time)
subset_history_trade_bar_df = qb.history(TradeBar, [spy, tlt], start_time, end_time)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), start_time, end_time)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, spy, start_time, end_time)
subset_history_quote_bar_df = qb.history(QuoteBar, [spy, tlt], start_time, end_time)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(spy, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([spy, tlt], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spy, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([spy, tlt], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spy, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spy, tlt], start_time, end_time, Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](spy, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spy, tlt], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

Resolutions

The following table shows the available resolutions and data formats for Equity subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets
LEAN groups all of the US Equity exchanges under Market.USA .

Data Normalization

The data normalization mode defines how historical data is adjusted for corporate actions . By default, LEAN

adjusts US Equity data for splits and dividends to produce a smooth price curve, but the following data

normalization modes are available:

If you use ADJUSTED , SPLIT_ADJUSTED , or TOTAL_RETURN , we use the entire split and dividend history to adjust

historical prices. This process ensures you get the same adjusted prices, regardless of the QuantBook time. If you
use SCALED_RAW , we use the split and dividend history before the QuantBook 's EndDate to adjust historical prices.

To set the data normalization mode for a security, pass a data_normalization_mode argument to the add_equity

method.

PY

spy = qb.add_equity("SPY", data_normalization_mode=DataNormalizationMode.RAW).symbol

When you request historical data, the history method uses the data normalization of your security subscription.

To get historical data with a different data normalization, pass a data_normalization_mode argument to the

history method.

PY

history = qb.history(qb.securities.keys(), qb.time-timedelta(days=10), qb.time,


dataNormalizationMode=DataNormalizationMode.SPLIT_ADJUSTED)

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data

depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Equity Symbol

and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single Equity, index the loc property of the DataFrame with the Equity Symbol .
PY

all_history_df.loc[spy] # or all_history_df.loc['SPY']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[spy]['close']

If you request historical data for multiple Equities, you can transform the DataFrame so that it's a time series of
close values for all of the Equities. To transform the DataFrame , select the column you want to display for each

Equity and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each Equity and each row contains the

close value.
If you prefer to display the ticker of each Symbol instead of the string representation of the SecurityIdentifier ,

follow these steps:

1. Create a dictionary where the keys are the string representations of each SecurityIdentifier and the

values are the ticker.

PY

tickers_by_id = {str(x.id): x.value for x in qb.securities.keys}

2. Get the values of the symbol level of the DataFrame index and create a list of tickers.

PY

tickers = set([tickers_by_id[x] for x in all_history_df.index.get_level_values('symbol')])

3. Set the values of the symbol level of the DataFrame index to the list of tickers.

PY

all_history_df.index.set_levels(tickers, 'symbol', inplace=True)

The new DataFrame is keyed by the ticker.

PY

all_history_df.loc[spy.value] # or all_history_df.loc["SPY"]

After the index renaming, the unstacked DataFrame has the following format:
Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your Equity subscriptions. To avoid issues, check if the Slice contains data for your
Equity before you index it with the Equity Symbol .

You can also iterate through each TradeBar and QuoteBar in the Slice .

PY

for slice in all_history_slice:


for kvp in slice.bars:
symbol = kvp.key
trade_bar = kvp.value
for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value

TradeBar Objects

If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.

PY

for trade_bar in single_history_trade_bars:


print(trade_bar)

If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Equity. The

TradeBars may not have data for all of your Equity subscriptions. To avoid issues, check if the TradeBars object

contains data for your security before you index it with the Equity Symbol .

PY

for trade_bars in all_history_trade_bars:


if trade_bars.contains_key(spy):
trade_bar = trade_bars[spy]

You can also iterate through each of the TradeBars .


PY

for trade_bars in all_history_trade_bars:


for kvp in trade_bars:
symbol = kvp.Key
trade_bar = kvp.Value

QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Equity. The
QuoteBars may not have data for all of your Equity subscriptions. To avoid issues, check if the QuoteBars object

contains data for your security before you index it with the Equity Symbol .

PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(spy):
quote_bar = quote_bars[spy]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Equity. The Ticks may not

have data for all of your Equity subscriptions. To avoid issues, check if the Ticks object contains data for your

security before you index it with the Equity Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(spy):
ticks = ticks[spy]

You can also iterate through each of the Ticks .


PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value

The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical Equity data to produce plots. You can use many of the supported plotting libraries to

visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(spy, datetime(2021, 11, 23), datetime(2021, 12, 8), Resolution.DAILY).loc[spy]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick chart.

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='SPY OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the plot.


PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.

Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([spy, tlt], datetime(2021, 11, 23), datetime(2021, 12, 8), Resolution.DAILY)

2. Select the data to plot.

PY

volume = history['volume'].unstack(level=0)

3. Call the plot method on the pandas object.

PY

volume.plot(title="Volume", figsize=(15, 10))

4. Show the plot.


PY

plt.show()

Line charts display the value of the property you selected in a time series.

Common Errors

Some factor files have INF split values, which indicate that the stock has so many splits that prices can't be
calculated with correct numerical precision. To allow history requests with these symbols, we need to move the

starting date forward when reading the data or use raw data normalization . If there are numerical precision errors

in the factor files for a security in your history request, LEAN throws the following error:

"Warning: when performing history requests, the start date will be adjusted if there are numerical precision errors
in the factor files."
Datasets > Equity Fundamental Data

Datasets
Equity Fundamental Data

Introduction

This page explains how to request, manipulate, and visualize historical Equity Fundamental data. Corporate

fundamental data is available through the US Fundamental Data from Morningstar .

Create Subscriptions

Follow these steps to subscribe to an Equity security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Call the add_equity method with a ticker and then save a reference to the Equity Symbol .

PY

symbols = [
qb.add_equity(ticker, Resolution.DAILY).symbol
for ticker in [
"AAL", # American Airlines Group, Inc.
"ALGT", # Allegiant Travel Company
"ALK", # Alaska Air Group, Inc.
"DAL", # Delta Air Lines, Inc.
"LUV", # Southwest Airlines Company
"SKYW", # SkyWest, Inc.
"UAL" # United Air Lines
]
]

Get Historical Data

You need a subscription before you can request historical fundamental data for US Equities. On the time

dimension, you can request an amount of historical data based on a trailing number of bars, a trailing period of

time, or a defined period of time. On the security dimension, you can request historical data for a single US Equity,

a set of US Equities, or all of the US Equities in the US Fundamental dataset. On the property dimension, you can

call the history method to get all fundamental properties .

When you call the history method, you can request Fundamental or Fundamentals objects. If you use

Fundamental , the method returns all fundamental properties for the Symbol object(s) you provide. If you use

Fundamentals , the method returns all fundamental properties for all the US Equities in the US Fundamental dataset

that were trading during that time period you request, including companies that no longer trade.

Trailing Number of Trading Days


To get historical data for a number of trailing trading days, call the history method with the number of trading

days. If you didn't use Resolution.DAILY when you subscribed to the US Equities, pass it as the last argument to

the history method.

PY

# DataFrame of fundamental data


single_fundamental_df = qb.history(Fundamental, symbols[0], 10, flatten=True)
set_fundamental_df = qb.history(Fundamental, symbols, 10, flatten=True)
all_fundamental_df = qb.history(Fundamental, qb.securities.keys(), 10, flatten=True)
all_fundamentals_df = qb.history(Fundamentals, 10, flatten=True)

# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], 10)
set_fundamental_history = qb.history[Fundamental](symbols, 10)
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), 10)

# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](qb.securities.keys(), 10)

The preceding calls return fundamental data for the 10 most recent trading days.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with a timedelta object.

PY

# DataFrame of fundamental data


single_fundamental_df = qb.history(Fundamental, symbols[0], timedelta(days=10), flatten=True)
set_fundamental_df = qb.history(Fundamental, symbols, timedelta(days=10), flatten=True)
all_fundamental_df = qb.history(Fundamental, qb.securities.keys(), timedelta(days=10), flatten=True)
all_fundamentals_df = qb.history(Fundamentals, timedelta(5), flatten=True)

# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], timedelta(days=10))
set_fundamental_history = qb.history[Fundamental](symbols, timedelta(days=10))
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), timedelta(days=10))

# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](timedelta(days=10))

The preceding calls return fundamental data for the most recent trading days.

Defined Period of Time

To get the historical data of all the fundamental properties over specific period of time, call the history method

with a start datetime and an end datetime . To view the possible fundamental properties, see the Fundamental

attributes in Data Point Attributes . The start and end times you provide to these methods are based in the

notebook time zone .


PY

start_date = datetime(2021, 1, 1)
end_date = datetime(2021, 2, 1)

# DataFrame of all fundamental properties


single_fundamental_df = qb.history(Fundamental, symbols[0], start_date, end_date, flatten=True)
set_fundamental_df = qb.history(Fundamental, symbols, start_date, end_date, flatten=True)
all_fundamental_df = qb.history(Fundamental, qb.securities.keys(), start_date, end_date, flatten=True)
all_fundamentals_df = qb.history(Fundamentals, start_date, end_date, flatten=True)

# Fundamental objects
single_fundamental_history = qb.history[Fundamental](symbols[0], start_date, end_date)
set_fundamental_history = qb.history[Fundamental](symbols, start_date, end_date)
all_fundamental_history = qb.history[Fundamental](qb.securities.keys(), start_date, end_date)

# Fundamentals objects
all_fundamentals_history = qb.history[Fundamentals](qb.securities.keys(), start_date, end_date)

The preceding method returns the fundamental property values that are timestamped within the defined period of

time.

Wrangle Data

You need some historical data to perform wrangling operations. To display pandas objects, run a cell in a notebook

with the pandas object as the last line. To display other data formats, call the print method.

DataFrame Objects

The history method returns a multi-index DataFrame where the first level is the Equity Symbol and the second

level is the end_time of the trading day. The columns of the DataFrame are the names of the fundamental

properties. The following image shows the first 4 columns of an example DataFrame:

To access an attribute from one of the cells in the DataFrame, select the value in the cell and then access the

object's property.

PY

single_fundamental_df.iloc[0].companyprofile.share_class_level_shares_outstanding

Fundamental Objects

If you pass a Symbol to the history[Fundamental] method, run the following code to get the fundamental

properties over time:


PY

for fundamental in single_fundamental_history:


symbol = fundamental.symbol
end_time = fundamental.end_time
pe_ratio = fundamental.valuation_ratios.pe_ratio

If you pass a list of Symbol objects to the history[Fundamental] method, run the following code to get the
fundamental properties over time:

PY

for fundamental_dict in set_fundamental_history: # Iterate trading days


for symbol, fundamental in fundamental_dict.items(): # Iterate Symbols
end_time = fundamental.end_time
pe_ratio = fundamental.valuation_ratios.pe_ratio

Fundamentals Objects

If you request all fundamental properties for all US Equities with the history[Fundamentals] method, run the

following code to get the fundamental properties over time:

PY

for fundamentals_dict in all_fundamentals_history: # Iterate trading days


fundamentals = list(fundamentals_dict.values)[0]
end_time = fundamentals.end_time
for fundamental in fundamentals.data: # Iterate Symbols
if not fundamental.has_fundamental_data:
continue
symbol = fundamental.symbol
pe_ratio = fundamental.valuation_ratios.pe_ratio

Plot Data

You need some historical Equity fundamental data to produce plots. You can use many of the supported plotting

libraries to visualize data in various formats. For example, you can plot line charts.

Follow these steps to plot line charts using built-in methods :

1. Get some historical data .

PY

history = qb.history[Fundamental](symbols, datetime(2014, 1, 1), datetime(2015, 1, 1))

2. Convert to pandas DataFrame.


PY

data = {}
for fundamental_dict in history: # Iterate trading days
for symbol, fundamental in fundamental_dict.items(): # Iterate Symbols
datum = data.get(symbol, dict())
datum['index'] = datum.get('index', [])
datum['index'].append(fundamental.end_time)
datum['pe_ratio'] = datum.get('pe_ratio', [])
datum['pe_ratio'].append(fundamental.valuation_ratios.pe_ratio)
data[symbol] = datum

df = pd.DataFrame()
for symbol, datum in data.items():
df_symbol = pd.DataFrame({symbol: pd.Series(datum['pe_ratio'], index=datum['index'])})
df = pd.concat([df, df_symbol], axis=1)

3. Call the plot method on the history DataFrame.

PY

df.plot(title='PE Ratio Over Time', figsize=(15, 8))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Equity Options

Datasets
Equity Options

Datasets > Equity Options > Key Concepts

Equity Options
Key Concepts

Introduction

Equity Options are a financial derivative that gives the holder the right (but not the obligation) to buy or sell the

underlying Equity, such as Apple, at the stated exercise price. This page explains the basics of Equity Option data

in the Research Environment. To get some data, see Universes or Individual Contracts . For more information about

the specific datasets we use, see the US Equity Options and US Equity Option Universe dataset listings.

Resolutions

The following table shows the available resolutions and data formats for Equity Option contract subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

LEAN groups all of the US Equity Option exchanges under Market.USA , so you don't need to pass a Market to the

add_option or add_option_contract methods.

Data Normalization

The data normalization mode doesn't affect data from history request. By default, LEAN doesn't adjust Equity
Options data for splits and dividends of their underlying. If you change the data normalization mode, it won't

change the outcome.


Datasets > Equity Options > Universes

Equity Options
Universes

Introduction

This page explains how to request historical data for a universe of Equity Option contracts.

Create Subscriptions

Follow these steps to subscribe to an Equity Option universe:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Subscribe to the underlying Equity with raw data normalization and save a reference to the Equity Symbol .

PY

equity_symbol = qb.add_equity("SPY", data_normalization_mode=DataNormalizationMode.RAW).symbol

To view the supported underlying assets in the US Equity Options dataset, see the Data Explorer .

3. Call the add_option method with the underlying Equity Symbol .

PY

option = qb.add_option(equity_symbol)

Price History

The contract filter determines which Equity Option contracts are in your universe each trading day. The default

filter selects the contracts with the following characteristics:

Standard type (exclude weeklys)

Within 1 strike price of the underlying asset price

Expire within 35 days

To change the filter, call the set_filter method.

PY

# Set the contract filter to select contracts that have the strike price
# within 1 strike level and expire within 90 days.
option.set_filter(-1, 1, 0, 90)
To get the prices and volumes for all of the Equity Option contracts that pass your filter during a specific period of

time, call the option_history method with the underlying Equity Symbol object, a start datetime , and an end

datetime .

PY

option_history = qb.option_history(
equity_symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), Resolution.MINUTE,
fill_forward=False, extended_market_hours=False
)

To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each

contract and the underlying, use the data_frame property.

PY

option_history.data_frame

To get the expiration dates of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_expiry_dates()

To get the strike prices of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_strikes()

Daily Price and Greeks History

To get daily data on all the tradable contracts for a given date, call the history method with the canoncial Option

Symbol, a start date, and an end date. This method returns the entire Option chain for each trading day, not the

subset of contracts that pass your universe filter. The daily Option chains contain the prices, volume, open
interest, implied volaility, and Greeks of each contract.

PY

# DataFrame format
history_df = qb.history(option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), flatten=True)

# OptionUniverse objects
history = qb.history[OptionUniverse](option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5))
for chain in history:
end_time = chain.end_time
filtered_chain = [contract for contract in chain if contract.greeks.delta > 0.3]
for contract in filtered_chain:
price = contract.price
iv = contract.implied_volatility

The method represents each contract with an OptionUniverse object, which have the following properties:
Datasets > Equity Options > Individual Contracts

Equity Options
Individual Contracts

Introduction

This page explains how to request historical data for individual Equity Option contracts. The history requests on

this page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks. For

information about history requests that return the daily implied volatility and Greeks, see Universes .

Create Subscriptions

Follow these steps to subscribe to individual Equity Option contracts:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Add the underlying Equity with raw data normalization .

PY

underlying_symbol = qb.add_equity("SPY", data_normalization_mode=DataNormalizationMode.RAW).symbol

To view the supported underlying assets in the US Equity Options dataset, see the Data Explorer .

3. Set the start date to a date in the past that you want to use as the analysis date.

PY

qb.set_start_date(2024, 1, 1)

The method that you call in the next step returns data on all the contracts that were tradable on this date.

4. Call the option_chain method with the underlying Equity Symbol .

PY

# Get the Option contracts that were tradable on January 1st, 2024.
chain = qb.option_chain(underlying_symbol, flatten=True)

This method returns an OptionChain object, which represent an entire chain of Option contracts for a single

underlying security. You can even format the chain data into a DataFrame where each row in the DataFrame

represents a single contract.

5. Sort and filter the data to select the specific contract(s) you want to analyze.
PY

# Get the contracts available to trade (in DataFrame format).


chain = chain.data_frame

# Select a contract.
expiry = chain.expiry.min()
contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
# Select contracts with a 0.3-0.7 delta.
(chain.delta > 0.3) &
(chain.delta < 0.7)
# Select the contract with the largest open interest.
].sort_values('openinterest').index[-1]

6. Call the add_option_contract method with an OptionContract Symbol and disable fill-forward.

PY

# Subscribe to the target contract.


option_contract = qb.add_option_contract(contract_symbol, fill_forward=False)

Disable fill-forward because there are only a few OpenInterest data points per day.

Trade History

TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,

high, low, close, and volume of trading activity over a period of time.

To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).
PY

# DataFrame format
history_df = qb.history(TradeBar, contract_symbol, timedelta(3))
display(history_df)

# TradeBar objects
history = qb.history[TradeBar](contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)

TradeBar objects have the following properties:

Quote History

QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,

and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the

mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,

low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.

To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(QuoteBar, contract_symbol, timedelta(3))
display(history_df)

# QuoteBar objects
history = qb.history[QuoteBar](contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)

QuoteBar objects have the following properties:


Open Interest History

Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor

interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated

once per day.

To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(OpenInterest, contract_symbol, timedelta(3))
display(history_df)

# OpenInterest objects
history = qb.history[OpenInterest](contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)

OpenInterest objects have the following properties:

Greeks and IV History

The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price

changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)

represents the market's expectation of the underlying asset's volatility over the life of the Option.

Follow these steps to get the Greeks and IV data:

1. Create the mirror contract Symbol .

PY

mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)

2. Set up the risk free interest rate , dividend yield , and Option pricing models.

In our research , we found the Forward Tree model to be the best pricing model for indicators.

PY

risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(underlying_symbol)
option_model = OptionPricingModelType.FORWARD_TREE

3. Define a method to return the IV & Greeks indicator values for each contract.
PY

def greeks_and_iv(contracts, period, risk_free_rate_model, dividend_yield_model, option_model):


# Get the call and put contract.
call, put = sorted(contracts, key=lambda s: s.id.option_right)

def get_values(indicator_class, contract, mirror_contract):


return qb.indicator_history(
indicator_class(contract, risk_free_rate_model, dividend_yield_model, mirror_contract,
option_model),
[contract, mirror_contract, contract.underlying],
period
).data_frame.current

return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})

4. Call the preceding method and display the results.

PY

greeks_and_iv([contract_symbol, mirror_contract_symbol], 15, risk_free_rate_model,


dividend_yield_model, option_model)

The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in

time.

Examples

The following examples demonstrate some common practices for analyzing individual Equity Option contracts in

the Research Environment.


Example 1: Contract Trade History

The following notebook plots the historical prices of an SPY Equity Option contract using Plotly :

PY

import plotly.graph_objects as go

# Get the SPY Option chain for January 1, 2024.


qb = QuantBook()
underlying_symbol = qb.add_equity("SPY", data_normalization_mode=DataNormalizationMode.RAW).symbol
qb.set_start_date(2024, 1, 1)
chain = qb.option_chain(underlying_symbol, flatten=True).data_frame

# Select a contract from the chain.


expiry = chain.expiry.min()
contract_symbol = chain[
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
(chain.delta > 0.3) &
(chain.delta < 0.7)
].sort_values('openinterest').index[-1]

# Add the target contract.


qb.add_option_contract(contract_symbol)

# Get the contract history.


history = qb.history(contract_symbol, timedelta(3))

# Plot the price history.


go.Figure(
data=go.Candlestick(
x=history.index.levels[4],
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close']
),
layout=go.Layout(
title=go.layout.Title(text=f'{contract_symbol.value} OHLC'),
xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False
)
).show()
Example 2: Contract Open Interest History

The following notebook plots the historical open interest of a TSLA Equity Option contract using Matplotlib :

PY

# Get the TSLA Option chain for January 1, 2024.


qb = QuantBook()
underlying_symbol = qb.add_equity("TSLA", data_normalization_mode=DataNormalizationMode.RAW).symbol
qb.set_start_date(2024, 1, 1)
chain = qb.option_chain(underlying_symbol, flatten=True).data_frame

# Select a contract from the chain.


strike_distance = (chain.strike - chain.underlyinglastprice).abs()
target_strike_distance = strike_distance.min()
chain = chain.loc[strike_distance[strike_distance == target_strike_distance].index]
contract_symbol = chain.sort_values('impliedvolatility').index[-1]

# Add the target contract.


qb.add_option_contract(contract_symbol, fill_forward=False)

# Get the contract's open interest history.


history = qb.history(OpenInterest, contract_symbol, timedelta(90))
history.index = history.index.droplevel([0, 1, 2])
history = history['openinterest'].unstack(0)[contract_symbol]

# Plot the open interest history.


history.plot(title=f'{contract_symbol.value} Open Interest')
plt.show()
Datasets > Crypto

Datasets
Crypto

Introduction

This page explains how to request, manipulate, and visualize historical Crypto data.

Create Subscriptions

Follow these steps to subscribe to a Crypto security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. (Optional) Set the time zone to the data time zone .

PY

qb.set_time_zone(TimeZones.UTC)

3. Call the add_crypto method with a ticker and then save a reference to the Crypto Symbol .

PY

btcusd = qb.add_crypto("BTCUSD").symbol
ethusd = qb.add_crypto("ETHUSD").symbol

To view the supported assets in the Crypto datasets, see the Supported Assets section of the CoinAPI dataset

listings .

Get Historical Data

You need a subscription before you can request historical data for a security. On the time dimension, you can

request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the security dimension, you can request historical data for a single Cryptocurrency, a subset of

the Cryptocurrencies you created subscriptions for in your notebook, or all of the Cryptocurrencies in your

notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an

integer.
PY

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, 10)
subset_history_df = qb.history([btcusd, ethusd], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, 10)
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], 10)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, 10)
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], 10)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, 10)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a

timedelta .
PY

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, timedelta(days=3))
subset_history_df = qb.history([btcusd, ethusd], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, timedelta(days=3))
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], timedelta(days=3))
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, timedelta(days=3))
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], timedelta(days=3))
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(btcusd, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](btcusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

Defined Period of Time

To get historical data for a specific period of time, call the History method with the Symbol object(s), a start

datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, start_time, end_time)
subset_history_df = qb.history([btcusd, ethusd], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, start_time, end_time)
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], start_time, end_time)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), start_time, end_time)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, start_time, end_time)
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], start_time, end_time)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(btcusd, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](btcusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

Resolutions

The following table shows the available resolutions and data formats for Crypto subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The following Market enumeration members are available for Crypto:


Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data

depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Crypto Symbol

and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single Crypto, index the loc property of the DataFrame with the Crypto Symbol .

PY

all_history_df.loc[btcusd] # or all_history_df.loc['BTCUSD']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[btcusd]['close']
If you request historical data for multiple Crypto pairs, you can transform the DataFrame so that it's a time series of

close values for all of the Crypto pairs. To transform the DataFrame , select the column you want to display for

each Crypto pair and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each Crypto pair and each row

contains the close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your Crypto subscriptions. To avoid issues, check if the Slice contains data for your

Crypto pair before you index it with the Crypto Symbol .

You can also iterate through each TradeBar and QuoteBar in the Slice .
PY

for slice in all_history_slice:


for kvp in slice.bars:
symbol = kvp.key
trade_bar = kvp.value
for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value

TradeBar Objects

If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.

PY

for trade_bar in single_history_trade_bars:


print(trade_bar)

If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Crypto pair.

The TradeBars may not have data for all of your Crypto subscriptions. To avoid issues, check if the TradeBars

object contains data for your security before you index it with the Crypto Symbol .

PY

for trade_bars in all_history_trade_bars:


if trade_bars.contains_key(btcusd):
trade_bar = trade_bars[btcusd]

You can also iterate through each of the TradeBars .

PY

for trade_bars in all_history_trade_bars:


for kvp in trade_bars:
symbol = kvp.Key
trade_bar = kvp.Value

QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Crypto pair.

The QuoteBars may not have data for all of your Crypto subscriptions. To avoid issues, check if the QuoteBars

object contains data for your security before you index it with the Crypto Symbol .
PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(btcusd):
quote_bar = quote_bars[btcusd]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Crypto pair. The Ticks may

not have data for all of your Crypto subscriptions. To avoid issues, check if the Ticks object contains data for your

security before you index it with the Crypto Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(btcusd):
ticks = ticks[btcusd]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value

The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical Crypto data to produce plots. You can use many of the supported plotting libraries to

visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .


PY

history = qb.history(btcusd, datetime(2020, 12, 27), datetime(2021, 12, 21),


Resolution.DAILY).loc[btcusd]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='BTCUSD OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([btcusd, ethusd], datetime(2020, 12, 27), datetime(2021, 12, 21),


Resolution.DAILY)

2. Select the data to plot.

PY

volume = history['volume'].unstack(level=0)

3. Call the plot method on the pandas object.

PY

volume.plot(title="Volume", figsize=(15, 10))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Crypto Futures

Datasets
Crypto Futures

Introduction

This page explains how to request, manipulate, and visualize historical Crypto Futures data.

Create Subscriptions

Follow these steps to subscribe to a perpetual Crypto Futures contract:

1. Create a QuantBook .

PY

qb = QuantBook()

2. (Optional) Set the time zone to the data time zone .

PY

qb.set_time_zone(TimeZones.UTC)

3. Call the add_crypto_future method with a ticker and then save a reference to the Crypto Future Symbol .

PY

btcusd = qb.add_crypto_future("BTCUSD").symbol
ethusd = qb.add_crypto_future("ETHUSD").symbol

To view the supported assets in the Crypto Futures datasets, see the Data Explorer .

Get Historical Data

You need a subscription before you can request historical data for a security. You can request an amount of

historical data based on a trailing number of bars, a trailing period of time, or a defined period of time. You can also

request historical data for a single contract, a subset of the contracts you created subscriptions for in your
notebook, or all of the contracts in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an

integer.
PY

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, 10)
subset_history_df = qb.history([btcusd, ethusd], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, 10)
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], 10)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, 10)
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], 10)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, 10)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .
PY

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, timedelta(days=3))
subset_history_df = qb.history([btcusd, ethusd], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, timedelta(days=3))
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], timedelta(days=3))
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, timedelta(days=3))
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], timedelta(days=3))
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(btcusd, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](btcusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

Defined Period of Time

To get historical data for a specific period of time, call the History method with the Symbol object(s), a start

datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of trade and quote data


single_history_df = qb.history(btcusd, start_time, end_time)
subset_history_df = qb.history([btcusd, ethusd], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, btcusd, start_time, end_time)
subset_history_trade_bar_df = qb.history(TradeBar, [btcusd, ethusd], start_time, end_time)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), start_time, end_time)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, btcusd, start_time, end_time)
subset_history_quote_bar_df = qb.history(QuoteBar, [btcusd, ethusd], start_time, end_time)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(btcusd, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](btcusd, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([btcusd, ethusd], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](btcusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([btcusd, ethusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](btcusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([btcusd, ethusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

Resolutions

The following table shows the available resolutions and data formats for Crypto Futures contract subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The following Market enumeration members are available for Cryptofuture:


Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data

depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Crypto Future

Symbol and the second level is the end_time of the data sample. The columns of the DataFrame are the data

properties.

To select the historical data of a single Crypto Future, index the loc property of the DataFrame with the Crypto

Future Symbol .

PY

all_history_df.loc[btcusd] # or all_history_df.loc['BTCUSD']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[btcusd]['close']
If you request historical data for multiple Crypto Futures contracts, you can transform the DataFrame so that it's a

time series of close values for all of the Crypto Futures contracts. To transform the DataFrame , select the column

you want to display for each Crypto Futures contract and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each Crypto Futures contract and each

row contains the close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your Crypto Future subscriptions. To avoid issues, check if the Slice contains data for

your Crypto Futures contract before you index it with the Crypto Future Symbol .

You can also iterate through each TradeBar and QuoteBar in the Slice .
PY

for slice in all_history_slice:


for kvp in slice.bars:
symbol = kvp.key
trade_bar = kvp.value
for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value

TradeBar Objects

If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.

PY

for trade_bar in single_history_trade_bars:


print(trade_bar)

If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Crypto

Futures contract. The TradeBars may not have data for all of your Crypto Future subscriptions. To avoid issues,

check if the TradeBars object contains data for your security before you index it with the Crypto Future Symbol .

PY

for trade_bars in all_history_trade_bars:


if trade_bars.contains_key(btcusd):
trade_bar = trade_bars[btcusd]

You can also iterate through each of the TradeBars .

PY

for trade_bars in all_history_trade_bars:


for kvp in trade_bars:
symbol = kvp.Key
trade_bar = kvp.Value

QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Crypto

Futures contract. The QuoteBars may not have data for all of your Crypto Future subscriptions. To avoid issues,
check if the QuoteBars object contains data for your security before you index it with the Crypto Future Symbol .
PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(btcusd):
quote_bar = quote_bars[btcusd]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Crypto Futures contract.

The Ticks may not have data for all of your Crypto Future subscriptions. To avoid issues, check if the Ticks object

contains data for your security before you index it with the Crypto Future Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(btcusd):
ticks = ticks[btcusd]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value

The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical Crypto Futures data to produce plots. You can use many of the supported plotting

libraries to visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .


PY

history = qb.history(btcusd, datetime(2021, 11, 23), datetime(2021, 12, 8),


Resolution.DAILY).loc[btcusd]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='BTCUSD 18R OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([btcusd, ethusd], datetime(2021, 11, 23), datetime(2021, 12, 8),


Resolution.DAILY)

PY

var history = qb.history(new List<Symbol> { btcusd, ethusd }, new DateTime(2021, 11, 23), new
DateTime(2021, 12, 8), Resolution.DAILY);

2. Select the data to plot.

PY

volume = history['volume'].unstack(level=0)

3. Call the plot method on the pandas object.

PY

volume.plot(title="Volume", figsize=(15, 10))

4. Show the plot.

PY

plt.show()
Line charts display the value of the property you selected in a time series.
Datasets > Futures

Datasets
Futures

Introduction

This page explains how to request, manipulate, and visualize historical Futures data.

Create Subscriptions

Follow these steps to subscribe to a Future security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Call the add_future method with a ticker, resolution, and contract rollover settings .

PY

future = qb.add_future(Futures.Indices.SP_500_E_MINI, Resolution.MINUTE,


data_normalization_mode = DataNormalizationMode.BACKWARDS_RATIO,
data_mapping_mode = DataMappingMode.LAST_TRADING_DAY,
contract_depth_offset = 0)

To view the available tickers in the US Futures dataset, see Supported Assets .

If you omit any of the arguments after the ticker, see the following table for their default values:

Argument Default Value

resolution Resolution.MINUTE

data_normalization_mode DataNormalizationMode.ADJUSTED

data_mapping_mode DataMappingMode.OpenInterest

contract_depth_offset 0

3. (Optional) Set a contract filter .

PY

future.set_filter(0, 90)

If you don't call the set_filter method, the future_history method won't return historical data.
If you want historical data on individual contracts and their OpenInterest , follow these steps to subscribe to

individual Future contracts:

1. Call the GetFuturesContractList method with the underlying Future Symbol and a datetime .

PY

start_date = datetime(2021,12,20)
symbols = qb.future_chain_provider.get_future_contract_list(future.symbol, start_date)

This method returns a list of Symbol objects that reference the Future contracts that were trading at the given

time. If you set a contract filter with set_filter , it doesn't affect the results of get_future_contract_list .

2. Select the Symbol of the FutureContract object(s) for which you want to get historical data.

For example, select the Symbol of the contract with the closest expiry.

PY

contract_symbol = sorted(symbols, key=lambda s: s.id.date)[0]

3. Call the add_future_contract method with an FutureContract Symbol and disable fill-forward.

PY

qb.add_future_contract(contract_symbol, fill_forward = False)

Disable fill-forward because there are only a few OpenInterest data points per day.

Get Historical Data

You need a subscription before you can request historical data for Futures contracts. On the time dimension, you

can request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the contract dimension, you can request historical data for a single contract, a subset of the

contracts you created subscriptions for in your notebook, or all of the contracts in your notebook.

These history requests return the prices and open interest of the Option contracts. They don't provide the implied

volatility or Greeks. To get the implied volaility and Greeks, call the option_chain method or create some

indicators .

Before you request historical data, call the set_start_date method with a datetime to reduce the risk of look-

ahead bias .

PY

qb.set_start_date(start_date)

If you call the set_start_date method, the date that you pass to the method is the latest date for which your
history requests will return data.

Trailing Number of Bars


To get historical data for a number of trailing bars, call the history method with the contract Symbol object(s) and

an integer.

PY

# DataFrame of trade and quote data


single_history_df = qb.history(contract_symbol, 10)
subset_history_df = qb.history([contract_symbol], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, contract_symbol, 10)
subset_history_trade_bar_df = qb.history(TradeBar, [contract_symbol], 10)s
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, contract_symbol, 10)
subset_history_quote_bar_df = qb.history(QuoteBar, [contract_symbol], 10)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), 10)

# DataFrame of open interest data


single_history_open_interest_df = qb.history(OpenInterest, contract_symbol, 400)
subset_history_open_interest_df = qb.history(OpenInterest, [contract_symbol], 400)
all_history_open_interest_df = qb.history(OpenInterest, qb.securities.keys(), 400)

# Slice objects
all_history_slice = qb.history(10)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, 10)
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, 10)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, 400)
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], 400)
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), 400)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

To get historical data for the continous Futures contract, in the preceding history requests, replace

contract_symbol with future.Symbol .

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the contract Symbol object(s) and a

timedelta .
PY

# DataFrame of trade and quote data


single_history_df = qb.history(contract_symbol, timedelta(days=3))
subset_history_df = qb.history([contract_symbol], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, contract_symbol, timedelta(days=3))
subset_history_trade_bar_df = qb.history(TradeBar, [contract_symbol], timedelta(days=3))
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, contract_symbol, timedelta(days=3))
subset_history_quote_bar_df = qb.history(QuoteBar, [contract_symbol], timedelta(days=3))
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), timedelta(days=3))

# DataFrame of open interest data


single_history_open_interest_df = qb.history(OpenInterest, contract_symbol, timedelta(days=3))
subset_history_open_interest_df = qb.history(OpenInterest, [contract_symbol], timedelta(days=3))
all_history_open_interest_df = qb.history(OpenInterest, qb.securities.keys(), timedelta(days=3))

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], timedelta(days=3),
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](contract_symbol, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([contract_symbol], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, timedelta(days=2))
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], timedelta(days=2))
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), timedelta(days=2))

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

To get historical data for the continous Futures contract, in the preceding history requests, replace

contract_symbol with future.Symbol .

Defined Period of Time

To get historical data for individual Futures contracts during a specific period of time, call the history method with

the Futures contract Symbol object(s), a start datetime , and an end datetime . The start and end times you

provide are based in the notebook time zone .


PY

start_time = datetime(2021, 12, 1)


end_time = datetime(2021, 12, 31)

# DataFrame of trade and quote data


single_history_df = qb.history(contract_symbol, start_time, end_time)
subset_history_df = qb.history([contract_symbol], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of trade data


single_history_trade_bar_df = qb.history(TradeBar, contract_symbol, start_time, end_time)
subset_history_trade_bar_df = qb.history(TradeBar, [contract_symbol], start_time, end_time)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), start_time, end_time)

# DataFrame of quote data


single_history_quote_bar_df = qb.history(QuoteBar, contract_symbol, start_time, end_time)
subset_history_quote_bar_df = qb.history(QuoteBar, [contract_symbol], start_time, end_time)
all_history_quote_bar_df = qb.history(QuoteBar, qb.securities.keys(), start_time, end_time)

# DataFrame of open interest data


single_history_open_interest_df = qb.history(OpenInterest, contract_symbol, start_time, end_time)
subset_history_open_interest_df = qb.history(OpenInterest, [contract_symbol], start_time, end_time)
all_history_trade_open_interest_df = qb.history(OpenInterest, qb.securities.keys(), start_time,
end_time)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](contract_symbol, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([contract_symbol], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](contract_symbol, start_time, end_time,
Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([contract_symbol], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](contract_symbol, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([contract_symbol], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

# OpenInterest objects
single_history_open_interest = qb.history[OpenInterest](contract_symbol, start_time, end_time)
subset_history_open_interest = qb.history[OpenInterest]([contract_symbol], start_time, end_time)
all_history_open_interest = qb.history[OpenInterest](qb.securities.keys(), start_time, end_time)

To get historical data for the continous Futures contract, in the preceding history requests, replace

contract_symbol with future.Symbol .

To get historical data for all of the Futures contracts that pass your filter during a specific period of time, call the

future_history method with the Symbol object of the continuous Future, a start datetime , and an end datetime .

PY

future_history = qb.future_history(future.Symbol, end_time-timedelta(days=2), end_time,


Resolution.MINUTE, fill_forward=False, extended_market_hours=False)

The preceding calls return data that have a timestamp within the defined period of time.

Resolutions

The following table shows the available resolutions and data formats for Futures subscriptions:
Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The following Market enumeration members are available for Futures:

Data Normalization

The data normalization mode doesn't affect data from history request for Futures contracts. If you change the data

normalization mode, it won't change the outcome.

The following data normalization modes are available for continuous Futures contracts :

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.

DataFrame Objects

If your history request returns a DataFrame , the DataFrame has the following index levels:

1. Contract expiry
2. Encoded contract Symbol

3. The end_time of the data sample

The columns of the DataFrame are the data properties. Depending on how you request data, the DataFrame may
contain data for the continuous Futures contract. The continuous contract doesn't expire, so the default expiry

date of December 30, 1899 doesn't have any practical meaning.


To select the rows of the contract(s) that expire at a specific time, index the loc property of the DataFrame with the
expiry time.

PY

all_history_df.loc[datetime(2022, 3, 18, 13, 30)]

If you remove the first index level, you can index the DataFrame with just the contract Symbol , similiar to how you

would with non-derivative asset classes. To remove the first index level, call the droplevel method.

PY

all_history_df.index = all_history_df.index.droplevel(0)

To select the historical data of a single Futures contract, index the loc property of the DataFrame with the contract

Symbol .

PY

all_history_df.loc[contract_symbol]
To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[contract_symbol]['close']

If you request historical data for multiple Futures contracts, you can transform the DataFrame so that it's a time
series of close values for all of the Futures contracts. To transform the DataFrame , select the column you want to

display for each Futures contract and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each security and each row contains

the close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Futures subscriptions. To avoid issues, check if the Slice contains data for your

Futures contract before you index it with the Futures Symbol .

You can also iterate through each TradeBar and QuoteBar in the Slice .
PY

for slice in all_history_slice:


for kvp in slice.bars:
symbol = kvp.key
trade_bar = kvp.value
for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value

TradeBar Objects

If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.

PY

for trade_bar in single_history_trade_bars:


print(trade_bar)

If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Futures
contract. The TradeBars may not have data for all of your Futures subscriptions. To avoid issues, check if the

TradeBars object contains data for your security before you index it with the Futures Symbol .

PY

for trade_bars in all_history_trade_bars:


if trade_bars.contains_key(contract_symbol):
trade_bar = trade_bars[contract_symbol]

You can also iterate through each of the TradeBars .

PY

for trade_bars in all_history_trade_bars:


for kvp in trade_bars:
symbol = kvp.Key
trade_bar = kvp.Value

QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Futures

contract. The QuoteBars may not have data for all of your Futures subscriptions. To avoid issues, check if the
QuoteBars object contains data for your security before you index it with the Futures Symbol .
PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(contract_symbol):
quote_bar = quote_bars[contract_symbol]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Futures contract. The Ticks

may not have data for all of your Futures subscriptions. To avoid issues, check if the Ticks object contains data for
your security before you index it with the Futures Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(contract_symbol):
ticks = ticks[contract_symbol]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value

The Ticks objects only contain the last tick of each security for that particular timeslice

OpenInterest Objects

If the history method returns OpenInterest objects, iterate through the OpenInterest objects to get each one.

PY

for open_interest in single_history_open_interest:


print(open_interest)

If the history method returns a dictionary of OpenInterest objects, iterate through the dictionary to get the
OpenInterest of each Futures contract. The dictionary of OpenInterest objects may not have data for all of your

Futures contract subscriptions. To avoid issues, check if the dictionary contains data for your contract before you

index it with the Futures contract Symbol .

PY

for open_interest_dict in all_history_open_interest:


if open_interest_dict.contains_key(contract_symbol):
open_interest = open_interest_dict[contract_symbol]

You can also iterate through each of the OpenInterest dictionaries.

PY

for open_interest_dict in all_history_open_interest:


for kvp in open_interest_dict:
symbol = kvp.key
open_interest = kvp.value

FutureHistory Objects

The future_history method returns a FutureHistory object. To get each slice in the FutureHistory object,
iterate through it.

PY

for slice in future_history:


for continuous_contract_symbol, chain in slice.futures_chains.items():
for contract in chain:
pass

To convert the FutureHistory object to a DataFrame that contains the trade and quote information of each

contract, call the GetAllData method.

PY

future_history.get_all_data()

To get the expiration dates of all the contracts in an FutureHistory object, call the GetExpiryDates method.

PY

future_history.get_expiry_dates()

Plot Data

You need some historical Futures data to produce plots. You can use many of the supported plotting libraries to

visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .


PY

history = qb.history(contract_symbol, datetime(2021, 12, 1), datetime(2021, 12, 31),


Resolution.DAILY)

2. Drop the first two index levels.

PY

history.index = history.index.droplevel([0, 1])

3. Import the plotly library.

PY

import plotly.graph_objects as go

4. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

5. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text=f'{contract_symbol.value} OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

6. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

7. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the contract.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history(symbols, datetime(2021, 12, 1), datetime(2021, 12, 31), Resolution.DAILY)

2. Drop the first index level.

PY

history.index = history.index.droplevel(0)

3. Select data to plot.

PY

closing_prices = history['close'].unstack(level=0)

4. Rename the columns to be the Symbol of each contract.

PY

closing_prices.columns = [Symbol.get_alias(SecurityIdentifier.parse(x)) for x in


closing_prices.columns]

5. Call the plot method on the pandas object.

PY

closing_prices.plot(title="Close", figsize=(15, 8))

6. Show the plot.


PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Futures Options

Datasets
Futures Options

Datasets > Futures Options > Key Concepts

Futures Options
Key Concepts

Introduction

Future Option contracts give the buyer a window of opportunity to buy or sell the underlying Future contract at a
specific price. This page explains the basics of Future Option data in the Research Environment. To get some data,

see Universes or Individual Contracts . For more information about the specific datasets we use, see the US Future
Options dataset listing.

Resolutions

The following table shows the available resolutions and data formats for Future Option contract subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The following Market enumeration members are available for Future Options:

Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.


Datasets > Futures Options > Universes

Futures Options
Universes

Introduction

This page explains how to request historical data for a universe of Future Option contracts.

Create Subscriptions

Follow these steps to subscribe to a Futures Options universe:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Add the underlying Future .

PY

future = qb.add_future(Futures.Indices.SP_500_E_MINI)

To view the available underlying Futures in the US Future Options dataset, see Supported Assets .

Price History

The contract filter determines which Future Option contracts are in your universe each trading day. The default

filter selects the contracts with the following characteristics:

Standard type (weeklies and non-standard contracts are not available)


Within 1 strike price of the underlying asset price

Expire within 35 days

To get the prices and volumes for all of the Future Option contracts that pass your filter during a specific period of
time, get the underlying Future contract and then call the option_history method with the Future contract's

Symbol object, a start datetime , and an end datetime .


PY

start_date = datetime(2024, 1, 1)

# Select an underlying Futures contract. For example, get the front-month contract.
futures_contract = sorted(
qb.future_chain_provider.get_future_contract_list(future.symbol, start_date),
key=lambda symbol: symbol.id.date
)[0]

# Get the Options data for the selected Futures contract.


option_history = qb.option_history(
futures_contract, start_date, futures_contract.id.date, Resolution.HOUR,
fill_forward=False, extended_market_hours=False
)

To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each

contract and the underlying, use the data_frame property.

PY

option_history.data_frame

To get the expiration dates of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_expiry_dates()

To get the strike prices of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_strikes()
Datasets > Futures Options > Individual Contracts

Futures Options
Individual Contracts

Introduction

This page explains how to request historical data for individual Future Option contracts. The history requests on

this page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks.

Create Subscriptions

Follow these steps to subscribe to individual Futures Option contracts:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Add the underlying Futures contract .

PY

future = qb.add_future(Futures.Indices.SP_500_E_MINI)
start_date = datetime(2023, 12, 20)
futures_contract_symbol = sorted(
qb.future_chain_provider.get_future_contract_list(future.symbol, start_date),
key=lambda s: s.id.date
)[0]
qb.add_future_contract(futures_contract_symbol, fill_forward=False)

To view the available underlying Futures in the US Future Options dataset, see Supported Assets .

3. Set the start date to a date in the past that you want to use as the analysis date.

PY

qb.set_start_date(futures_contract_symbol.id.date - timedelta(5))

The method that you call in the next step returns data on all the contracts that were tradable on this date.

4. Call the option_chain method with the underlying Futures contract Symbol .

PY

chain = qb.option_chain(futures_contract_symbol, flatten=True).data_frame

This method returns an OptionChain object, which represent an entire chain of Option contracts for a single

underlying security. You can even format the chain data into a DataFrame where each row in the DataFrame
represents a single contract.
5. Sort and filter the data to select the specific Futures Options contract(s) you want to analyze.

PY

# Select a contract.
expiry = chain.expiry.min()
fop_contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL)
# Select the contract with a strike price near the middle.
].sort_values('strike').index[150]

6. Call the add_future_option_contract method with an OptionContract Symbol and disable fill-forward.

PY

option_contract = qb.add_future_option_contract(fop_contract_symbol, fill_forward=False)

Disable fill-forward because there are only a few OpenInterest data points per day.

Trade History

TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,

high, low, close, and volume of trading activity over a period of time.

To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).
PY

# DataFrame format
history_df = qb.history(TradeBar, fop_contract_symbol, timedelta(3))
display(history_df)

# TradeBar objects
history = qb.history[TradeBar](fop_contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)

TradeBar objects have the following properties:

Quote History

QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,

and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the

mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,
low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.

To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(QuoteBar, fop_contract_symbol, timedelta(3))
display(history_df)

# QuoteBar objects
history = qb.history[QuoteBar](fop_contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)

QuoteBar objects have the following properties:


Open Interest History

Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor
interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated

once per day.

To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(OpenInterest, fop_contract_symbol, timedelta(3))
display(history_df)

# OpenInterest objects
history = qb.history[OpenInterest](fop_contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)

OpenInterest objects have the following properties:

Greeks and IV History

The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price

changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)
represents the market's expectation of the underlying asset's volatility over the life of the Option.

Follow these steps to get the Greeks and IV data:

1. Create the mirror contract Symbol .

PY

mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, fop_contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)

2. Set up the risk free interest rate , dividend yield , and Option pricing models.

In our research , we found the Forward Tree model to be the best pricing model for indicators.

PY

risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(futures_contract_symbol)
option_model = OptionPricingModelType.FORWARD_TREE

3. Define a method to return the IV & Greeks indicator values for each contract.
PY

def greeks_and_iv(contracts, period, risk_free_rate_model, dividend_yield_model, option_model):


# Get the call and put contract.
call, put = sorted(contracts, key=lambda s: s.id.option_right)

def get_values(indicator_class, contract, mirror_contract):


return qb.indicator_history(
indicator_class(contract, risk_free_rate_model, dividend_yield_model, mirror_contract,
option_model),
[contract, mirror_contract, contract.underlying],
period
).data_frame.current

return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})

4. Call the preceding method and display the results.

PY

greeks_and_iv([fop_contract_symbol, mirror_contract_symbol], 15, risk_free_rate_model,


dividend_yield_model, option_model)

The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in
time.

Examples

The following examples demonstrate some common practices for analyzing individual Future Option contracts in
the Research Environment.
Example 1: Contract Mid-Price History

The following notebook plots the historical mid-prices of an E-mini S&P 500 Future Option contract using Plotly :

PY

import plotly.graph_objects as go

# Add the underlying Future contract


# (the front-month ES Future contract as of December 12, 2023).
qb = QuantBook()
future = qb.add_future(Futures.Indices.SP_500_E_MINI)
futures_contract_symbol = sorted(
qb.future_chain_provider.get_future_contract_list(future.symbol, datetime(2023, 12, 20)),
key=lambda s: s.id.date
)[0]
qb.add_future_contract(futures_contract_symbol, fill_forward=False)

# Get the Future Option chain as of 5 days before the underlying Future's expiry date.
qb.set_start_date(futures_contract_symbol.id.date - timedelta(5))
chain = qb.option_chain(futures_contract_symbol, flatten=True).data_frame

# Select a Future Option contract from the chain.


expiry = chain.expiry.min()
fop_contract_symbol = chain[
(chain.expiry == expiry) & (chain.right == OptionRight.CALL)
].sort_values('strike').index[50]

# Add the target Future Option contract.


qb.add_future_option_contract(fop_contract_symbol)

# Get the Future Option contract quote history.


history = qb.history(QuoteBar, fop_contract_symbol, datetime(2024, 2, 22), datetime(2024, 2, 23))

# Plot the mid-price values of the quote history.


go.Figure(
data=go.Candlestick(
x=history.index.levels[4],
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close']
),
layout=go.Layout(
title=go.layout.Title(text=f'{fop_contract_symbol.value} OHLC'),
xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False
)
).show()
Datasets > Forex

Datasets
Forex

Introduction

This page explains how to request, manipulate, and visualize historical Forex data.

Create Subscriptions

Follow these steps to subscribe to a Forex security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. (Optional) Set the time zone to the data time zone .

PY

qb.set_time_zone(TimeZones.UTC)

3. Call the add_forex method with a ticker and then save a reference to the Forex Symbol .

PY

eurusd = qb.add_forex("EURUSD").symbol
gbpusd = qb.add_forex("GBPUSD").symbol

To view all of the available Forex pairs, see Supported Assets .

Get Historical Data

You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the security dimension, you can request historical data for a single Forex pair, a subset of the
pairs you created subscriptions for in your notebook, or all of the pairs in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY

# DataFrame
single_history_df = qb.history(eurusd, 10)
subset_history_df = qb.history([eurusd, gbpusd], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, 10)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a

timedelta .

PY

# DataFrame of quote data (Forex data doesn't have trade data)


single_history_df = qb.history(eurusd, timedelta(days=3))
subset_history_df = qb.history([eurusd, gbpusd], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(eurusd, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([eurusd, gbpusd], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](eurusd, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([eurusd, gbpusd], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of quote data (Forex data doesn't have trade data)


single_history_df = qb.history(eurusd, start_time, end_time)
subset_history_df = qb.history([eurusd, gbpusd], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(eurusd, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([eurusd, gbpusd], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](eurusd, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([eurusd, gbpusd], start_time, end_time,
Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](eurusd, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([eurusd, gbpusd], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

Resolutions

The following table shows the available resolutions and data formats for Forex subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The only market available for Forex pairs is Market.OANDA .

Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Forex Symbol and
the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single Forex, index the loc property of the DataFrame with the Forex Symbol .

PY

all_history_df.loc[eurusd] # or all_history_df.loc['EURUSD']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[eurusd]['close']
If you request historical data for multiple Forex pairs, you can transform the DataFrame so that it's a time series of
close values for all of the Forex pairs. To transform the DataFrame , select the column you want to display for each

Forex pair and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each Forex pair and each row contains

the close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your Forex subscriptions. To avoid issues, check if the Slice contains data for your
Forex pair before you index it with the Forex Symbol .

You can also iterate through each QuoteBar in the Slice .

PY

for slice in all_history_slice:


for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value
QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each Forex pair.
The QuoteBars may not have data for all of your Forex subscriptions. To avoid issues, check if the QuoteBars

object contains data for your security before you index it with the Forex Symbol .

PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(eurusd):
quote_bar = quote_bars[eurusd]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Forex pair. The Ticks may

not have data for all of your Forex subscriptions. To avoid issues, check if the Ticks object contains data for your
security before you index it with the Forex Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(eurusd):
ticks = ticks[eurusd]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value
The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical Forex data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(eurusd, datetime(2021, 11, 26), datetime(2021, 12, 8),


Resolution.DAILY).loc[eurusd]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='EURUSD OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([eurusd, gbpusd], datetime(2021, 11, 26), datetime(2021, 12, 8),


Resolution.DAILY)

2. Select the data to plot.

PY

pct_change = history['close'].unstack(0).pct_change().dropna()

3. Call the plot method on the pandas object.

PY

pct_change.plot(title="Close Price %Change", figsize=(15, 10))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > CFD

Datasets
CFD

Introduction

This page explains how to request, manipulate, and visualize historical CFD data.

Create Subscriptions

Follow these steps to subscribe to a CFD security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. (Optional) Set the time zone to the data time zone .

PY

qb.set_time_zone(TimeZones.UTC)

3. Call the add_cfd method with a ticker and then save a reference to the CFD Symbol .

PY

spx = qb.add_cfd("SPX500USD").symbol
usb = qb.add_cfd("USB10YUSD").symbol

To view all of the available contracts, see Supported Assets .

Get Historical Data

You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the security dimension, you can request historical data for a single CFD contract, a subset of the
contracts you created subscriptions for in your notebook, or all of the contracts in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an

integer.
PY

# DataFrame
single_history_df = qb.history(spx, 10)
subset_history_df = qb.history([spx, usb], 10)
all_history_df = qb.history(qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, 10)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], 10)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a

timedelta .

PY

# DataFrame of quote data (CFD data doesn't have trade data)


single_history_df = qb.history(spx, timedelta(days=3))
subset_history_df = qb.history([spx, usb], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(spx, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([spx, usb], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, timedelta(days=3), Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], timedelta(days=3), Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), timedelta(days=3),
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](spx, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, usb], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of quote data (CFD data doesn't have trade data)


single_history_df = qb.history(spx, start_time, end_time)
subset_history_df = qb.history([spx, usb], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(spx, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([spx, usb], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# QuoteBar objects
single_history_quote_bars = qb.history[QuoteBar](spx, start_time, end_time, Resolution.MINUTE)
subset_history_quote_bars = qb.history[QuoteBar]([spx, usb], start_time, end_time, Resolution.MINUTE)
all_history_quote_bars = qb.history[QuoteBar](qb.securities.keys(), start_time, end_time,
Resolution.MINUTE)

# Tick objects
single_history_ticks = qb.history[Tick](spx, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, usb], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

Resolutions

The following table shows the available resolutions and data formats for CFD subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The only market available for CFD contracts is Market.OANDA .

Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.
To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded CFD Symbol and

the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single CFD, index the loc property of the DataFrame with the CFD Symbol .

PY

all_history_df.loc[spx] # or all_history_df.loc['SPX500USD']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[spx]['close']
If you request historical data for multiple CFD contracts, you can transform the DataFrame so that it's a time series

of close values for all of the CFD contracts. To transform the DataFrame , select the column you want to display for
each CFD contract and then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each CFD contract and each row
contains the close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your CFD subscriptions. To avoid issues, check if the Slice contains data for your CFD
contract before you index it with the CFD Symbol .

You can also iterate through each QuoteBar in the Slice .

PY

for slice in all_history_slice:


for kvp in slice.quote_bars:
symbol = kvp.key
quote_bar = kvp.value
QuoteBar Objects

If the history method returns QuoteBar objects, iterate through the QuoteBar objects to get each one.

PY

for quote_bar in single_history_quote_bars:


print(quote_bar)

If the history method returns QuoteBars , iterate through the QuoteBars to get the QuoteBar of each CFD contract.

The QuoteBars may not have data for all of your CFD subscriptions. To avoid issues, check if the QuoteBars object
contains data for your security before you index it with the CFD Symbol .

PY

for quote_bars in all_history_quote_bars:


if quote_bars.contains_key(spx):
quote_bar = quote_bars[spx]

You can also iterate through each of the QuoteBars .

PY

for quote_bars in all_history_quote_bars:


for kvp in quote_bars:
symbol = kvp.key
quote_bar = kvp.value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each CFD contract. The Ticks
may not have data for all of your CFD subscriptions. To avoid issues, check if the Ticks object contains data for

your security before you index it with the CFD Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(spx):
ticks = ticks[spx]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value
The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical CFD data to produce plots. You can use many of the supported plotting libraries to

visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(spx, datetime(2021, 11, 26), datetime(2021, 12, 8), Resolution.DAILY).loc[spx]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='SPX CFD OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create the Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([spx, usb], datetime(2021, 11, 26), datetime(2021, 12, 8), Resolution.DAILY)

2. Select the data to plot.

PY

pct_change = history['close'].unstack(0).pct_change().dropna()

3. Call the plot method on the pandas object.

PY

pct_change.plot(title="Close Price %Change", figsize=(15, 10))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Indices

Datasets
Indices

Introduction

This page explains how to request, manipulate, and visualize historical Index data.

Create Subscriptions

Follow these steps to subscribe to an Index security:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Call the add_index method with a ticker and then save a reference to the Index Symbol .

PY

spx = qb.add_index("SPX").symbol
vix = qb.add_index("VIX").symbol

To view all of the available indices, see Supported Indices .

Get Historical Data

You need a subscription before you can request historical data for a security. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined

period of time. On the security dimension, you can request historical data for a single Index, a subset of the Indices
you created subscriptions for in your notebook, or all of the Indices in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an

integer.
PY

# DataFrame
single_history_df = qb.history(spx, 10)
single_history_trade_bar_df = qb.history(TradeBar, spx, 10)
subset_history_df = qb.history([spx, vix], 10)
subset_history_trade_bar_df = qb.history(TradeBar, [spx, vix], 10)
all_history_df = qb.history(qb.securities.keys(), 10)
all_history_trade_bar_df = qb.history(TradeBar, qb.securities.keys(), 10)

# Slice objects
all_history_slice = qb.history(10)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, 10)
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], 10)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a

timedelta .

PY

# DataFrame of trade data (indices don't have quote data)


single_history_df = qb.history(spx, timedelta(days=3))
subset_history_df = qb.history([spx, vix], timedelta(days=3))
all_history_df = qb.history(qb.securities.keys(), timedelta(days=3))

# DataFrame of tick data


single_history_tick_df = qb.history(spx, timedelta(days=3), Resolution.TICK)
subset_history_tick_df = qb.history([spx, usb], timedelta(days=3), Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), timedelta(days=3), Resolution.TICK)

# Slice objects
all_history_slice = qb.history(timedelta(days=3))

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, timedelta(days=3))
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], timedelta(days=3))
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), timedelta(days=3))

# Tick objects
single_history_ticks = qb.history[Tick](spx, timedelta(days=3), Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, vix], timedelta(days=3), Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), timedelta(days=3), Resolution.TICK)

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start

datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .
PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 2, 1)

# DataFrame of trade data (indices don't have quote data)


single_history_df = qb.history(spx, start_time, end_time)
subset_history_df = qb.history([spx, vix], start_time, end_time)
all_history_df = qb.history(qb.securities.keys(), start_time, end_time)

# DataFrame of tick data


single_history_tick_df = qb.history(spx, start_time, end_time, Resolution.TICK)
subset_history_tick_df = qb.history([spx, vix], start_time, end_time, Resolution.TICK)
all_history_tick_df = qb.history(qb.securities.keys(), start_time, end_time, Resolution.TICK)

# TradeBar objects
single_history_trade_bars = qb.history[TradeBar](spx, start_time, end_time)
subset_history_trade_bars = qb.history[TradeBar]([spx, vix], start_time, end_time)
all_history_trade_bars = qb.history[TradeBar](qb.securities.keys(), start_time, end_time)

# Tick objects
single_history_ticks = qb.history[Tick](spx, start_time, end_time, Resolution.TICK)
subset_history_ticks = qb.history[Tick]([spx, vix], start_time, end_time, Resolution.TICK)
all_history_ticks = qb.history[Tick](qb.securities.keys(), start_time, end_time, Resolution.TICK)

The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

Resolutions

The following table shows the available resolutions and data formats for Index subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The only market available for Indices is Market.USA .

Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,

it won't change the outcome.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.


DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded Index Symbol and
the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single Index, index the loc property of the DataFrame with the Index Symbol .

PY

all_history_df.loc[spx] # or all_history_df.loc['SPX']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[spx]['close']
If you request historical data for multiple Indices, you can transform the DataFrame so that it's a time series of close
values for all of the Indices. To transform the DataFrame , select the column you want to display for each Index and

then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each Index and each row contains the

close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects
may not have data for all of your Index subscriptions. To avoid issues, check if the Slice contains data for your

Index before you index it with the Index Symbol .

You can also iterate through each TradeBar in the Slice .

PY

for slice in all_history_slice:


for kvp in slice.bars:
symbol = kvp.key
trade_bar = kvp.value
TradeBar Objects

If the history method returns TradeBar objects, iterate through the TradeBar objects to get each one.

PY

for trade_bar in single_history_trade_bars:


print(trade_bar)

If the history method returns TradeBars , iterate through the TradeBars to get the TradeBar of each Index. The

TradeBars may not have data for all of your Index subscriptions. To avoid issues, check if the TradeBars object

contains data for your security before you index it with the Index Symbol .

PY

for trade_bars in all_history_trade_bars:


if trade_bars.contains_key(spx):
trade_bar = trade_bars[spx]

You can also iterate through each of the TradeBars .

PY

for trade_bars in all_history_trade_bars:


for kvp in trade_bars:
symbol = kvp.Key
trade_bar = kvp.Value

Tick Objects

If the history method returns TICK objects, iterate through the TICK objects to get each one.

PY

for tick in single_history_ticks:


print(tick)

If the history method returns Ticks , iterate through the Ticks to get the TICK of each Index. The Ticks may not
have data for all of your Index subscriptions. To avoid issues, check if the Ticks object contains data for your

security before you index it with the Index Symbol .

PY

for ticks in all_history_ticks:


if ticks.contains_key(spx):
ticks = ticks[spx]

You can also iterate through each of the Ticks .

PY

for ticks in all_history_ticks:


for kvp in ticks:
symbol = kvp.key
tick = kvp.value
The Ticks objects only contain the last tick of each security for that particular timeslice

Plot Data

You need some historical Indices data to produce plots. You can use many of the supported plotting libraries to

visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(spx, datetime(2021, 11, 24), datetime(2021, 12, 8), Resolution.DAILY).loc[spx]

2. Import the plotly library.

PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='SPX OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create a Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([spx, vix], datetime(2021, 11, 24), datetime(2021, 12, 8), Resolution.DAILY)

2. Select the data to plot.

PY

pct_change = history['close'].unstack(0).pct_change().dropna()

3. Call the plot method on the pandas object.

PY

pct_change.plot(title="Close Price %Change", figsize=(15, 10))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Index Options

Datasets
Index Options

Datasets > Index Options > Key Concepts

Index Options
Key Concepts

Introduction

Index Options are a financial derivative that gives the holder the right (but not the obligation) to buy or sell the
value of an underlying Index, such as the S&P 500 index, at the stated exercise price. No actual assets are bought

or sold. This page explains the basics of Index Option data in the Research Environment. To get some data, see
Universes or Individual Contracts . For more information about the specific datasets we use, see the US Index

Options and US Index Option Universe dataset listings.

Resolutions

The following table shows the available resolutions and data formats for Index Option contract subscriptions:

Resolution TradeBar QuoteBar Trade Tick Quote Tick

TICK

SECOND

MINUTE

HOUR

DAILY

Markets

The only market available for Index Options is Market.USA .

Data Normalization

The data normalization mode doesn't affect data from history request. If you change the data normalization mode,
it won't change the outcome.
Datasets > Index Options > Universes

Index Options
Universes

Introduction

This page explains how to request historical data for a universe of Index Option contracts.

Create Subscriptions

Follow these steps to subscribe to an Index Option universe:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Add the underlying Index .

PY

index_symbol = qb.add_index("SPX", Resolution.MINUTE).symbol

To view the available Indices, see Supported Assets .

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

3. Call the add_index_option method with the underlying Index Symbol and, if you want non-standard Index

Options, the target Option ticker .

PY

option = qb.add_index_option(index_symbol)

Price History

The contract filter determines which Index Option contracts are in your universe each trading day. The default filter

selects the contracts with the following characteristics:

Standard type (exclude weeklys)


Within 1 strike price of the underlying asset price

Expire within 35 days

To change the filter, call the set_filter method.


PY

# Set the contract filter to select contracts that have the strike price
# within 1 strike level and expire within 90 days.
option.set_filter(-1, 1, 0, 90)

To get the prices and volumes for all of the Index Option contracts that pass your filter during a specific period of

time, call the option_history method with the underlying Index Symbol object, a start datetime , and an end
datetime .

PY

option_history = qb.option_history(
index_symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), Resolution.MINUTE,
fill_forward=False, extended_market_hours=False
)

To convert the OptionHistory object to a DataFrame that contains the trade and quote information of each

contract and the underlying, use the data_frame property.

PY

option_history.data_frame

To get the expiration dates of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_expiry_dates()

To get the strike prices of all the contracts in an OptionHistory object, call the method.

PY

option_history.get_strikes()
Daily Price and Greeks History

To get daily data on all the tradable contracts for a given date, call the history method with the canoncial Option
Symbol, a start date, and an end date. This method returns the entire Option chain for each trading day, not the

subset of contracts that pass your universe filter. The daily Option chains contain the prices, volume, open
interest, implied volaility, and Greeks of each contract.

PY

# DataFrame format
history_df = qb.history(option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5), flatten=True)

# OptionUniverse objects
history = qb.history[OptionUniverse](option.symbol, datetime(2024, 1, 1), datetime(2024, 1, 5))
for chain in history:
end_time = chain.end_time
filtered_chain = [contract for contract in chain if contract.greeks.delta > 0.3]
for contract in filtered_chain:
price = contract.price
iv = contract.implied_volatility

The method represents each contract with an OptionUniverse object, which have the following properties:
Datasets > Index Options > Individual Contracts

Index Options
Individual Contracts

Introduction

This page explains how to request historical data for individual Index Option contracts. The history requests on this
page only return the prices and open interest of the Option contracts, not their implied volatility or Greeks. For

information about history requests that return the daily implied volatility and Greeks, see Universes .

Create Subscriptions

Follow these steps to subscribe to individual Index Option contracts:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Add the underlying Index .

PY

underlying_symbol = qb.add_index("SPX", Resolution.MINUTE).symbol

To view the available Indices, see Supported Assets .

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

3. Set the start date to a date in the past that you want to use as the analysis date.

PY

qb.set_start_date(2024, 1, 1)

The method that you call in the next step returns data on all the contracts that were tradable on this date.

4. Call the option_chain method with the underlying Index Symbol .

PY

# Get the Option contracts that were tradable on January 1st, 2024.
# Option A: Standard contracts.
chain = qb.option_chain(
Symbol.create_canonical_option(underlying_symbol, Market.USA, "?SPX"), flatten=True
).data_frame

# Option B: Weekly contracts.


#chain = qb.option_chain(
# Symbol.create_canonical_option(underlying_symbol, "SPXW", Market.USA, "?SPXW"), flatten=True
#).data_frame
This method returns an OptionChain object, which represent an entire chain of Option contracts for a single
underlying security. You can even format the chain data into a DataFrame where each row in the DataFrame

represents a single contract.

5. Sort and filter the data to select the specific contract(s) you want to analyze.

PY

# Select a contract.
expiry = chain.expiry.min()
contract_symbol = chain[
# Select call contracts with the closest expiry.
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
# Select contracts with a 0.3-0.7 delta.
(chain.delta > 0.3) &
(chain.delta < 0.7)
# Select the contract with the largest open interest.
].sort_values('openinterest').index[-1]

6. Call the add_index_option_contract method with an OptionContract Symbol and disable fill-forward.

PY

option_contract = qb.add_index_option_contract(contract_symbol, fill_forward=False)

Disable fill-forward because there are only a few OpenInterest data points per day.

Trade History

TradeBar objects are price bars that consolidate individual trades from the exchanges. They contain the open,

high, low, close, and volume of trading activity over a period of time.
To get trade data, call the history or history[TradeBar] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(TradeBar, contract_symbol, timedelta(3))
display(history_df)

# TradeBar objects
history = qb.history[TradeBar](contract_symbol, timedelta(3))
for trade_bar in history:
print(trade_bar)

TradeBar objects have the following properties:

Quote History

QuoteBar objects are bars that consolidate NBBO quotes from the exchanges. They contain the open, high, low,

and close prices of the bid and ask. The open , high , low , and close properties of the QuoteBar object are the
mean of the respective bid and ask prices. If the bid or ask portion of the QuoteBar has no data, the open , high ,

low , and close properties of the QuoteBar copy the values of either the bid or ask instead of taking their mean.

To get quote data, call the history or history[QuoteBar] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(QuoteBar, contract_symbol, timedelta(3))
display(history_df)

# QuoteBar objects
history = qb.history[QuoteBar](contract_symbol, timedelta(3))
for quote_bar in history:
print(quote_bar)
QuoteBar objects have the following properties:

Open Interest History

Open interest is the number of outstanding contracts that haven't been settled. It provides a measure of investor
interest and the market liquidity, so it's a popular metric to use for contract selection. Open interest is calculated

once per day.

To get open interest data, call the history or history[OpenInterest] method with the contract Symbol object(s).

PY

# DataFrame format
history_df = qb.history(OpenInterest, contract_symbol, timedelta(3))
display(history_df)

# OpenInterest objects
history = qb.history[OpenInterest](contract_symbol, timedelta(3))
for open_interest in history:
print(open_interest)

OpenInterest objects have the following properties:

Greeks and IV History

The Greeks are measures that describe the sensitivity of an Option's price to various factors like underlying price

changes (Delta), time decay (Theta), volatility (Vega), and interest rates (Rho), while Implied Volatility (IV)
represents the market's expectation of the underlying asset's volatility over the life of the Option.

Follow these steps to get the Greeks and IV data:

1. Create the mirror contract Symbol .

PY

mirror_contract_symbol = Symbol.create_option(
option_contract.underlying.symbol, contract_symbol.id.market, option_contract.style,
OptionRight.Call if option_contract.right == OptionRight.PUT else OptionRight.PUT,
option_contract.strike_price, option_contract.expiry
)

2. Set up the risk free interest rate , dividend yield , and Option pricing models.

In our research , we found the Forward Tree model to be the best pricing model for indicators.

PY

risk_free_rate_model = qb.risk_free_interest_rate_model
dividend_yield_model = DividendYieldProvider(underlying_symbol)
option_model = OptionPricingModelType.FORWARD_TREE

3. Define a method to return the IV & Greeks indicator values for each contract.
PY

def greeks_and_iv(contracts, period, risk_free_rate_model, dividend_yield_model, option_model):


# Get the call and put contract.
call, put = sorted(contracts, key=lambda s: s.id.option_right)

def get_values(indicator_class, contract, mirror_contract):


return qb.indicator_history(
indicator_class(contract, risk_free_rate_model, dividend_yield_model, mirror_contract,
option_model),
[contract, mirror_contract, contract.underlying],
period
).data_frame.current

return pd.DataFrame({
'iv_call': get_values(ImpliedVolatility, call, put),
'iv_put': get_values(ImpliedVolatility, put, call),
'delta_call': get_values(Delta, call, put),
'delta_put': get_values(Delta, put, call),
'gamma_call': get_values(Gamma, call, put),
'gamma_put': get_values(Gamma, put, call),
'rho_call': get_values(Rho, call, put),
'rho_put': get_values(Rho, put, call),
'vega_call': get_values(Vega, call, put),
'vega_put': get_values(Vega, put, call),
'theta_call': get_values(Theta, call, put),
'theta_put': get_values(Theta, put, call),
})

4. Call the preceding method and display the results.

PY

greeks_and_iv([contract_symbol, mirror_contract_symbol], 15, risk_free_rate_model,


dividend_yield_model, option_model)

The DataFrame can have NaN entries if there is no data for the contracts or the underlying asset at a moment in
time.

Examples

The following examples demonstrate some common practices for analyzing individual Index Option contracts in
the Research Environment.
Example 1: Contract Trade History

The following notebook plots the historical prices of an SPX Index Option contract using Plotly :

PY

import plotly.graph_objects as go

# Get the SPX Option chain for January 1, 2024.


qb = QuantBook()
underlying_symbol = qb.add_index("SPX").symbol
qb.set_start_date(2024, 1, 1)
chain = qb.option_chain(
Symbol.create_canonical_option(underlying_symbol, Market.USA, "?SPX"), flatten=True
).data_frame

# Select a contract from the chain.


expiry = chain.expiry.min()
contract_symbol = chain[
(chain.expiry == expiry) &
(chain.right == OptionRight.CALL) &
(chain.delta > 0.3) &
(chain.delta < 0.7)
].sort_values('openinterest').index[-1]

# Add the target contract.


qb.add_index_option_contract(contract_symbol)

# Get the contract history.


history = qb.history(contract_symbol, timedelta(3))

# Plot the price history.


go.Figure(
data=go.Candlestick(
x=history.index.levels[4],
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close']
),
layout=go.Layout(
title=go.layout.Title(text=f'{contract_symbol.value} OHLC'),
xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False
)
).show()
Example 2: Contract Open Interest History

The following notebook plots the historical open interest of a VIXW Index Option contract using Matplotlib :

PY

# Get the VIX weekly Option chain for January 1, 2024.


qb = QuantBook()
underlying_symbol = qb.add_index("VIX").symbol
qb.set_start_date(2024, 1, 1)
chain = qb.option_chain(
Symbol.create_canonical_option(underlying_symbol, "VIXW", Market.USA, "?VIXW"), flatten=True
).data_frame

# Select a contract from the chain.


strike_distance = (chain.strike - chain.underlyinglastprice).abs()
target_strike_distance = strike_distance.min()
chain = chain.loc[strike_distance[strike_distance == target_strike_distance].index]
contract_symbol = chain.sort_values('openinterest').index[-1]

# Add the target contract.


qb.add_index_option_contract(contract_symbol, fill_forward=False)

# Get the contract's open interest history.


history = qb.history(OpenInterest, contract_symbol, timedelta(90))
history.index = history.index.droplevel([0, 1, 2])
history = history['openinterest'].unstack(0)[contract_symbol]

# Plot the open interest history.


history.plot(title=f'{contract_symbol.value} Open Interest')
plt.show()
Datasets > Alternative Data

Datasets
Alternative Data

Introduction

This page explains how to request, manipulate, and visualize historical alternative data. This tutorial uses the VIX

Daily Price dataset from the CBOE as the example dataset.

Create Subscriptions

Follow these steps to subscribe to an alternative dataset from the Dataset Market :

1. Create a QuantBook .

PY

qb = QuantBook()

2. Call the add_data method with the dataset class, a ticker, and a resolution and then save a reference to the

alternative data Symbol .

PY

vix = qb.add_data(CBOE, "VIX", Resolution.DAILY).symbol


v3m = qb.add_data(CBOE, "VIX3M", Resolution.DAILY).symbol

To view the arguments that the add_data method accepts for each dataset, see the dataset listing .

If you don't pass a resolution argument, the default resolution of the dataset is used by default. To view the
supported resolutions and the default resolution of each dataset, see the dataset listing .

Get Historical Data

You need a subscription before you can request historical data for a dataset. On the time dimension, you can
request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined
period of time. On the dataset dimension, you can request historical data for a single dataset subscription, a subset

of the dataset subscriptions you created in your notebook, or all of the dataset subscriptions in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an
integer.
PY

# DataFrame
single_history_df = qb.History(vix, 10)
subset_history_df = qb.History([vix, v3m], 10)
all_history_df = qb.History(qb.Securities.Keys, 10)

# Slice objects
all_history_slice = qb.History(10)

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, 10)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], 10)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a
timedelta .

PY

# DataFrame
single_history_df = qb.History(vix, timedelta(days=3))
subset_history_df = qb.History([vix, v3m], timedelta(days=3))
all_history_df = qb.History(qb.Securities.Keys, timedelta(days=3))

# Slice objects
all_history_slice = qb.History(timedelta(days=3))

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, timedelta(days=3))
subset_history_data_objects = qb.History[CBOE]([vix, v3m], timedelta(days=3))
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, timedelta(days=3))

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start
datetime , and an end datetime . The start and end times you provide are based in the notebook time zone .

PY

start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 3, 1)

# DataFrame
single_history_df = qb.History(vix, start_time, end_time)
subset_history_df = qb.History([vix, v3m], start_time, end_time)
all_history_df = qb.History(qb.Securities.Keys, start_time, end_time)

# Slice objects
all_history_slice = qb.History(start_time, end_time)

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, start_time, end_time)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], start_time, end_time)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, start_time, end_time)
The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

If you do not pass a resolution to the history method, the history method uses the resolution that the add_data
method used when you created the subscription .

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data
depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line.

To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame , the first level of the DataFrame index is the encoded dataset Symbol
and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

To select the historical data of a single dataset, index the loc property of the DataFrame with the dataset Symbol .

PY

all_history_df.loc[vix] # or all_history_df.loc['VIX']

To select a column of the DataFrame , index it with the column name.

PY

all_history_df.loc[vix]['close']
If you request historical data for multiple tickers, you can transform the DataFrame so that it's a time series of close

values for all of the tickers. To transform the DataFrame , select the column you want to display for each ticker and
then call the unstack method.

PY

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each ticker and each row contains the
close value.

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects

may not have data for all of your dataset subscriptions. To avoid issues, check if the Slice contains data for your
ticker before you index it with the dataset Symbol .

Plot Data

You need some historical alternative data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

You can only create candlestick charts for alternative datasets that have open, high, low, and close properties.

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(vix, datetime(2021, 1, 1), datetime(2021, 2, 1)).loc[vix]

2. Import the plotly library.


PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text='VIX from CBOE OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create a Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the alternative data.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Get some historical data.

PY

history = qb.history([vix, v3m], datetime(2021, 1, 1), datetime(2021, 2, 1))

2. Select the data to plot.

PY

values = history['close'].unstack(0)

3. Call the plot method on the pandas object.

PY

values.plot(title = 'Close', figsize=(15, 10))

4. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Datasets > Custom Data

Datasets
Custom Data

Introduction

This page explains how to request, manipulate, and visualize historical user-defined custom data.

Define Custom Data

You must format the data file into chronological order before you define the custom data class.

To define a custom data class, extend the PythonData class and override the GetSource and Reader methods.

PY

class Nifty(PythonData):
'''NIFTY Custom Data Class'''
def get_source(self, config: SubscriptionDataConfig, date: datetime, is_live_mode: bool) ->
SubscriptionDataSource:
url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv"
return SubscriptionDataSource(url, SubscriptionTransportMedium.REMOTE_FILE)

def reader(self, config: SubscriptionDataConfig, line: str, date: datetime, is_live_mode: bool) ->
BaseData:
if not (line.strip() and line[0].isdigit()): return None

# New Nifty object


index = Nifty()
index.symbol = config.symbol

try:
# Example File Format:
# Date, Open High Low Close Volume Turnover
# 2011-09-13 7792.9 7799.9 7722.65 7748.7 116534670 6107.78
data = line.split(',')
index.time = datetime.strptime(data[0], "%Y-%m-%d")
index.end_time = index.time + timedelta(days=1)
index.value = data[4]
index["Open"] = float(data[1])
index["High"] = float(data[2])
index["Low"] = float(data[3])
index["Close"] = float(data[4])

except:
pass

return index

Create Subscriptions

You need to define a custom data class before you can subscribe to it.

Follow these steps to subscribe to custom dataset:

1. Create a QuantBook .
PY

qb = QuantBook()

2. Call the add_data method with a ticker and then save a reference to the data Symbol .

PY

symbol = qb.add_data(Nifty, "NIFTY").symbol

Custom data has its own resolution, so you don't need to specify it.

Get Historical Data

You need a subscription before you can request historical data for a security. You can request an amount of

historical data based on a trailing number of bars, a trailing period of time, or a defined period of time.

Before you request data, call set_start_date method with a datetime to reduce the risk of look-ahead bias .

PY

qb.set_start_date(2014, 7, 29)

If you call the set_start_date method, the date that you pass to the method is the latest date for which your

history requests will return data.

Trailing Number of Bars

Call the history method with a symbol, integer, and resolution to request historical data based on the given
number of trailing bars and resolution.

PY

history = qb.history(symbol, 10)

This method returns the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

Call the history method with a symbol, timedelta , and resolution to request historical data based on the given
trailing period of time and resolution.

PY

history = qb.history(symbol, timedelta(days=10))

This method returns the most recent bars, excluding periods of time when the exchange was closed.

Defined Period of Time

Call the history method with a symbol, start datetime , end datetime , and resolution to request historical data
based on the defined period of time and resolution. The start and end times you provide are based in the notebook
time zone .

PY

start_time = datetime(2013, 7, 29)


end_time = datetime(2014, 7, 29)
history = qb.history(symbol, start_time, end_time)

This method returns the bars that are timestamped within the defined period of time.

In all of the cases above, the history method returns a DataFrame with a MultiIndex .

Download Method

To download the data directly from the remote file location instead of using your custom data class, call the
download method with the data URL.

PY

content = qb.download("http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv")

Follow these steps to convert the content to a DataFrame :

1. Import the StringIO from the io library.

PY

from io import StringIO

2. Create a StringIO .

PY

data = StringIO(content)

3. Call the read_csv method.

PY

dataframe = pd.read_csv(data, index_col=0)


Wrangle Data

You need some historical data to perform wrangling operations. To display pandas objects, run a cell in a notebook
with the pandas object as the last line. To display other data formats, call the print method.

The DataFrame that the history method returns has the following index levels:

1. Dataset Symbol
2. The end_time of the data sample

The columns of the DataFrame are the data properties.

To select the data of a single dataset, index the loc property of the DataFrame with the data Symbol .

PY

history.loc[symbol]
To select a column of the DataFrame , index it with the column name.

PY

history.loc[symbol]['close']

Plot Data

You need some historical custom data to produce plots. You can use many of the supported plotting libraries to
visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

1. Get some historical data .

PY

history = qb.history(Nifty, datetime(2013, 7, 1), datetime(2014, 7, 31)).loc[symbol]

2. Import the plotly library.


PY

import plotly.graph_objects as go

3. Create a Candlestick .

PY

candlestick = go.Candlestick(x=history.index,
open=history['open'],
high=history['high'],
low=history['low'],
close=history['close'])

4. Create a Layout .

PY

layout = go.Layout(title=go.layout.Title(text=f'{symbol} OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Create a Figure .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Show the Figure .

PY

fig.show()

Candlestick charts display the open, high, low, and close prices of the security.
Line Chart

Follow these steps to plot line charts using built-in methods :

1. Select data to plot.

PY

values = history['value'].unstack(level=0)

2. Call the plot method on the pandas object.

PY

values.plot(title="Value", figsize=(15, 10))

3. Show the plot.

PY

plt.show()

Line charts display the value of the property you selected in a time series.
Charting

Charting

The Research Environment is centered around analyzing and understanding data. One way to gain a more intuitive

understanding of the existing relationships in our data is to visualize it using charts. There are many different
libraries that allow you to chart our data in different ways. Sometimes the right chart can illuminate an interesting

relationship in the data. Click one of the following libraries to learn more about it:

Bokeh

Matplotlib

Plotly

Seaborn

Plotly NET

See Also

Supported Libraries
Algorithm Charting
Charting > Bokeh

Charting
Bokeh

Introduction

bokeh is a Python library you can use to create interactive visualizations. It helps you build beautiful graphics,

ranging from simple plots to complex dashboards with streaming datasets. With bokeh , you can create JavaScript-

powered visualizations without writing any JavaScript.

Import Libraries

Follow these steps to import the libraries that you need:

1. Import the bokeh library.

PY

from bokeh.plotting import figure, show


from bokeh.models import BasicTicker, ColorBar, ColumnDataSource, LinearColorMapper
from bokeh.palettes import Category20c
from bokeh.transform import cumsum, transform
from bokeh.io import output_notebook

2. Call the output_notebook method.

PY

output_notebook()

3. Import the numpy library.

PY

import numpy as np

Get Historical Data

Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some

banking companies over 2021, run:

PY

qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))
Create Candlestick Chart

You must import the plotting libraries and get some historical data to create candlestick charts.

In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the
banking securities. Follow these steps to create the candlestick chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol .

PY

data = history.loc[symbol]

3. Divide the data into days with positive returns and days with negative returns.

PY

up_days = data[data['close'] > data['open']]


down_days = data[data['open'] > data['close']]

4. Call the figure function with a title, axis labels and x-axis type.

PY

plot = figure(title=f"{symbol} OHLC", x_axis_label='Date', y_axis_label='Price',


x_axis_type='datetime')

5. Call the segment method with the data timestamps, high prices, low prices, and a color.

PY

plot.segment(data.index, data['high'], data.index, data['low'], color="black")

This method call plots the candlestick wicks.

6. Call the vbar method for the up and down days with the data timestamps, open prices, close prices, and a

color.

PY

width = 12*60*60*1000
plot.vbar(up_days.index, width, up_days['open'], up_days['close'],
fill_color="green", line_color="green")
plot.vbar(down_days.index, width, down_days['open'], down_days['close'],
fill_color="red", line_color="red")

This method call plots the candlestick bodies.

7. Call the show function.


PY

show(plot)

The Jupyter Notebook displays the candlestick chart.

Create Line Plot

You must import the plotting libraries and get some historical data to create line charts.

In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these
steps to create the line chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

close_prices = history.loc[symbol]['close']

3. Call the figure function with title, axis labels and x-axis type..
PY

plot = figure(title=f"{symbol} Close Price", x_axis_label='Date', y_axis_label='Price',


x_axis_type='datetime')

4. Call the line method with the timestamps, close_prices , and some design settings.

PY

plot.line(close_prices.index, close_prices,
legend_label=symbol.value, color="blue", line_width=2)

5. Call the show function.

PY

show(plot)

The Jupyter Notebook displays the line plot.

Create Scatter Plot

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a scatter plot that shows the relationship between the daily returns of two banking
securities. Follow these steps to create the scatter plot:

1. Select 2 Symbol s.
For example, to select the Symbol s of the first 2 bank stocks, run:

PY

symbol1 = symbols[1]
symbol2 = symbols[2]

2. Slice the history DataFrame with each Symbol and then select the close column.

PY

close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']

3. Call the pct_change and dropna methods on each Series .

PY

daily_return1 = close_price1.pct_change().dropna()
daily_return2 = close_price2.pct_change().dropna()

4. Call the polyfit method with the daily_returns1 , daily_returns2 , and a degree.

PY

m, b = np.polyfit(daily_returns1, daily_returns2, deg=1)

This method call returns the slope and intercept of the ordinary least squares regression line.

5. Call the linspace method with the minimum and maximum values on the x-axis.

PY

x = np.linspace(daily_returns1.min(), daily_returns1.max())

6. Calculate the y-axis coordinates of the regression line.

PY

y = m*x + b

7. Call the figure function with a title and axis labels.

PY

plot = figure(title=f"{symbol1} vs {symbol2} Daily Return",


x_axis_label=symbol1.value, y_axis_label=symbol2.value)

8. Call the line method with x- and y-axis values, a color, and a line width.

PY

plot.line(x, y, color="red", line_width=2)

This method call plots the regression line.


9. Call the dot method with the daily_returns1 , daily_returns2 , and some design settings.

PY

plot.dot(daily_returns1, daily_returns2, size=20, color="navy", alpha=0.5)

This method call plots the scatter plot dots.

10. Call the show function.

PY

show(plot)

The Jupyter Notebook displays the scatter plot.

Create Histogram

You must import the plotting libraries and get some historical data to create histograms.

In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
ETF. In addition to the bins in the histogram, you overlay a normal distribution curve for comparison. Follow these

steps to create the histogram:

1. Select the Symbol .


PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

close_prices = history.loc[symbol]['close']

3. Call the pct_change method and then call the dropna method.

PY

daily_returns = close_prices.pct_change().dropna()

4. Call the histogram method with the daily_returns , the density argument enabled, and a number of bins.

PY

hist, edges = np.histogram(daily_returns, density=True, bins=20)

This method call returns the following objects:

hist : The value of the probability density function at each bin, normalized such that the integral over the

range is 1.
edges : The x-axis value of the edges of each bin.

Call the figure method with a title and axis labels.

PY

plot = figure(title=f"{symbol} Daily Return Distribution",


x_axis_label='Return', y_axis_label='Frequency')

Call the quad method with the coordinates of the bins and some design settings.

PY

plot.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],


fill_color="navy", line_color="white", alpha=0.5)

This method call plots the histogram bins.

Call the mean and std methods.

PY

mean = daily_returns.mean()
std = daily_returns.std()

Call the linspace method with the lower limit, upper limit, and number data points for the x-axis of the normal
distribution curve.
PY

x = np.linspace(-3*std, 3*std, 1000)

Calculate the y-axis values of the normal distribution curve.

PY

pdf = 1/(std * np.sqrt(2*np.pi)) * np.exp(-(x-mean)**2 / (2*std**2))

Call the line method with the data and style of the normal distribution PDF curve.

PY

plot.line(x, pdf, color="red", line_width=4,


alpha=0.7, legend_label="Normal Distribution PDF")

This method call plots the normal distribution PDF curve.

Call show to show the plot.

PY

show(plot)

The Jupyter Notebook displays the histogram.

Create Bar Chart


You must import the plotting libraries and get some historical data to create bar charts.

In this example, you create a bar chart that shows the average daily percent return of the banking securities.
Follow these steps to create the bar chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method and then multiply by 100.

PY

daily_returns = close_prices.pct_change() * 100

3. Call the mean method.

PY

avg_daily_returns = daily_returns.mean()

4. Call the DataFrame constructor with the data Series and then call the reset_index method.

PY

avg_daily_returns = pd.DataFrame(avg_daily_returns, columns=['avg_return']).reset_index()

5. Call the figure function with a title, x-axis values, and axis labels.

PY

plot = figure(title='Banking Stocks Average Daily % Returns', x_range=avg_daily_returns['symbol'],


x_axis_label='%', y_axis_label='Stocks')

6. Call the vbar method with the avg_daily_returns , x- and y-axis column names, and a bar width.

PY

plot.vbar(source=avg_daily_returns, x='symbol', top='avg_return', width=0.8)

7. Rotate the x-axis label and then call the show function.

PY

plot.xaxis.major_label_orientation = 0.6
show(plot)

The Jupyter Notebook displays the bar chart.


Create Heat Map

You must import the plotting libraries and get some historical data to create heat maps.

In this example, you create a heat map that shows the correlation between the daily returns of the banking
securities. Follow these steps to create the heat map:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the corr method.

PY

corr_matrix = daily_returns.corr()

4. Set the index and columns of the corr_matrix to the ticker of each security and then set the name of the

column and row indices.


PY

corr_matrix.index = corr_matrix.columns = [symbol.value for symbol in symbols]


corr_matrix.index.name = 'symbol'
corr_matrix.columns.name = "stocks"

5. Call the stack , rename , and reset_index methods.

PY

corr_matrix = corr_matrix.stack().rename("value").reset_index()

6. Call the figure function with a title, axis ticks, and some design settings.

PY

plot = figure(title=f"Banking Stocks and Bank Sector ETF Correlation Heat Map",
x_range=list(corr_matrix.symbol.drop_duplicates()),
y_range=list(corr_matrix.stocks.drop_duplicates()),
toolbar_location=None,
tools="",
x_axis_location="above")

7. Select a color palette and then call the LinearColorMapper constructor with the color pallet, the minimum

correlation, and the maximum correlation.

PY

colors = Category20c[len(corr_matrix.columns)]
mapper = LinearColorMapper(palette=colors, low=corr_matrix.value.min(),
high=corr_matrix.value.max())

8. Call the rect method with the correlation plot data and design setting.

PY

plot.rect(source=ColumnDataSource(corr_matrix),
x="stocks",
y="symbol",
width=1,
height=1,
line_color=None,
fill_color=transform('value', mapper))

9. Call the ColorBar constructor with the mapper , a location, and a BaseTicker .

PY

color_bar = ColorBar(color_mapper=mapper,
location=(0, 0),
ticker=BasicTicker(desired_num_ticks=len(colors)))

This snippet creates a color bar to represent the correlation coefficients of the heat map cells.

10. Call the add_layout method with the color_bar and a location.
PY

plot.add_layout(color_bar, 'right')

This method call plots the color bar to the right of the heat map.

11. Call the show function.

PY

show(plot)

The Jupyter Notebook displays the heat map.

Create Pie Chart

You must import the plotting libraries and get some historical data to create pie charts.

In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate

to them based on their inverse volatility. Follow these steps to create the pie chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)
2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the var method, take the inverse, and then normalize the result.

PY

inverse_variance = 1 / daily_returns.var()
inverse_variance /= np.sum(inverse_variance) # Normalization
inverse_variance *= np.pi*2 # For a full circle circumference in radian

4. Call the DataFrame constructor with the inverse_variance Series and then call the reset_index method.

PY

inverse_variance = pd.DataFrame(inverse_variance, columns=["inverse variance"]).reset_index()

5. Add a color column to the inverse_variance DataFrame .

PY

inverse_variance['color'] = Category20c[len(inverse_variance.index)]

6. Call the figure function with a title.

PY

plot = figure(title=f"Banking Stocks and Bank Sector ETF Allocation")

7. Call the wedge method with design settings and the inverse_variance DataFrame .

PY

plot.wedge(x=0, y=1, radius=0.6, start_angle=cumsum('inverse variance', include_zero=True),


end_angle=cumsum('inverse variance'), line_color="white", fill_color='color',
legend_field='symbol', source=inverse_variance)

8. Call the show function.

PY

show(plot)

The Jupyter Notebook displays the pie chart.


Charting > Matplotlib

Charting
Matplotlib

Introduction

matplotlib is the most popular 2d-charting library for python. It allows you to easily create histograms, scatter

plots, and various other charts. In addition, pandas is integrated with matplotlib , so you can seamlessly move
between data manipulation and data visualization. This makes matplotlib great for quickly producing a chart to

visualize your data.

Import Libraries

Follow these steps to import the libraries that you need:

1. Import the matplotlib , mplfinance , and numpy libraries.

PY

import matplotlib.pyplot as plt


import mplfinance
import numpy as np

2. Import, and then call, the register_matplotlib_converters method.

PY

from pandas.plotting import register_matplotlib_converters


register_matplotlib_converters()

Get Historical Data

Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some

banking companies over 2021, run:

PY

qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))

Create Candlestick Chart

You must import the plotting libraries and get some historical data to create candlestick charts.

In this example, we'll create a candlestick chart that shows the open, high, low, and close prices of one of the
banking securities. Follow these steps to create the candlestick chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol .

PY

data = history.loc[symbol]

3. Rename the columns.

PY

data.columns = ['Close', 'High', 'Low', 'Open', 'Volume']

4. Call the plot method with the data , chart type, style, title, y-axis label, and figure size.

PY

mplfinance.plot(data,
type='candle',
style='charles',
title=f'{symbol.value} OHLC',
ylabel='Price ($)',
figratio=(15, 10))

The Jupyter Notebook displays the candlestick chart.


Create Line Plot

You must import the plotting libraries and get some historical data to create line charts.

In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these

steps to create the line chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with symbol and then select the close column.

PY

data = history.loc[symbol]['close']

3. Call the plot method with a title and figure size.

PY

data.plot(title=f"{symbol} Close Price", figsize=(15, 10));

The Jupyter Notebook displays the line plot.


Create Scatter Plot

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a scatter plot that shows the relationship between the daily returns of two banking

securities. Follow these steps to create the scatter plot:

1. Select the 2 Symbol s.

For example, to select the Symbol s of the first 2 bank stocks, run:

PY

symbol1 = symbols[1]
symbol2 = symbols[2]

2. Slice the history DataFrame with each Symbol and then select the close column.

PY

close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']

3. Call the pct_change and dropna methods on each Series .

PY

daily_returns1 = close_price1.pct_change().dropna()
daily_returns2 = close_price2.pct_change().dropna()

4. Call the polyfit method with the daily_returns1 , daily_returns2 , and a degree.
PY

m, b = np.polyfit(daily_returns1, daily_returns2, deg=1)

This method call returns the slope and intercept of the ordinary least squares regression line.

5. Call the linspace method with the minimum and maximum values on the x-axis.

PY

x = np.linspace(daily_returns1.min(), daily_returns1.max())

6. Calculate the y-axis coordinates of the regression line.

PY

y = m*x + b

7. Call the plot method with the coordinates and color of the regression line.

PY

plt.plot(x, y, color='red')

8. In the same cell that you called the plot method, call the scatter method with the 2 daily return series.

PY

plt.scatter(daily_returns1, daily_returns2)

9. In the same cell that you called the scatter method, call the title , xlabel , and ylabel methods with a title
and axis labels.

PY

plt.title(f'{symbol1} vs {symbol2} daily returns Scatter Plot')


plt.xlabel(symbol1.value)
plt.ylabel(symbol2.value);

The Jupyter Notebook displays the scatter plot.


Create Histogram

You must import the plotting libraries and get some historical data to create histograms.

In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector

ETF. In addition to the bins in the histogram, you overlay a normal distribution curve for comparison. Follow these
steps to create the histogram:

1. Select the Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

close_prices = history.loc[symbol]['close']

3. Call the pct_change method and then call the dropna method.

PY

daily_returns = close_prices.pct_change().dropna()

4. Call the mean and std methods.


PY

mean = daily_returns.mean()
std = daily_returns.std()

5. Call the linspace method with the lower limit, upper limit, and number data points for the x-axis of the normal

distribution curve.

PY

x = np.linspace(-3*std, 3*std, 1000)

6. Calculate the y-axis values of the normal distribution curve.

PY

pdf = 1/(std * np.sqrt(2*np.pi)) * np.exp(-(x-mean)**2 / (2*std**2))

7. Call the plot method with the data for the normal distribution curve.

PY

plt.plot(x, pdf, label="Normal Distribution")

8. In the same cell that you called the plot method, call the hist method with the daily return data and the

number of bins.

PY

plt.hist(daily_returns, bins=20)

9. In the same cell that you called the hist method, call the title , xlabel , and ylabel methods with a title and

the axis labels.

PY

plt.title(f'{symbol} Return Distribution')


plt.xlabel('Daily Return')
plt.ylabel('Count');

The Jupyter Notebook displays the histogram.


Create Bar Chart

You must import the plotting libraries and get some historical data to create bar charts.

In this example, you create a bar chart that shows the average daily percent return of the banking securities.
Follow these steps to create the bar chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method and then multiply by 100.

PY

daily_returns = close_prices.pct_change() * 100

3. Call the mean method.

PY

avg_daily_returns = daily_returns.mean()

4. Call the figure method with a figure size.

PY

plt.figure(figsize=(15, 10))
5. Call the bar method with the x-axis and y-axis values.

PY

plt.bar(avg_daily_returns.index, avg_daily_returns)

6. In the same cell that you called the bar method, call the title , xlabel , and ylabel methods with a title and

the axis labels.

PY

plt.title('Banking Stocks Average Daily % Returns')


plt.xlabel('Tickers')
plt.ylabel('%');

The Jupyter Notebook displays the bar chart.

Create Heat Map

You must import the plotting libraries and get some historical data to create heat maps.

In this example, you create a heat map that shows the correlation between the daily returns of the banking

securities. Follow these steps to create the heat map:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)
2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the corr method.

PY

corr_matrix = daily_returns.corr()

4. Call the imshow method with the correlation matrix, a color map, and an interpolation method.

PY

plt.imshow(corr_matrix, cmap='hot', interpolation='nearest')

5. In the same cell that you called the imshow method, call the title , xticks , and yticks , methods with a title

and the axis tick labels.

PY

plt.title('Banking Stocks and Bank Sector ETF Correlation Heat Map')


plt.xticks(np.arange(len(tickers)), labels=tickers)
plt.yticks(np.arange(len(tickers)), labels=tickers)

6. In the same cell that you called the imshow method, call the colorbar method.

PY

plt.colorbar();

The Jupyter Notebook displays the heat map.


Create Pie Chart

You must import the plotting libraries and get some historical data to create pie charts.

In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate

to them based on their inverse volatility. Follow these steps to create the pie chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the var method and then take the inverse.

PY

inverse_variance = 1 / daily_returns.var()

4. Call the pie method with the inverse_variance Series , the plot labels, and a display format.
PY

plt.pie(inverse_variance, labels=inverse_variance.index, autopct='%1.1f%%')

5. In the cell that you called the pie method, call the title method with a title.

PY

plt.title('Banking Stocks and Bank Sector ETF Allocation');

The Jupyter Notebook displays the pie chart.

Create 3D Chart

You must import the plotting libraries and get some historical data to create 3D charts.

In this example, you create a 3D chart that shows the price of an asset on each dimension, i.e. the price correlation

of 3 different symbols. Follow these steps to create the 3D chart:

1. Select the asset to plot on each dimension.

PY

x, y, z = symbols[:3]

2. Select the close price series of each symbol.


PY

x_hist = history.loc[x].close
y_hist = history.loc[y].close
z_hist = history.loc[z].close

3. Construct the basic 3D plot layout.

PY

fig = plt.figure(figsize=(8, 8))


ax = fig.add_subplot(projection='3d')

4. Call the ax.scatter method with the 3 price series to plot the graph.

PY

ax.scatter(x_hist, y_hist, z_hist)

5. Update the x, y, and z axis labels.

PY

ax.set_xlabel(f"{x} Price")
ax.set_ylabel(f"{y} Price")
ax.set_zlabel(f"{z} Price")

6. Display the 3D chart. Note that you need to zoom the chart to avoid z-axis cut off.

PY

ax.set_box_aspect(None, zoom=0.85)
plt.show()

The Jupyter Notebook displays the pie chart.


Charting > Plotly

Charting
Plotly

Introduction

plotly is an online charting tool with a python API. It offers the ability to create rich and interactive graphs.

Import Libraries

Import the plotly library.

PY

import plotly.express as px
import plotly.graph_objects as go

Get Historical Data

Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some

banking companies over 2021, run:

PY

qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))

Create Candlestick Chart

You must import the plotting libraries and get some historical data to create candlestick charts.

In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the

banking securities. Follow these steps to create the candlestick chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol .

PY

data = history.loc[symbol]
3. Call the Candlestick constructor with the time and open, high, low, and close price Series .

PY

candlestick = go.Candlestick(x=data.index,
open=data['open'],
high=data['high'],
low=data['low'],
close=data['close'])

4. Call the Layout constructor with a title and axes labels.

PY

layout = go.Layout(title=go.layout.Title(text=f'{symbol.value} OHLC'),


xaxis_title='Date',
yaxis_title='Price',
xaxis_rangeslider_visible=False)

5. Call the Figure constructor with the candlestick and layout .

PY

fig = go.Figure(data=[candlestick], layout=layout)

6. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the candlestick chart.

Create Line Chart

You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these

steps to create the line chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

data = history.loc[symbol]['close']

3. Call the DataFrame constructor with the data Series and then call the reset_index method.

PY

data = pd.DataFrame(data).reset_index()

4. Call the line method with data , the column names of the x- and y-axis in data , and the plot title.

PY

fig = px.line(data, x='time', y='close', title=f'{symbol} Close price')

5. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the line chart.


Create Scatter Plot

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a scatter plot that shows the relationship between the daily returns of two banking

securities. Follow these steps to create the scatter plot:

1. Select 2 Symbol s.

For example, to select the Symbol s of the first 2 bank stocks, run:

PY

symbol1 = symbols[1]
symbol2 = symbols[2]

2. Slice the history DataFrame with each Symbol and then select the close column.

PY

close_price1 = history.loc[symbol1]['close']
close_price2 = history.loc[symbol2]['close']

3. Call the pct_change and dropna methods on each Series .

PY

daily_return1 = close_price1.pct_change().dropna()
daily_return2 = close_price2.pct_change().dropna()

4. Call the scatter method with the 2 return Series , the trendline option, and axes labels.

PY

fig = px.scatter(x=daily_return1, y=daily_return2, trendline='ols',


labels={'x': symbol1.value, 'y': symbol2.value})

5. Call the update_layout method with a title.

PY

fig.update_layout(title=f'{symbol1.value} vs {symbol2.value} Daily % Returns');

6. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the scatter plot.


Create Histogram

You must import the plotting libraries and get some historical data to create histograms.

In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector
ETF. Follow these steps to create the histogram:

1. Select the Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

data = history.loc[symbol]['close']

3. Call the pct_change method and then call the dropna method.

PY

daily_returns = data.pct_change().dropna()

4. Call the DataFrame constructor with the data Series and then call the reset_index method.

PY

daily_returns = pd.DataFrame(daily_returns).reset_index()

5. Call the histogram method with the daily_returns DataFrame, the x-axis label, a title, and the number of

bins.
PY

fig = px.histogram(daily_returns, x='close',


title=f'{symbol} Daily Return of Close Price Distribution',
nbins=20)

6. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the histogram.

Create Bar Chart

You must import the plotting libraries and get some historical data to create bar charts.

In this example, you create a bar chart that shows the average daily percent return of the banking securities.

Follow these steps to create the bar chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method and then multiply by 100.

PY

daily_returns = close_prices.pct_change() * 100

3. Call the mean method.


PY

avg_daily_returns = daily_returns.mean()

4. Call the DataFrame constructor with the avg_daily_returns Series and then call the reset_index method.

PY

avg_daily_returns = pd.DataFrame(avg_daily_returns, columns=["avg_daily_ret"]).reset_index()

5. Call the bar method with the avg_daily_returns and the axes column names.

PY

fig = px.bar(avg_daily_returns, x='symbol', y='avg_daily_ret')

6. Call the update_layout method with a title.

PY

fig.update_layout(title='Banking Stocks Average Daily % Returns');

7. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the bar plot.

Create Heat Map

You must import the plotting libraries and get some historical data to create heat maps.
In this example, you create a heat map that shows the correlation between the daily returns of the banking

securities. Follow these steps to create the heat map:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the corr method.

PY

corr_matrix = daily_returns.corr()

4. Call the imshow method with the corr_matrix and the axes labels.

PY

fig = px.imshow(corr_matrix, x=tickers, y=tickers)

5. Call the update_layout method with a title.

PY

fig.update_layout(title='Banking Stocks and bank sector ETF Correlation Heat Map');

6. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the heat map.


Create Pie Chart

You must import the plotting libraries and get some historical data to create pie charts.

In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate

to them based on their inverse volatility. Follow these steps to create the pie chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the var method and then take the inverse.

PY

inverse_variance = 1 / daily_returns.var()

4. Call the DataFrame constructor with the inverse_variance Series and then call the reset_index method.

PY

inverse_variance = pd.DataFrame(inverse_variance, columns=["inverse variance"]).reset_index()

5. Call the pie method with the inverse_variance DataFrame , the column name of the values, and the column

name of the names.


PY

fig = px.pie(inverse_variance, values='inverse variance', names='symbol')

6. Call the update_layout method with a title.

PY

fig.update_layout(title='Asset Allocation of bank stocks and bank sector ETF');

7. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the pie chart.

Create 3D Chart

You must import the plotting libraries and get some historical data to create 3D charts.

In this example, you create a 3D chart that shows the price of an asset on each dimension. Follow these steps to
create the 3D chart:

1. Select the asset to plot on each dimension.

PY

x, y, z = symbols[:3]

2. Call the Scatter3d constructor with the data for the x, y, and z axes.
PY

scatter = go.Scatter3d(
x=history.loc[x].close,
y=history.loc[y].close,
z=history.loc[z].close,
mode='markers',
marker=dict(
size=2,
opacity=0.8
)
)

3. Call the Layout constructor with the axes titles and chart dimensions.

PY

layout = go.Layout(
scene=dict(
xaxis_title=f'{x.value} Price',
yaxis_title=f'{y.value} Price',
zaxis_title=f'{z.value} Price'
),
width=700,
height=700
)

4. Call the Figure constructor with the scatter and layout variables.

PY

fig = go.Figure(scatter, layout)

5. Call the show method.

PY

fig.show()

The Jupyter Notebook displays the 3D chart.


Charting > Seaborn

Charting
Seaborn

Introduction

seaborn is a data visualization library based on matplotlib . It makes it easier to create more complicated plots

and allows us to create much more visually-appealing charts than matplotlib charts.

Import Libraries

Follow these steps to import the libraries that you need:

1. Import the seaborn and matplotlib libraries.

PY

import seaborn as sns


import matplotlib.pyplot as plt

2. Import, and then call, the register_matplotlib_converters method.

PY

from pandas.plotting import register_matplotlib_converters


register_matplotlib_converters()

Get Historical Data

Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some
banking companies over 2021, run:

PY

qb = QuantBook()
tickers = ["XLF", # Financial Select Sector SPDR Fund
"COF", # Capital One Financial Corporation
"GS", # Goldman Sachs Group, Inc.
"JPM", # J P Morgan Chase & Co
"WFC"] # Wells Fargo & Company
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2021, 1, 1), datetime(2022, 1, 1))

Create Candlestick Chart

Seaborn does not currently support candlestick charts. Use one of the other plotting libraries to create candlestick

charts.

Create Line Chart

You must import the plotting libraries and get some historical data to create line charts.
In this example, you create a line chart that shows the closing price for one of the banking securities. Follow these

steps to create the chart:

1. Select a Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

data = history.loc[symbol]['close']

3. Call the DataFrame constructor with the data Series and then call the reset_index method.

PY

data = pd.DataFrame(data).reset_index()

4. Call the lineplot method with the data Series and the column name of each axis.

PY

plot = sns.lineplot(data=data,
x='time',
y='close')

5. In the same cell that you called the lineplot method, call the set method with the y-axis label and a title.

PY

plot.set(ylabel="price", title=f"{symbol} Price Over Time");

The Jupyter Notebook displays the line chart.


Create Scatter Plot

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a scatter plot that shows the relationship between the daily returns of two banking

securities. Follow these steps to create the scatter plot:

1. Select 2 Symbol s.

For example, to select the Symbol s of the first 2 bank stocks, run:

PY

symbol1 = symbols[1]
symbol2 = symbols[2]

2. Select the close column of the history DataFrame, call the unstack method, and then select the symbol1 and
symbol2 columns.

PY

close_prices = history['close'].unstack(0)[[symbol1, symbol2]]

3. Call the pct_change method and then call the dropna method.

PY

daily_returns = close_prices.pct_change().dropna()

4. Call the regplot method with the daily_returns DataFrame and the column names.
PY

plot = sns.regplot(data=daily_returns,
x=daily_returns.columns[0],
y=daily_returns.columns[1])

5. In the same cell that you called the regplot method, call the set method with the axis labels and a title.

PY

plot.set(xlabel=f'{daily_returns.columns[0]} % Returns',
ylabel=f'{daily_returns.columns[1]} % Returns',
title=f'{symbol1} vs {symbol2} Daily % Returns');

The Jupyter Notebook displays the scatter plot.

Create Histogram

You must import the plotting libraries and get some historical data to create histograms.

In this example, you create a histogram that shows the distribution of the daily percent returns of the bank sector

ETF. Follow these steps to create the histogram:

1. Select the Symbol .

PY

symbol = symbols[0]

2. Slice the history DataFrame with the symbol and then select the close column.

PY

data = history.loc[symbol]['close']
3. Call the pct_change method and then call the dropna method.

PY

daily_returns = data.pct_change().dropna()

4. Call the DataFrame constructor with the daily_returns Series and then call the reset_index method.

PY

daily_returns = pd.DataFrame(daily_returns).reset_index()

5. Call the histplot method with the daily_returns , the close column name, and the number of bins.

PY

plot = sns.histplot(daily_returns, x='close', bins=20)

6. In the same cell that you called the histplot method, call the set method with the axis labels and a title.

PY

plot.set(xlabel='Return',
ylabel='Frequency',
title=f'{symbol} Daily Return of Close Price Distribution');

The Jupyter Notebook displays the histogram.

Create Bar Chart

You must import the plotting libraries and get some historical data to create bar charts.

In this example, you create a bar chart that shows the average daily percent return of the banking securities.

Follow these steps to create the bar chart:


1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method and then multiply by 100.

PY

daily_returns = close_prices.pct_change() * 100

3. Call the mean method.

PY

avg_daily_returns = daily_returns.mean()

4. Call the DataFrame constructor with the avg_daily_returns Series and then call the reset_index method.

PY

avg_daily_returns = pd.DataFrame(avg_daily_returns, columns=["avg_daily_ret"]).reset_index()

5. Call barplot method with the avg_daily_returns Series and the axes column names.

PY

plot = sns.barplot(data=avg_daily_returns, x='symbol', y='avg_daily_ret')

6. In the same cell that you called the barplot method, call the set method with the axis labels and a title.

PY

plot.set(xlabel='Tickers',
ylabel='%',
title='Banking Stocks Average Daily % Returns')

7. In the same cell that you called the set method, call the tick_params method to rotate the x-axis labels.

PY

plot.tick_params(axis='x', rotation=90)

The Jupyter Notebook displays the bar chart.


Create Heat Map

You must import the plotting libraries and get some historical data to create heat maps.

In this example, you create a heat map that shows the correlation between the daily returns of the banking

securities. Follow these steps to create the heat map:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call the corr method.

PY

corr_matrix = daily_returns.corr()

4. Call the heatmap method with the corr_matrix and the annotation argument enabled.
PY

plot = sns.heatmap(corr_matrix, annot=True)

5. In the same cell that you called the heatmap method, call the set method with a title.

PY

plot.set(title='Bank Stocks and Bank Sector ETF Correlation Coefficients');

The Jupyter Notebook displays the heat map.

Create Pie Chart

You must import the plotting libraries and get some historical data to create pie charts.

In this example, you create a pie chart that shows the weights of the banking securities in a portfolio if you allocate

to them based on their inverse volatility. Follow these steps to create the pie chart:

1. Select the close column and then call the unstack method.

PY

close_prices = history['close'].unstack(level=0)

2. Call the pct_change method.

PY

daily_returns = close_prices.pct_change()

3. Call var method and then take the inverse.


PY

inverse_variance = 1 / daily_returns.var()

4. Call the color_palette method with a palette name and then truncate the returned colors to so that you have
one color for each security.

PY

colors = sns.color_palette('pastel')[:len(inverse_variance.index)]

5. Call the pie method with the security weights, labels, and colors.

PY

plt.pie(inverse_variance, labels=inverse_variance.index, colors=colors, autopct='%1.1f%%')

6. In the same cell that you called the pie method, call the title method with a title.

PY

plt.title(title='Banking Stocks and Bank Sector ETF Allocation');

The Jupyter Notebook displays the pie chart.


Charting > Plotly NET

Charting
Plotly NET

Introduction

Plotly.NET provides functions for generating and rendering plotly.js charts in .NET programming languages. Our

.NET interactive notebooks support its C# implementation.

Import Libraries

Follow these steps to import the libraries that you need:

1. Load the assembly files and data types in their own cell.
2. Load the necessary assembly files.

3. Import the QuantConnect , Plotly.NET , and Accord packages.

Get Historical Data

Get some historical market data to produce the plots. For example, to get data for a bank sector ETF and some

banking companies over 2021, run:

Create Candlestick Chart

You must import the plotting libraries and get some historical data to create candlestick charts.

In this example, you create a candlestick chart that shows the open, high, low, and close prices of one of the

banking securities. Follow these steps to create the candlestick chart:

1. Select a Symbol .

2. Call the Chart2D.Chart.Candlestick constructor with the time and open, high, low, and close price

IEnumerable .

3. Call the Layout constructor and set the title , xaxis , and yaxis properties as the title and axes label

objects.

4. Assign the Layout to the chart.

5. Show the plot.

The Jupyter Notebook displays the candlestick chart.


Create Line Chart

You must import the plotting libraries and get some historical data to create line charts.

In this example, you create a line chart that shows the volume of a security. Follow these steps to create the chart:

1. Select a Symbol .

2. Call the Chart2D.Chart.Line constructor with the timestamps and volumes.

3. Create a Layout .

4. Assign the Layout to the chart.


5. Show the plot.

The Jupyter Notebook displays the line chart.


Create Scatter Plot

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a scatter plot that shows the relationship between the daily price of two securities.
Follow these steps to create the scatter plot:

1. Select two Symbol objects.

2. Call the Chart2D.Chart.Point constructor with the closing prices of both securities.

3. Create a Layout .
4. Assign the Layout to the chart.

5. Show the plot.

The Jupyter Notebook displays the scatter plot.


Create Heat Map

You must import the plotting libraries and get some historical data to create heat maps.

In this example, you create a heat map that shows the correlation between the daily returns of the banking
securities. Follow these steps to create the heat map:

1. Compute the daily returns.

2. Call the Measures.Correlation method.


3. Call the Plotly.NET.Chart2D.Chart.Heatmap constructor with the correlation matrix.

4. Create a Layout .

5. Assign the Layout to the chart.

6. Show the plot.

The Jupyter Notebook displays the heat map.


Create 3D Chart

You must import the plotting libraries and get some historical data to create scatter plots.

In this example, you create a 3D chart that shows the price of an asset on each dimension, i.e. the price correlation

of 3 different symbols. Follow these steps to create the 3D chart:

1. Select three Symbol objects.

2. Call the Chart3D.Chart.Point3D constructor with the closing price series of each securities.

3. Create a Layout to add title and axis labels.

4. Assign the Layout to the chart.


5. Show the plot.

The Jupyter Notebook displays the scatter plot.


Universes

Universes

Introduction

Universe selection is the process of selecting a basket of assets to research. Dynamic universe selection increases

diversification and decreases selection bias in your analysis.

Get Universe Data

Universes are data types. To get historical data for a universe, pass the universe data type to the UniverseHistory
method. The object that returns contains a universe data collection for each day. With this object, you can iterate

through each day and then iterate through the universe data objects of each day to analyze the universe

constituents.

For example, follow these steps to get the US Equity Fundamental data for a specific universe:

1. Create a QuantBook .

PY

qb = QuantBook()

2. Define a universe.

The following example defines a dynamic universe that contains the 10 Equities with the lowest PE ratios in
the market. To see all the Fundamental attributes you can use to define a filter function for a Fundamental

universe, see Data Point Attributes . To create the universe, call the add_universe method with the filter

function.

PY

def filter_function(fundamentals):
sorted_by_pe_ratio = sorted(
[f for f in fundamentals if not np.isnan(f.valuation_ratios.pe_ratio)],
key=lambda fundamental: fundamental.valuation_ratios.pe_ratio
)
return [fundamental.symbol for fundamental in sorted_by_pe_ratio[:10]]

universe = qb.add_universe(filter_function)

3. Call the universe_history method with the universe, a start date, and an end date.

PY

universe_history = qb.universe_history(universe, datetime(2023, 11, 6), datetime(2023, 11, 13))

The end date arguments is optional. If you omit it, the method returns Fundamental data between the start

date and the current day.


The universe_history method returns a Series where the multi-index is the universe Symbol and the time
when universe selection would occur in a backtest. Each row in the data column contains a list of

Fundamental objects. The following image shows the first 5 rows of an example Series:

4. Iterate through the Series to access the universe data.

PY

universe_history = universe_history.droplevel('symbol', axis=0)


for date, fundamentals in universe_history.items():
for fundamental in fundamentals:
symbol = fundamental.symbol
price = fundamental.price
if fundamental.has_fundamental_data:
pe_ratio = fundamental.valuation_ratios.pe_ratio

Available Universes

To get universe data for other types of universes, you usually just need to replace Fundamental in the preceding

code snippets with the universe data type. The following table shows the datasets that support universe selection

and their respective data type. For more information, about universe selection with these datasets and the data
points you can use in the filter function, see the dataset's documentation.
Dataset Name Universe Type(s) Documentation

US Fundamental Data Fundamental Learn more

US ETF Constituents ETFConstituentUniverse Learn more

Crypto Price Data CryptoUniverse Learn more

US Congress Trading QuiverQuantCongresssUniverse Learn more

WallStreetBets QuiverWallStreetBetsUniverse Learn more

SmartInsiderIntentionUni
verse
Corporate Buybacks Learn more
SmartInsiderTransactionU
niverse

BrainSentimentIndicatorUniver
Brain Sentiment Indicator Learn more
se

Brain ML Stock Ranking BrainStockRankingUniverse Learn more

BrainCompanyFilingLangua

Brain Language Metrics on geMetricsUniverseAll


Learn more
Company Filings BrainCompanyFilingLangua
geMetricsUniverse10K

CNBC Trading QuiverCNBCsUniverse Learn more

QuiverGovernmentContractUnive
US Government Contracts Learn more
rse

Corporate Lobbying QuiverLobbyingUniverse Learn more

Insider Trading QuiverInsiderTradingUniverse Learn more

To get universe data for Futures and Options, use the future_history and option_history methods,

respectively.
Indicators

Indicators

Indicators let you analyze market data in an abstract form rather than in its raw form. For example, indicators like

the RSI tell you, based on price and volume data, if the market is overbought or oversold. Because indicators can

extract overall market trends from price data, sometimes, you may want to look for correlations between indicators
and the market, instead of between raw price data and the market. To view all of the indicators and candlestick

patterns we provide, see the Supported Indicators .

Data Point Indicators

Indicators that process IndicatorDataPoint objects

Bar Indicators

Indicators that process Bar objects

Trade Bar Indicators

Indicators that process TradeBar objects

Combining Indicators

Chain indicators together

Custom Indicators

Create your own

Custom Resolutions

Beyond the standard resolutions

See Also

Key Concepts
Indicators > Data Point Indicators

Indicators
Data Point Indicators

Introduction

This page explains how to create, update, and visualize LEAN data-point indicators.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator

values. In this example, use a 20-period 2-standard-deviation BollingerBands indicator.

PY

bb = BollingerBands(20, 2)

You can create the indicator timeseries with the Indicator helper method or you can manually create the

timeseries.

Indicator Helper Method

To create an indicator timeseries with the helper method, call the Indicator method.

PY

# Create a dataframe with a date index, and columns are indicator values.
bb_dataframe = qb.indicator(bb, symbol, 50, Resolution.DAILY)

Manually Create the Indicator Timeseries

Follow these steps to manually create the indicator timeseries:


1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Set the indicator window.size for each attribute of the indicator to hold their values.

PY

# Set the window.size to the desired timeseries length


bb.window.size=50
bb.lower_band.window.size=50
bb.middle_band.window.size=50
bb.upper_band.window.size=50
bb.band_width.window.size=50
bb.percent_b.window.size=50
bb.standard_deviation.window.size=50
bb.price.window.size=50

3. Iterate through the historical market data and update the indicator.

PY

for bar in history:


bb.update(bar.end_time, bar.close)

4. Populate a DataFrame with the data in the Indicator object.

PY

bb_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in bb}),
"lowerband": pd.Series({x.end_time: x.value for x in bb.lower_band}),
"middleband": pd.Series({x.end_time: x.value for x in bb.middle_band}),
"upperband": pd.Series({x.end_time: x.value for x in bb.upper_band}),
"bandwidth": pd.Series({x.end_time: x.value for x in bb.band_width}),
"percentb": pd.Series({x.end_time: x.value for x in bb.percent_b}),
"standarddeviation": pd.Series({x.end_time: x.value for x in bb.standard_deviation}),
"price": pd.Series({x.end_time: x.value for x in bb.price})
}).sort_index()

Plot Indicators

You need to create an indicator timeseries to plot the indicator values.

Follow these steps to plot the indicator values:


1. Select the columns/features to plot.

PY

bb_plot = bb_indicator[["upperband", "middleband", "lowerband", "price"]]

2. Call the plot method.

PY

bb_plot.plot(figsize=(15, 10), title="SPY BB(20,2)"))

3. Show the plots.

PY

plt.show()
Indicators > Bar Indicators

Indicators
Bar Indicators

Introduction

This page explains how to create, update, and visualize LEAN bar indicators.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator

values. In this example, use a 20-period AverageTrueRange indicator.

PY

atr = AverageTrueRange(20)

You can create the indicator timeseries with the Indicator helper method or you can manually create the

timeseries.

Indicator Helper Method

To create an indicator timeseries with the helper method, call the Indicator method.

PY

# Create a dataframe with a date index, and columns are indicator values.
atr_dataframe = qb.indicator(atr, symbol, 50, Resolution.DAILY)

Manually Create the Indicator Timeseries

Follow these steps to manually create the indicator timeseries:


1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Set the indicator window.size for each attribute of the indicator to hold their values.

PY

# Set the window.size to the desired timeseries length


atr.window.size = 50
atr.true_range.window.size = 50

3. Iterate through the historical market data and update the indicator.

PY

for bar in history:


atr.update(bar)

4. Populate a DataFrame with the data in the Indicator object.

PY

atr_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in atr}),
"truerange": pd.Series({x.end_time: x.value for x in atr.true_range})
}).sort_index()

Plot Indicators

You need to create an indicator timeseries to plot the indicator values.

Follow these steps to plot the indicator values:

1. Call the plot method.

PY

atr_indicator.plot(title="SPY ATR(20)", figsize=(15, 10))

2. Show the plots.


PY

plt.show()
Indicators > Trade Bar Indicators

Indicators
Trade Bar Indicators

Introduction

This page explains how to create, update, and visualize LEAN TradeBar indicators.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator

values. In this example, use a 20-period VolumeWeightedAveragePriceIndicator indicator.

PY

vwap = VolumeWeightedAveragePriceIndicator(20)

You can create the indicator timeseries with the Indicator helper method or you can manually create the

timeseries.

Indicator Helper Method

To create an indicator timeseries with the helper method, call the Indicator method.

PY

# Create a dataframe with a date index, and columns are indicator values.
vwap_dataframe = qb.indicator(vwap, symbol, 50, Resolution.DAILY)

Manually Create the Indicator Timeseries

Follow these steps to manually create the indicator timeseries:


1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Set the indicator window.size for each attribute of the indicator to hold their values.

PY

# Set the window.size to the desired timeseries length


vwap.window.size = 50

3. Iterate through the historical market data and update the indicator.

PY

for bar in history:


vwap.update(bar)

4. Populate a DataFrame with the data in the Indicator object.

PY

vwap_dataframe = pd.DataFrame({
"current": pd.Series({x.end_time: x.value for x in vwap}))
}).sort_index()

Plot Indicators

Follow these steps to plot the indicator values:

1. Call the plot method.

PY

vwap_indicator.plot(title="SPY VWAP(20)", figsize=(15, 10))

2. Show the plots.

PY

plt.show()
Indicators > Combining Indicators

Indicators
Combining Indicators

Introduction

This page explains how to create, update, and visualize LEAN Composite indicators.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data and create a composite indicator in order to calculate a timeseries of

indicator values. In this example, use a 10-period SimpleMovingAverage of a 10-period RelativeStrengthIndex

indicator.

PY

# Create 10-period RSI and 10-period SMA indicator objects.


rsi = RelativeStrengthIndex(10)
sma = SimpleMovingAverage(10)
# Create a composite indicator by feeding the value of 10-period RSI to the 10-period SMA indicator.
sma_of_rsi = IndicatorExtensions.of(sma, rsi)

Follow these steps to create an indicator timeseries:

1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Create a RollingWindow for each attribute of the indicator to hold their values.

In this example, save 50 data points.

PY

# Create a window dictionary to store RollingWindow objects.


window = {}
# Store the RollingWindow objects, index by key is the property of the indicator.
window['time'] = RollingWindow[DateTime](50)
window["SMA Of RSI"] = RollingWindow[float](50)
window["rollingsum"] = RollingWindow[float](50)
3. Attach a handler method to the indicator that updates the RollingWindow objects.

PY

# Define an update function to add the indicator values to the RollingWindow object.
def update_sma_of_rsi_window(sender: object, updated: IndicatorDataPoint) -> None:
indicator = sender
window['time'].add(updated.end_time)
window["SMA Of RSI"].add(updated.value)
window["rollingsum"].add(indicator.rolling_sum.current.value)

sma_of_rsi.updated += UpdateSmaOfRsiWindow

When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint

values into the respective RollingWindow .

4. Iterate the historical market data to update the indicators and the RollingWindow s.

PY

for bar in history:


# Update the base indicators, the composite indicator will update automatically when the base
indicator is updated.
rsi.update(bar.end_time, bar.close)

5. Populate a DataFrame with the data in the RollingWindow objects.

PY

sma_of_rsi_dataframe = pd.DataFrame(window).set_index('time')

Plot Indicators

Follow these steps to plot the indicator values:

1. Select the columns/features to plot.

PY

sma_of_rsi_plot = sma_of_rsi_dataframe[["SMA Of RSI"]]

2. Call the plot method.

PY

sma_of_rsi_plot.plot(title="SPY SMA(10) of RSI(10)", figsize=(15, 10))


3. Show the plots.

PY

plt.show()
Indicators > Custom Indicators

Indicators
Custom Indicators

Introduction

This page explains how to create and update custom indicators.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data in order to calculate a timeseries of indicator values.

Follow these steps to create an indicator timeseries:

1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Define a custom indicator class. Note the PythonIndicator superclass inheritance, Value attribute, and

update method are mandatory.

In this tutorial, create an ExpectedShortfallPercent indicator that uses Monte Carlo to calculate the

expected shortfall of returns.


PY

class ExpectedShortfallPercent(PythonIndicator):
import math, numpy as np

def __init__(self, period, alpha):


self.Value = None # Attribute represents the indicator value
self.ValueAtRisk = None

self.alpha = alpha

self.window = RollingWindow[float](period)

# Override the IsReady attribute to flag all attributes values are ready.
@property
def IsReady(self) -> bool:
return self.Value and self.ValueAtRisk

# Method to update the indicator values. Note that it only receives 1 IBaseData object (Tick,
TradeBar, QuoteBar) argument.
def Update(self, input: BaseData) -> bool:
count = self.window.Count

self.window.Add(input.Close)

# Update the Value and other attributes as the indicator current value.
if count >= 2:
cutoff = math.ceil(self.alpha * count)

ret = [ (self.window[i] - self.window[i+1]) / self.window[i+1] for i in range(count-1)


]
lowest = sorted(ret)[:cutoff]

self.Value = np.mean(lowest)
self.ValueAtRisk = lowest[-1]

# return a boolean to indicate IsReady.


return count >= 2

3. Initialize a new instance of the custom indicator.

PY

custom = ExpectedShortfallPercent(50, 0.05)

4. Create a RollingWindow for each attribute of the indicator to hold their values.

In this example, save 20 data points.

PY

# Create a window dictionary to store RollingWindow objects.


window = {}
# Store the RollingWindow objects, index by key is the property of the indicator.
window['time'] = RollingWindow[DateTime](20)
window['expectedshortfall'] = RollingWindow[float](20)
window['valueatrisk'] = RollingWindow[float](20)

5. Attach a handler method to the indicator that updates the RollingWindow objects.

When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint
values into the respective RollingWindow .

6. Iterate through the historical market data and update the indicator.
PY

for bar in history:


custom.update(bar)

# The Updated event handler is not available for custom indicator in Python, RollingWindows are
needed to be updated in here.
if custom.is_ready:
window['time'].add(bar.end_time)
window['expectedshortfall'].add(custom.value)
window['valueatrisk'].add(custom.value_at_risk)

7. Populate a DataFrame with the data in the RollingWindow objects.

PY

custom_dataframe = pd.DataFrame(window).set_index('time'))

Plot Indicators

Follow these steps to plot the indicator values:

1. Call the plot method.

PY

custom_dataframe.plot()

2. Show the plot.

PY

plt.show()
Indicators > Custom Resolutions

Indicators
Custom Resolutions

Introduction

This page explains how to create and update indicators with data of a custom resolution.

Create Subscriptions

You need to subscribe to some market data in order to calculate indicator values.

PY

qb = QuantBook()
symbol = qb.add_equity("SPY").symbol

Create Indicator Timeseries

You need to subscribe to some market data and create an indicator in order to calculate a timeseries of indicator

values.

Follow these steps to create an indicator timeseries:

1. Get some historical data .

PY

# Request historical trading data with the daily resolution.


history = qb.history[TradeBar](symbol, 70, Resolution.DAILY)

2. Create a data-point indicator.

In this example, use a 20-period 2-standard-deviation BollingerBands indicator.

PY

bb = BollingerBands(20, 2)

3. Create a RollingWindow for each attribute of the indicator to hold their values.
PY

# Create a window dictionary to store RollingWindow objects.


window = {}
# Store the RollingWindow objects, index by key is the property of the indicator.
window['time'] = RollingWindow[DateTime](50)
window["bollingerbands"] = RollingWindow[float](50)
window["lowerband"] = RollingWindow[float](50)
window["middleband"] = RollingWindow[float](50)
window["upperband"] = RollingWindow[float](50)
window["bandwidth"] = RollingWindow[float](50)
window["percentb"] = RollingWindow[float](50)
window["standarddeviation"] = RollingWindow[float](50)
window["price"] = RollingWindow[float](50)

4. Attach a handler method to the indicator that updates the RollingWindow objects.

PY

# Define an update function to add the indicator values to the RollingWindow object.
def update_bollinger_band_window(sender: object, updated: IndicatorDataPoint) -> None:
indicator = sender
window['time'].add(updated.end_time)
window["bollingerbands"].add(updated.value)
window["lowerband"].add(indicator.lower_band.current.value)
window["middleband"].add(indicator.middle_band.current.value)
window["upperband"].add(indicator.upper_band.current.value)
window["bandwidth"].add(indicator.band_width.current.value)
window["percentb"].add(indicator.percent_b.current.value)
window["standarddeviation"].add(indicator.standard_deviation.current.value)
window["price"].add(indicator.price.current.value)

bb.updated += UpdateBollingerBandWindow

When the indicator receives new data, the preceding handler method adds the new IndicatorDataPoint
values into the respective RollingWindow .

5. Create a TradeBarConsolidator to consolidate data into the custom resolution.

PY

consolidator = TradeBarConsolidator(timedelta(days=7))

6. Attach a handler method to feed data into the consolidator and updates the indicator with the consolidated

bars.

PY

def on_data_consolidated(sender, consolidated):


bb.update(consolidated.end_time, consolidated.close)
consolidator.data_consolidated += on_data_consolidated

When the consolidator receives 7 days of data, the handler generates a 7-day TradeBar and update the
indicator.

7. Iterate through the historical market data and update the indicator.
PY

for bar in history:


consolidator.update(bar)

8. Populate a DataFrame with the data in the RollingWindow objects.

PY

bb_dataframe = pd.DataFrame(window).set_index('time')

Plot Indicators

Follow these steps to plot the indicator values:

1. Select the columsn to plot.

PY

df = bb_dataframe[['lowerband', 'middleband', 'upperband', 'price']]

2. Call the plot method.

PY

df.plot()

3. Show the plot.

PY

plt.show()
Object Store

Object Store

Introduction

The Object Store is a file system that you can use in your algorithms to save, read, and delete data. The Object

Store is organization-specific, so you can save or read data from the same Object Store in all of your

organization's projects. The Object Store works like a key-value storage system where you can store regular

strings, JSON encoded strings, XML encoded strings, and bytes. You can access the data you store in the Object
Store from backtests, the Research Environment, and live algorithms.

Get All Stored Data

To get all of the keys and values in the Object Store, iterate through the object_store property.

PY

for kvp in qb.object_store:


key = kvp.key
value = kvp.value

To iterate through just the keys in the Object Store, iterate through the keys property.

PY

for key in qb.object_store.keys:


continue

Create Sample Data

You need some data to store data in the Object Store.

Follow these steps to create some sample data:

1. Create a string .

PY

string_sample = "My string"

2. Create a Bytes object.

PY

bytes_sample = str.encode("My String")

Save Data

The Object Store saves objects under a key-value system. If you save objects in backtests, you can access them
from the Research Environment.
If you run algorithms in QuantConnect Cloud, you need storage create permissions to save data in the Object

Store.

If you don't have data to store, create some sample data .

You can save Bytes and string objects in the Object Store.

Strings

To save a string object, call the save or save_string method.

PY

save_successful = qb.object_store.save(f"{qb.project_id}/string_key", string_sample)

Bytes

To save a Bytes object (for example, zipped data), call the save_bytes method.

PY

save_successful = qb.object_store.save_bytes(f"{qb.project_id}/bytes_key", bytes_sample)

zipped_data_sample = Compression.zip_bytes(bytes(string_sample, "utf-8"), "data")


zip_save_successful = qb.object_store.save_bytes(f"{qb.project_id}/bytesKey.zip", zipped_data_sample)

Read Data

To read data from the Object Store, you need to provide the key you used to store the object.

You can load Bytes and string objects from the Object Store.

Before you read data from the Object Store, check if the key exists.

PY

if qb.object_store.contains_key(key):
# Read data

Strings

To read a string object, call the read or read_string method.

PY

string_data = qb.object_store.read(f"{qb.project_id}/string_key")

Bytes

To read a Bytes object, call the read_bytes method.

PY

byte_data = qb.object_store.read_bytes(f"{qb.project_id}/bytes_key")
Delete Data

Delete objects in the Object Store to remove objects that you no longer need. If you use the Research Environment

in QuantConnect Cloud, you need storage delete permissions to delete data from the Object Store.

To delete objects from the Object Store, call the delete method. Before you delete data, check if the key exists. If

you try to delete an object with a key that doesn't exist in the Object Store, the method raises an exception.

PY

if qb.object_store.contains_key(key):
qb.object_store.delete(key)

To delete all of the content in the Object Store, iterate through all the stored data.

PY

for kvp in qb.object_store:


qb.object_store.delete(kvp.key)

Cache Data

When you write to or read from the Object Store, the notebook caches the data. The cache speeds up the

notebook execution because if you try to read the Object Store data again with the same key, it returns the cached
data instead of downloading the data again. The cache speeds up execution, but it can cause problems if you are

trying to share data between two nodes under the same Object Store key. For example, consider the following

scenario:

1. You open project A and save data under the key 123 .
2. You open project B and save new data under the same key 123 .

3. In project A, you read the Object Store data under the key 123 , expecting the data from project B, but you get

the original data you saved in step #1 instead.

You get the data from step 1 instead of step 2 because the cache contains the data from step 1.

To clear the cache, call the Clear method.

PY

qb.object_store.clear()

Get File Path

To get the file path for a specific key in the Object Store, call the get_file_path method. If the key you pass to the

method doesn't already exist in the Object Store, it's added to the Object Store.

PY

file_path = qb.object_store.get_file_path(key)

Storage Quotas
If you use the Research Environment locally, you can store as much data as your hardware will allow. If you use

the Research Environment in QuantConnect Cloud, you must stay within your storage quota . If you need more

storage space, edit your storage plan .

Example for DataFrames

Follow these steps to create a DataFrame, save it into the Object Store, and load it from the Object Store:

1. Get some historical data.

PY

spy = qb.add_equity("SPY").symbol
df = qb.history(qb.securities.keys, 360, Resolution.DAILY)

2. Get the file path for a specific key in the Object Store.

PY

file_path = qb.object_store.get_file_path("df_to_csv")

3. Call the to_csv method to save the DataFrame in the Object Store as a CSV file.

PY

df.to_csv(file_path) # File size: 32721 bytes

4. Call the read_csv method to load the CSV file from the Object Store.

PY

reread = pd.read_csv(file_path)

pandas supports saving and loading DataFrame objects in the following additional formats:

XML

PY

file_path = qb.object_store.get_file_path("df_to_xml")
df.to_xml(file_path) # File size: 87816 bytes
reread = pd.read_xml(file_path)

JSON

PY

file_path = qb.object_store.get_file_path("df_to_json")
df.to_json(file_path) # File size: 125250 bytes
reread = pd.read_json(file_path)

Parquet
PY

file_path = qb.object_store.get_file_path("df_to_parquet")
df.to_parquet(file_path) # File size: 23996 bytes
reread = pd.read_parquet(file_path)

Pickle

PY

file_path = qb.object_store.get_file_path("df_to_pickle")
df.to_pickle(file_path) # File size: 19868 bytes
reread = pd.read_pickle(file_path)

Example for Plotting

You can use the Object Store to plot data from your backtests and live algorithm in the Research Environment. The
following example demonstrates how to plot a Simple Moving Average indicator that's generated during a backtest.

1. Create a algorithm, add a data subscription, and add a simple moving average indicator.

PY

class ObjectStoreChartingAlgorithm(QCAlgorithm):
def initialize(self):
self.add_equity("SPY")

self.content = ''
self._sma = self.sma("SPY", 22)

The algorithm will save self.content to the Object Store.

2. Save the indicator data as string in self.content .

PY

def on_data(self, data: Slice):


self.plot('SMA', 'Value', self.sma.current.value)
self.content += f'{self.sma.current.end_time},{self.sma.current.value}\n'

3. In the OnEndOfAlgorithm method, save the indicator data to the Object Store.

PY

def on_end_of_algorithm(self):
self.object_store.save('sma_values_python', self.content)

4. Open the Research Environment and create a QuantBook .

PY

qb = QuantBook()

5. Read the indicator data from the Object Store.


PY

content = qb.object_store.read("sma_values_python")

The key you provide must be the same key you used to save the object.

6. Convert the data to a pandas object and create a chart.

PY

data = {}
for line in content.split('\n'):
csv = line.split(',')
if len(csv) > 1:
data[csv[0]] = float(csv[1])

series = pd.Series(data, index=data.keys())


series.plot()

 Charts  Statistics  Code Clone Algorithm


Machine Learning

Machine Learning

Machine Learning > Key Concepts

Machine Learning
Key Concepts

Introduction

Machine learning is a field of study that combines statistics and computer science to build intelligent systems that

predict outcomes. Quant researchers commonly use machine learning models to optimize portfolios, make trading

signals, and manage risk. These models can find relationships in datasets that humans struggle to find, are subtle,

or are too complex. You can use machine learning techniques in your research notebooks.

Supported Libraries

The following table shows the supported machine learning libraries:

Library Research Tutorial Documentation

Keras Tutorial Documentation

TensorFlow Tutorial Documentation

Scikit-Learn Tutorial Documentation

hmmlearn Tutorial Documentation

gplearn Tutorial Documentation

PyTorch Tutorial Documentation

Stable Baselines Tutorial Documentation

tslearn Tutorial Documentation

XGBoost Tutorial Documentation

Add New Libraries

To request a new library, contact us . We will add the library to the queue for review and deployment. Since the

libraries run on our servers, we need to ensure they are secure and won't cause harm. The process of adding new
libraries takes 2-4 weeks to complete. View the list of libraries currently under review on the Issues list of the Lean
GitHub repository .

Transfer Models

You can load machine learning models from the Object Store or a custom data file like pickle. If you train a model in
the Research Environment, you can also save it into the Object Store to transfer it to the backtesting and live

trading environment.
Machine Learning > Popular Libraries

Machine Learning
Popular Libraries

These are examples of using some of the most common machine learning libraries in an algorithm. Click one to

learn more.

Aesera

GPlearn

Hmmlearn

Keras

PyTorch

Scikit-Learn

Stable Baselines

TensorFlow

Tslearn

XGBoost
Machine Learning > Popular Libraries > Aesera

Popular Libraries
Aesera

Introduction

This page explains how to build, train, test, and store Aesera models.

Import Libraries

Import the aesera , and sklearn libraries.

PY

import aesara
import aesara.tensor as at
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import joblib

You need the joblib library to store models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

Normalized close price of the SPY over the last 5


Features
days

Labels Return direction of the SPY over the next day

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Obtain the close price and return direction series.

PY

close = history['close']
returns = data['close'].pct_change().shift(-1)[lookback*2-1:-1].reset_index(drop=True)
labels = pd.Series([1 if y > 0 else 0 for y in returns]) # binary class

2. Loop through the close Series and collect the features.

PY

lookback = 5
lookback_series = []
for i in range(1, lookback + 1):
df = data['close'].shift(i)[lookback:-1]
df.name = f"close-{i}"
lookback_series.append(df)
X = pd.concat(lookback_series, axis=1)
# Normalize using the 5 day interval
X = MinMaxScaler().fit_transform(X.T).T[4:]

3. Convert the lists of features and labels into numpy arrays.

PY

X = np.array(features)
y = np.array(labels)

4. Split the data into training and testing periods.

PY

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a Logistic Regression model with log loss cross entropy and square error

as cost function. Follow these steps to create the model:

1. Generate a dataset.

PY

# D = (input_values, target_class)
D = (np.array(X_train), np.array(y_train))

2. Initialize variables.

PY

# Declare Aesara symbolic variables


x = at.dmatrix("x")
y = at.dvector("y")

# initialize the weight vector w randomly using share so model coefficients keep their values
# between training iterations (updates)
rng = np.random.default_rng(100)
w = aesara.shared(rng.standard_normal(X.shape[1]), name="w")

# initialize the bias term


b = aesara.shared(0., name="b")

3. Construct the model graph.

PY

# Construct Aesara expression graph


p_1 = 1 / (1 + at.exp(-at.dot(x, w) - b)) # Logistic transformation
prediction = p_1 > 0.5 # The prediction thresholded
xent = y * at.log(p_1) - (1 - y) * at.log(1 - p_1) # Cross-entropy log-loss function
cost = xent.mean() + 0.01 * (w ** 2).sum() # The cost to minimize (MSE)
gw, gb = at.grad(cost, [w, b]) # Compute the gradient of the cost

4. Compile the model.

PY

train = aesara.function(
inputs=[x, y],
outputs=[prediction, xent],
updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)))
predict = aesara.function(inputs=[x], outputs=prediction)

5. Train the model with training dataset.


PY

pred, err = train(D[0], D[1])

# We can also inspect the final outcome


print("Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0])) # whether > 0.5 or not

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:

1. Call the predict method with the features of the testing period.

PY

y_hat = predict(np.array(X_test))

2. Plot the actual and predicted labels of the testing period.


PY

df = pd.DataFrame({'y': y_test, 'y_hat': y_hat}).astype(int)


df.plot(title='Model Performance: predicted vs actual return direction in closing price', figsize=
(12, 5))

3. Calculate the prediction accuracy.

PY

correct = sum([1 if x==y else 0 for x, y in zip(y_test, y_hat)])


print(f"Accuracy: {correct}/{y_test.shape[0]} ({correct/y_test.shape[0]}%)")

Store Models

You can save and load aesera models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the dump method with the model and file path.
PY

joblib.dump(predict, file_name)

If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Call the contains_key method with the model key.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does

not contain the model_key , save the model using the model_key before you proceed.

2. Call GetFilePath with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call load with the file path.

PY

loaded_model = joblib.load(file_name)

This method returns the saved model.


Machine Learning > Popular Libraries > GPlearn

Popular Libraries
GPlearn

Introduction

This page introduces how to build, train, test, and store GPlearn models.

Import Libraries

Import the GPlearn library.

PY

from gplearn.genetic import SymbolicRegressor, SymbolicTransformer


from sklearn.model_selection import train_test_split
import joblib

You need the sklearn library to prepare the data and the joblib library to store models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

Daily percent change of the open, high, low, close,


Features
and volume of the SPY over the last 5 days

Labels Daily percent return of the SPY over the next day

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Call the pct_change method and then drop the first row.

PY

daily_returns = history['close'].pct_change()[1:]

2. Loop through the daily_returns DataFrame and collect the features and labels.

PY

n_steps = 5
features = []
labels = []
for i in range(len(daily_returns)-n_steps):
features.append(daily_returns.iloc[i:i+n_steps].values)
labels.append(daily_returns.iloc[i+n_steps])

3. Convert the lists of features and labels into numpy arrays.

PY

X = np.array(features)
y = np.array(labels)

4. Split the data into training and testing periods.

PY

X_train, X_test, y_train, y_test = train_test_split(X, y)

Train Models

You need to prepare the historical data for training before you train the model. If you have prepared the data, build

and train the model. In this example, create a Symbolic Transformer to generate new non-linear features and then

build a Symbolic Regressor model. Follow these steps to create the model:
1. Declare a set of functions to use for feature engineering.

PY

function_set = ['add', 'sub', 'mul', 'div',


'sqrt', 'log', 'abs', 'neg', 'inv',
'max', 'min']

2. Call the SymbolicTransformer constructor with the preceding set of functions.

PY

gp_transformer = SymbolicTransformer(function_set=function_set,
random_state=0,
verbose=1)

3. Call the fit method with the training features and labels.

PY

gp_transformer.fit(X_train, y_train)

This method displays the following output:

4. Call the transform method with the original features.

PY

gp_features_train = gp_transformer.transform(X_train)

5. Call the hstack method with the original features and the transformed features.

PY

new_X_train = np.hstack((X_train, gp_features_train))

6. Call the SymbolicRegressor constructor.


PY

gp_regressor = SymbolicRegressor(random_state=0, verbose=1)

7. Call the fit method with the engineered features and the original labels.

PY

gp_regressor.fit(new_X_train, y_train)

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:

1. Feature engineer the testing set data.

PY

gp_features_test = gp_transformer.transform(X_test)
new_X_test = np.hstack((X_test, gp_features_test))

2. Call the predict method with the engineered testing set data.

PY

y_predict = gp_regressor.predict(new_X_test)

3. Plot the actual and predicted labels of the testing period.

PY

df = pd.DataFrame({'Real': y_test.flatten(), 'Predicted': y_predict.flatten()})


df.plot(title='Model Performance: predicted vs actual closing price', figsize=(15, 10))
plt.show()
4. Calculate the R-square value.

PY

r2 = gp_regressor.score(new_X_test, y_test)
print(f"The explained variance of the GP model: {r2*100:.2f}%")

Store Models

You can save and load GPlearn models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key names of the models to be stored in the Object Store.

PY

transformer_key = "transformer"
regressor_key = "regressor"

2. Call the get_file_path method with the key names.

PY

transformer_file = qb.object_store.get_file_path(transformer_key)
regressor_file = qb.object_store.get_file_path(regressor_key)
This method returns the file paths where the models will be stored.

3. Call the dump method with the models and file paths.

PY

joblib.dump(gp_transformer, transformer_file)
joblib.dump(gp_regressor, regressor_file)

If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Call the contains_key method.

PY

qb.object_store.contains_key(transformer_key)
qb.object_store.contains_key(regressor_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the keys.

PY

transformer_file = qb.object_store.get_file_path(transformer_key)
regressor_file = qb.object_store.get_file_path(regressor_key)

This method returns the path where the model is stored.

3. Call the load method with the file paths.

PY

loaded_transformer = joblib.load(transformer_file)
loaded_regressor = joblib.load(regressor_file)

This method returns the saved models.


Machine Learning > Popular Libraries > Hmmlearn

Popular Libraries
Hmmlearn

Introduction

This page explains how to build, train, test, and store Hmmlearn models.

Import Libraries

Import the Hmmlearn library.

PY

from hmmlearn import hmm


import joblib

You need the joblib library to store models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. Follow these steps to prepare the data:

1. Select the close column of the historical data DataFrame.

PY

closes = history['close']

2. Call the pct_change method and then drop the first row.

PY

daily_returns = closes.pct_change().iloc[1:]

3. Call the reshape method.


PY

X = daily_returns.values.reshape(-1, 1)

Train Models

You need to prepare the historical data for training before you train the model. If you have prepared the data, build

and train the model. In this example, assume the market has only 2 regimes and the market returns follow a

Gaussian distribution. Therefore, create a 2-component Hidden Markov Model with Gaussian emissions, which is
equivalent to a Gaussian mixture model with 2 means. Follow these steps to create the model:

1. Call the GaussianHMM constructor with the number of components, a covariance type, and the number of

iterations.

PY

model = hmm.GaussianHMM(n_components=2, covariance_type="full", n_iter=100)

2. Call the fit method with the training data.

PY

model.fit(X)

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:

1. Call the predict method with the testing dataset.

PY

y = model.predict(X)

2. Plot the regimes in a scatter plot.

PY

plt.figure(figsize=(15, 10))
plt.scatter(ret.index, [f'Regime {n+1}' for n in y])
plt.title(f'{symbol} market regime')
plt.xlabel("time")
plt.show()
Store Models

You can save and load Hmmlearn models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the dump method with the model and file path.

PY

joblib.dump(model, file_name)

If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Call the contains_key method.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does

not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call the load method with the file path.

PY

loaded_model = joblib.load(file_name)

This method returns the saved model.


Machine Learning > Popular Libraries > Keras

Popular Libraries
Keras

Introduction

This page explains how to build, train, test, and store keras models.

Import Libraries

Import the keras libraries.

PY

from tensorflow.keras import utils, models


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.saving import load_model

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, use the following features and labels:

Data Category Description

Daily percent change of the open, high, low, close,


Features
and volume of the SPY over the last 5 days

Labels Daily percent return of the SPY over the next day

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Call the pct_change and dropna methods.

PY

daily_pct_change = history.pct_change().dropna()

2. Loop through the daily_pct_change DataFrame and collect the features and labels.

PY

n_steps = 5
features = []
labels = []
for i in range(len(daily_pct_change)-n_steps):
features.append(daily_pct_change.iloc[i:i+n_steps].values)
labels.append(daily_pct_change['close'].iloc[i+n_steps])

3. Convert the lists of features and labels into numpy arrays.

PY

features = np.array(features)
labels = np.array(labels)

4. Split the data into training and testing periods.

PY

train_length = int(len(features) * 0.7)


X_train = features[:train_length]
X_test = features[train_length:]
y_train = labels[:train_length]
y_test = labels[train_length:]

Train Models

You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the model. In this example, build a neural network model that predicts the future return of the SPY. Follow

these steps to create the model:

1. Call the Sequential constructor with a list of layers.

PY

model = Sequential([Dense(10, input_shape=(5,5), activation='relu'),


Dense(10, activation='relu'),
Flatten(),
Dense(1)])

Set the input_shape of the first layer to (5, 5) because each sample contains the percent change of 5

factors (percent change of the open, high, low, close, and volume) over the previous 5 days. Call the Flatten
constructor because the input is 2-dimensional but the output is just a single value.

2. Call the compile method with a loss function, an optimizer, and a list of metrics to monitor.

PY

model.compile(loss='mse',
optimizer=RMSprop(0.001),
metrics=['mae', 'mse'])

3. Call the fit method with the features and labels of the training dataset and a number of epochs.

PY

model.fit(X_train, y_train, epochs=5)

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the

out-of-sample data. Follow these steps to test the model:

1. Call the predict method with the features of the testing period.

PY

y_hat = model.predict(X_test)

2. Plot the actual and predicted labels of the testing period.

PY

results = pd.DataFrame({'y': y_test.flatten(), 'y_hat': y_hat.flatten()})


results.plot(title='Model Performance: predicted vs actual %change in closing price')
Store Models

You can save and load keras models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

The key must end with a .keras extension for the native Keras format (recommended) or a .h5 extension.

PY

model_key = "model.keras"

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the save method the file path.

PY

model.save(file_name)
Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Call the contains_key method with the model key.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does

not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the key name.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call the load_model method with the file path.

PY

loaded_model = load_model(file_name)

This method returns the saved model.


Machine Learning > Popular Libraries > PyTorch

Popular Libraries
PyTorch

Introduction

This page explains how how to build, train, test, and store PyTorch models.

Import Libraries

Import the torch , sklearn , and joblib libraries by the following:

PY

import torch
from torch import nn
from sklearn.model_selection import train_test_split
import joblib

You need the sklearn library to prepare the data and the joblib library to store models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

Features The last 5 closing prices

Labels The following day's closing price

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Perform fractional differencing on the historical data.

PY

df = (history['close'] * 0.5 + history['close'].diff() * 0.5)[1:]

Fractional differencing helps make the data stationary yet retains the variance information.

2. Loop through the df DataFrame and collect the features and labels.

PY

n_steps = 5
features = []
labels = []
for i in range(len(df)-n_steps):
features.append(df.iloc[i:i+n_steps].values)
labels.append(df.iloc[i+n_steps])

3. Convert the lists of features and labels into numpy arrays.

PY

features = np.array(features)
labels = np.array(labels)

4. Standardize the features and labels

PY

X = (features - features.mean()) / features.std()


y = (labels - labels.mean()) / labels.std()

5. Split the data into training and testing periods.


PY

X_train, X_test, y_train, y_test = train_test_split(X, y)

Train Models

You need to prepare the historical data for training before you train the model. If you have prepared the data, build

and train the model. In this example, create a deep neural network with 2 hidden layers. Follow these steps to

create the model:

1. Define a subclass of nn.Module to be the model.

In this example, use the ReLU activation function for each layer.

PY

class NeuralNetwork(nn.Module):
# Model Structure
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(5, 5), # input size, output size of the layer
nn.ReLU(), # Relu non-linear transformation
nn.Linear(5, 5),
nn.ReLU(),
nn.Linear(5, 1), # Output size = 1 for regression
)

# Feed-forward training/prediction
def forward(self, x):
x = torch.from_numpy(x).float() # Convert to tensor in type float
result = self.linear_relu_stack(x)
return result

2. Create an instance of the model and set its configuration to train on the GPU if it's available.

PY

device = 'cuda' if torch.cuda.is_available() else 'cpu'


model = NeuralNetwork().to(device)

3. Set the loss and optimization functions.

In this example, use the mean squared error as the loss function and stochastic gradient descent as the

optimizer.

PY

loss_fn = nn.MSELoss()
learning_rate = 0.001
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

4. Train the model.

In this example, train the model through 5 epochs.


PY

epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")

# Since we're using SGD, we'll be using the size of data as batch number.
for batch, (X, y) in enumerate(zip(X_train, y_train)):
# Compute prediction and loss
pred = model(X)
real = torch.from_numpy(np.array(y).flatten()).float()
loss = loss_fn(pred, real)

# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()

if batch % 100 == 0:
loss, current = loss.item(), batch
print(f"loss: {loss:.5f} [{current:5d}/{len(X_train):5d}]")

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:

1. Predict with the testing data.

PY

predict = model(X_test)
y_predict = predict.detach().numpy() # Convert tensor to numpy ndarray

2. Plot the actual and predicted values of the testing period.


PY

df = pd.DataFrame({'Real': y_test.flatten(), 'Predicted': y_predict.flatten()})


df.plot(title='Model Performance: predicted vs actual standardized fractional return', figsize=(15,
10))
plt.show()

3. Calculate the R-square value.

PY

r2 = 1 - np.sum(np.square(y_test.flatten() - y_predict.flatten())) /
np.sum(np.square(y_test.flatten() - y_test.mean()))
print(f"The explained variance by the model (r-square): {r2*100:.2f}%")

Store Models

You can save and load PyTorch models using the Object Store.

Save Models

Don't use the torch.save method to save models because the tensor data will be lost and corrupt the save. Follow

these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"
2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the dump method with the model and file path.

PY

joblib.dump(model, file_name)

If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:

1. Call the contains_key method.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does

not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call the load method with the file path.

PY

loaded_model = joblib.load(file_name)

This method returns the saved model.


Machine Learning > Popular Libraries > Scikit-Learn

Popular Libraries
Scikit-Learn

Introduction

This page explains how to build, train, test, and store Scikit-Learn / sklearn models.

Import Libraries

Import the sklearn libraries.

PY

from sklearn.svm import SVR


from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
import joblib

You need the joblib library to store models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

Daily percent change of the open, high, low, close,


Features
and volume of the SPY over the last 5 days

Labels Daily percent return of the SPY over the next day

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Call the pct_change method and then drop the first row.

PY

daily_returns = history['close'].pct_change()[1:]

2. Loop through the daily_returns DataFrame and collect the features and labels.

PY

n_steps = 5
features = []
labels = []
for i in range(len(daily_returns)-n_steps):
features.append(daily_returns.iloc[i:i+n_steps].values)
labels.append(daily_returns.iloc[i+n_steps])

3. Convert the lists of features and labels into numpy arrays.

PY

X = np.array(features)
y = np.array(labels)

4. Split the data into training and testing periods.

PY

X_train, X_test, y_train, y_test = train_test_split(X, y)

Train Models

You need to prepare the historical data for training before you train the model. If you have prepared the data, build

and train the model. In this example, build a Support Vector Regressor model and optimize its hyperparameters

with grid search cross-validation. Follow these steps to create the model:
1. Set the choices of hyperparameters used for grid search testing.

PY

param_grid = {'C': [.05, .1, .5, 1, 5, 10],


'epsilon': [0.001, 0.005, 0.01, 0.05, 0.1],
'gamma': ['auto', 'scale']}

2. Call the GridSearchCV constructor with the SVR model, the parameter grid, a scoring method, the number of

cross-validation folds.

PY

gsc = GridSearchCV(SVR(), param_grid, scoring='neg_mean_squared_error', cv=5)

3. Call the fit method and then select the best estimator.

PY

model = gsc.fit(X_train, y_train).best_estimator_

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the

out-of-sample data. Follow these steps to test the model:

1. Call the predict method with the features of the testing period.

PY

y_hat = model.predict(X_test)

2. Plot the actual and predicted labels of the testing period.

PY

df = pd.DataFrame({'y': y_test.flatten(), 'y_hat': y_hat.flatten()})


df.plot(title='Model Performance: predicted vs actual %change in closing price', figsize=(15, 10))
Store Models

You can save and load sklearn models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the dump method with the model and file path.

PY

joblib.dump(model, file_name)

If you dump the model using the joblib module before you save the model, you don't need to retrain the
model.

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:

1. Call the contains_key method with the model key.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.

2. Call GetFilePath with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call load with the file path.

PY

loaded_model = joblib.load(file_name)

This method returns the saved model.


Machine Learning > Popular Libraries > Stable Baselines

Popular Libraries
Stable Baselines

Introduction

This page introduces how to use stable baselines library in Python for reinforcement machine learning (RL)
model building, training, saving in the Object Store, and loading, through an example of a Proximal Policy

Optimization (PPO) portfolio optimization trading bot.

Import Libraries

Import the stable_baselines and gym libraries.

PY

import gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the different asset class

ETFs during 2010 and 2023, run:

PY

qb = QuantBook()
symbols = [
qb.add_equity("SPY", Resolution.DAILY).symbol,
qb.add_equity("GLD", Resolution.DAILY).symbol,
qb.add_equity("TLT", Resolution.DAILY).symbol,
qb.add_equity("USO", Resolution.DAILY).symbol,
qb.add_equity("UUP", Resolution.DAILY).symbol
]
df = qb.history(symbols, datetime(2010, 1, 1), datetime(2024, 1, 1))

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, extract the close price series as the outcome and obtain the partial-differenced

time-series of OHLCV values as the observation.

PY

history = df.unstack(0)
# we arbitrarily select weight 0.5 here, but ideally one should strike a balance between variance
retained and stationarity.
partial_diff = (history.diff() * 0.5 + history * 0.5).iloc[1:].fillna(0)
history = history.close.iloc[1:]

Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build
and train the environment and the model. In this example, create a gym environment to initialize the training

environment, agent and reward. Then, create a RL model by DQN algorithm. Follow these steps to create the

environment and the model:

1. Split the data for training and testing to evaluate our model.

PY

X_train = partial_diff.iloc[:-100].values
X_test = partial_diff.iloc[-100:].values
y_train = history.iloc[:-100].values
y_test = history.iloc[-100:].values

2. Create a custom gym environment class.

In this example, create a custom environment with previous 5 OHLCV partial-differenced price data as the

observation and the lowest maximum drawdown as the reward.


PY

class PortfolioEnv(gym.Env):
def __init__(self, data, prediction, num_stocks):
super(PortfolioEnv, self).__init__()

self.data = data
self.prediction = prediction
self.num_stocks = num_stocks

self.current_step = 5
self.portfolio_value = []
self.portfolio_weights = np.ones(num_stocks) / num_stocks

# Define your action and observation spaces


self.action_space = gym.spaces.Box(low=-1.0, high=1.0, shape=(num_stocks, ),
dtype=np.float32)
self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(5, data.shape[1]))

def reset(self):
self.current_step = 5
self.portfolio_value = []
self.portfolio_weights = np.ones(self.num_stocks) / self.num_stocks

return self._get_observation()

def step(self, action):


# Normalize the portfolio weights
sum_weights = np.sum(np.abs(action))
if sum_weights > 1:
action /= sum_weights

# deduct transaction fee


value = self.prediction[self.current_step]
fees = np.abs(self.portfolio_weights - action) @ value

# Update portfolio weights based on the chosen action


self.portfolio_weights = action

# Update portfolio value based on the new weights and the market prices less fee
self.portfolio_value.append(np.dot(self.portfolio_weights, value) - fees)

# Move to the next time step


self.current_step += 1

# Check if the episode is done (end of data)


done = self.current_step >= len(self.data) - 1

# Calculate the reward, in here, we use max drawdown


reward = self._neg_max_drawdown

return self._get_observation(), reward, done, {}

def _get_observation(self):
# Return the last 5 partial differencing OHLCV as the observation
return self.data[self.current_step-5:self.current_step]

@property
def _neg_max_drawdown(self):
# Return max drawdown within 20 days in portfolio value as reward (negate since max reward
is preferred)
portfolio_value_20d = np.array(self.portfolio_value[-min(len(self.portfolio_value), 20):])
acc_max = np.maximum.accumulate(portfolio_value_20d)
return -(portfolio_value_20d - acc_max).min()

def render(self, mode='human'):


# Implement rendering if needed
pass

3. Initialize the environment.


PY

# Initialize the environment


env = PortfolioEnv(X_train, y_train, 5)

# Wrap the environment in a vectorized environment


env = DummyVecEnv([lambda: env])

# Normalize the observation space


env = VecNormalize(env, norm_obs=True, norm_reward=False)

4. Train the model.

In this example, create a RL model and train with MLP-policy PPO algorithm.

PY

# Define the PPO agent


model = PPO("MlpPolicy", env, verbose=0)

# Train the agent


model.learn(total_timesteps=100000)

Test Models

You need to build and train the model before you test its performance. If you have trained the model, test it on the
out-of-sample data. Follow these steps to test the model:

1. Initialize a return series to calculate performance and a list to store the equity value at each timestep.

PY

test = np.log(y_test[1:]/y_test[:-1])
equity = [1]

2. Iterate each testing data point for prediction and trading.

PY

for i in range(5, X_test.shape[0]-1):


action, _ = model.predict(X_test[i-5:i], deterministic=True)
sum_weights = np.sum(np.abs(action))
if sum_weights > 1:
action /= sum_weights
value = test[i] @ action.T

equity.append((1+value) * equity[i-5])

3. Plot the result.

PY

plt.figure(figsize=(15, 10))
plt.title("Equity Curve")
plt.xlabel("timestep")
plt.ylabel("equity")
plt.plot(equity)
plt.show()
Store Models

You can save and load stable baselines models using the Object Store.

Save Models

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the save method with the file path.

PY

model.save(file_name)

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,
follow these steps to load it:

1. Call the contains_key method.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call the load method with the file path, environment and policy.

PY

loaded_model = PPO.load(file_name, env=env, policy="MlpPolicy")

This method returns the saved model.


Machine Learning > Popular Libraries > TensorFlow

Popular Libraries
TensorFlow

Introduction

This page explains how to build, train, test, and store Tensorflow models.

Import Libraries

Import the tensorflow , and sklearn libraries.

PY

import tensorflow as tf
from sklearn.model_selection import train_test_split

You need the sklearn library to prepare the data.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

The last 5 close price differencing to the current


Features
price

Labels The following day's price change

Follow these steps to prepare the data:

1. Loop through the DataFrame of historical prices and collect the features.
PY

data = history

lookback = 5
lookback_series = []
for i in range(1, lookback + 1):
df = data['close'].diff(i)[lookback:-1]
df.name = f"close-{i}"
lookback_series.append(df)
X = pd.concat(lookback_series, axis=1).reset_index(drop=True).dropna()
X

The following image shows the format of the features DataFrame:

2. Select the close column and then call the shift method to collect the labels.

PY

Y = data['close'].diff(-1)

3. Drop the first 5 samples and then call the reset_index method.

PY

Y = Y[lookback:-1].reset_index(drop=True)

This method aligns the history of the features and labels.

4. Split the data into training and testing datasets.

For example, to use the last third of data to test the model, run:

PY

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, shuffle=False)

Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build

and train the model. In this example, build a neural network model that predicts the future price of the SPY.

Build the Model

Follow these steps to build the model:

1. Set the number of layers, their number of nodes, the number of epoch and the learning rate.

PY

num_factors = X_test.shape[1]
num_neurons_1 = 10
num_neurons_2 = 10
num_neurons_3 = 5
epochs = 20
learning_rate = 0.0001

2. Create hidden layers with the set number of layer and their corresponding number of nodes.

In this example, we're constructing the model with the in-built Keras API, with Relu activator for non-linear

activation of each tensors.

PY

model = tf.keras.sequential([
tf.keras.layers.dense(num_neurons_1, activation=tf.nn.relu, input_shape=(num_factors,)), #
input shape required
tf.keras.layers.dense(num_neurons_2, activation=tf.nn.relu),
tf.keras.layers.dense(num_neurons_3, activation=tf.nn.relu),
tf.keras.layers.dense(1)
])

3. Select an optimizer.

We're using Adam optimizer in this example. You may also consider others like SGD.

PY

optimizer = tf.keras.optimizers.adam(learning_rate=learning_rate)

4. Define the loss function.

In the context of numerical regression, we use MSE as our objective function. If you're doing classification,

cross entropy would be more suitable.

PY

def loss_mse(target_y, predicted_y):


return tf.reduce_mean(tf.square(target_y - predicted_y))

Train the Model

Iteratively train the model by the set epoch number. The model will train adaptively by the gradient provided by the

loss function with the selected optimizer.


PY

for i in range(epochs):
with tf.gradient_tape() as t:
loss = loss_mse(y_train, model(X_train))

train_loss = loss_mse(y_train, model(X_train))


test_loss = loss_mse(y_test, model(X_test))
print(f"""Epoch {i+1}:
Training loss = {train_loss.numpy()}. Test loss = {test_loss.numpy()}""")

jac = t.gradient(loss, model.trainable_weights)


optimizer.apply_gradients(zip(jac, model.trainable_weights))

Test Models

To test the model, we'll setup a method to plot test set predictions ontop of the SPY price.

PY

def test_model(actual, title, X):


prediction = model(X).numpy()
prediction = prediction.reshape(-1, 1)

plt.figure(figsize=(16, 6))
plt.plot(actual, label="Actual")
plt.plot(prediction, label="Prediction")
plt.title(title)
plt.xlabel("Time step")
plt.ylabel("SPY Price")
plt.legend()
plt.show()

test_model(y_test, "Test Set Results from Original Model", X_test)

Store Models

You can save and load TensorFlow models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.
PY

model_key = "model.keras"

Note that the model has to have the suffix .keras .

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Call the save method with the model and file path.

PY

model.save(file_name)

4. Save the model to the file path.

PY

qb.object_store.save(model_key)

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Get the file path from the Object Store.

PY

file_name = qb.object_store.get_file_path(model_key)

2. Restore the TensorFlow model from the saved path.

PY

model = tf.keras.models.load_model(file_name)
Machine Learning > Popular Libraries > Tslearn

Popular Libraries
Tslearn

Introduction

This page explains how to build, train, test, and store tslearn models.

Import Libraries

Import the tslearn libraries.

PY

from tslearn.barycenters import softdtw_barycenter


from tslearn.clustering import TimeSeriesKMeans

Get Historical Data

Get some historical market data to train and test the model. For example, get data for the securities shown in the

following table:

Group Name Tickers

Overall US market SPY, QQQ, DIA

Tech companies AAPL, MSFT, TSLA

Long-term US Treasury ETFs IEF, TLT

Short-term US Treasury ETFs SHV, SHY

Heavy metal ETFs GLD, IAU, SLV

Energy sector USO, XLE, XOM

PY

qb = QuantBook()
tickers = ["SPY", "QQQ", "DIA",
"AAPL", "MSFT", "TSLA",
"IEF", "TLT", "SHV", "SHY",
"GLD", "IAU", "SLV",
"USO", "XLE", "XOM"]
symbols = [qb.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]
history = qb.history(symbols, datetime(2020, 1, 1), datetime(2022, 2, 20))

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train
and test the model. In this example, standardize the log close price time-series of the securities. Follow these

steps to prepare the data:

1. Unstack the historical DataFrame and select the close column.

PY

close = history.unstack(0).close

2. Take the logarithm of the historical time series.

PY

log_close = np.log(close)

Taking the logarithm eases the compounding effect.

3. Standardize the data.

PY

standard_close = (log_close - log_close.mean()) / log_close.std()

Train Models

Instead of using real-time comparison, we could apply a technique call Dynamic Time Wrapping (DTW) with

Barycenter Averaging (DBA). Intuitively, it is a technique of averaging a few time-series into a single one without

losing much of their information. Since not all time-series would move efficiently like in ideal EMH assumption, this

would allow similarity analysis of different time-series with sticky lags. Check the technical details from tslearn
documentation page .
We then can separate different clusters by KMean after DBA.

PY

# Set up the Time Series KMean model with soft DBA.


km = TimeSeriesKMeans(n_clusters=6, # We have 6 main groups
metric="softdtw", # soft for differentiable
random_state=0)

# Fit the model.


km.fit(standard_close.T)

Test Models

We visualize the clusters and their corresponding underlying series.

1. Predict with the label of the data.

PY

labels = km.predict(standard_close.T)

2. Create a class to aid plotting.

PY

def plot_helper(ts):
# plot all points of the data set
for i in range(ts.shape[0]):
plt.plot(ts[i, :], "k-", alpha=.2)

# plot the given barycenter of them


barycenter = softdtw_barycenter(ts, gamma=1.)
plt.plot(barycenter, "r-", linewidth=2)

3. Plot the results.

PY

j = 1
plt.figure(figsize=(15, 10))
for i in set(labels):
# Select the series in the i-th cluster.
X = standard_close.iloc[:, [n for n, k in enumerate(labels) if k == i]].values

# Plot the series and barycenter-averaged series.


plt.subplot(len(set(labels)) // 3 + (1 if len(set(labels))%3 != 0 else 0), 3, j)
plt.title(f"Cluster {i+1}")
plot_helper(X.T)

j += 1

plt.show()
4. Display the groupings.

PY

for i in set(labels):
print(f"Cluster {i+1}: {standard_close.columns[[n for n, k in enumerate(labels) if k == i]]}")

Store Models

You can save and load tslearn models using the Object Store.

Save Models

Follow these steps to save models in the Object Store:

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call the get_file_path method with the key.


PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the file path where the model will be stored.

3. Delete the current file to avoid a FileExistsError error when you save the model.

PY

import os
os.remove(file_name)

4. Call the to_hdf5 method with the file path.

PY

km.to_hdf5(file_name + ".hdf5")

Load Models

You must save a model into the Object Store before you can load it from the Object Store. If you saved a model,

follow these steps to load it:

1. Call the contains_key method.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does

not contain the model_key , save the model using the model_key before you proceed.

2. Call the get_file_path method with the key.

PY

file_name = qb.object_store.get_file_path(model_key)

This method returns the path where the model is stored.

3. Call the from_hdf5 method with the file path.

PY

loaded_model = TimeSeriesKMeans.from_hdf5(file_name + ".hdf5")

This method returns the saved model.

Reference

F. Petitjean, A. Ketterlin, P. Gancarski. (2010). A global averaging method for dynamic time warping, with

applications to clustering. Pattern Recognition. 44(2011). 678-693. Retreived from https://lig-


membres.imag.fr/bisson/cours/M2INFO-AIW-ML/papers/PetitJean11.pdf
Machine Learning > Popular Libraries > XGBoost

Popular Libraries
XGBoost

Introduction

This page explains how to build, train, test, and store XGBoost models.

Import Libraries

Import the xgboost , sklearn , and joblib libraries.

PY

import xgboost as xgb


from sklearn.model_selection import train_test_split
import joblib

You need the sklearn library to prepare the data and the joblib library to save models.

Get Historical Data

Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020

and 2021, run:

PY

qb = QuantBook()
symbol = qb.add_equity("SPY", Resolution.DAILY).symbol
history = qb.history(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]

Prepare Data

You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train

and test the model. In this example, use the following features and labels:

Data Category Description

Features The last 5 closing prices

Labels The following day's closing price

The following image shows the time difference between the features and labels:
Follow these steps to prepare the data:

1. Perform fractional differencing on the historical data.

PY

df = (history['close'] * 0.5 + history['close'].diff() * 0.5)[1:]

Fractional differencing helps make the data stationary yet retains the variance information.

2. Loop through the df DataFrame and collect the features and labels.

PY

n_steps = 5
features = []
labels = []
for i in range(len(df)-n_steps):
features.append(df.iloc[i:i+n_steps].values)
labels.append(df.iloc[i+n_steps])

3. Convert the lists of features and labels into numpy arrays.

PY

features = np.array(features)
labels = np.array(labels)

4. Standardize the features and labels

PY

X = (features - features.mean()) / features.std()


y = (labels - labels.mean()) / labels.std()

5. Split the data into training and testing periods.


PY

X_train, X_test, y_train, y_test = train_test_split(X, y)

Train Models

We're about to train a gradient-boosted random forest for future price prediction.

1. Split the data for training and testing to evaluate our model.

PY

X_train, X_test, y_train, y_test = train_test_split(X, y)

2. Format training set into XGBoost matrix.

PY

dtrain = xgb.DMatrix(X_train, label=y_train)

3. Train the model with parameters.

PY

params = {
'booster': 'gbtree',
'colsample_bynode': 0.8,
'learning_rate': 0.1,
'lambda': 0.1,
'max_depth': 5,
'num_parallel_tree': 100,
'objective': 'reg:squarederror',
'subsample': 0.8,
}
model = xgb.train(params, dtrain, num_boost_round=10)

Test Models

We then make predictions on the testing data set. We compare our Predicted Values with the Expected Values by
plotting both to see if our Model has predictive power.

1. Format testing set into XGBoost matrix.

PY

dtest = xgb.DMatrix(X_test, label=y_test)

2. Predict with the testing set data.

PY

y_predict = model.predict(dtest)

3. Plot the result.


PY

df = pd.DataFrame({'Real': y_test.flatten(), 'Predicted': y_predict.flatten()})


df.plot(title='Model Performance: predicted vs actual closing price', figsize=(15, 10))
plt.show()

Store Models

Saving the Model

We dump the model using the joblib module and save it to Object Store file path. This way, the model doesn't

need to be retrained, saving time and computational resources.

1. Set the key name of the model to be stored in the Object Store.

PY

model_key = "model"

2. Call GetFilePath with the key's name to get the file path.

PY

file_name = qb.object_store.get_file_path(model_key)

3. Call dump with the model and file path to save the model to the file path.

PY

joblib.dump(model, file_name)
Loading the Model

Let's retrieve the model from the Object Store file path and load by joblib .

1. Call the contains_key method.

PY

qb.object_store.contains_key(model_key)

This method returns a boolean that represents if the model_key is in the Object Store. If the Object Store does
not contain the model_key , save the model using the model_key before you proceed.

2. Call GetFilePath with the key's name to get the file path.

PY

file_name = qb.object_store.get_file_path(model_key)

3. Call load with the file path to fetch the saved model.

PY

loaded_model = joblib.load(file_name)

To ensure loading the model was successfuly, let's test the model.

PY

y_pred = loaded_model.predict(dtest)
df = pd.DataFrame({'Real': y_test.flatten(), 'Predicted': y_pred.flatten()})
df.plot(title='Model Performance: predicted vs actual closing price', figsize=(15, 10))
Machine Learning > Hugging Face

Machine Learning
Hugging Face

Machine Learning > Hugging Face > Key Concepts

Hugging Face
Key Concepts

Introduction
Debugging

Debugging

Introduction

The debugger is a built-in tool to help you debug coding errors while in the Research Environment. The debugger

enables you to slow down the code execution, step through the program line-by-line, and inspect the variables to

understand the internal state of the notebook.

The Research Environment debugger isn't currently available for C#.

Breakpoints

Breakpoints are lines in your notebook where execution pauses. You need at least one breakpoint in your

notebook to start the debugger. Open a project to start adjusting its breakpoints.

Add Breakpoints

Click to the left of a line to add a breakpoint on that line.

Edit Breakpoint Conditions

Follow these steps to customize what happens when a breakpoint is hit:

1. Right-click the breakpoint and then click Edit Breakpoint... .

2. Click one of the options in the following table:

Option Additional Steps Description

The breakpoint only pauses the


Enter an expression and then
Expression notebook when the expression is
press Enter .
true.

The breakpoint doesn't pause the


Enter an integer and then press
Hit Count notebook until its hit the number
Enter .
of times you specify.

Enable and Disable Breakpoints

To enable a breakpoint, right-click it and then click Enable Breakpoint .

To disable a breakpoint, right-click it and then click Disable Breakpoint .

Follow these steps to enable and disable all breakpoints:


1. In the right navigation menu, click the Run and Debug icon.

2. In the Run and Debug panel, hover over the Breakpoints section and then click the Toggle Active
Breakpoints icon.

Remove Breakpoints

To remove a breakpoint, right-click it and then click Remove Breakpoint .

Follow these steps to remove all breakpoints:

1. In the right navigation menu, click the Run and Debug icon.

2. In the Run and Debug panel, hover over the Breakpoints section and then click the Remove All

Breakpoints icon.

Launch Debugger

Follow these steps to launch the debugger:

1. Open the project you want to debug.

2. Open the notebook file in your project.


3. In a notebook cell, add at least one breakpoint.

4. In the top-left corner of the cell, click the drop-down arrow and then click Debug Cell .

If the Run and Debug panel is not open, it opens when the first breakpoint is hit.

Control Debugger

After you launch the debugger, you can use the following buttons to control it:

Default Keyboard
Button Name Description
Shortcut

Continue execution until


Continue
the next breakpoint

Step to the next line of


Step Over Alt+F10 code in the current or
parent scope

Step into the definition


Step Into Alt+F11 of the function call on
the current line

Restart Shift+F11 Restart the debugger

Disconnect Shift+F5 Exit the debugger

Inspect Variables

After you launch the debugger, you can inspect the state of your notebook as it executes each line of code. You
can inspect local variables or custom expressions. The values of variables in your notebook are formatted in the

IDE to improve readability. For example, if you inspect a variable that references a DataFrame, the debugger
represents the variable value as the following:

Local Variables

The Variables section of the Run and Debug panel shows the local variables at the current breakpoint. If a variable

in the panel is an object, click it to see its members. The panel updates as the notebook runs.

Follow these steps to update the value of a variable:

1. In the Run and Debug panel, right-click a variable and then click Set Value .

2. Enter the new value and then press Enter .

Custom Expressions

The Watch section of the Run and Debug panel shows any custom expressions you add. For example, you can add

an expression to show a datetime object.

Follow these steps to add a custom expression:

1. Hover over the Watch section and then click the plus icon that appears.

2. Enter an expression and then press Enter .


Meta Analysis

Meta Analysis

Meta Analysis > Key Concepts

Meta Analysis
Key Concepts

Introduction

Understanding your strategy trades in detail is key to attributing performance, and determining areas to focus for

improvement. This analysis can be done with the QuantConnect API. We enable you to load backtest, optimization,

and live trading results into the Research Environment.

Backtest Analysis

Load your backtest results into the Research Environment to analyze trades and easily compare them against the
raw backtesting data. For more information on loading and manipulating backtest results, see Backtest Analysis .

Optimization Analysis

Load your optimization results into the Research Environment to analyze how different combinations of parameters

affect the algorithm's performance. For more information on loading and manipulating optimizations results, see

Optimization Analysis .

Live Analysis

Load your live trading results into the Research Environment to compare live trading performance against

simulated backtest results, or analyze your trades to improve your slippage and fee models. For more information

on loading and manipulating live trading results, see Live Analysis .


Meta Analysis > Backtest Analysis

Meta Analysis
Backtest Analysis

Introduction

Load your backtest results into the Research Environment to analyze trades and easily compare them against the
raw backtesting data. Compare backtests from different projects to find uncorrelated strategies to combine for

better performance.

Loading your backtest trades allows you to plot fills against detailed data, or locate the source of profits. Similarly
you can search for periods of high churn to reduce turnover and trading fees.

Read Backtest Results

To get the results of a backtest, call the read_backtest method with the project Id and backtest ID.

PY

backtest = api.read_backtest(project_id, backtest_id)

The following table provides links to documentation that explains how to get the project Id and backtest Id,

depending on the platform you use:

Platform Project Id Backtest Id

Cloud Platform Get Project Id Get Backtest Id

Local Platform Get Project Id Get Backtest Id

CLI Get Project Id Get Backtest Id

Note that this method returns a snapshot of the backtest at the current moment. If the backtest is still executing,

the result won't include all of the backtest data.

The read_backtest method returns a Backtest object, which have the following attributes:

Plot Order Fills

Follow these steps to plot the daily order fills of a backtest:

1. Get the backtest orders.

PY

orders = api.read_backtest_orders(project_id, backtest_id)


The following table provides links to documentation that explains how to get the project Id and backtest Id,
depending on the platform you use:

Platform Project Id Backtest Id

Cloud Platform Get Project Id Get Backtest Id

Local Platform Get Project Id Get Backtest Id

CLI Get Project Id Get Backtest Id

The read_backtest_orders method returns a list of Order objects, which have the following properties:

2. Organize the trade times and prices for each security into a dictionary.

PY

class OrderData:

def __init__(self):
self.buy_fill_times = []

self.buy_fill_prices = []
self.sell_fill_times = []

self.sell_fill_prices = []

order_data_by_symbol = {}

for order in [x.order for x in orders]:


if order.symbol not in order_data_by_symbol:

order_data_by_symbol[order.symbol] = OrderData()
order_data = order_data_by_symbol[order.symbol]

is_buy = order.quantity > 0


(order_data.buy_fill_times if is_buy else

order_data.sell_fill_times).append(order.last_fill_time.date())
(order_data.buy_fill_prices if is_buy else order_data.sell_fill_prices).append(order.price)

3. Get the price history of each security you traded.

PY

qb = QuantBook()

start_date = datetime.max.date()
end_date = datetime.min.date()

for symbol, order_data in order_data_by_symbol.items():

if order_data.buy_fill_times:
start_date = min(start_date, min(order_data.buy_fill_times))

end_date = max(end_date, max(order_data.buy_fill_times))


if order_data.sell_fill_times:

start_date = min(start_date, min(order_data.sell_fill_times))


end_date = max(end_date, max(order_data.sell_fill_times))

start_date -= timedelta(days=3)
all_history = qb.history(list(order_data_by_symbol.keys()), start_date, end_date, Resolution.DAILY)
4. Create a candlestick plot for each security and annotate each plot with buy and sell markers.

PY

import plotly.express as px

import plotly.graph_objects as go

for symbol, order_data in order_data_by_symbol.items():


history = all_history.loc[symbol]

# Plot security price candlesticks

candlestick = go.Candlestick(x=history.index,

open=history['open'],
high=history['high'],

low=history['low'],
close=history['close'],

name='Price')
layout = go.Layout(title=go.layout.Title(text=f'{symbol.value} Trades'),

xaxis_title='Date',
yaxis_title='Price',

xaxis_rangeslider_visible=False,
height=600)

fig = go.Figure(data=[candlestick], layout=layout)

# Plot buys
fig.add_trace(go.Scatter(

x=order_data.buy_fill_times,

y=order_data.buy_fill_prices,
marker=go.scatter.Marker(color='aqua', symbol='triangle-up', size=10),

mode='markers',
name='Buys',

))

# Plot sells
fig.add_trace(go.Scatter(

x=order_data.sell_fill_times,
y=order_data.sell_fill_prices,

marker=go.scatter.Marker(color='indigo', symbol='triangle-down', size=10),


mode='markers',

name='Sells',

))

fig.show()
Note: The preceding plots only show the last fill of each trade. If your trade has partial fills, the plots only

display the last fill.

Plot Metadata

Follow these steps to plot the equity curve, benchmark, and drawdown of a backtest:

1. Get the backtest instance.

PY

backtest = api.read_backtest(project_id, backtest_id)

The following table provides links to documentation that explains how to get the project Id and backtest Id,

depending on the platform you use:


Platform Project Id Backtest Id

Cloud Platform Get Project Id Get Backtest Id

Local Platform Get Project Id Get Backtest Id

CLI Get Project Id Get Backtest Id

2. Get the "Strategy Equity", "Drawdown", and "Benchmark" Chart objects.

PY

equity_chart = backtest.charts["Strategy Equity"]


drawdown_chart = backtest.charts["Drawdown"]
benchmark_chart = backtest.charts["Benchmark"]

3. Get the "Equity", "Equity Drawdown", and "Benchmark" Series from the preceding charts.

PY

equity = equity_chart.series["Equity"].values
drawdown = drawdown_chart.series["Equity Drawdown"].values
benchmark = benchmark_chart.series["Benchmark"].values

4. Create a pandas.DataFrame from the series values.

PY

df = pd.DataFrame({
"Equity": pd.Series({value.TIME: value.CLOSE for value in equity}),
"Drawdown": pd.Series({value.TIME: value.Y for value in drawdown}),
"Benchmark": pd.Series({value.TIME: value.Y for value in benchmark})
}).ffill()

5. Plot the performance chart.

PY

# Create subplots to plot series on same/different plots


fig, ax = plt.subplots(2, 1, figsize=(12, 12), sharex=True, gridspec_kw={'height_ratios': [2, 1]})

# Plot the equity curve


ax[0].plot(df.index, df["Equity"])
ax[0].set_title("Strategy Equity Curve")
ax[0].set_ylabel("Portfolio Value ($)")

# Plot the benchmark on the same plot, scale by using another y-axis
ax2 = ax[0].twinx()
ax2.plot(df.index, df["Benchmark"], color="grey")
ax2.set_ylabel("Benchmark Price ($)", color="grey")

# Plot the drawdown on another plot


ax[1].plot(df.index, df["Drawdown"], color="red")
ax[1].set_title("Drawdown")
ax[1].set_xlabel("Time")
ax[1].set_ylabel("%")
The following table shows all the chart series you can plot:
Chart Series Description

Equity Time series of the equity curve


Strategy Equity
Daily Performance Time series of daily percentage change

Capacity Strategy Capacity Time series of strategy capacity snapshots

Drawdown Equity Drawdown Time series of equity peak-to-trough value

Benchmark Benchmark Time series of the benchmark closing price (SPY, by default)

SecurityType - Time series of the overall ratio of SecurityType long positions of


Long Ratio the whole portfolio if any SecurityType is ever in the universe
Exposure
SecurityType - Time series of the overall ratio of SecurityType short position of
Short Ratio the whole portfolio if any SecurityType is ever in the universe

Custom Chart Custom Series Time series of a Series in a custom chart

Plot Insights

Follow these steps to display the insights of each asset in a backtest:

1. Get the insights.

PY

insight_response = api.read_backtest_insights(project_id, backtest_id)

The following table provides links to documentation that explains how to get the project Id and backtest Id,
depending on the platform you use:

Platform Project Id Backtest Id

Cloud Platform Get Project Id Get Backtest Id

Local Platform Get Project Id Get Backtest Id

CLI Get Project Id Get Backtest Id

The read_backtest_insights method returns an InsightResponse object, which have the following

properties:

2. Organize the insights into a DataFrame.


PY

import pytz

def _eastern_time(unix_timestamp):
return unix_timestamp.replace(tzinfo=pytz.utc)\
.astimezone(pytz.timezone('US/Eastern')).replace(tzinfo=None)

insight_df = pd.DataFrame(
[
{
'Symbol': i.symbol,
'Direction': i.direction,
'Generated Time': _eastern_time(i.generated_time_utc),
'Close Time': _eastern_time(i.close_time_utc),
'Weight': i.weight
}
for i in insight_response.insights
]
)

3. Get the price history of each security that has an insight.

PY

symbols = list(insight_df['Symbol'].unique())
qb = QuantBook()
history = qb.history(
symbols, insight_df['Generated Time'].min()-timedelta(1),
insight_df['Close Time'].max(), Resolution.DAILY
)['close'].unstack(0)

4. Plot the price and insights of each asset.

PY

colors = ['yellow', 'green', 'red']


fig, axs = plt.subplots(len(symbols), 1, sharex=True)
for i, symbol in enumerate(symbols):
ax = axs[i]
history[symbol].plot(ax=ax)
for _, insight in insight_df[insight_df['Symbol'] == symbol].iterrows():
ax.axvspan(
insight['Generated Time'], insight['Close Time'],
color=colors[insight['Direction']], alpha=0.3
)
ax.set_title(f'Insights for {symbol.value}')
ax.set_xlabel('Date')
ax.set_ylabel('Price')
plt.tight_layout()
plt.show()
Meta Analysis > Optimization Analysis

Meta Analysis
Optimization Analysis

Introduction

Load your optimization results into the Research Environment to analyze how different combinations of parameters
affect the algorithm's performance.

Read Optimization Results

To get the results of an optimization, call the read_optimization method with the optimization Id.

PY

optimization = api.read_optimization(optimization_id)

The following table provides links to documentation that explains how to get the optimization Id, depending on the

platform you use:

Platform Optimization Id

Cloud Platform Get Optimization Id

Local Platform Get Optimization Id

CLI

The read_optimization method returns an Optimization object, which have the following attributes:
Meta Analysis > Live Analysis

Meta Analysis
Live Analysis

Introduction

Load your live trading results into the Research Environment to compare live trading performance against

simulated backtest results.

Read Live Results

To get the results of a live algorithm, call the read_live_algorithm method with the project Id and deployment ID.

PY

live_algorithm = api.read_live_algorithm(project_id, deploy_id)

The following table provides links to documentation that explains how to get the project Id and deployment Id,

depending on the platform you use:

Platform Project Id Deployment Id

Cloud Platform Get Project Id Get Deployment Id

Local Platform Get Project Id Get Deployment Id

CLI Get Project Id

The read_live_algorithm method returns a LiveAlgorithmResults object, which have the following attributes:

Reconciliation

Reconciliation is a way to quantify the difference between an algorithm's live performance and its out-of-sample

(OOS) performance (a backtest run over the live deployment period).

Seeing the difference between live performance and OOS performance gives you a way to determine if the

algorithm is making unrealistic assumptions, exploiting data differences, or merely exhibiting behavior that is

impractical or impossible in live trading.

A perfectly reconciled algorithm has an exact overlap between its live equity and OOS backtest curves. Any

deviation means that the performance of the algorithm has differed for some reason. Several factors can

contribute to this, often stemming from the algorithm design.


Reconciliation is scored using two metrics: returns correlation and dynamic time warping (DTW) distance.

What is DTW Distance?

Dynamic Time Warp (DTW) Distance quantifies the difference between two time-series. It is an algorithm that

measures the shortest path between the points of two time-series. It uses Euclidean distance as a measurement of

point-to-point distance and returns an overall measurement of the distance on the scale of the initial time-series
values. We apply DTW to the returns curve of the live and OOS performance, so the DTW distance measurement is

on the scale of percent returns.

L

{ ( )
DTW(X, Y) = min l= 1 xm − yn 2 ∈ PN× M
l l }
For the reasons outlined in our research notebook on the topic (linked below), QuantConnect annualizes the daily
DTW. An annualized distance provides a user with a measurement of the annual difference in the magnitude of

returns between the two curves. A perfect score is 0, meaning the returns for each day were precisely the same. A

DTW score of 0 is nearly impossible to achieve, and we consider anything below 0.2 to be a decent score. A

distance of 0.2 means the returns between an algorithm's live and OOS performance deviated by 20% over a year.

What is Returns Correlation?

Returns correlation is the simple Pearson correlation between the live and OOS returns. Correlation gives us a

rudimentary understanding of how the returns move together. Do they trend up and down at the same time? Do

they deviate in direction or timing?

cov(X, Y)
σXσY
ρXY =

An algorithm's returns correlation should be as close to 1 as possible. We consider a good score to be 0.8 or

above, meaning that there is a strong positive correlation. This indicates that the returns move together most of

the time and that for any given return you see from one of the curves, the other curve usually has a similar
direction return (positive or negative).

Why Do We Need Both DTW and Returns Correlation?

Each measurement provides insight into distinct elements of time-series similarity, but neither measurement alone

gives us the whole picture. Returns correlation tells us whether or not the live and OOS returns move together, but

it doesn't account for the possible differences in the magnitude of the returns. DTW distance measures the

difference in magnitude of returns but provides no insight into whether or not the returns move in the same
direction. It is possible for there to be two cases of equity curve similarity where both pairs have the same DTW

distance, but one has perfectly negatively correlated returns, and the other has a perfectly positive correlation.

Similarly, it is possible for two pairs of equity curves to each have perfect correlation but substantially different

DTW distance. Having both measurements provides us with a more comprehensive understanding of the actual
similarity between live and OOS performance. We outline several interesting cases and go into more depth on the

topic of reconciliation in research we have published.

Plot Order Fills

Follow these steps to plot the daily order fills of a live algorithm:

1. Get the live trading orders.

PY

orders = api.read_live_orders(project_id)

The following table provides links to documentation that explains how to get the project Id, depending on the

platform you use:

Platform Project Id

Cloud Platform Get Project Id

Local Platform Get Project Id

CLI Get Project Id

By default, the orders with an ID between 0 and 100. To get orders with an ID greater than 100, pass start

and end arguments to the read_live_orders method. Note that end - start must be less than 100.

PY

orders = api.read_live_orders(project_id, 100, 150)

The read_live_orders method returns a list of Order objects, which have the following properties:

2. Organize the trade times and prices for each security into a dictionary.
PY

class OrderData:

def __init__(self):
self.buy_fill_times = []

self.buy_fill_prices = []
self.sell_fill_times = []

self.sell_fill_prices = []

order_data_by_symbol = {}

for order in [x.order for x in orders]:


if order.symbol not in order_data_by_symbol:

order_data_by_symbol[order.symbol] = OrderData()
order_data = order_data_by_symbol[order.symbol]

is_buy = order.quantity > 0


(order_data.buy_fill_times if is_buy else

order_data.sell_fill_times).append(order.last_fill_time.date())
(order_data.buy_fill_prices if is_buy else order_data.sell_fill_prices).append(order.price)

3. Get the price history of each security you traded.

PY

qb = QuantBook()

start_date = datetime.max.date()
end_date = datetime.min.date()

for symbol, order_data in order_data_by_symbol.items():


if order_data.buy_fill_times:

start_date = min(start_date, min(order_data.buy_fill_times))


end_date = max(end_date, max(order_data.buy_fill_times))

if order_data.sell_fill_times:

start_date = min(start_date, min(order_data.sell_fill_times))


end_date = max(end_date, max(order_data.sell_fill_times))

start_date -= timedelta(days=3)
all_history = qb.history(list(order_data_by_symbol.keys()), start_date, end_date, Resolution.DAILY)

4. Create a candlestick plot for each security and annotate each plot with buy and sell markers.
PY

import plotly.express as px
import plotly.graph_objects as go

for symbol, order_data in order_data_by_symbol.items():


history = all_history.loc[symbol]

# Plot security price candlesticks

candlestick = go.Candlestick(x=history.index,
open=history['open'],

high=history['high'],
low=history['low'],

close=history['close'],
name='Price')

layout = go.Layout(title=go.layout.Title(text=f'{symbol.value} Trades'),


xaxis_title='Date',

yaxis_title='Price',
xaxis_rangeslider_visible=False,

height=600)

fig = go.Figure(data=[candlestick], layout=layout)

# Plot buys
fig.add_trace(go.Scatter(

x=order_data.buy_fill_times,
y=order_data.buy_fill_prices,

marker=go.scatter.Marker(color='aqua', symbol='triangle-up', size=10),


mode='markers',

name='Buys',
))

# Plot sells

fig.add_trace(go.Scatter(
x=order_data.sell_fill_times,

y=order_data.sell_fill_prices,

marker=go.scatter.Marker(color='indigo', symbol='triangle-down', size=10),


mode='markers',

name='Sells',
))

fig.show()
Note: The preceding plots only show the last fill of each trade. If your trade has partial fills, the plots only

display the last fill.

Plot Metadata

Follow these steps to plot the equity curve, benchmark, and drawdown of a live algorithm:

1. Get the live algorithm instance.

PY

live_algorithm = api.read_live_algorithm(project_id, deploy_id)

The following table provides links to documentation that explains how to get the project Id and deployment Id,

depending on the platform you use:


Platform Project Id Deployment Id

Cloud Platform Get Project Id Get Deployment Id

Local Platform Get Project Id Get Deployment Id

CLI Get Project Id

2. Get the results of the live algorithm.

PY

results = live_algorithm.live_results.results

3. Get the "Strategy Equity", "Drawdown", and "Benchmark" Chart objects.

PY

equity_chart = results.charts["Strategy Equity"]


drawdown_chart = results.charts["Drawdown"]
benchmark_chart = results.charts["Benchmark"]

4. Get the "Equity", "Equity Drawdown", and "Benchmark" Series from the preceding charts.

PY

equity = equity_chart.series["Equity"].values
drawdown = drawdown_chart.series["Equity Drawdown"].values
benchmark = benchmark_chart.series["Benchmark"].values

5. Create a pandas.DataFrame from the series values.

PY

df = pd.DataFrame({
"Equity": pd.Series({value.TIME: value.CLOSE for value in equity}),
"Drawdown": pd.Series({value.TIME: value.Y for value in drawdown}),
"Benchmark": pd.Series({value.TIME: value.Y for value in benchmark})
}).ffill()

6. Plot the performance chart.


PY

# Create subplots to plot series on same/different plots


fig, ax = plt.subplots(2, 1, figsize=(12, 12), sharex=True, gridspec_kw={'height_ratios': [2, 1]})

# Plot the equity curve


ax[0].plot(df.index, df["Equity"])
ax[0].set_title("Strategy Equity Curve")
ax[0].set_ylabel("Portfolio Value ($)")

# Plot the benchmark on the same plot, scale by using another y-axis
ax2 = ax[0].twinx()
ax2.plot(df.index, df["Benchmark"], color="grey")
ax2.set_ylabel("Benchmark Price ($)", color="grey")

# Plot the drawdown on another plot


ax[1].plot(df.index, df["Drawdown"], color="red")
ax[1].set_title("Drawdown")
ax[1].set_xlabel("Time")
ax[1].set_ylabel("%")

The following table shows all the chart series you can plot:
Chart Series Description

Equity Time series of the equity curve


Strategy Equity
Daily Performance Time series of daily percentage change

Capacity Strategy Capacity Time series of strategy capacity snapshots

Drawdown Equity Drawdown Time series of equity peak-to-trough value

Benchmark Benchmark Time series of the benchmark closing price (SPY, by default)

SecurityType - Time series of the overall ratio of SecurityType long positions of


Long Ratio the whole portfolio if any SecurityType is ever in the universe
Exposure
SecurityType - Time series of the overall ratio of SecurityType short position of
Short Ratio the whole portfolio if any SecurityType is ever in the universe

Custom Chart Custom Series Time series of a Series in a custom chart


Meta Analysis > Live Deployment Automation

Meta Analysis
Live Deployment Automation

Introduction

This page explains how use QuantConnect API in an interactive notebook to deploy and stop a set of live trading
algorithms in QC Cloud.

Get Project Ids

To automate live deployments for multiple projects, save the projects under a single directory in QuantConnect

Cloud. This tutorial assumes you save all the projects under a /Live directory.

Follow the below steps to get the project Ids of all projects under the /Live directory:

1. Open a research notebook .

2. Call the list_projects method to get a list of all project responses.

PY

list_project_response = api.list_projects()

3. Obtain the project Ids for the projects in /Live directory.

PY

project_ids = [
project.project_id for project in list_project_response.projects
if project.name.split("/")[0] == "Live"
]

Deploy Live Algorithms

Follow these steps to progromatically deploy the preceding projects with the QuantConnect API:

1. Compile all the projects and cache the compilation Ids with a dictionary.

PY

compile_id_by_project_id = {}
for project_id in project_ids:
compile_response = api.create_compile(project_id)
if not compile_response.success:
print(f"Errors compiling project {project_id}: \n{compile_response.errors}")
else:
compile_id_by_project_id[project_id] = compile_response.compile_id

2. Get the Ids of all the live nodes that are available and sort them by their speed.
PY

live_nodes = []
node_response = api.read_project_nodes(project_ids[0])

if not node_response.success:
print(f"Error getting nodes: \n{node_response.errors}")
else:
nodes = sorted(
[node for node in node_response.nodes.live_nodes if not node.busy],
key=lambda node: node.speed,
reverse=True
)
node_ids = [node.id for node in nodes]

Check the length of node_ids is greater than 0 to ensure there are live nodes available.

3. Configure your brokerage and environment.

For example, to use the QC paper brokerage, run:

PY

base_live_algorithm_settings = {
"id": "QuantConnectBrokerage",
"user": "",
"password": "",
"environment": "paper",
"account": ""
}
version_id = "-1" # Master branch

4. Deploy the projects and cache the project Ids of the successful deployments.

PY

deployed_ids = []

for project_id, compile_id in compile_id_by_project_id.items():


# Deploy live algorithm
node_id = node_ids[len(deployed_ids)] # Fastest node available
live_response = api.create_live_algorithm(project_id, compile_id, node_id,
base_live_algorithm_settings, version_id)

if not live_response.success:
print(f"Errors deploying project {project_id}: \n{live_response.errors}")
else:
print(f"Deployed {project_id}")
deployed_ids.append(project_id)

Stop Live Algorithms

To stop multiple live algorithms from an interactive notebook through the QuantConnect API, call the

api.stop_live_algorithm method with each project Id.


PY

for project_id in project_ids:


stop_response = api.stop_live_algorithm(project_id)
if not stop_response.success:
print(f"Errors stopping live algorithm {project_id}: \n{stop_response.errors}")
else:
print(f"Successfully stopped live algorithm {project_id}")
Applying Research

Applying Research

Applying Research > Key Concepts

Applying Research
Key Concepts

Introduction

The ultimate goal of research is to produce a strategy that you can backtest and eventually trade live. Once you've

developed a hypothesis that you're confident in, you can start working towards exporting your research into
backtesting. To export the code, you need to replace QuantBook() with self and replace the QuantBook methods

with their QCAlgorithm counterparts.

Workflow

Imagine that you've developed the following hypothesis: stocks that are below 1 standard deviation of their 30-day

mean are due to revert and increase in value. The following Research Environment code picks out such stocks
from a preselected basket of stocks:
PY

import numpy as np
qb = QuantBook()

symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol

# Fetch history on our universe


df = qb.history(qb.securities.keys(), 30, Resolution.DAILY)

# Make all of them into a single time index.


df = df.close.unstack(level=0)

# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).tail(1)

# Get indexes of the True values


classifier_indexes = np.where(classifier)[1]

# Get the Symbols for the True values


classifier = classifier.transpose().iloc[classifier_indexes].index.values

# Get the std values for the True values (used for magnitude)
magnitude = df.std().transpose()[classifier_indexes].values

# Zip together to iterate over later


selected = zip(classifier, magnitude)

Once you are confident in your hypothesis, you can export this code into the backtesting environment. The

algorithm will ultimately go long on the stocks that pass the classifier logic. One way to accommodate this model

into a backtest is to create a Scheduled Event that uses the model to pick stocks and place orders.

PY

def initialize(self) -> None:


self.set_start_date(2014, 1, 1)
self.set_cash(1000000)
self.set_benchmark("SPY")

self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())

self.assets = ["IEF", "SHY", "TLT", "IEI", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

self.symbols = {}

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.symbols[self.assets[i]] = self.add_equity(self.assets[i], Resolution.MINUTE).symbol

# Set the Scheduled Event method


self.schedule.on(self.date_rules.every(DayOfWeek.MONDAY), self.time_rules.after_market_open("IEF",
1), self.every_day_after_market_open)

Now that the initialize method of the algorithm is set, export the model into the Scheduled Event method. You

just need to switch qb with self and replace QuantBook methods with their QCAlgorithm counterparts. In this

example, you don't need to switch any methods because the model only uses methods that exist in QCAlgorithm .
PY

def every_day_after_market_open(self):
qb = self
# Fetch history on our universe
df = qb.history(qb.securities.keys(), 5, Resolution.DAILY)

# Make all of them into a single time index.


df = df.close.unstack(level=0)

# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).tail(1)

# Get indexes of the True values


classifier_indexes = np.where(classifier)[1]

# Get the Symbols for the True values


classifier = classifier.transpose().iloc[classifier_indexes].index.values

# Get the std values for the True values (used for magnitude)
magnitude = df.std().transpose()[classifier_indexes].values

# Zip together to iterate over later


selected = zip(classifier, magnitude)

# ==============================

insights = []

for symbol, magnitude in selected:


insights.append(Insight.price(symbol, timedelta(days=5), InsightDirection.UP, magnitude))

self.emit_insights(insights)

With the Research Environment model now in the backtesting environment, you can further analyze its

performance with its backtesting metrics . If you are confident in the backtest, you can eventually live trade this
strategy.

To view full examples of this Research to Production workflow, see the examples in the menu.

Contribute Tutorials

If you contribute Research to Production tutorials, you'll get the following benefits:

A QCC reward

You'll learn the Research to Production methodology to improve your own strategy research and

development

Your contribution will be featured in the community forum

To view the topics the community wants Research to Production tutorials for, see the issues with the WishList tag

in the Research GitHub repository . If you find a topic you want to create a tutorial for, make a pull request to the

repository with your tutorial and we will review it.

To request new tutorial topics, contact us .


Applying Research > Mean Reversion

Applying Research
Mean Reversion

Introduction

This page explains how to you can use the Research Environment to develop and test a Mean Reversion
hypothesis, then put the hypothesis in production.

Create Hypothesis

Imagine that we've developed the following hypothesis: stocks that are below 1 standard deviation of their 30-

day-mean are due to revert and increase in value, statistically around 85% chance if we assume the return series

is stationary and the price series is a Random Process. We've developed the following code in research to pick out
such stocks from a preselected basket of stocks.

Import Libraries

We'll need to import libraries to help with data processing. Import numpy and scipy libraries by the following:

PY

import numpy as np
from scipy.stats import norm, zscore

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired tickers for research.

PY

assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

3. Call the add_equity method with the tickers, and their corresponding resolution.

PY

for i in range(len(assets)):
qb.add_equity(assets[i],Resolution.MINUTE)
If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2021, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

Prepare Data

We'll have to process our data to get an extent of the signal on how much the stock is deviated from its norm for
each ticker.

1. Select the close column and then call the unstack method.

PY

df = history['close'].unstack(level=0)

2. Calculate the truth value of the most recent price being less than 1 standard deviation away from the mean

price.

PY

classifier = df.le(df.rolling(30).mean() - df.rolling(30).std())

3. Get the z-score for the True values, then compute the expected return and probability (used for Insight

magnitude and confidence).

PY

z_score = df.apply(zscore)[classifier]
magnitude = -z_score * df.rolling(30).std() / df.shift(1)
confidence = (-z_score).apply(norm.cdf)
4. Call fillna to fill NaNs with 0.

PY

magnitude.fillna(0, inplace=True)
confidence.fillna(0, inplace=True)

5. Get our trading weight, we'd take a long only portfolio and normalized to total weight = 1.

PY

weight = confidence - 1 / (magnitude + 1)


weight = weight[weight > 0].fillna(0)
sum_ = np.sum(weight, axis=1)
for i in range(weight.shape[0]):
if sum_[i] > 0:
weight.iloc[i] = weight.iloc[i] / sum_[i]
else:
weight.iloc[i] = 0
weight = weight.iloc[:-1]

Test Hypothesis

We would test the performance of this strategy. To do so, we would make use of the calculated weight for portfolio
optimization.

1. Get the total daily return series.

PY

ret = pd.Series(index=range(df.shape[0] - 1))


for i in range(df.shape[0] - 1):
ret[i] = weight.iloc[i] @ df.pct_change().iloc[i + 1].T

2. Call cumprod to get the cumulative return.

PY

total_ret = (ret + 1).cumprod()

3. Set index for visualization.

PY

total_ret.index = weight.index

4. Display the result.

PY

total_ret.plot(title='Strategy Equity Curve', figsize=(15, 10))


plt.show()
Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this

model into research is to create a scheduled event which uses our model to pick stocks and goes long.

PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2014, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.set_portfolio_construction(InsightWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())

self.assets = ["SHY", "TLT", "IEI", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.add_equity(self.assets[i], Resolution.MINUTE)

# Set Scheduled Event Method For Our Model


self.schedule.on(self.date_rules.every_day(), self.time_rules.before_market_close("SHY", 5),
self.every_day_before_market_close)

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with
their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .

PY

def EveryDayBeforeMarketClose(self) -> None:


qb = self
# Fetch history on our universe
df = qb.History(qb.Securities.Keys, 30, Resolution.Daily)
if df.empty: return

# Make all of them into a single time index.


df = df.close.unstack(level=0)

# Calculate the truth value of the most recent price being less than 1 std away from the mean
classifier = df.le(df.mean().subtract(df.std())).iloc[-1]
if not classifier.any(): return

# Get the z-score for the True values, then compute the expected return and probability
z_score = df.apply(zscore)[[classifier.index[i] for i in range(classifier.size) if
classifier.iloc[i]]]

magnitude = -z_score * df.std() / df


confidence = (-z_score).apply(norm.cdf)

# Get the latest values


magnitude = magnitude.iloc[-1].fillna(0)
confidence = confidence.iloc[-1].fillna(0)

# Get the weights, then zip together to iterate over later


weight = confidence - 1 / (magnitude + 1)
weight = weight[weight > 0].fillna(0)
sum_ = np.sum(weight)
if sum_ > 0:
weight = (weight) / sum_
selected = zip(weight.index, magnitude, confidence, weight)
else:
return

# ==============================

insights = []

for symbol, magnitude, confidence, weight in selected:


insights.append( Insight.Price(symbol, timedelta(days=1), InsightDirection.Up, magnitude,
confidence, None, weight) )

self.EmitInsights(insights)

Clone Example Project


 Charts  Statistics  Code Clone Algorithm
Applying Research > Random Forest Regression

Applying Research
Random Forest Regression

Introduction

This page explains how to you can use the Research Environment to develop and test a Random Forest Regression

hypothesis, then put the hypothesis in production.

Create Hypothesis

We've assumed the price data is a time series with some auto regressive property (i.e. its expectation is related to

past price information). Therefore, by using past information, we could predict the next price level. One way to do

so is by Random Forest Regression, which is a supervised machine learning algorithm where its weight and bias is

decided in non-linear hyperdimension.

Import Libraries

We'll need to import libraries to help with data processing and machine learning. Import sklearn , numpy and
matplotlib libraries by the following:

PY

from sklearn.ensemble import RandomForestRegressor


import numpy as np
from matplotlib import pyplot as plt

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired tickers for research.

PY

symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

3. Call the add_equity method with the tickers, and their corresponding resolution. Then store their Symbol s.
PY

for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

Prepare Data

We'll have to process our data as well as to build the ML model before testing the hypothesis. Our methodology is

to use fractional differencing close price as the input data in order to (1) provide stationarity, and (2) retain

sufficient extent of variance of the previous price information. We assume d=0.5 is the right balance to do so.

1. Select the close column and then call the unstack method.

PY

df = history['close'].unstack(level=0)

2. Feature engineer the data as fractional differencing for input.

PY

input_ = df.diff() * 0.5 + df * 0.5


input_ = input_.iloc[1:]

3. Shift the data for 1-step backward as training output result.


PY

output = df.shift(-1).iloc[:-1]

4. Split the data into training and testing sets.

PY

splitter = int(input_.shape[0] * 0.8)


X_train = input_.iloc[:splitter]
X_test = input_.iloc[splitter:]
y_train = output.iloc[:splitter]
y_test = output.iloc[splitter:]

5. Initialize a Random Forest Regressor.

PY

regressor = RandomForestRegressor(n_estimators=100, min_samples_split=5, random_state = 1990)

6. Fit the regressor.

PY

regressor.fit(X_train, y_train)

Test Hypothesis

We would test the performance of this ML model to see if it could predict 1-step forward price precisely. To do so,
we would compare the predicted and actual prices.

1. Predict the testing set.

PY

predictions = regressor.predict(X_test)

2. Convert result into DataFrame .

PY

predictions = pd.DataFrame(predictions, index=y_test.index, columns=y_test.columns)

3. Plot the result for comparison.

PY

for col in y_test.columns:


plt.figure(figsize=(15, 10))

y_test[col].plot(label="Actual")
predictions[col].plot(label="Prediction")

plt.title(f"{col} Regression Result")


plt.legend()
plt.show()
plt.clf()
For more plots, please clone the project and run the notebook.

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this

model into backtest is to create a scheduled event which uses our model to predict the expected return. Since we

could calculate the expected return, we'd use Mean-Variance Optimization for portfolio construction.
PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2014, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.set_portfolio_construction(MeanVarianceOptimizationPortfolioConstructionModel(portfolio_bias =
PortfolioBias.LONG,
period=252))
self.set_execution(ImmediateExecutionModel())

self.assets = ["SHY", "TLT", "IEI", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.add_equity(self.assets[i], Resolution.MINUTE)

# Initialize the timer to train the Machine Learning model


self.time = datetime.min

# Set Scheduled Event Method For Our Model


self.schedule.on(self.date_rules.every_day(), self.time_rules.before_market_close("SHY", 5),
self.every_day_before_market_close)

We'll also need to create a function to train and update our model from time to time.

PY

def build_model(self) -> None:


# Initialize the Random Forest Regressor
self.regressor = RandomForestRegressor(n_estimators=100, min_samples_split=5, random_state = 1990)

# Get historical data


history = self.history(self.securities.keys, 360, Resolution.DAILY)

# Select the close column and then call the unstack method.
df = history['close'].unstack(level=0)

# Feature engineer the data for input.


input_ = df.diff() * 0.5 + df * 0.5
input_ = input_.iloc[1:].ffill().fillna(0)

# Shift the data for 1-step backward as training output result.


output = df.shift(-1).iloc[:-1].ffill().fillna(0)

# Fit the regressor


self.regressor.fit(input_, output)

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with

their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .


PY

def EveryDayBeforeMarketClose(self) -> None:


# Retrain the regressor every month
if self.time < self.Time:
self.BuildModel()
self.time = Expiry.EndOfMonth(self.Time)

qb = self
# Fetch history on our universe
df = qb.History(qb.Securities.Keys, 2, Resolution.Daily)
if df.empty: return

# Make all of them into a single time index.


df = df.close.unstack(level=0)

# Feature engineer the data for input


input_ = df.diff() * 0.5 + df * 0.5
input_ = input_.iloc[-1].fillna(0).values.reshape(1, -1)

# Predict the expected price


predictions = self.regressor.predict(input_)

# Get the expected return


predictions = (predictions - df.iloc[-1].values) / df.iloc[-1].values
predictions = predictions.flatten()

# ==============================

insights = []

for i in range(len(predictions)):
insights.append( Insight.Price(self.assets[i], timedelta(days=1), InsightDirection.Up,
predictions[i]) )

self.EmitInsights(insights)

Clone Example Project

 Charts  Statistics  Code Clone Algorithm


Applying Research > Uncorrelated Assets

Applying Research
Uncorrelated Assets

Introduction

This page explains how to you can use the Research Environment to develop and test a Uncorrelated Assets

hypothesis, then put the hypothesis in production.

Create Hypothesis

According to Modern Portfolio Thoery, asset combinations with negative or very low correlation could have lower

total portfolio variance given the same level of return. Thus, uncorrelated assets allows you to find a portfolio that

will, theoretically, be more diversified and resilient to extreme market events. We're testing this statement in real

life scenario, while hypothesizing a portfolio with uncorrelated assets could be a consistent portfolio. In this
example, we'll compare the performance of 5-least-correlated-asset portfolio (proposed) and 5-most-correlated-

asset portfolio (benchmark), both equal weighting.

Import Libraries

We'll need to import libraries to help with data processing and visualization. Import numpy and matplotlib libraries

by the following:

PY

import numpy as np
from matplotlib import pyplot as plt

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired tickers for research.

PY

assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

3. Call the add_equity method with the tickers, and their corresponding resolution.
PY

for i in range(len(assets)):
qb.add_equity(assets[i],Resolution.MINUTE)

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2021, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

Prepare Data

We'll have to process our data to get their correlation and select the least and most related ones.

1. Select the close column and then call the unstack method, then call pct_change to compute the daily return.

PY

returns = history['close'].unstack(level=0).pct_change().iloc[1:]

2. Write a function to obtain the least and most correlated 5 assets.


PY

def get_uncorrelated_assets(returns, num_assets):


# Get correlation
correlation = returns.corr()

# Find assets with lowest and highest absolute sum correlation


selected = []
for index, row in correlation.iteritems():
corr_rank = row.abs().sum()
selected.append((index, corr_rank))

# Sort and take the top num_assets


sort_ = sorted(selected, key = lambda x: x[1])
uncorrelated = sort_[:num_assets]
correlated = sort_[-num_assets:]

return uncorrelated, correlated

selected, benchmark = GetUncorrelatedAssets(returns, 5)

Test Hypothesis

To test the hypothesis: Our desired outcome would be a consistent and low fluctuation equity curve should be

seen, as compared with benchmark.

1. Construct a equal weighting portfolio for the 5-uncorrelated-asset-portfolio and the 5-correlated-asset-

portfolio (benchmark).

PY

port_ret = returns[[x[0] for x in selected]] / 5


bench_ret = returns[[x[0] for x in benchmark]] / 5

2. Call cumprod to get the cumulative return.

PY

total_ret = (np.sum(port_ret, axis=1) + 1).cumprod()


total_ret_bench = (np.sum(bench_ret, axis=1) + 1).cumprod()

3. Plot the result.

PY

plt.figure(figsize=(15, 10))
total_ret.plot(label='Proposed')
total_ret_bench.plot(label='Benchmark')
plt.title('Equity Curve')
plt.legend()
plt.show()
-image

We can clearly see from the results, the proposed uncorrelated-asset-portfolio has a lower variance/fluctuation,

thus more consistent than the benchmark. This proven our hypothesis.

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into research is to create a scheduled event which uses our model to pick stocks and goes long.
PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2014, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())

self.assets = ["SHY", "TLT", "IEI", "SHV", "TLH", "EDV", "BIL",


"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.add_equity(self.assets[i], Resolution.MINUTE)

# Set Scheduled Event Method For Our Model. In this example, we'll rebalance every month.
self.schedule.on(self.date_rules.month_start(),
self.time_rules.before_market_close("SHY", 5),
self.every_day_before_market_close)

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with

their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .


PY

def every_day_before_market_close(self) -> None:


qb = self
# Fetch history on our universe
history = qb.history(qb.securities.keys(), 252*2, Resolution.DAILY)
if history.empty: return

# Select the close column and then call the unstack method, then call pct_change to compute the
daily return.
returns = history['close'].unstack(level=0).pct_change().iloc[1:]

# Get correlation
correlation = returns.corr()

# Find 5 assets with lowest absolute sum correlation


selected = []
for index, row in correlation.iteritems():
corr_rank = row.abs().sum()
selected.append((index, corr_rank))

sort_ = sorted(selected, key = lambda x: x[1])


selected = [x[0] for x in sort_[:5]]

# ==============================

insights = []

for symbol in selected:


insights.append( Insight.price(symbol, Expiry.END_OF_MONTH, InsightDirection.UP) )

self.emit_insights(insights)

Clone Example Project

 Charts  Statistics  Code Clone Algorithm


Applying Research > Kalman Filters and Stat Arb

Applying Research
Kalman Filters and Stat Arb

Introduction

This page explains how to you can use the Research Environment to develop and test a Kalman Filters and

Statistical Arbitrage hypothesis, then put the hypothesis in production.

Create Hypothesis

In finance, we can often observe that 2 stocks with similar background and fundamentals (e.g. AAPL vs MSFT, SPY

vs QQQ) move in similar manner. They could be correlated, although not necessary, but their price difference/sum

(spread) is stationary. We call this cointegration. Thus, we could hypothesize that extreme spread could provide

chance for arbitrage, just like a mean reversion of spread. This is known as pairs trading. Likewise, this could also
be applied to more than 2 assets, this is known as statistical arbitrage.

However, although the fluctuation of the spread is stationary, the mean of the spread could be changing by time

due to different reasons. Thus, it is important to update our expectation on the spread in order to go in and out of

the market in time, as the profit margin of this type of short-window trading is tight. Kalman Filter could come in

handy in this situation. We can consider it as an updater of the underlying return Markov Chain's expectation,

while we're assuming the price series is a Random Process.

In this example, we're making a hypothesis on trading the spread on cointegrated assets is profitable. We'll be

using forex pairs EURUSD, GBPUSD, USDCAD, USDHKD and USDJPY for this example, skipping the normalized

price difference selection.

Import Libraries

We'll need to import libraries to help with data processing, model building, validation and visualization. Import arch

, pykalman , scipy , statsmodels , numpy , matplotlib and pandas libraries by the following:

PY

from arch.unitroot.cointegration import engle_granger


from pykalman import KalmanFilter
from scipy.optimize import minimize
from statsmodels.tsa.vector_ar.vecm import VECM

import numpy as np
from matplotlib import pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

Get Historical Data

To begin, we retrieve historical data for researching.


1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired tickers for research.

PY

assets = ["EURUSD", "GBPUSD", "USDCAD", "USDHKD", "USDJPY"]

3. Call the add_forex method with the tickers, and their corresponding resolution. Then store their Symbol s.

PY

for i in range(len(assets)):
qb.add_forex(assets[i],Resolution.MINUTE)

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2021, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

Cointegration

We'll have to test if the assets are cointegrated. If so, we'll have to obtain the cointegration vector(s).

Cointegration Testing

1. Select the close column and then call the unstack method.

PY

df = history['close'].unstack(level=0)
2. Call np.log to convert the close price into log-price series to eliminate compounding effect.

PY

log_price = np.log(data)

3. Apply Engle Granger Test to check if the series are cointegrated.

PY

coint_result = engle_granger(log_price.iloc[:, 0], log_price.iloc[:, 1:], trend='n', method='bic')

It shows a p-value < 0.05 for the unit test, with lag-level 0. This proven the log price series are cointegrated
in realtime. The spread of the 5 forex pairs are stationary.

Get Cointegration Vectors

We would use a VECM model to obtain the cointegrated vectors.

1. Initialize a VECM model by following the unit test parameters, then fit to our data.

PY

vecm_result = VECM(log_price, k_ar_diff=0, coint_rank=len(assets)-1, deterministic='n').fit()

2. Obtain the Beta attribute. This is the cointegration subspaces' unit vectors.

PY

beta = vecm_result.beta

3. Check the spread of different cointegration subspaces.

PY

spread = log_price @ beta

4. Plot the results.


PY

fig, axs = plt.subplots(beta.shape[1], figsize=(15, 15))


fig.suptitle('Spread for various cointegrating vectors')
for i in range(beta.shape[1]):
axs[i].plot(spread.iloc[:, i])
axs[i].set_title(f"The {i+1}th normalized cointegrating subspace")
plt.show()

Optimization of Cointegration Subspaces

Although the 4 cointegratoin subspaces are not looking stationarym, we can optimize for a mean-reverting

portfolio by putting various weights in different subspaces. We use the Portmanteau statistics as a proxy for the

mean reversion. So we formulate:


w TM1w
w TM0w 2
minimizew ( )
subject to w TM0w = ν
1 Tw = 0
where Mi ≜ Cov(st, st + i) = E[(st − E[st])(st + i − E[st + i])T]

with s is spread, v is predetermined desirable variance level (the larger the higher the profit, but lower the trading

frequency)

1. We set the weight on each vector is between -1 and 1. While overall sum is 0.

PY

x0 = np.array([-1**i/beta.shape[1] for i in range(beta.shape[1])])


bounds = tuple((-1, 1) for i in range(beta.shape[1]))
constraints = [{'type':'eq', 'fun':lambda x: np.sum(x)}]

2. Optimize the Portmanteau statistics.

PY

opt = minimize(lambda w: ((w.T @ np.cov(spread.T, spread.shift(1).fillna(0).T)[spread.shape[1]:,


:spread.shape[1]] @ w)/(w.T @ np.cov(spread.T) @ w))**2,
x0=x0,
bounds=bounds,
constraints=constraints,
method="SLSQP")

3. Normalize the result.

PY

opt.x = opt.x/np.sum(abs(opt.x))
for i in range(len(opt.x)):
print(f"The weight put on {i+1}th normalized cointegrating subspace: {opt.x[i]}")

4. Plot the weighted spread.

PY

new_spread = spread @ opt.x


new_spread.plot(title="Weighted spread", figsize=(15, 10))
plt.ylabel("Spread")
plt.show()
Kalman Filter

The weighted spread looks more stationary. However, the fluctuation half-life is very long accrossing zero. We aim

to trade as much as we can to maximize the profit of this strategy. Kalman Filter then comes into the play. It could
modify the expectation of the next step based on smoothening the prediction and actual probability distribution of

return.

Image Source: Understanding Kalman Filters, Part 3: An Optimal State Estimator. Melda Ulusoy (2017). MathWorks.
Retreived from: https://www.mathworks.com/videos/understanding-kalman-filters-part-3-optimal-state-
estimator--1490710645421.html
1. Initialize a KalmanFilter .

In this example, we use the first 20 data points to optimize its initial state. We assume the market has no

regime change so that the transitional matrix and observation matrix is [1].

PY

kalmanFilter = KalmanFilter(transition_matrices = [1],


observation_matrices = [1],
initial_state_mean = new_spread.iloc[:20].mean(),
observation_covariance = new_spread.iloc[:20].var(),
em_vars=['transition_covariance', 'initial_state_covariance'])
kalmanFilter = kalmanFilter.em(new_spread.iloc[:20], n_iter=5)
(filtered_state_means, filtered_state_covariances) = kalmanFilter.filter(new_spread.iloc[:20])

2. Obtain the current Mean and Covariance Matrix expectations.

PY

currentMean = filtered_state_means[-1, :]
currentCov = filtered_state_covariances[-1, :]

3. Initialize a mean series for spread normalization using the KalmanFilter 's results.

PY

mean_series = np.array([None]*(new_spread.shape[0]-100))

4. Roll over the Kalman Filter to obtain the mean series.

PY

for i in range(100, new_spread.shape[0]):


(currentMean, currentCov) = kalmanFilter.filter_update(filtered_state_mean = currentMean,
filtered_state_covariance =
currentCov,
observation = new_spread.iloc[i])
mean_series[i-100] = float(currentMean)

5. Obtain the normalized spread series.

PY

normalized_spread = (new_spread.iloc[100:] - mean_series)

6. Plot the normalized spread series.

PY

plt.figure(figsize=(15, 10))
plt.plot(normalized_spread, label="Processed spread")
plt.title("Normalized spread series")
plt.ylabel("Spread - Expectation")
plt.legend()
plt.show()
Determine Trading Threshold

Now we need to determine the threshold of entry. We want to maximize profit from each trade (variance of spread)

x frequency of entry. To do so, we formulate:

‖ ˉf − f ‖2 + λ ‖ Df ‖22
2
minimizef

∑Tt= 11 {spread > set level }


¯ t j

fj T
where =

[ ]
1 −1
1 −1
D= ∈ R (j− 1) × j
⋱ ⋱
1 −1

ˉ
so f ∗ = (I + λD TD) − 1 f

1. Initialize 50 set levels for testing.

PY

s0 = np.linspace(0, max(normalized_spread), 50)

2. Calculate the profit levels using the 50 set levels.

PY

f_bar = np.array([None]*50)
for i in range(50):
f_bar[i] = len(normalized_spread.values[normalized_spread.values > s0[i]]) /
normalized_spread.shape[0]
3. Set trading frequency matrix.

PY

D = np.zeros((49, 50))
for i in range(D.shape[0]):
D[i, i] = 1
D[i, i+1] = -1

4. Set level of lambda.

PY

l = 1.0

5. Obtain the normalized profit level.

PY

f_star = np.linalg.inv(np.eye(50) + l * D.T@D) @ f_bar.reshape(-1, 1)


s_star = [f_star[i]*s0[i] for i in range(50)]

6. Get the maximum profit level as threshold.

PY

threshold = s0[s_star.index(max(s_star))]
print(f"The optimal threshold is {threshold}")

7. Plot the result.

PY

plt.figure(figsize=(15, 10))
plt.plot(s0, s_star)
plt.title("Profit of mean-revertion trading")
plt.xlabel("Threshold")
plt.ylabel("Profit")
plt.show()
Test Hypothesis

To test the hypothesis. We wish to obtain a profiting strategy.

1. Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.

PY

trading_weight = beta @ opt.x


trading_weight /= np.sum(abs(trading_weight))

2. Set up the trading data.

PY

testing_ret = data.pct_change().iloc[1:].shift(-1) # Shift 1 step backward as forward return


result
equity = pd.DataFrame(np.ones((testing_ret.shape[0], 1)), index=testing_ret.index, columns=["Daily
value"])

3. Set the buy and sell preiod when the spread exceeds the threshold.

PY

buy_period = normalized_spread[normalized_spread < -threshold].index


sell_period = normalized_spread[normalized_spread > threshold].index

4. Trade the portfolio.


PY

equity.loc[buy_period, "Daily value"] = testing_ret.loc[buy_period] @ trading_weight + 1


equity.loc[sell_period, "Daily value"] = testing_ret.loc[sell_period] @ -trading_weight + 1

5. Get the total portfolio value.

PY

value = equity.cumprod()

6. Plot the result.

PY

value.plot(title="Equity Curve", figsize=(15, 10))


plt.ylabel("Portfolio Value")
plt.show()

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this

model into backtest is to create a scheduled event which uses our model to predict the expected return.
PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2014, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.assets = ["EURUSD", "GBPUSD", "USDCAD", "USDHKD", "USDJPY"]

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.add_forex(self.assets[i], Resolution.MINUTE)

# Instantiate our model


self.recalibrate()

# Set a variable to indicate the trading bias of the portfolio


self.state = 0

# Set Scheduled Event Method For Recalibrate Our Model Every Week.
self.schedule.on(self.date_rules.week_start(),
self.time_rules.at(0, 0),
self.recalibrate)

# Set Scheduled Event Method For Kalman Filter updating.


self.schedule.on(self.date_rules.every_day(),
self.time_rules.before_market_close("EURUSD"),
self.every_day_before_market_close)

We'll also need to create a function to train and update our model from time to time. We will switch qb with self

and replace methods with their QCAlgorithm counterparts as needed. In this example, this is not an issue because

all the methods we used in research also exist in QCAlgorithm .

PY

def Recalibrate(self) -> None:


qb = self
history = qb.History(self.assets, 252*2, Resolution.Daily)
if history.empty: return

# Select the close column and then call the unstack method
data = history['close'].unstack(level=0)

# Convert into log-price series to eliminate compounding effect


log_price = np.log(data)

### Get Cointegration Vectors


# Initialize a VECM model following the unit test parameters, then fit to our data.
vecm_result = VECM(log_price, k_ar_diff=0, coint_rank=len(self.assets)-1, deterministic='n').fit()

# Obtain the Beta attribute. This is the cointegration subspaces' unit vectors.
beta = vecm_result.beta

# Check the spread of different cointegration subspaces.


spread = log_price @ beta

### Optimization of Cointegration Subspaces


# We set the weight on each vector is between -1 and 1. While overall sum is 0.
x0 = np.array([-1**i/beta.shape[1] for i in range(beta.shape[1])])
bounds = tuple((-1, 1) for i in range(beta.shape[1]))
constraints = [{'type':'eq', 'fun':lambda x: np.sum(x)}]
constraints = [{'type':'eq', 'fun':lambda x: np.sum(x)}]

# Optimize the Portmanteau statistics


opt = minimize(lambda w: ((w.T @ np.cov(spread.T, spread.shift(1).fillna(0).T)[spread.shape[1]:,
:spread.shape[1]] @ w)/(w.T @ np.cov(spread.T) @ w))**2,
x0=x0,
bounds=bounds,
constraints=constraints,
method="SLSQP")

# Normalize the result


opt.x = opt.x/np.sum(abs(opt.x))
new_spread = spread @ opt.x

### Kalman Filter


# Initialize a Kalman Filter. Using the first 20 data points to optimize its initial state. We
assume the market has no regime change so that the transitional matrix and observation matrix is [1].
self.kalmanFilter = KalmanFilter(transition_matrices = [1],
observation_matrices = [1],
initial_state_mean = new_spread.iloc[:20].mean(),
observation_covariance = new_spread.iloc[:20].var(),
em_vars=['transition_covariance', 'initial_state_covariance'])
self.kalmanFilter = self.kalmanFilter.em(new_spread.iloc[:20], n_iter=5)
(filtered_state_means, filtered_state_covariances) = self.kalmanFilter.filter(new_spread.iloc[:20])

# Obtain the current Mean and Covariance Matrix expectations.


self.currentMean = filtered_state_means[-1, :]
self.currentCov = filtered_state_covariances[-1, :]

# Initialize a mean series for spread normalization using the Kalman Filter's results.
mean_series = np.array([None]*(new_spread.shape[0]-20))

# Roll over the Kalman Filter to obtain the mean series.


for i in range(20, new_spread.shape[0]):
(self.currentMean, self.currentCov) = self.kalmanFilter.filter_update(filtered_state_mean =
self.currentMean,
filtered_state_covariance =
self.currentCov,
observation = new_spread.iloc[i])
mean_series[i-20] = float(self.currentMean)

# Obtain the normalized spread series.


normalized_spread = (new_spread.iloc[20:] - mean_series)

### Determine Trading Threshold


# Initialize 50 set levels for testing.
s0 = np.linspace(0, max(normalized_spread), 50)

# Calculate the profit levels using the 50 set levels.


f_bar = np.array([None]*50)
for i in range(50):
f_bar[i] = len(normalized_spread.values[normalized_spread.values > s0[i]]) \
/ normalized_spread.shape[0]

# Set trading frequency matrix.


D = np.zeros((49, 50))
for i in range(D.shape[0]):
D[i, i] = 1
D[i, i+1] = -1

# Set level of lambda.


l = 1.0

# Obtain the normalized profit level.


f_star = np.linalg.inv(np.eye(50) + l * D.T@D) @ f_bar.reshape(-1, 1)
s_star = [f_star[i]*s0[i] for i in range(50)]
self.threshold = s0[s_star.index(max(s_star))]

# Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.
trading_weight = beta @ opt.x
self.trading_weight = trading_weight / np.sum(abs(trading_weight))
Now we export our model into the scheduled event method for trading. We will switch qb with self and replace

methods with their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the

methods we used in research also exist in QCAlgorithm .

PY

def EveryDayBeforeMarketClose(self) -> None:


qb = self

# Get the real-time log close price for all assets and store in a Series
series = pd.Series()
for symbol in qb.Securities.Keys:
series[symbol] = np.log(qb.Securities[symbol].Close)

# Get the spread


spread = series @ self.trading_weight

# Update the Kalman Filter with the Series


(self.currentMean, self.currentCov) = self.kalmanFilter.filter_update(filtered_state_mean =
self.currentMean,
filtered_state_covariance =
self.currentCov,
observation = spread)

# Obtain the normalized spread.


normalized_spread = spread - self.currentMean

# ==============================

# Mean-reversion
if normalized_spread < -self.threshold:
orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], self.trading_weight[i]))
self.SetHoldings(orders)

self.state = 1

elif normalized_spread > self.threshold:


orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], -1 * self.trading_weight[i]))
self.SetHoldings(orders)

self.state = -1

# Out of position if spread recovered


elif self.state == 1 and normalized_spread > -self.threshold or self.state == -1 and
normalized_spread < self.threshold:
self.Liquidate()

self.state = 0

Reference

1. A Signal Processing Perspective on Financial Engineering. Y. Feng, D. P. Palomer (2016). Foundations and

Trends in Signal Processing. 9(1-2). p173-200.

Clone Example Project


 Charts  Statistics  Code Clone Algorithm
Applying Research > PCA and Pairs Trading

Applying Research
PCA and Pairs Trading

Introduction

This page explains how to you can use the Research Environment to develop and test a Principle Component

Analysis hypothesis, then put the hypothesis in production.

Create Hypothesis

Principal Component Analysis (PCA) a way of mapping the existing dataset into a new "space", where the
dimensions of the new data are linearly-independent, orthogonal vectors. PCA eliminates the problem of

multicollinearity. In another way of thought, can we actually make use of the collinearity it implied, to find the

collinear assets to perform pairs trading?

Import Libraries

We'll need to import libraries to help with data processing, validation and visualization. Import sklearn , arch ,

statsmodels , numpy and matplotlib libraries by the following:

PY

from sklearn.decomposition import PCA


from arch.unitroot.cointegration import engle_granger
from statsmodels.tsa.stattools import adfuller
import numpy as np
from matplotlib import pyplot as plt

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired tickers for research.

PY

symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
"SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
"VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

3. Call the add_equity method with the tickers, and their corresponding resolution. Then store their Symbol s.
PY

for i in range(len(assets)):
symbols[assets[i]] = qb.add_equity(assets[i],Resolution.MINUTE).symbol

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2021, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

Prepare Data

We'll have to process our data to get the principle component unit vector that explains the most variance, then find

the highest- and lowest-absolute-weighing assets as the pair, since the lowest one's variance is mostly explained

by the highest.

1. Select the close column and then call the unstack method.

PY

close_price = history['close'].unstack(level=0)

2. Call pct_change to compute the daily return.

PY

returns = close_price.pct_change().iloc[1:]

3. Initialize a PCA model, then get the principle components by the maximum likelihood.
PY

pca = PCA()
pca.fit(returns)

4. Get the number of principle component in a list, and their corresponding explained variance ratio.

PY

components = [str(x + 1) for x in range(pca.n_components_)]


explained_variance_pct = pca.explained_variance_ratio_ * 100

5. Plot the principle components' explained variance ratio.

PY

plt.figure(figsize=(15, 10))
plt.bar(components, explained_variance_pct)
plt.title("Ratio of Explained Variance")
plt.xlabel("Principle Component #")
plt.ylabel("%")
plt.show()

We can see over 95% of the variance is explained by the first principle. We could conclude that collinearity

exists and most assets' return are correlated. Now, we can extract the 2 most correlated pairs.

6. Get the weighting of each asset in the first principle component.

PY

first_component = pca.components_[0, :]
7. Select the highest- and lowest-absolute-weighing asset.

PY

highest = assets[abs(first_component).argmax()]
lowest = assets[abs(first_component).argmin()]
print(f'The highest-absolute-weighing asset: {highest}\nThe lowest-absolute-weighing asset:
{lowest}')

8. Plot their weighings.

PY

plt.figure(figsize=(15, 10))
plt.bar(assets, first_component)
plt.title("Weightings of each asset in the first component")
plt.xlabel("Assets")
plt.ylabel("Weighting")
plt.xticks(rotation=30)
plt.show()

Test Hypothesis

We now selected 2 assets as candidate for pair-trading. Hence, we're going to test if they are cointegrated and

their spread is stationary to do so.

1. Call np.log to get the log price of the pair.


PY

log_price = np.log(close_price[[highest, lowest]])

2. Test cointegration by Engle Granger Test.

PY

coint_result = engle_granger(log_price.iloc[:, 0], log_price.iloc[:, 1], trend="c", lags=0)


display(coint_result)

3. Get their cointegrating vector.

PY

coint_vector = coint_result.cointegrating_vector[:2]

4. Calculate the spread.

PY

spread = log_price @ coint_vector

5. Use Augmented Dickey Fuller test to test its stationarity.

PY

pvalue = adfuller(spread, maxlag=0)[1]


print(f"The ADF test p-value is {pvalue}, so it is {'' if pvalue < 0.05 else 'not '}stationary.")

6. Plot the spread.

PY

spread.plot(figsize=(15, 10), title=f"Spread of {highest} and {lowest}")


plt.ylabel("Spread")
plt.show()
Result shown that the pair is cointegrated and their spread is stationary, so they are potential pair for pair-

trading.

Set Up Algorithm

Pairs trading is exactly a 2-asset version of statistical arbitrage. Thus, we can just modify the algorithm from the

Kalman Filter and Statistical Arbitrage tutorial , except we're using only a single cointegrating unit vector so no
optimization of cointegration subspace is needed.

PY

def initialize(self):

#1. Required: Five years of backtest history


self.set_start_date(2014, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.assets = ["SCHO", "SHY"]

# Add Equity ------------------------------------------------


for i in range(len(self.assets)):
self.add_equity(self.assets[i], Resolution.MINUTE).symbol

# Instantiate our model


self.recalibrate()

# Set a variable to indicate the trading bias of the portfolio


self.state = 0

# Set Scheduled Event Method For Kalman Filter updating.


self.schedule.on(self.date_rules.week_start(),
self.schedule.on(self.date_rules.week_start(),
self.time_rules.at(0, 0),
self.recalibrate)

# Set Scheduled Event Method For Kalman Filter updating.


self.schedule.on(self.date_rules.every_day(),
self.time_rules.before_market_close("SHY"),
self.every_day_before_market_close)

def recalibrate(self):
qb = self
history = qb.history(self.assets, 252*2, Resolution.DAILY)
if history.empty: return

# Select the close column and then call the unstack method
data = history['close'].unstack(level=0)

# Convert into log-price series to eliminate compounding effect


log_price = np.log(data)

### Get Cointegration Vectors


# Get the cointegration vector
coint_result = engle_granger(log_price.iloc[:, 0], log_price.iloc[:, 1], trend="c", lags=0)
coint_vector = coint_result.cointegrating_vector[:2]

# Get the spread


spread = log_price @ coint_vector

### Kalman Filter


# Initialize a Kalman Filter. Using the first 20 data points to optimize its initial state. We
assume the market has no regime change so that the transitional matrix and observation matrix is [1].
self.kalman_filter = KalmanFilter(transition_matrices = [1],
observation_matrices = [1],
initial_state_mean = spread.iloc[:20].mean(),
observation_covariance = spread.iloc[:20].var(),
em_vars=['transition_covariance', 'initial_state_covariance'])
self.kalman_filter = self.kalman_filter.em(spread.iloc[:20], n_iter=5)
(filtered_state_means, filtered_state_covariances) = self.kalman_filter.filter(spread.iloc[:20])

# Obtain the current Mean and Covariance Matrix expectations.


self.current_mean = filtered_state_means[-1, :]
self.current_cov = filtered_state_covariances[-1, :]

# Initialize a mean series for spread normalization using the Kalman Filter's results.
mean_series = np.array([None]*(spread.shape[0]-20))

# Roll over the Kalman Filter to obtain the mean series.


for i in range(20, spread.shape[0]):
(self.current_mean, self.current_cov) = self.kalman_filter.filter_update(filtered_state_mean =
self.current_mean,
filtered_state_covariance =
self.current_cov,
observation = spread.iloc[i])
mean_series[i-20] = float(self.current_mean)

# Obtain the normalized spread series.


normalized_spread = (spread.iloc[20:] - mean_series)

### Determine Trading Threshold


# Initialize 50 set levels for testing.
s0 = np.linspace(0, max(normalized_spread), 50)

# Calculate the profit levels using the 50 set levels.


f_bar = np.array([None]*50)
for i in range(50):
f_bar[i] = len(normalized_spread.values[normalized_spread.values > s0[i]]) /
normalized_spread.shape[0]

# Set trading frequency matrix.


D = np.zeros((49, 50))
for i in range(D.shape[0]):
D[i, i] = 1
D[i, i+1] = -1
D[i, i+1] = -1

# Set level of lambda.


l = 1.0

# Obtain the normalized profit level.


f_star = np.linalg.inv(np.eye(50) + l * D.T@D) @ f_bar.reshape(-1, 1)
s_star = [f_star[i]*s0[i] for i in range(50)]
self.threshold = s0[s_star.index(max(s_star))]

# Set the trading weight. We would like the portfolio absolute total weight is 1 when trading.
self.trading_weight = coint_vector / np.sum(abs(coint_vector))

def every_day_before_market_close(self):
qb = self

# Get the real-time log close price for all assets and store in a Series
series = pd.Series()
for symbol in qb.securities.Keys:
series[symbol] = np.log(qb.securities[symbol].close)

# Get the spread


spread = np.sum(series * self.trading_weight)

# Update the Kalman Filter with the Series


(self.current_mean, self.current_cov) = self.kalman_filter.filter_update(filtered_state_mean =
self.current_mean,
filtered_state_covariance =
self.current_cov,
observation = spread)

# Obtain the normalized spread.


normalized_spread = spread - self.current_mean

# ==============================

# Mean-reversion
if normalized_spread < -self.threshold:
orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], self.trading_weight[i]))
self.set_holdings(orders)

self.state = 1

elif normalized_spread > self.threshold:


orders = []
for i in range(len(self.assets)):
orders.append(PortfolioTarget(self.assets[i], -1 * self.trading_weight[i]))
self.set_holdings(orders)

self.state = -1

# Out of position if spread recovered


elif self.state == 1 and normalized_spread > -self.threshold or self.state == -1 and
normalized_spread < self.threshold:
self.liquidate()

self.state = 0

Clone Example Project


 Charts  Statistics  Code Clone Algorithm
Applying Research > Hidden Markov Models

Applying Research
Hidden Markov Models

Introduction

This page explains how to you can use the Research Environment to develop and test a Hidden Markov Model

hypothesis, then put the hypothesis in production.

Create Hypothesis

A Markov process is a stochastic process where the possibility of switching to another state depends only on the

current state of the model by the current state's probability distribution (it is usually represented by a state

transition matrix). It is history-independent, or memoryless. While often a Markov process's state is observable,

the states of a Hidden Markov Model (HMM) is not observable. This means the input(s) and output(s) are

observable, but their intermediate, the state, is non-observable/hidden.

A 3-state HMM example, where S are the hidden states, O are the observable states and a are the probabilities of

state transition.

Image source: Modeling Strategic Use of Human Computer Interfaces with Novel Hidden Markov Models. L. J.
Mariano, et. al. (2015). Frontiers in Psychology 6:919. DOI:10.3389/fpsyg.2015.00919

In finance, HMM is particularly useful in determining the market regime, usually classified into "Bull" and "Bear"

markets. Another popular classification is "Volatile" vs "Involatile" market, such that we can avoid entering the

market when it is too risky. We hypothesis a HMM could be able to do the later, so we can produce a SPY-out-

performing portfolio (positive alpha).

Import Libraries

We'll need to import libraries to help with data processing, validation and visualization. Import statsmodels , scipy

, numpy , matplotlib and pandas libraries by the following:


PY

from statsmodels.tsa.regime_switching.markov_regression import MarkovRegression


from scipy.stats import multivariate_normal
import numpy as np

from matplotlib import pyplot as plt


from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired index for research.

PY

asset = "SPX"

3. Call the add_index method with the tickers, and their corresponding resolution.

PY

qb.add_index(asset, Resolution.MINUTE)

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request
historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)
Prepare Data

We'll have to process our data to get the volatility of the market for classification.

1. Select the close column and then call the unstack method.

PY

close_price = history['close'].unstack(level=0)

2. Call pct_change to compute the daily return.

PY

returns = close_price.pct_change().iloc[1:]

3. Initialize the HMM, then fit by the daily return data. Note that we're using varinace as switching regime, so

switching_variance argument is set as True .

PY

model = MarkovRegression(returns, k_regimes=2, switching_variance=True).fit()


display(model.summary())
All p-values of the regime self-transition coefficients and the regime transition probability matrix's coefficient

is smaller than 0.05, indicating the model should be able to classify the data into 2 different volatility regimes.

Test Hypothesis

We now verify if the model can detect high and low volatility period effectively.

1. Get the regime as a column, 1 as Low Variance Regime, 2 as High Variance Regime.

PY

regime = pd.Series(model.smoothed_marginal_probabilities.values.argmax(axis=1)+1,
index=returns.index, name='regime')
df_1 = close.loc[returns.index][regime == 1]
df_2 = close.loc[returns.index][regime == 2]

2. Get the mean and covariance matrix of the 2 regimes, assume 0 covariance between the two.

PY

mean = np.array([returns.loc[df_1.index].mean(), returns.loc[df_2.index].mean()])


cov = np.array([[returns.loc[df_1.index].var(), 0], [0, returns.loc[df_2.index].var()]])

3. Fit a 2-dimensional multivariate normal distribution by the 2 means and covriance matrix.
PY

dist = multivariate_normal(mean=mean.flatten(), cov=cov)


mean_1, mean_2 = mean[0], mean[1]
sigma_1, sigma_2 = cov[0,0], cov[1,1]

4. Get the normal distribution of each of the distribution.

PY

x = np.linspace(-0.05, 0.05, num=100)


y = np.linspace(-0.05, 0.05, num=100)
X, Y = np.meshgrid(x,y)
pdf = np.zeros(X.shape)
for i in range(X.shape[0]):
for j in range(X.shape[1]):
pdf[i,j] = dist.pdf([X[i,j], Y[i,j]])

5. Plot the probability of data in different regimes.

PY

fig, axes = plt.subplots(2, figsize=(15, 10))


ax = axes[0]
ax.plot(model.smoothed_marginal_probabilities[0])
ax.set(title='Smoothed probability of Low Variance Regime')
ax = axes[1]
ax.plot(model.smoothed_marginal_probabilities[1])
ax.set(title='Smoothed probability of High Variance Regime')
fig.tight_layout()
plt.show()

6. Plot the series into regime-wise.


PY

df_1.index = pd.to_datetime(df_1.index)
df_1 = df_1.sort_index()
df_2.index = pd.to_datetime(df_2.index)
df_2 = df_2.sort_index()
plt.figure(figsize=(15, 10))
plt.scatter(df_1.index, df_1, color='blue', label="Low Variance Regime")
plt.scatter(df_2.index, df_2, color='red', label="High Variance Regime")
plt.title("Price series")
plt.ylabel("Price ($)")
plt.xlabel("Date")
plt.legend()
plt.show()

7. Plot the distribution surface.

PY

fig = plt.figure(figsize=(20, 10))


ax = fig.add_subplot(122, projection = '3d')
ax.plot_surface(X, Y, pdf, cmap = 'viridis')
ax.axes.zaxis.set_ticks([])
plt.xlabel("Low Volatility Regime")
plt.ylabel("High Volatility Regime")
plt.title('Bivariate normal distribution of the Regimes')
plt.tight_layout()
plt.show()
8. Plot the contour.

PY

plt.figure(figsize=(12, 8))
plt.contourf(X, Y, pdf, cmap = 'viridis')
plt.xlabel("Low Volatility Regime")
plt.ylabel("High Volatility Regime")
plt.title('Bivariate normal distribution of the Regimes')
plt.tight_layout()
plt.show()
We can clearly seen from the results, the Low Volatility Regime has much lower variance than the High Volatility

Regime, proven the model works.

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this

model into backtest is to create a scheduled event which uses our model to predict the expected return. Since we
could calculate the expected return, we'd use Mean-Variance Optimization for portfolio construction.
PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2008, 1, 1)
self.set_end_date(2021, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.assets = ["SPY", "TLT"] # "TLT" as fix income in out-of-market period (high volatility)

# Add Equity ------------------------------------------------


for ticker in self.assets:
self.add_equity(ticker, Resolution.MINUTE)

# Set Scheduled Event Method For Kalman Filter updating.


self.schedule.on(self.date_rules.every_day(),
self.time_rules.before_market_close("SPY", 5),
self.every_day_before_market_close)

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with

their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .

PY

def every_day_before_market_close(self) -> None:


qb = self

# Get history
history = qb.history(["SPY"], datetime(2010, 1, 1), datetime.now(), Resolution.DAILY)

# Get the close price daily return.


close = history['close'].unstack(level=0)

# Call pct_change to obtain the daily return


returns = close.pct_change().iloc[1:]

# Initialize the HMM, then fit by the standard deviation data.


model = MarkovRegression(returns, k_regimes=2, switching_variance=True).fit()

# Obtain the market regime


regime = model.smoothed_marginal_probabilities.values.argmax(axis=1)[-1]

# ==============================

if regime == 0:
self.set_holdings([PortfolioTarget("TLT", 0.), PortfolioTarget("SPY", 1.)])
else:
self.set_holdings([PortfolioTarget("TLT", 1.), PortfolioTarget("SPY", 0.)])

Clone Example Project


 Charts  Statistics  Code Clone Algorithm
Applying Research > Long Short-Term Memory

Applying Research
Long Short-Term Memory

Introduction

This page explains how to you can use the Research Environment to develop and test a Long Short Term Memory

hypothesis, then put the hypothesis in production.

Recurrent neural networks (RNN) are a powerful tool in deep learning. These models quite accurately mimic how

humans process sequencial information and learn. Unlike traditional feedforward neural networks, RNNs have

memory. That is, information fed into them persists and the network is able to draw on this to make inferences.

Long Short-term Memory (LSTM) is a type of RNN. Instead of one layer, LSTM cells generally have four, three of

which are part of "gates" -- ways to optionally let information through. The three gates are commonly referred to

as the forget, input, and output gates. The forget gate layer is where the model decides what information to keep
from prior states. At the input gate layer, the model decides which values to update. Finally, the output gate layer is

where the final output of the cell state is decided. Essentially, LSTM separately decides what to remember and the

rate at which it should update.

An exmaple of a LSTM cell: x is the input data, c is the long-term memory, h is the current state and serve as short-

term memory, σ and tanh is the non-linear activation function of the gates.

Image source: https://en.wikipedia.org/wiki/Long_short-term_memory#/media/File:LSTM_Cell.svg

Create Hypothesis

LSTM models have produced some great results when applied to time-series prediction. One of the central
challenges with conventional time-series models is that, despite trying to account for trends or other non-

stationary elements, it is almost impossible to truly predict an outlier like a recession, flash crash, liquidity crisis,

etc. By having a long memory, LSTM models are better able to capture these difficult trends in the data without

suffering from the level of overfitting a conventional model would need in order to capture the same data.

For a very basic application, we're hypothesizing LSTM can offer an accurate prediction in future price.

Import Libraries

We'll need to import libraries to help with data processing, validation and visualization. Import keras , sklearn ,

numpy and matplotlib libraries by the following:

PY

from keras.layers import LSTM, Dense, Dropout


from keras.models import Sequential
from keras.callbacks import EarlyStopping
from sklearn.preprocessing import MinMaxScaler

import numpy as np
from matplotlib import pyplot as plt

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the desired index for research.

PY

asset = "SPY"

3. Call the add_equity method with the tickers, and their corresponding resolution.

PY

qb.add_equity(asset, Resolution.MINUTE)

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with qb.securities.keys for all tickers, time argument(s), and resolution to request

historical data for the symbol.

PY

history = qb.history(qb.securities.keys(), datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)
Prepare Data

We'll have to process our data as well as build the LSTM model before testing the hypothesis. We would scale our

data to for better covergence.

1. Select the close column and then call the unstack method.

PY

close_price = history['close'].unstack(level=0)

2. Initialize MinMaxScaler to scale the data onto [0,1].

PY

scaler = MinMaxScaler(feature_range = (0, 1))

3. Transform our data.

PY

df = pd.DataFrame(scaler.fit_transform(close), index=close.index)

4. Select input data

PY

scaler = MinMaxScaler(feature_range = (0, 1))

5. Shift the data for 1-step backward as training output result.

PY

output = df.shift(-1).iloc[:-1]

6. Split the data into training and testing sets.


In this example, we use the first 80% data for trianing, and the last 20% for testing.

PY

splitter = int(input_.shape[0] * 0.8)


X_train = input_.iloc[:splitter]
X_test = input_.iloc[splitter:]
y_train = output.iloc[:splitter]
y_test = output.iloc[splitter:]

7. Build feauture and label sets (using number of steps 60, and feature rank 1).

PY

features_set = []
labels = []
for i in range(60, X_train.shape[0]):
features_set.append(X_train.iloc[i-60:i].values.reshape(-1, 1))
labels.append(y_train.iloc[i])
features_set, labels = np.array(features_set), np.array(labels)
features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))

Build Model

We construct the LSTM model.

1. Build a Sequential keras model.

PY

model = Sequential()

2. Create the model infrastructure.

PY

# Add our first LSTM layer - 50 nodes.


model.add(LSTM(units = 50, return_sequences=True, input_shape=(features_set.shape[1], 1)))
# Add Dropout layer to avoid overfitting
model.add(Dropout(0.2))
# Add additional layers
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units = 5))
model.add(Dense(units = 1))

3. Compile the model.

We use Adam as optimizer for adpative step size and MSE as loss function since it is continuous data.

PY

model.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae', 'acc'])

4. Set early stopping callback method.


PY

callback = EarlyStopping(monitor='loss', patience=3, verbose=1, restore_best_weights=True)

5. Display the model structure.

PY

model.summary()

6. Fit the model to our data, running 20 training epochs.

Note that different training session's results will not be the same since the batch is randomly selected.

PY

model.fit(features_set, labels, epochs = 20, batch_size = 100, callbacks=[callback])


Test Hypothesis

We would test the performance of this ML model to see if it could predict 1-step forward price precisely. To do so,
we would compare the predicted and actual prices.

1. Get testing set features for input.

PY

test_features = []
for i in range(60, X_test.shape[0]):
test_features.append(X_test.iloc[i-60:i].values.reshape(-1, 1))
test_features = np.array(test_features)
test_features = np.reshape(test_features, (test_features.shape[0], test_features.shape[1], 1))

2. Make predictions.

PY

predictions = model.predict(test_features)

3. Transform predictions back to original data-scale.


PY

predictions = scaler.inverse_transform(predictions)
actual = scaler.inverse_transform(y_test.values)

4. Plot the results.

PY

plt.figure(figsize=(15, 10))
plt.plot(actual[60:], color='blue', label='Actual')
plt.plot(predictions , color='red', label='Prediction')
plt.title('Price vs Predicted Price ')
plt.legend()
plt.show()

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this

model into backtest is to create a scheduled event which uses our model to predict the expected return. If we

predict the price will go up, we long SPY, else, we short it.
PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2016, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.asset = "SPY"

# Add Equity ------------------------------------------------


self.add_equity(self.asset, Resolution.MINUTE)

# Initialize the LSTM model


self.build_model()

# Set Scheduled Event Method For Our Model


self.schedule.on(self.date_rules.every_day(),
self.time_rules.before_market_close("SPY", 5),
self.every_day_before_market_close)

# Set Scheduled Event Method For Our Model Retraining every month
self.schedule.on(self.date_rules.month_start(),
self.time_rules.at(0, 0),
self.build_model)

We'll also need to create a function to train and update our model from time to time.
PY

def build_model(self) -> None:


qb = self

### Preparing Data


# Get historical data
history = qb.history(qb.securities.keys(), 252*2, Resolution.DAILY)

# Select the close column and then call the unstack method.
close = history['close'].unstack(level=0)

# Scale data onto [0,1]


self.scaler = MinMaxScaler(feature_range = (0, 1))

# Transform our data


df = pd.DataFrame(self.scaler.fit_transform(close), index=close.index)

# Feature engineer the data for input.


input_ = df.iloc[1:]

# Shift the data for 1-step backward as training output result.


output = df.shift(-1).iloc[:-1]

# Build feauture and label sets (using number of steps 60, and feature rank 1)
features_set = []
labels = []
for i in range(60, input_.shape[0]):
features_set.append(input_.iloc[i-60:i].values.reshape(-1, 1))
labels.append(output.iloc[i])
features_set, labels = np.array(features_set), np.array(labels)
features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))

### Build Model


# Build a Sequential keras model
self.model = Sequential()

# Add our first LSTM layer - 50 nodes


self.model.add(LSTM(units = 50, return_sequences=True, input_shape=(features_set.shape[1], 1)))
# Add Dropout layer to avoid overfitting
self.model.add(Dropout(0.2))
# Add additional layers
self.model.add(LSTM(units=50, return_sequences=True))
self.model.add(Dropout(0.2))
self.model.add(LSTM(units=50))
self.model.add(Dropout(0.2))
self.model.add(Dense(units = 5))
self.model.add(Dense(units = 1))

# Compile the model. We use Adam as optimizer for adpative step size and MSE as loss function since
it is continuous data.
self.model.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae', 'acc'])

# Set early stopping callback method


callback = EarlyStopping(monitor='loss', patience=3, restore_best_weights=True)

# Fit the model to our data, running 20 training epochs


self.model.fit(features_set, labels, epochs = 20, batch_size = 1000, callbacks=[callback])

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with

their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .


PY

def EveryDayBeforeMarketClose(self) -> None:


qb = self
# Fetch history on our universe
history = qb.History(qb.Securities.Keys, 60, Resolution.Daily)
if history.empty: return

# Make all of them into a single time index.


close = history.close.unstack(level=0)

# Scale our data


df = pd.DataFrame(self.scaler.transform(close), index=close.index)

# Feature engineer the data for input


input_ = []
input_.append(df.values.reshape(-1, 1))
input_ = np.array(input_)
input_ = np.reshape(input_, (input_.shape[0], input_.shape[1], 1))

# Prediction
prediction = self.model.predict(input_)

# Revert the scaling into price


prediction = self.scaler.inverse_transform(prediction)

# ==============================

if prediction > qb.Securities[self.asset].Price:


self.SetHoldings(self.asset, 1.)
else:
self.SetHoldings(self.asset, -1.)

Clone Example Project


Applying Research > Airline Buybacks

Applying Research
Airline Buybacks

Introduction

This page explains how to you can use the Research Environment to develop and test a Airline Buybacks

hypothesis, then put the hypothesis in production.

Create Hypothesis

Buyback represents a company buy back its own stocks in the market, as (1) management is confident on its own

future, and (2) wants more control over its development. Since usually buyback is in large scale on a schedule, the

price of repurchasing often causes price fluctuation.

Airlines is one of the largest buyback sectors. Major US Airlines use over 90% of their free cashflow to buy back

their own stocks in the recent years. [1] Therefore, we can use airline companies to test the hypothesis of

buybacks would cause price action. In this particular exmaple, we're hypothesizing that difference in buyback

price and close price would suggest price change in certain direction. (we don't know forward return would be in

momentum or mean-reversion in this case!)

Import Libraries

We'll need to import libraries to help with data processing, validation and visualization. Import
SmartInsiderTransaction class, statsmodels , sklearn , numpy , pandas and seaborn libraries by the following:

PY

from QuantConnect.DataSource import SmartInsiderTransaction

from statsmodels.discrete.discrete_model import Logit


from sklearn.metrics import confusion_matrix
import numpy as np
import pandas as pd
import seaborn as sns

Get Historical Data

To begin, we retrieve historical data for researching.

1. Instantiate a QuantBook .

PY

qb = QuantBook()

2. Select the airline tickers for research.


PY

assets = ["LUV", # Southwest Airlines


"DAL", # Delta Airlines
"UAL", # United Airlines Holdings
"AAL", # American Airlines Group
"SKYW", # SkyWest Inc.
"ALGT", # Allegiant Travel Co.
"ALK" # Alaska Air Group Inc.
]

3. Call the add_equity method with the tickers, and its corresponding resolution. Then call add_data with

SmartInsiderTransaction to subscribe to their buyback transaction data. Save the Symbol s into a dictionary.

PY

symbols = {}
for ticker in assets:
symbol = qb.add_equity(ticker, Resolution.MINUTE).symbol
symbols[symbol] = qb.add_data(SmartInsiderTransaction, symbol).symbol

If you do not pass a resolution argument, Resolution.MINUTE is used by default.

4. Call the history method with a list of Symbol s for all tickers, time argument(s), and resolution to request

historical data for the symbols.

PY

history = qb.history(list(symbols.keys()), datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

5. Call SPY history as reference.

PY

spy = qb.history(qb.add_equity("SPY").symbol, datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

6. Call the history method with a list of SmartInsiderTransaction Symbol s for all tickers, time argument(s),

and resolution to request historical data for the symbols.

PY

history_buybacks = qb.history(list(symbols.values()), datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)
Prepare Data

We'll have to process our data to get the buyback premium/discount% vs forward return data.

1. Select the close column and then call the unstack method.

PY

df = history['close'].unstack(level=0)
spy_close = spy['close'].unstack(level=0)

2. Call pct_change to get the daily return of close price, then shift 1-step backward as prediction.

PY

ret = df.pct_change().shift(-1).iloc[:-1]
ret_spy = spy_close.pct_change().shift(-1).iloc[:-1]

3. Get the active forward return.

PY

active_ret = ret.sub(ret_spy.values, axis=0)

4. Select the ExecutionPrice column and then call the unstack method to get the buyback dataframe.

PY

df_buybacks = history_buybacks['executionprice'].unstack(level=0)

5. Convert buyback history into daily mean data.


PY

df_buybacks = df_buybacks.groupby(df_buybacks.index.date).mean()
df_buybacks.columns = df.columns

6. Get the buyback premium/discount %.

PY

df_close = df.reindex(df_buybacks.index)[~df_buybacks.isna()]
df_buybacks = (df_buybacks - df_close)/df_close

7. Create a Dataframe to hold the buyback and 1-day forward return data.

PY

data = pd.DataFrame(columns=["Buybacks", "Return"])

8. Append the data into the Dataframe .

PY

for row, row_buyback in zip(active_ret.reindex(df_buybacks.index).itertuples(),


df_buybacks.itertuples()):
index = row[0]
for i in range(1, df_buybacks.shape[1]+1):
if row_buyback[i] != 0:
data = pd.concat([data, pd.DataFrame({"Buybacks": row_buyback[i], "Return":row[i]},
index=[index])])

9. Call dropna to drop NaNs.

PY

data.dropna(inplace=True)

Test Hypothesis

We would test (1) if buyback has statistically significant effect on return direction, and (2) buyback could be a
return predictor.

1. Get binary return (+/-).

PY

binary_ret = data["Return"].copy()
binary_ret[binary_ret < 0] = 0
binary_ret[binary_ret > 0] = 1
2. Construct a logistic regression model.

PY

model = Logit(binary_ret.values, data["Buybacks"].values).fit()

3. Display logistic regression results.

PY

display(model.summary())

We can see a p-value of < 0.05 in the logistic regression model, meaning the separation of positive and

negative using buyback premium/discount% is statistically significant.

4. Plot the results.

PY

plt.figure(figsize=(10, 6))
sns.regplot(x=data["Buybacks"]*100, y=binary_ret, logistic=True, ci=None, line_kws={'label': "
Logistic Regression Line"})
plt.plot([-50, 50], [0.5, 0.5], "r--", label="Selection Cutoff Line")
plt.title("Buyback premium vs Profit/Loss")
plt.xlabel("Buyback premium %")
plt.xlim([-50, 50])
plt.ylabel("Profit/Loss")
plt.legend()
plt.show()
Interesting, from the logistic regression line, we observe that when the airlines brought their stock in premium

price, the price tended to go down, while the opposite for buying back in discount.

Let's also study how good is the logistic regression.

5. Get in-sample prediction result.

PY

predictions = model.predict(data["Buybacks"].values)
for i in range(len(predictions)):
predictions[i] = 1 if predictions[i] > 0.5 else 0

6. Call confusion_matrix to contrast the results.

PY

cm = confusion_matrix(binary_ret, predictions)

7. Display the result.

PY

df_result = pd.DataFrame(cm,
index=pd.MultiIndex.from_tuples([("Prediction", "Positive"),
("Prediction", "Negative")]),
columns=pd.MultiIndex.from_tuples([("Actual", "Positive"), ("Actual",
"Negative")]))

The logistic regression is having a 55.8% accuracy (55% sensitivity and 56.3% specificity), this can suggest

a > 50% win rate before friction costs, proven our hypothesis.
Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting. One way to accomodate this
model into backtest is to create a scheduled event which uses our model to predict the expected return.

PY

def initialize(self) -> None:

#1. Required: Five years of backtest history


self.set_start_date(2017, 1, 1)

#2. Required: Alpha Streams Models:


self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)

#3. Required: Significant AUM Capacity


self.set_cash(1000000)

#4. Required: Benchmark to SPY


self.set_benchmark("SPY")

self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
self.set_execution(ImmediateExecutionModel())

# Set our strategy to be take 5% profit and 5% stop loss.


self.add_risk_management(MaximumUnrealizedProfitPercentPerSecurity(0.05))
self.add_risk_management(MaximumDrawdownPercentPerSecurity(0.05))

# Select the airline tickers for research.


self.symbols = {}
assets = ["LUV", # Southwest Airlines
"DAL", # Delta Airlines
"UAL", # United Airlines Holdings
"AAL", # American Airlines Group
"SKYW", # SkyWest Inc.
"ALGT", # Allegiant Travel Co.
"ALK" # Alaska Air Group Inc.
]

# Call the AddEquity method with the tickers, and its corresponding resolution. Then call AddData
with SmartInsiderTransaction to subscribe to their buyback transaction data.
for ticker in assets:
symbol = self.add_equity(ticker, Resolution.MINUTE).symbol
self.symbols[symbol] = self.add_data(SmartInsiderTransaction, symbol).symbol

self.add_equity("SPY")

# Initialize the model


self.build_model()

# Set Scheduled Event Method For Our Model Recalibration every month
self.schedule.on(self.date_rules.month_start(), self.time_rules.at(0, 0), self.build_model)

# Set Scheduled Event Method For Trading


self.schedule.on(self.date_rules.every_day(), self.time_rules.before_market_close("SPY", 5),
self.every_day_before_market_close)

We'll also need to create a function to train and update the logistic regression model from time to time.
PY

def BuildModel(self) -> None:


qb = self
# Call the History method with list of tickers, time argument(s), and resolution to request
historical data for the symbol.
history = qb.History(list(self.symbols.keys()), datetime(2015, 1, 1), datetime.now(),
Resolution.Daily)

# Call SPY history as reference


spy = qb.History(["SPY"], datetime(2015, 1, 1), datetime.now(), Resolution.Daily)

# Call the History method with list of buyback tickers, time argument(s), and resolution to request
buyback data for the symbol.
history_buybacks = qb.History(list(self.symbols.values()), datetime(2015, 1, 1), datetime.now(),
Resolution.Daily)

# Select the close column and then call the unstack method to get the close price dataframe.
df = history['close'].unstack(level=0)
spy_close = spy['close'].unstack(level=0)

# Call pct_change to get the daily return of close price, then shift 1-step backward as prediction.
ret = df.pct_change().shift(-1).iloc[:-1]
ret_spy = spy_close.pct_change().shift(-1).iloc[:-1]

# Get the active return


active_ret = ret.sub(ret_spy.values, axis=0)

# Select the ExecutionPrice column and then call the unstack method to get the dataframe.
df_buybacks = history_buybacks['executionprice'].unstack(level=0)

# Convert buyback history into daily mean data


df_buybacks = df_buybacks.groupby(df_buybacks.index.date).mean()
df_buybacks.columns = df.columns

# Get the buyback premium/discount


df_close = df.reindex(df_buybacks.index)[~df_buybacks.isna()]
df_buybacks = (df_buybacks - df_close)/df_close

# Create a dataframe to hold the buyback and 1-day forward return data
data = pd.DataFrame(columns=["Buybacks", "Return"])

# Append the data into the dataframe


for row, row_buyback in zip(active_ret.reindex(df_buybacks.index).itertuples(),
df_buybacks.itertuples()):
index = row[0]
for i in range(1, df_buybacks.shape[1]+1):
if row_buyback[i] != 0:
data = pd.concat([data, pd.DataFrame({"Buybacks": row_buyback[i], "Return":row[i]},
index=[index])])

# Call dropna to drop NaNs


data.dropna(inplace=True)

# Get binary return (+/-)


binary_ret = data["Return"].copy()
binary_ret[binary_ret < 0] = 0
binary_ret[binary_ret > 0] = 1

# Construct a logistic regression model


self.model = Logit(binary_ret.values, data["Buybacks"].values).fit()

Now we export our model into the scheduled event method. We will switch qb with self and replace methods with

their QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .


PY

def EveryDayBeforeMarketClose(self) -> None:


qb = self
# Get any buyback event today
history_buybacks = qb.History(list(self.symbols.values()), timedelta(days=1), Resolution.Daily)
if history_buybacks.empty or "executionprice" not in history_buybacks.columns: return

# Select the ExecutionPrice column and then call the unstack method to get the dataframe.
df_buybacks = history_buybacks['executionprice'].unstack(level=0)

# Convert buyback history into daily mean data


df_buybacks = df_buybacks.groupby(df_buybacks.index.date).mean()

# ==============================

insights = []

# Iterate the buyback data, thne pass to the model for prediction
row = df_buybacks.iloc[-1]
for i in range(len(row)):
prediction = self.model.predict(row[i])

# Long if the prediction predict price goes up, short otherwise. Do opposite for SPY (active
return)
if prediction > 0.5:
insights.append( Insight.Price(row.index[i].split(".")[0], timedelta(days=1),
InsightDirection.Up) )
insights.append( Insight.Price("SPY", timedelta(days=1), InsightDirection.Down) )
else:
insights.append( Insight.Price(row.index[i].split(".")[0], timedelta(days=1),
InsightDirection.Down) )
insights.append( Insight.Price("SPY", timedelta(days=1), InsightDirection.Up) )

self.EmitInsights(insights)

Reference

US Airlines Spent 96% of Free Cash Flow on Buybacks: Chart. B. Kochkodin (17 March 2020). Bloomberg.

Retrieve from: https://www.bloomberg.com/news/articles/2020-03-16/u-s-airlines-spent-96-of-free-cash-


flow-on-buybacks-chart.

Clone Example Project


 Charts  Statistics  Code Clone Algorithm
Applying Research > Sparse Optimization

Applying Research
Sparse Optimization

Introduction

This page explains how to you can use the Research Environment to develop and test a Sparse Optimization Index
Tracking hypothesis, then put the hypothesis in production.

Create Hypothesis

Passive index fund portfolio managers will buy in corresponding weighting of stocks from an index's constituents.

The main idea is allowing market participants to trade an index in a smaller cost. Their performance is measured by

Tracking Error (TE), which is the standard deviation of the active return of the portfolio versus its benchmark

index. The lower the TE means that the portfolio tracks the index very accurately and consistently.

A technique called Sparse Optimization comes into the screen as the portfolio managers want to cut their cost

even lower by trading less frequently and with more liquid stocks. They select a desired group/all constituents
from an index and try to strike a balance between the number of stocks in the portfolio and the TE, like the idea of

L1/L2-normalization.

On the other hand, long-only active fund aimed to beat the benchmark index. Their performance are measured by

the mean-adjusted tracking error, which also take the mean active return into account, so the better fund can be

identified as consisitently beating the index by n%.

We can combine the 2 ideas. In this tutorial, we are about to generate our own active fund and try to use Sparse
Optimization to beat QQQ. However, we need a new measure on active fund for this technique -- Downward Risk

(DR). This is a measure just like the tracking error, but taking out the downward period of the index, i.e. we only

want to model the index's upward return, but not downward loss. We would also, for a more robust regression,

combining Huber function as our loss function. This is known as Huber Downward Risk (HDR). Please refer to

Optimization Methods for Financial Index Tracking: From Theory to Practice. K. Benidis, Y. Feng, D. P. Palomer

(2018) for technical details.

Import Libraries

We'll need to import libraries to help with data processing and visualization. Import numpy , matplotlib and pandas

libraries by the following:

PY

import numpy as np
from matplotlib import pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
Get Historical Data

To begin, we retrieve historical data for researching.

1. Create a class to get the index/ETF constituents on a particular date.


PY

class ETFUniverse:
"""
A class to create a universe of equities from the constituents of an ETF
"""
def __init__(self, etf_ticker, universe_date):
"""
Input:
- etf_ticker
Ticker of the ETF
- universe_date
The date to gather the constituents of the ETF
"""
self.etf_ticker = etf_ticker
self.universe_date = universe_date

def get_symbols(self, qb):


"""
Subscribes to the universe constituents and returns a list of symbols and their timezone

Input:
- qb
The QuantBook instance inside the DatasetAnalyzer

Returns a list of symbols and their timezone


"""
etf_symbols = self._get_etf_constituents(qb, self.etf_ticker, self.universe_date)
security_timezone = None
security_symbols = []

# Subscribe to the universe price data


for symbol in etf_symbols:
security = qb.add_security(symbol, Resolution.DAILY)
security_timezone = security.exchange.time_zone
security_symbols.append(symbol)

return security_symbols, security_timezone

def _get_etf_constituents(self, qb, etf_ticker, date):


"""
A helper method to retreive the ETF constituents on a given date

Input:
- qb
The QuantBook instance inside the DatasetAnalyzer
- etf_ticker
Ticker of the ETF
- universe_date
The date to gather the constituents of the ETF

Returns a list of symbols


"""
date_str = date.strftime("%Y%m%d")
filename = f"/data/equity/usa/universes/etf/{etf_ticker.lower()}/{date_str}.csv"
try:
df = pd.read_csv(filename)
except:
print(f'Error: The ETF universe file does not exist')
return
security_ids = df[df.columns[1]].values
symbols = [qb.symbol(security_id) for security_id in security_ids]
return symbols

2. Instantiate a QuantBook .
PY

qb = QuantBook()

3. Subscribe to the index/ETF.

In this tutorial, we'll be using QQQ.

PY

qqq = qb.add_equity("QQQ").symbol

4. Select all the constituents for research.

In this tutorial, we select the constituents of QQQ on 2020-12-31.

PY

assets, _ = ETFUniverse("QQQ", datetime(2020, 12, 31)).get_symbols(qb)

5. Prepare the historical return data of the constituents and the benchmark index to track.

PY

spy = qb.history(qb.add_equity("SPY").symbol, datetime(2019, 1, 1), datetime(2021, 12, 31),


Resolution.DAILY)

6. Call the history method with a list of SmartInsiderTransaction Symbol s for all tickers, time argument(s),
and resolution to request historical data for the symbols.

PY

history = qb.history(assets, datetime(2020, 1, 1), datetime(2021, 3, 31), Resolution.DAILY)


historyPortfolio = history.close.unstack(0).loc[:"2021-01-01"]
pctChangePortfolio = np.log(historyPortfolio/historyPortfolio.shift(1)).dropna()

historyQQQ_ = qb.history(qqq, datetime(2020, 1, 1), datetime(2021, 3, 31), Resolution.DAILY)


historyQQQ = historyQQQ_.close.unstack(0).loc[:"2021-01-01"]
pctChangeQQQ = np.log(historyQQQ/historyQQQ.shift(1)).loc[pctChangePortfolio.index]

Prepare Data

We'll have to process our data and construct the proposed sparse index tracking portfolio.

1. Get the dimensional sizes.

PY

m = pctChangePortfolio.shape[0]; n = pctChangePortfolio.shape[1]

2. Set up optimization parameters (penalty of exceeding bounds, Huber statistics M-value, penalty weight).
PY

p = 0.5
M = 0.0001
l = 0.01

3. Set up convergence tolerance, maximum iteration of optimization, iteration counter and HDR as minimization

indicator.

PY

tol = 0.001
maxIter = 20
iters = 1
hdr = 10000

4. Initial weightings and placeholders.

PY

w_ = np.array([1/n] * n).reshape(n, 1)
weights = pd.Series()
a = np.array([None] * m).reshape(m, 1)
c = np.array([None] * m).reshape(m, 1)
d = np.array([None] * n).reshape(n, 1)

5. Iterate minimization algorithm to minimize the HDR.


PY

while iters < maxIter:


x_k = (pctChangeQQQ.values - pctChangePortfolio.values @ w_)
for i in range(n):
w = w_[i]
d[i] = d_ = 1/(np.log(1+l/p)*(p+w))
for i in range(m):
xk = float(x_k[i])
if xk < 0:
a[i] = M / (M - 2*xk)
c[i] = xk
else:
c[i] = 0
if 0 <= xk <= M:
a[i] = 1
else:
a[i] = M/abs(xk)

L3 = 1/m * pctChangePortfolio.T.values @ np.diagflat(a.T) @ pctChangePortfolio.values


eigVal, eigVec = np.linalg.eig(L3.astype(float))
eigVal = np.real(eigVal); eigVec = np.real(eigVec)
q3 = 1/max(eigVal) * (2 * (L3 - max(eigVal) * np.eye(n)) @ w_ + eigVec @ d - 2/m *
pctChangePortfolio.T.values @ np.diagflat(a.T) @ (c - pctChangeQQQ.values))

# We want to keep the upper bound of each asset to be 0.1


u = 0.1
mu = float(-(np.sum(q3) + 2)/n); mu_ = 0
while mu > mu_:
mu = mu_
index1 = [i for i, q in enumerate(q3) if mu + q < -u*2]
index2 = [i for i, q in enumerate(q3) if -u*2 < mu + q < 0]
mu_ = float(-(np.sum([q3[i] for i in index2]) + 2 - len(index1)*u*2)/len(index2))

# Obtain the weights and HDR of this iteration.


w_ = np.amax(np.concatenate((-(mu + q3)/2, u*np.ones((n, 1))), axis=1), axis=1).reshape(-1, 1)
w_ = w_/np.sum(abs(w_))
hdr_ = float(w_.T @ w_ + q3.T @ w_)

# If the HDR converges, we take the current weights


if abs(hdr - hdr_) < tol:
break

# Else, we would increase the iteration count and use the current weights for the next
iteration.
iters += 1
hdr = hdr_

6. Save the final weights.

PY

for i in range(n):
weights[pctChangePortfolio.columns[i]] = w_[i]

7. Get the historical return of the proposed portfolio.

PY

histPort = historyPortfolio.dropna() @ np.array([weights[pctChangePortfolio.columns[i]] for i in


range(pctChangePortfolio.shape[1])])
Test Hypothesis

To test the hypothesis. We wish to (1) outcompete the benchmark and (2) the active return is consistent in the in-

and out-of-sample period.

1. Obtain the equity curve of our portfolio and normalized benchmark for comparison.

PY

proposed = history.close.unstack(0).dropna() @ np.array([weights[pctChangePortfolio.columns[i]] for


i in range(pctChangePortfolio.shape[1])])
benchmark = historyQQQ_.close.unstack(0).loc[proposed.index]
normalized_benchmark = benchmark / (float(benchmark.iloc[0])/float(proposed.iloc[0]))

2. Obtain the active return.

PY

proposed_ret = proposed.pct_change().iloc[1:]
benchmark_ret = benchmark.pct_change().iloc[1:]
active_ret = proposed_ret - benchmark_ret.values

3. Plot the result.

PY

fig = plt.figure(figsize=(15, 10))


plt.plot(proposed, label="Proposed Portfolio")
plt.plot(normalized_benchmark, label="Normalized Benchmark")
min_ = min(min(proposed.values), min(normalized_benchmark.values))
max_ = max(max(proposed.values), max(normalized_benchmark.values))
plt.plot([pd.to_datetime("2021-01-01")]*100, np.linspace(min_, max_, 100), "r--", label="in- and
out- of sample separation")
plt.title("Equity Curve")
plt.legend()
plt.show()
plt.clf()

fig, ax = plt.subplots(1, 1)
active_ret["Mean"] = float(active_ret.mean())
active_ret.plot(figsize=(15, 5), title="Active Return", ax=ax)
plt.show()
We can see from the plots, both in- and out-of-sample period the proposed portfolio out preform the benchmark

while remaining a high correlation with it. Although the active return might not be very consistent, but it is a

stationary series above zero. So, in a long run, it does consistently outcompete the QQQ benchmark!

Set Up Algorithm

Once we are confident in our hypothesis, we can export this code into backtesting.
PY

def initialize(self) -> None:


self.set_start_date(2017, 1, 1)
self.set_brokerage_model(BrokerageName.ALPHA_STREAMS)
self.set_cash(1000000)

# Add our ETF constituents of the index that we would like to track.
self.QQQ = self.add_equity("QQQ", Resolution.MINUTE).symbol
self.universe_settings.asynchronous = True
self.universe_settings.resolution = Resolution.MINUTE
self.add_universe(self.universe.etf(self.QQQ, self.universe_settings, self.etf_selection))

self.set_benchmark("QQQ")

# Set up varaibles to flag the time to recalibrate and hold the constituents.
self.time = datetime.min
self.assets = []

We'll also need to create a function for getting the ETF constituents.

PY

def etf_selection(self, constituents: ETFConstituentUniverse) -> List[Symbol]:


# We want all constituents to be considered.
self.assets = [x.symbol for x in constituents]
return self.assets

Now we export our model into the on_data method. We will switch qb with self and replace methods with their

QCAlgorithm counterparts as needed. In this example, this is not an issue because all the methods we used in

research also exist in QCAlgorithm .

PY

def on_data(self, slice: Slice) -> None:


qb = self
if self.time > self.Time:
return

# Prepare the historical return data of the constituents and the ETF (as index to track).
history = qb.History(self.assets, 252, Resolution.Daily)
if history.empty: return

historyPortfolio = history.close.unstack(0)
pctChangePortfolio = np.log(historyPortfolio/historyPortfolio.shift(1)).dropna()

historyQQQ = qb.History(self.AddEquity("QQQ").Symbol, 252, Resolution.Daily)


historyQQQ = historyQQQ.close.unstack(0)
pctChangeQQQ = np.log(historyQQQ/historyQQQ.shift(1)).loc[pctChangePortfolio.index]

m = pctChangePortfolio.shape[0]; n = pctChangePortfolio.shape[1]

# Set up optimization parameters.


p = 0.5; M = 0.0001; l = 0.01

# Set up convergence tolerance, maximum iteration of optimization, iteration counter and Huber
downward risk as minimization indicator.
tol = 0.001; maxIter = 20; iters = 1; hdr = 10000

# Initial weightings and placeholders.


w_ = np.array([1/n] * n).reshape(n, 1)
self.weights = pd.Series()
a = np.array([None] * m).reshape(m, 1)
c = np.array([None] * m).reshape(m, 1)
d = np.array([None] * n).reshape(n, 1)

# Iterate to minimize the HDR.


# Iterate to minimize the HDR.
while iters < maxIter:
x_k = (pctChangeQQQ.values - pctChangePortfolio.values @ w_)
for i in range(n):
w = w_[i]
d[i] = d_ = 1/(np.log(1+l/p)*(p+w))
for i in range(m):
xk = float(x_k[i])
if xk < 0:
a[i] = M / (M - 2*xk)
c[i] = xk
else:
c[i] = 0
if 0 <= xk <= M:
a[i] = 1
else:
a[i] = M/abs(xk)

L3 = 1/m * pctChangePortfolio.T.values @ np.diagflat(a.T) @ pctChangePortfolio.values


eigVal, eigVec = np.linalg.eig(L3.astype(float))
eigVal = np.real(eigVal); eigVec = np.real(eigVec)
q3 = 1/max(eigVal) * (2 * (L3 - max(eigVal) * np.eye(n)) @ w_ + eigVec @ d - 2/m *
pctChangePortfolio.T.values @ np.diagflat(a.T) @ (c - pctChangeQQQ.values))

# We want to keep the upper bound of each asset to be 0.1


u = 0.1
mu = float(-(np.sum(q3) + 2)/n); mu_ = 0
while mu > mu_:
mu = mu_
index1 = [i for i, q in enumerate(q3) if mu + q < -u*2]
index2 = [i for i, q in enumerate(q3) if -u*2 < mu + q < 0]
mu_ = float(-(np.sum([q3[i] for i in index2]) + 2 - len(index1)*u*2)/len(index2))

# Obtain the weights and HDR of this iteration.


w_ = np.amax(np.concatenate((-(mu + q3)/2, u*np.ones((n, 1))), axis=1), axis=1).reshape(-1, 1)
w_ = w_/np.sum(abs(w_))
hdr_ = float(w_.T @ w_ + q3.T @ w_)

# If the HDR converges, we take the current weights


if abs(hdr - hdr_) < tol:
break

# Else, we would increase the iteration count and use the current weights for the next
iteration.
iters += 1
hdr = hdr_

# -----------------------------------------------------------------------------------------
orders = []
for i in range(n):
orders.append(PortfolioTarget(pctChangePortfolio.columns[i], float(w_[i])))
self.SetHoldings(orders)

# Recalibrate on quarter end.


self.time = Expiry.EndOfQuarter(self.Time)

Reference

Optimization Methods for Financial Index Tracking: From Theory to Practice. K. Benidis, Y. Feng, D. P.

Palomer (2018). Foundations and Trends in Signal Processing. 3-3. p171-279.

Clone Example Project


 Charts  Statistics  Code Clone Algorithm

Loading [MathJax]/jax/output/HTML-CSS/jax.js
Loading [MathJax]/jax/output/HTML-CSS/jax.js

You might also like