0% found this document useful (0 votes)

435 views

A Beginners Guide To Python Programming For Traders

Uploaded by

Nebojsa Vidic

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

435 views

A Beginners Guide To Python Programming For Traders

Uploaded by

Nebojsa Vidic

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 102

A Beginners

Guide to Python
Programming for
Traders
A Beginners Guide to
Python Programming
for Traders

Connors Research, LLC

Steve Jost

Publishing
Copyright © 2020, Connors Research, LLC.

Published by The Connors Group, Inc., 185 Hudson St., Suite 2500, Jersey City, NJ
07311

ALL RIGHTS RESERVED. No part of this publication may be reproduced, stored in a

retrieval system, or transmitted, in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior written permission of the
publisher and the author.

This publication is designed to provide accurate and authoritative information in regard

to the subject matter covered. It is sold with the understanding that the author and the
publisher are not engaged in rendering legal, accounting, or other professional service.

Authorization to photocopy items for internal or personal use, or in the internal or

personal use of specific clients, is granted by The Connors Group, Inc., provided that
the U.S. $7.00 per page fee is paid directly to The Connors Group, Inc.,
1-973-494-7311.

Publisher’s Notice

The publisher has provided this eBook to you without Digital Rights Management (DRM)
software applied so that you can enjoy reading it on your personal devices. This eBook
is for your personal use only. You may not print or post this eBook, or make this eBook
publicly available in any way. You may not copy, reproduce, or upload this eBook
except to read it on your personal devices.

Copyright infringement is against the law. If you believe the copy of this eBook you are
reading infringes on the author’s copyright, please notify the publisher at
[email protected].

ISBN 978-0-578-68440-6

Printed in the United States of America.

Disclaimer
The Connors Group, Inc., Connors Research, LLC, Steve, and Laurence A. Connors (collectively referred
to as “Company") are not investment advisory services, nor registered investment advisors or
broker-dealers and do not purport to tell or suggest which securities or currencies customers should buy
or sell for themselves. The analysts and employees or affiliates of Company may hold positions in the
stocks, currencies, or industries discussed here. You understand and acknowledge that there is a very
high degree of risk involved in trading securities and/or currencies. The Company, the authors, the
publisher, and all affiliates of Company assume no responsibility or liability for your trading and
investment results. Factual statements on the Company's website, or in its publications, are made as of
the date stated and are subject to change without notice.

It should not be assumed that the methods, techniques, or indicators presented in these products will be
profitable or that they will not result in losses. Past results of any individual trader or trading system
published by Company are not indicative of future returns by that trader or system, and are not indicative
of future returns which be realized by you. In addition, the indicators, strategies, columns, articles and all
other features of Company's products (collectively, the "Information") are provided for informational and
educational purposes only and should not be construed as investment advice. Examples presented on
Company's website are for educational purposes only. Such set-ups are not solicitations of any order to
buy or sell. Accordingly, you should not rely solely on the Information in making any investment. Rather,
you should use the Information only as a starting point for doing additional independent research in order
to allow you to form your own opinion regarding investments. You should always check with your licensed
financial advisor and tax advisor to determine the suitability of any investment.

HYPOTHETICAL OR SIMULATED PERFORMANCE RESULTS HAVE CERTAIN INHERENT

LIMITATIONS. UNLIKE AN ACTUAL PERFORMANCE RECORD, SIMULATED RESULTS DO NOT
REPRESENT ACTUAL TRADING AND MAY NOT BE IMPACTED BY BROKERAGE AND OTHER
SLIPPAGE FEES. ALSO, SINCE THE TRADES HAVE NOT ACTUALLY BEEN EXECUTED, THE
RESULTS MAY HAVE UNDER- OR OVER-COMPENSATED FOR THE IMPACT, IF ANY, OF CERTAIN
MARKET FACTORS, SUCH AS LACK OF LIQUIDITY. SIMULATED TRADING PROGRAMS IN
GENERAL ARE ALSO SUBJECT TO THE FACT THAT THEY ARE DESIGNED WITH THE BENEFIT OF
HINDSIGHT. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO
ACHIEVE PROFITS OR LOSSES SIMILAR TO THOSE SHOWN.

The Connors Group, Inc.

185 Hudson St.
Suite 2500
Jersey City, NJ 07311

Connors Research, LLC

185 Hudson St.
Suite 2500
Jersey City, NJ 07311
Table of Contents

Chapter 1 - Introduction
● Why is Python taking over Quantitative Finance?
● What can Python do for your trading and your career?
● My Journey
● About this e-book

Chapter 2 - Introduction to Quantopian.com

● Setting up your Quantopian Account
● Research/Notebook Environment
● Algorithm Environment

Chapter 3 - Getting Started with Python - The Basics

● Variables
○ What are variables?
○ How to assign variables in Python
● Basic Math – Using Python as a Calculator
● Comparison Operators
● Comments

Chapter 4 - Data Type in Python

● What are DataTypes and Why Do I Care?
● Type() Function
● Integers
● Floats
● Strings
● Booleans
● Python Lists
○ len() function
● Python Dictionaries

Chapter 5 - Control Flow Statements

● What are Control Flow Statements and Why Do I Care
● White Space Matters
● If Statements
● If … Else Statements
● If … Elif … Else Statements
● Loops
○ For Loops
○ While Loops
Chapter 6 - Intro To Pandas: The Most Important Library for Quantitative Trading
● What is Pandas and why do I care?
● Pandas Series and Pandas DataFrames
● Pandas Indexing and subsetting
○ Selecting a column
○ Selecting multiple columns
○ Selecting a row
■ .loc[]
■ .iloc[]
○ Selecting multiple rows
○ Selecting columns and rows
○ Creating a new column
○ Boolean Indexing
● Some Basic DataFrame Methods
○ What is a method?
○ .head()
○ .tail()
○ .shape
○ .sort_values()
○ .dropna()
○ .shift()
○ .mean()
○ .std()
○ .max()
○ .min()
○ .info()
○ .describe()
○ .value_counts()
● Resampling Data in Pandas
○ What does Resample mean?
○ .resample() method

Chapter 7 - Case Study: Using Pandas to Conduct Quantitative Financial Research

● Finding historical edges based on different RSI values

Chapter 8 - Introduction to Zipline and Quantopian

● What is Quantopian?
● What is Zipline?
● The initialize() function
● Scheduling other functions
● Writing a function containing trading logic and execution
● Ordering
● Accessing portfolio information
Chapter 9 – Case Study: Writing your first Zipline Algorithm and Analyzing the Results
● Case Study: writing your first zipline algorithm
○ Logic
○ Set up our initialize function
○ Writing our trading logic
○ Sending Orders / Managing the Portfolio
○ Test Results - Analyzing your Backtest with Pyfolio

Chapter 10 – Conclusion and Next Steps

● Python Programming for Traders Course
● What is covered in the Python Programming for Traders Course?
Chapter 1 - Introduction

Python is the hottest programming language on the planet, not just in the Quantitative Finance
field but also in many industries. Python continues to gain more users than any of its
competitors and is soon projected to be the most popular programming language in the world.

Stackoverflow.com, the most popular website for coding related questions, reports that the
number of questions regarding the Python programming language is growing much faster than
any of the other languages as the following graph shows, indicating Pythons recent rise in
popularity:

Python is behind some of the largest companies and most important projects in the world. From
helping Netflix stream videos to more than 100 million homes worldwide to powering the very
popular photo-sharing phenomenon Instagram to helping NASA in space exploration.
In fact, it was Python that recently aided NASA in stitching together the first image of a black
hole some 500 million trillion kilometers away!

While Python is popular in many industries, it is especially popular in the Data Science arena in
general and Quantitative finance in particular.

Why is Python Taking Over Quantitative Finance?

Not only is Python growing faster than other programming languages for general use, it's
already the most used language for quantitative finance and trading on the professional level.

There are many reasons for Python’s rise to the preeminent language used in quantitative
finance and data science. Some of these reasons include:

● Simple and intuitive syntax

● Python’s vast collection of libraries designed specifically for data manipulation and
analysis, which are directly applicable for Finance and Quant trading.
● Python is the most popular language for statistical modeling and machine learning
● There are already several backtesting engines written in Python which allows you to
backtest trading ideas and build sophisticated trading modes. One example is Zipline,
the backtesting engine the powers Quantopian.com and the tool we will be working with
for this e-book.
● Being a general-purpose language, Python can be used throughout the entire quant
finance pipeline - from research to backtesting to trade execution.
● Python has an enormous user community and continues to grow rapidly. The
advantages of this includes significant collaboration opportunities and the ability to get
your coding related questions answered.

What Can Python Do For Your Trading and Your Career?

Don’t take our word for this, just do a quick search for any trading related job posting by the
largest and most sophisticated investment banks and hedge funds in the world.

You will see that all of these institutions require Python knowledge to get your foot in the door.
These companies don’t care that you can code in retail products like Amibroker and
TradeStation!

These institutions realize that Python makes them more efficient, more importantly more
profitable! They require any new hires be well versed in the Python programming language.

Check out some recent job postings below:

Here is the bottom line - if you want to use the same tools as the largest, most sophisticated
hedge funds and investment banks in the world, you need to learn Python. If you want to get a
job in the industry, you need to know Python. If you want to backtest your trading ideas in a
robust and professional manner, you need to know Python. To keep up with the times, and not
become outdated, you need to learn Python!

My Python / Coding / Systematic Trading Journey

My trading and strategy development skills were greatly enhanced when I learned how to
program in Python. Below is a quick summary of my coding and systematic trading journey so
far.

Before joining Connors Research, I was an electrical engineer for 35 years and had recently
retired in 2019. Aside from engineering, my other passion is finance and trading.

As an avid reader of stock trading books, I applied techniques I read about to managing my own
money but never really achieved much success as a discretionary trader.

I was also drawn to systematic trading and viewed it as a way to possibly achieve the success
that eluded me as a discretionary trader. If I could develop one or more quantified approaches
that suited my personality and risk tolerance, I would be much more likely to have confidence in
my trading and be able to stick it out through the inevitable drawdowns. In short, having realistic
expectations based on quantified results and a set of rules to follow would enable me to control
my emotions and to become a better trader.

I started my journey to systematic trading using excel but it quickly became apparent that if I
was going to make any real progress I had to learn how to program.

My first venture into programming was using a community-based platform called WealthLab in
2003. As an open source platform, WealthLab had an active community that would freely share
code. Not all of it was great as there was a lot of curve fitting and one upmanship, but I did take
away from this experience a few good ideas and I learned the basics of how to program. Fidelity
bought WealthLab in 2004 and that was the end of the experience for me.

I drifted away from systematic trading for lack of an inexpensive platform, but eventually found
one that I could afford with Amibroker and Norgate Data. About that time I also found Connor’s
Research and purchased a number of their guidebooks on short term trading. I enjoyed
implementing the rules in Amibroker code and obtained good results trading the systems.

Enter Python

Amibroker worked out well for me for a number of years, but it had its limitations.

Being a closed source language, I was at the mercy of the developers at Amibroker to advance
the language and extend its functionality. I also missed the community based experience that I
had years earlier with WealthLab.

Moving Into The Big Leagues

I needed to step up my game. While Amibroker was fine, at the end of the day it is a retail
product. I wanted to enter the big leagues and learn a more flexible, open-source, professionally
used coding language. A language that the biggest and best banks and hedge funds on Wall
Street were using.

I began reading how Python was taking over quantitative finance and how it could be used for
trading and backtesting. The decision from there was easy; I set out to learn Python and to find
another platform to develop and backtest trading algorithms.

How Python Made Me a Better Trader

Fast forward nine months and Python is now my primary tool for trading - ranging from
developing complete trading strategies to analyzing my backtests, to finding new trading edges.

Python has greatly expanded my skill set, ultimately making me a better, more profitable trader.
Below are some of the ways I use Python in my trading and research. I can now...

● Code any strategy I can think of with much greater efficiency and improved flexibility.

● Code portfolio level trading strategies, not just strategies applied to one security at a
time. This is a required skill for professional trading system development.

● Professionally manage hundreds and even thousands of individual securities (think US

Stocks) in a dynamically changing universe and use that universe to create strategies.

● Do advanced number-crunching - Python has largely replaced Excel for me!

● Perform backtests on Futures Contracts, using continuous futures with several “roll”
options for realistic simulations.

● Test individual trading signals for historical edges before incorporating that signal into a
complete strategy. Answering questions such as “every time the RSI has been below
10, what has happened over the next 3 business days?” for example.

● Utilize hundreds of fundamental data points in my trading strategies such as price/book

ratio, return on equity, and gross margins to name a few.

● Perform deep analysis of backtested results, including custom metrics.

● Run statistical and machine learning models, which are already pre-programmed in
Python.

● Write custom code to monitor the performance of my portfolio in real-time.

● Make custom charts and plots of any data I can get my hands on - this includes things
like line charts, bar charts, scatter plots, and correlation matrices.

● Analyze data not just from the Quantopian database, but also from any website, program
with a Python API (there are tons of them), Excel spreadsheets or CSV files.

● Interact with a large community of traders, data scientists, developers, and researchers
using Python. This has huge benefits including always having somebody available to
help with questions and generous sharing of code throughout the community.

● Seek advanced career opportunities in quantitative finance. If you look at the

professional job boards, almost every trading job, quant job, researcher, or portfolio
manager now requires you to know Python. Nobody requires you to be able to code in
EasyLangauge or AmiBroker!
The bottom line is that learning Python advanced my professional knowledge, made my work
more efficient, helped me find new and better trading edges, greatly expanded my backtesting
ability, and led to me being a better, more profitable trader.

Python can do the same for you!

About This e-Book

This e-book is designed to help start you on your journey to using the power of Python to
improve your research, trading, and career prospects. It is not meant to be a complete course
but rather is meant to start you down the right path.

There are many great sources teaching the basics of Python. With this e-book, we don’t intend
to just put another resource out there that covers the same topics. Instead, our aim is to teach
some of the most important skills you will need to know in Python specifically for quantitative
finance, research, and systematic trading.

In this e-book, you will first learn the basics of Python, setting the foundation for some of the
more advanced topics we cover later in the book. This includes topics such as basic math in
Python, variable assignments, comparison operators, data types, and control flow statements.
While this serves as a general introduction to the language, special attention is paid to Python
techniques useful for trading and quantitative research.

After the basics have been covered, we will move on to the most important Python package
used for quantitative financial research and trading – Pandas.

After spending some time getting to know the basics of the Pandas package, we will go on to
the heart of the book, demonstrating some quantitative research and trading strategy
development using our new Python and Pandas skills.

We will start with a highly practical quantitative research case study. In this chapter, we will
walk you through how to check for edges in the marketplace by observing the future
performance of a security based on the reading of a technical analysis indicator. For our
example, we will use the popular technical analysis indicator RSI (relative strength index).

After gaining the knowledge provided in this chapter, you will be able to take this general
framework and inspect any trading indicator (or combination of indicators) to statistically see if it
has any predictive power.

We will then move on to an introduction to Zipline - the backtesting engine that powers
Quantopian.com. You will learn the basics of the Zipline API, the structure of a trading
algorithm, how to send orders, and how to access portfolio level information.
Finally, we will end the book with a complete walkthrough of how to code our first full-blown
trading strategy using the research we conducted in chapter 7. We will go step by step,
explaining the logic of the trading strategy complete with code snippets. At the end of the
chapter, we will provide the complete source code, which is also available for download.

Last but not least, we will inspect our test results using Pyfolio - the open-source Python
package designed to analyze historical backtested results.

I want this to be a highly practical book, specifically designed to start you on your journey to
leveling up your trading by showing real-world research examples and developing a profitable
trading strategy from scratch.

My hope is that you walk away from this book excited by the possibilities Python can offer an
aspiring, or even an established, quantitative trader or investment manager.

Let's get started!

Steve Jost and Connors Research, LLC

March 2020
Chapter 2 - Introduction to Quantopian.com

In this chapter, we are going to quickly cover the tool we will be utilizing throughout the book -
Quantopian.com. While we will be working in Quantopian for the purposes of this book,
remember all of the Python skills you will learn can be applied to your local environment as well.

Setting Up Your Quantopian Account

The first step to get going with Quantopian is to set up a Quantopian account. This is done
simply and easily and should take all of five minutes to complete.

1. Go to Quantopian.com and click on the “Sign Up” button on the top right-hand corner.

Images provided courtesy of Quantopian.com

2. From there, simply fill in your name, email address, and create a password.
3. A confirmation email will be sent to the email address provided. Pull up this email and
verify it is you. After that you are all set, just go back to Quantopian.com, enter your
credentials under “Log In” (top right hand corner) and you are in. Welcome to
Quantopian!

Research/Notebook Environment

The first environment, or place where you write code, we will cover in Quantopian.com is what is
known as the “research” or “notebook” environment. This environment contains embedded
Jupyter Notebooks. These Jupyter Notebooks have quickly become the ubiquitous tool for
quants everywhere, with some going as far as saying that Jupyter Notebooks are the new
Excel!

These notebooks are primarily designed to facilitate iterative coding. This means having the
ability to observe your data, write some code, and inspect the results in a step-by-step fashion.
This ensures that your code is doing what you intend it to do and greatly helps the development
process.

To enter the research/notebook environment, click on the “Research” dropdown menu at the top
of Quantopian.com (after you logged into your account) and select “Notebooks” from the
dropdown menu.

Images provided courtesy of Quantopian.com

This will take you to a screen containing your existing notebooks. To create a new notebook,
click on the blue plus sign ( + ) at the top right hand side of the page.

Images provided courtesy of Quantopian.com

This leads you to a blank Notebook where you can write Python code, pull in data provided by
Quantopian, do research, make charts, and much more!

Images provided courtesy of Quantopian.com

Algorithm Environment

The other environment we will be working in is known as the “algorithm” environment. This is
where you construct complete trading models and produce historical (backtested) test results.
We will be working in this environment in Chapters 8 and 9, where we construct a trading model
from scratch.

To access this environment, again go to the “Research” tab at the top of the page. This time
select “Algorithms” from the dropdown menu.

Images provided courtesy of Quantopian.com

This takes you to a screen displaying all of your saved Algorithms. To create a new Algorithm,
select “New Algorithm” at the top right-hand side of the screen.
Images provided courtesy of Quantopian.com

After you name your new Algorithm, this takes you to the Algorithm environment, where you can
construct full trading models. There will be some pre-populated sample code to get you started.
As a beginner, I would suggest deleting that code and starting from scratch. I will guide you
through writing a complete trading Algorithm in chapters 8 and 9.

Now that you are familiar with Quantopian.com, let's go on to some Python basics. All of the
example code provided over the next couple of chapters has been created in the
research/notebook environment. We will then go on to code in the Algorithm environment to
conclude the book.
Chapter 3 - Getting Started with Python - The Basics

Variables

What are variables?

Variables can best be thought of as a named container which is used to store a value.

As a practical example, let's say you wanted to code a moving average. You first have to
choose a length for your moving average (50-day, 200-day, etc). You could hard code this into
your code, or you can make a variable called something like “ma_length” which contains the
length of that moving average.

The big advantage of the latter approach is that you can easily go back and change the length
of the moving average by changing the value of “ma_length”. All moving averages in your code,
that use “ma_length”, will then be changed as well.

This makes your code much more dynamic, robust, and is considered a best-practice
methodology.

How to assign variables in Python

Before we use a variable in our code, we first have to assign it. Assigning variables in Python is
straightforward, we just use the equal sign(=). This stores whatever is to the right of the equal
sign to the variable name to the left.

From there, we can just reference the variable name in our code, and Python will know what
that variable is assigned.

Examples of Setting Variables:

In the above examples, we first set the variable called “ma_length” to the number 50. We then
printed out “ma_length” and Python told us this variable is set to the number 50.

In the next example, we set “ma_length” to the number 200. We then printed out “ma_length”
and Python told us this variable is now set to 200.

Basic Math – Using Python as a Calculator

Doing basic math in Python is very straightforward. The following table displays some of the
basic mathematical operations that can be done in the language.
You might be thinking, why do I need Python for this, doesn’t a simple calculator or a
spreadsheet such as Microsoft Excel have this functionality?

Yes, it does, but as it will become clearer to you as your Python skills advance, Python can get
tasks done way more efficiently than those other tools.

Below, find some examples of using Python for basic math operations.

Comparison Operators in Python

Comparison operators are what they sound like - these are statements comparing one thing to
another. Common comparison operators include “greater than”, “less than”, “equal to” etc.

Find a table below outlining common comparison operators in Python which will be very useful
in future trading logic.
A couple of things to point out here that trips up some new programmers. Namely, the code that
checks if something is equal to something else. Notice that for this logic, we need a double
equal sign (==).

This is because just one equal sign is used to assign a variable in Python, which we already
learned.

When you ask Python to evaluate comparison operations, you will get a “True” or “False” value
back. When your Python skills improve, you will learn how to use this output to control your
trading logic.

For now, some simple code examples using comparison operators are below.
Data and analysis tools provided by Quantopian.com.

Comments

Comments are an important part of programming and it is best to get in the habit of liberally
commenting on your code right from the beginning.

Comments refer to lines in your code that are not read by the computer and instead are
designed to be read by an actual human. It is best practice to liberally use comments in your
code to explain what the code is doing. This has a dual benefit.

First, it helps you understand what some code you previously wrote is trying to do when you
come back to it sometime in the future.

Second, it helps others to understand what your code is doing. This is especially necessary if
you are collaborating on a team, as your comments help another team member understand the
code.

Comments in Python are done in one of two ways. You can either start a line with a hashtag (#)
or use a triple quote (‘’’).

Hashtags will comment out single lines of code and triple quotes will comment out as many lines
of code that follow until another triple quote is encountered.

Data and analysis tools provided by Quantopian.com.

Notice that in both code snippets, the lines either beginning in hashtags (#) or the lines
surrounded by triple quotes (‘’’) are simply English text and not Python code. Since these lines
are commented, the computer won’t read them and thus won’t throw off an error message.
Chapter 4 - Data Types in Python

What are DataTypes and Why Do I Care?

DataTypes are best thought of as categories of data in Python, and it affects how Python treats
a data point in calculations and scripts.

It’s not immediately clear to novice programmers why they need to even care about what
DataType a certain variable is. After you have been programming for a while, however, it will
become obvious why we need to care – because Python treats different DataTypes in different
ways!

A failure to understand this can cause your code to fail, or worse cause it to display behavior
differently than what you intended.

The main DataTypes we will be covering in this section are

1. Integers
2. Floats
3. Strings
4. Booleans
5. Python Lists
6. Python Dictionaries

Type() function

Before we dive into different DataTypes, we first must cover the type() function. This function
simply returns whatever DataType of the object that you pass into the function.

If you are unsure of what DataType something is, pass it into the type() function and Python will
tell you.

Integers

An integer, or sometimes referred to shorthand as an “int”, is simply a whole number. Said

another way, an integer is a number with no decimal points after it.
Data and analysis tools provided by Quantopian.com.

In the example above, the first piece of code just passes the whole number 3 into the type()
function. Python then told us 3 is an integer (“int”).

In the second piece of code, we first set the whole number 10 to the variable “x”. We then
passed “x” into the type() function and python again told us that “x” is an integer, since it is set to
10.

Floats

A float is simply a number with a decimal point after it.

Data and analysis tools provided by Quantopian.com.

Here we set pass the number 5.0 into the type function. Since 5.0 has a decimal place after it,
Python tells us this is a float. We then set the number 0.35 to the variable “x” and pass “x” into
the type() function. Python tells us this is a float as well.
Strings

A string in Python is technically anything in quotes. Typically, but not always, strings are words
as opposed to numbers.

Data and analysis tools provided by Quantopian.com.

Here we pass the phrase ‘python is cool’ into the type() function. Python tells us this is a string.
Notice the quote around ‘python is cool’, indicating that this is, in fact, a string. We then set
‘python is cool’ to the variable “x” and pass “x” into the type() function. Python again tells us X is
now a string.

Booleans

Booleans can best be thought of as a type of “on” or “off” switch in Python. Technically,
booleans are “True” or “False” values. Notice the capitalization of “True” and “False.” These
are special Python words, stored as Boolean values.

You can either set a variable as “True” or “False” directly or code a logical operation which
Python will then read and return either “True” or “False”, similar to what we saw previously.
Data and analysis tools provided by Quantopian.com.

In the code above, we first check the type of the words “True” and “False”. Python told us these
were of the “bool” datatype, short for Boolean. We then set the variable “x” to “5 > 3” and
checked the type of x, “bool” again. Finally, we printed out “x”, which Python tells us is True
(because 5 is greater than 3).

Python Lists

A python list is a collection of objects. Python lists have square brackets “[]”.

Data and analysis tools provided by Quantopian.com.

In the code above, we first checked the type of a collection of numbers in square brackets,
Python told us this is a list. We then set the list to a variable – “my_first_list” and checked the
type of “my_first_list”. Python again told us this variable was a list. Finally, we printed out
“my_first_list” and Python returns the collection of numbers.

The len() function

The len() function is a quick way to check the length of an object. In this case, we are working
with Python lists, which are collections of objects in Python, as we just learned.

To check how many objects are in our Python list, we simply pass the list into the len() function.

Data and analysis tools provided by Quantopian.com.

In the code above, notice that Python returned the integer “5” after we passed our list
(my_first_list) into the len() function. This is because there is a total of 5 objects in this Python
list.

The len() function will come in handy when we are building trading algorithms, as you will see
later in this book.

Python Dictionary

A python dictionary is a collection of key:value pairs. Python dictionaries are designed so you
can look up the key and it returns the associated value. Similar to a regular dictionary, where
you look up the word (the “key”) and it tells you the definition (the “value”).

Python dictionaries are set up using curly brackets “{}” and key-value pairs are separated by a
colon with a comma in between each set of pairs.
Data and analysis tools provided by Quantopian.com.

In the code above we first set the variable “my_first_dictionary” to a dictionary we made in curly
brackets, containing three key:value pairs of three countries and their corresponding largest city.

We then checked the type of the variable “my_first_dictionary”, python told us this is a
dictionary.

We then printed out the dictionary and Python displays it for us.

Finally, to show the practical functionality of Python dictionaries, we referenced our dictionary
and followed that by a key in square brackets. Python returns to us the value associated with
that key.

In this case “NYC” is returned after we looked up the key “USA”.

Chapter 5 - Control Flow Statements

What are Control Flow Statements?

Control flow statements are a very important topic, specifically for trading purposes. These
statements control the “flow” of the code and are used to determine which lines of code get
executed and which do not.

As a practical example, it’s easy to imagine that we want to code logic such that if the price is
above a moving average we will buy, and if it is below a moving average we will sell. The most
straightforward way to achieve this would be an “if” statement.

White Space Matters

In Python, white space matters! Said another way, code that is indented will only be executed if
the code before it is “True”. If not, the code below will not be executed. If this is confusing to
you, the code examples to follow should clear up any confusion.

If statements

If statements first check if a condition or conditions are True. The code following the if statement
is executed if the proceeding code is True and not executed if the proceeding code is False.

Data and analysis tools provided by Quantopian.com.

In the code above, we first ask Python to check a comparison operation: is 5 greater than 2? If
this is true, which of course it is, the code indented under the if statement is executed. In this
case, the code prints out “yes, this is true” as can be seen above.

Data and analysis tools provided by Quantopian.com.

In this next code snippet, we asked Python if 5 is less than 2. Of course, this is not true. As
such, since this line of code is False, the code indented under it which prints out “yes, this is
true” is not executed, which is why you don’t see it printed out.

If … Else Statements

If … else statements are exactly what they sound like. Python first checks if the condition
following the “if” is true. If so, it executes the code under that condition. If not, then it executes
the code under the “else” statement.

Data and analysis tools provided by Quantopian.com.

In this code, Python first checks if 5 is greater than 2. Since it is, the code under that is
executed, so “yes, this is true” is printed.

Since the first if statement is true, the code under the else statement is not executed.

Data and analysis tools provided by Quantopian.com.

In this next example, the first line of code is False (5 is not less than 2). As such, the code under
the Else statement is executed, printing out “no, this is not true”.

If … Elif … Else Statements

The next control flow statements we will learn is if … elif … else. This statement first checks if
the line of code following the if statement if True. If it is, the code indented under the if statement
gets executed.

If not, then the code checks to see if the statement following “elif” is true (short for else…if). If
that statement is True the code indented under that is executed.

If both the if statement and the elif statement are false, then the code under the “else” statement
is executed. The following code examples demonstrate this concept.

Data and analysis tools provided by Quantopian.com.

In the above code, we set the integer 30 to the variable “x”. In the first if statement, we check if
“x” is greater than 20. Since it is, the indented code under the first if statement is executed,
printing out “the first if statement is true”.

Data and analysis tools provided by Quantopian.com.

In this next code example, the variable “x” is set to the integer 19. As such, the first if statement
is False. Python then goes to the next elif statement – is “x” greater than 15? Since it is (19 is
greater than 15), the indented code which prints out “the elif statement is true” gets executed.

Data and analysis tools provided by Quantopian.com.

Finally, in our last code example, we set the variable “x” to be the integer 10. As such, the first if
statement and the second elif statement are both False, since 10 is not greater than 20 and 10
is also not greater than 15. Since these lines are both False, Python executes the code under
the else statement – printing “neither are true”.

Loops

Loops are an important topic in any programming language. Loops are used to iterate through a
collection of items, such as a Python list or a Python dictionary. There are two basic types of
loops – “for” loops and “while” loops.

A “for” loop iterates through each item in a collection one by one, executing whatever desired
code you wish to implement on each item in the collection. Once the loop works through each
item in the collection, the loop is stopped.

A “while” loop continues to execute the loop while a condition is True, only stopping when that
condition is False.

We will show examples of both types of loops below.

For Loops

In our first example, we will just make a Python list containing a collection of numbers. We will
then write a for loop to iterate through the list, simply printing out each item.
Data and analysis tools provided by Quantopian.com.

Notice in the code above, we set up a for loop by typing the python word “for” followed by an
iterative variable “x”. This iterative variable can be named anything, for simplicity we will stick
with “x” as the iterative variable name. We next type “in” followed by the name of the collection
we wish to iterate through - “my_list”. This is followed by a colon.

The indented code is what happens inside the loop. In this simple example, all we do is iterate
through the list and print out the contents one by one as shown above.

Here is another example, this time we will iterate through the same list and add 5 to each item
and print out the result.

Data and analysis tools provided by Quantopian.com.

Notice in this example that Python takes each integer in our Python list “my_list”, adds 5 to it
then prints out the result.

While Loops

A while loop executes the loop while a certain condition is True. Only when than conditions fails
to be True does the loop stop.

Data and analysis tools provided by Quantopian.com.

In the code above, we first set the variable “y” equal to the integer 0. We then begin our while
loop. Our while loop first prints out the current value of “y”, then adds 1 to that value.

Notice that this loop executes only when the logical statement, in this case is y less than or
equal to 5, is True. Once the value of y becomes greater than 5, the loop is stopped. This is
why the integer 6 is not printed out.
Chapter 6 - Intro to Pandas: The Most Important Library for
Quantitative Trading

What is Pandas and Why Do I Care?

Now that we have some Python basics down, let’s get into more useful python programming as
it applies to Quantitative Finance and Trading. The most useful Python library for quantitative
research and trading is called Pandas.

Pandas is a software library written in Python designed specifically for data manipulation and
analysis. The Pandas library uses data in tabular format.

Tabular data is the main structure of data you will encounter in Finance and Trading. Pandas
was built specifically to work with this type of data, and more specifically, Pandas was built to
handle time series data.

Time series data refers to data that has an associated timestamp, such as a date or time (or
both). This is extremely common in Finance. Think of any price data you have encountered, for
example. This data will contain prices, maybe open, high, low and close along with volume,
accompanied by a timestamp. That is an example of time series data.

Pandas was designed to effectively deal with this type of data.

Let’s now jump into some Pandas coding examples. After we cover the basics of Pandas, we
will use our new-found skills to do some quantitative research, observing historical edges
provided by the popular technical analysis indicator RSI. Finally, we will go on to build our first
complete trading model in Python, incorporating the conclusions from our research project!
Let's get started.

Pandas DataFrames and Pandas Series

The two main data structures of the Pandas library are known as Pandas DataFrames and
Pandas Series.

Most of your work will be on Pandas DataFrames. Pandas DataFrames are simply tabular data,
similar to an Excel spreadsheet, only much more powerful. A Pandas DataFrame will always
have labeled rows, called the index, and labeled columns. These row and column labels will be
used heavily in your Pandas code.

A Pandas Series is similar to Pandas DataFrames, except a Pandas Series only contains one
column. This is juxtaposed to Pandas DataFrames, which can contain multiple columns. Just
like DataFrames, Pandas Series will also have labeled rows, called an index.
Indexes and Column Names

As mentioned above, Pandas is designed to deal with tabular data. All data in Pandas
DataFrames will have labels for both the rows and the columns. We will use this label to execute
Pandas code.

The row labels in a Pandas Series and a Pandas DataFrame is called the index. Whenever we
refer to the index of a Series or DataFrame, that is the label associated with each row. If there
are no explicit row labels provided, the index will be an integer index starting with 0 (so 0, 1, 2,
3, etc.).

Having an integer index, however, is not very useful and is not a best practice. As you will soon
see, a lot of Pandas code is based on referencing the index, so it's best to have an index that
makes sense and actually means something, like a date for instance.

For finance and trading, the most common index is some kind of date or time (or both). For
example, it is very common to be working with pricing data, which almost all trading models will
require. This data will be of the time series variety, with each data point corresponding to a date
or time.

Column names are the other labels you need to know to write efficient Pandas code. For pricing
data, this is often a label such as “open”, “high”, “low”, “close”, “volume”, etc.

Getting Data into The Quantopian Notebook Environment

In the rest of the examples in this chapter, we will be working in the notebook environment on
the Quantopian website. This is an embedded Jupyter notebook within Quantopian where we
can grab data and practice our coding. All you have to do is sign up for Quantopian to access
this tool, no downloading required. Go ahead and create an account on Quantopian.com, which
should take all of 5 minutes, to follow along.

To grab some data to practice our Pandas coding skills, we need to use the built-in Quantopian
function “get_pricing”. This function takes several arguments or things you pass into the
function. These arguments control what data we are grabbing, how much data, the frequency of
the data, etc.
Here we will use the get_pricing function to grab the prices for the ETF “SPY” from 08/16/2019
to 08/22/2019 using daily frequency (daily bars).

Data and analysis tools provided by Quantopian.com.

Notice the arguments we passed into our “get_pricing” function. First the security we want data
for - “SPY”. Next, we specified a start date and an end date using the arguments “start_date”
and “end_date” respectively. Finally, we specified the frequency of our data using the
“frequency” argument.

This results in a Pandas DataFrame with the dates being the index (row labels) and
“open_price”, ”high”, ”low”, ”close_price”, ”volume” and ”price” being the column names.

We will use this small sample DataFrame for the rest of this chapter to demonstrate some basic
Pandas code.

First, let's save our new DataFrame as a variable. This is accomplished the same way all other
variables are assigned in Python, simply with the equal sign (=).

We will set our new DataFrame to the variable named “df”.

Data and analysis tools provided by Quantopian.com.

Let’s quickly check the datatype of our newly formed object we named “df”. We will do this the
same way we did earlier in this book, using the type() function:

Data and analysis tools provided by Quantopian.com.

Notice that Python tells us that “df” is a Pandas DataFrame. Let's check out some of the things
we can do with our newly created DataFrame.
Pandas Indexing and Subsetting

We will begin our Pandas coding by learning how to grab rows and columns.

Selecting a column

To select a column of a DataFrame, we simply pass the name of the column (as a string) into
square brackets after the name of the DataFrame.

Data and analysis tools provided by Quantopian.com.

In the code above, we simply passed ‘price’ into square brackets after the name of our
DataFrame (df). Remember ‘price’ is the name of one of the columns in our DataFrame. This
returns a Pandas Series containing just the ‘price’ column.

Selecting multiple columns

To select multiple columns from a DataFrame, you pass a Python list into square brackets after
the name of the DataFrame, with the list containing the names of the columns you wish to
select.

Remember a Python list is itself in square brackets. So, since we are passing a list into square
brackets after the name of the DataFrame, this actually results in double square brackets.

See the code snippet below.

Data and analysis tools provided by Quantopian.com.

Selecting a Row

To select a row in Pandas, it is best practice to use either .loc[] or .iloc[] after calling the
DataFrame name.

.loc[] and .iloc[]

Both .loc[] and .iloc[] can be used to select rows in Pandas, though they do such in slightly
different ways.

.loc[] is used to select rows based on its index (row label). As such, you have to pass in the
name of the row or rows you want to select.

.iloc[] uses integer-based location. So instead of passing in the name or names of the rows you
want to select, you pass in the integer location of the row or rows you want to select. This will
become clearer once you see the examples and is very handy for trading purposes as we will
soon see.

Selecting a Row using .loc[]

In the code below, we select the row containing the data for the day “2019-08-21” using .loc[].
This works because “2019-08-21” is in the index (row labels) of our DataFrame.
Data and analysis tools provided by Quantopian.com.

Notice that the values in this row are returned.

Selecting a Row using .iloc[]

To select rows via their integer location instead of the row names, use iloc[].

It is important to remember that Python indexes are 0 based, meaning it starts with 0. So, the
first row is actually row 0 and the second row is row 1.

Data and analysis tools provided by Quantopian.com.

Notice above that df.iloc[0] returns the first row of our DataFrame (08/16/2019) and df.iloc[1]
returns the second row (08/19/2019).

A handy thing about using .iloc[] is you can count backwards using negative numbers. This is
something you will find yourself doing a lot when coding trading models.

For example, df.iloc[-1] would result in the last row. In trading, this is often the most recent data.
If you want to reference the most recent moving average value, for example, you would use
.iloc[-1].
Notice that using .iloc[-1] here returns the last row of our DataFrame (08/22/2019)

Data and analysis tools provided by Quantopian.com.

Selecting multiple Rows using .loc[]

In the code below, we select the rows ‘2019-08-19’ to ‘2019-08-21 using .loc[]. Notice the use
of the colon here, we are telling Pandas we want all data from the first date to the last date.

Data and analysis tools provided by Quantopian.com.

Finally, we can tell Pandas to give us all the rows from the beginning of our DataFrame to a
date, or from a date to the end of our DataFrame. To achieve this, we use .loc[] and either
leave the left side of the colon blank (start at the beginning) or the right side of the colon blank
(go to the end).

The examples below should make this clear.

Data and analysis tools provided by Quantopian.com

Data and analysis tools provided by Quantopian.com.

Notice in our first example, we used df.loc[ : ’2019-08-19’ ] to select all of the rows from the
beginning of the DataFrame to the row corresponding to ‘2019-08-19’.

In our second example, we used df.loc[ ’2019-08-19’ : ] to select all of the rows beginning with
‘2019-08-01’ to the last row of the DataFrame.

Selecting multiple Rows using .iloc[]

Selecting multiple rows works the same way with .iloc[], just pass in the integer locations as
opposed to the labels.

Here we grab rows corresponding to the integer index #3 (the 4th row) to the end:

Data and analysis tools provided by Quantopian.com.

Selecting rows and columns

To select both rows and columns, we will also make use of .loc[] and .iloc[]. The general
structure here is .loc/.iloc[row/rows , column/columns].

To select different rows and columns, simply pass in the rows you want to select, followed by a
comma, followed by the columns you want to select.

The functionality of .iloc[] and .loc[] remain the same, namely .iloc[] uses integer based selection
while .loc[] uses label based selection.

See some examples below:

First let’s observe our whole DataFrame:

Data and analysis tools provided by Quantopian.com.

Let’s now use .loc[] to select a row and a column. In this example we will grab data from the
row corresponding to the date 08/19/2019 and the column “high”:

Data and analysis tools provided by Quantopian.com.

Notice that only 293.079999 is returned, which is the high from 2019-08-19.

We can do the same thing using .iloc[], this time we have to pass in the integer locations for
“2019-08-19” and “high” instead of the names.

Data and analysis tools provided by Quantopian.com.

Grabbing multiple rows and multiple columns works the same way as we saw before,
specifically you can use a colon to tell Pandas “I want this row to that row”, for example.

Here we grab rows “2019-08-19” to “2019-08-20” and columns “high” to “low”:

Data and analysis tools provided by Quantopian.com.

Boolean Indexing

A useful technique to know, which will come in handy when writing trading strategies, is what is
known as Boolean indexing. Boolean indexing is used to filter the data in our DataFrame or
Series based on a logical expression as opposed to row/column labels or integer locations.

Remember when we asked Python a logical operation, such as is something greater or less
than something else? Python then returned a Boolean value, a True or False value such as the
simple example below:

Data and analysis tools provided by Quantopian.com.

We can do that same thing for a row or column in a Pandas DataFrame. Pandas will return a
Series of True and False values.

We will continue to use our small 5 row DataFrame (“df”) we have been working with. As a
reminder, here is the full DataFrame:
Data and analysis tools provided by Quantopian.com.

In the code below, we grab the “close_price” column and ask Pandas is the close_price for each
day above $292? Notice Pandas returns a collection of True/False (boolean) values.

This collection of Boolean values is itself a Pandas Series:

Data and analysis tools provided by Quantopian.com.

We can visually inspect this and see that the first and third row have close prices below $292
while the second, fourth and fifth rows have close prices above $292.

You might be thinking, “who cares, I could have just looked to see if that was true.” Well, here is
where the magic comes in.

If we pass that expression into square brackets after the name of the DataFrame, Pandas will
return to us only the rows that are “True”. In this case, it would be only the rows (days) with a
close value above $292.
Data and analysis tools provided by Quantopian.com.

Notice in the code above that only 3 days are returned since there are only three days with a
close above $292.

While this simple example is just to demonstrate the technique, think for a minute how useful
this technique could be from a trading perspective.

Let's say we had a universe of 500 stocks, for example, and we wanted to do a quick filter of all
the stocks which have a current price greater than some moving average. This technique of
Boolean filtering can get this logic done for use quickly and easily, without the need to write for
loops!

Boolean filtering isn’t limited to just one condition either. You can use multiple conditions as
well. You can also save those conditions as their own variables and use those variables to
conduct your Boolean filtering.

We will show this in the following examples. Here we print our original DataFrame for reference,
then create two variables named “condition_1” and “condition_2” which contain logical
operations.

● condition_1 : Is the price greater than $292?

● condition_2: Is the volume greater than 40,000,000?
Data and analysis tools provided by Quantopian.com.

Notice the series of Boolean (True/False) values returned when we print out “condition_1” and
“condition_2”.

Now let’s combine these conditions to filter our DataFrame.

We can combine the conditions using:

● “&” - meaning “and”
● “|” meaning “or”

Data and analysis tools provided by Quantopian.com.

In the code above, we combined both conditions using “&”. Notice that this only returned one
row, as only one day satisfied the conditions that both the price was greater than $292 AND the
volume was greater than 40,000,000.

Data and analysis tools provided by Quantopian.com.

Here we combined the conditions using “|” meaning “or”. Notice that four days are now
returned, corresponding to the days where either the price was greater than $292 OR the
volume is greater than 40,000,000.

Boolean filtering is a powerful technique, allowing you an unlimited number of filtering conditions
in just a few lines of code!

Some Basic DataFrame Methods

What is a method?

A method in Python is similar to a function, with the key difference being that a method is
dependent on the object you call it on. Meaning integers, for example, will have different
methods associated with it than, for example, strings. This is another example of why knowing
datatypes matters in your Python coding.

The general syntax for methods is below:

OBJECT.SOME_METHOD()

In this section, we will learn about a bunch of useful methods for Pandas DataFrames. This
certainly isn’t an exhaustive list of all methods available, but these are the most common
methods used in trading and quantitative finance.

For the examples in this section, we are going to use the following DataFrame. This DataFrame
contains daily pricing data over a 17-day time frame.

There are 17 rows (for the 17 days in our sample) and six columns to start with, named
“open_price”, “high” , “low” , “close_price” , “volume” and “price”
See the sample DataFrame we will be using in this section below:

Data and analysis tools provided by Quantopian.com.

.info() and .describe()

To view some summary statistics of our DataFrames, we can use the .info() and .describe()
methods.

.info() displays general information about our DataFrame such as the names of the columns,
how many rows we have, what the DataTypes of the columns are and how much memory this
DataFrame is currently using.
Data and analysis tools provided by Quantopian.com.

Inspecting the output of our df.info() call, we can see that our DataFrame has 17 entries (17
rows) and 6 columns. We can also see the names of the columns and the datatypes of the data
contained in that column.

For our DataFrame, all columns contain data of the datatype “float”, which is a non-whole
number (a number with a decimal place).

The .describe() method displays statistical summaries of the columns in our DataFrame such as
count, mean, max, min, etc.

Data and analysis tools provided by Quantopian.com.

Inspecting our df.describe() output we see the count, mean, std, min, max and percentile values
for every row in our DataFrame.

.rolling(), .mean() and adding a new column to our DataFrame

In our first example, we will cover the technique used to add a new column to an existing
DataFrame.

The general structure for adding a new column is as follows:

DataFrame[ “Name of new column “ ] = What is in the new column

In this example, we will also introduce the first two DataFrame methods we will cover- .rolling()
and .mean().

.rolling() is used to grab a rolling window of data. This is very useful in calculating things like
moving averages, which we will do here.

The amount of rolling periods you want is controlled by an integer argument you pass into the
parentheses after .rolling(). For example, if you want a five-day rolling window, you would use
.rolling(5).

.mean() is self-explanatory. It computes the mean (average) of a column in your DataFrame. If

you add .mean() after a .rolling() call, the mean will be the average of the rolling window.

We are going to do that in the following code. Here we add a new column we will name “SMA”
which calculates the 5-period simple moving average of the closes.
Data and analysis tools provided by Quantopian.com.

Notice here we referenced the close_price column, grabbed a rolling 5-period window of data
using .rolling(5), then calculated the mean using .mean(). We set the result as a new column
titled “SMA” - all in one line of code!

.head() and .tail()

These two methods return the first or last X rows of a DataFrame for .head() and .tail()
respectively. By default, .head() and .tail() returns the first and last 5 rows.
Passing an integer into the parenthesis controls how many rows are returned (if you leave it
blank, 5 will be returned).

Example:

Data and analysis tools provided by Quantopian.com.

Notice the head() and tail() methods returned the first and last 5 rows of our DataFrame,
respectively, by default. If we explicitly wanted to have it return the last 2 rows, we would pass
in the integer 2.

.median() .std() .max() .min()

To calculate standard statistical measures on our data, we can grab a column or columns and
use .median() to get the median, .std() to get the standard deviation, .max() to get the largest
and .min() to get the smallest.
This is similar to what we already saw with .mean()

Data and analysis tools provided by Quantopian.com.

.pct_change()

A common task in quantitative finance is to calculate the percent changes of a security over a
certain time period.

Pandas makes this very easy by including a .pct_change() method. This method calculates the
percentage change on a column of data.

If we don’t pass in any arguments to our pct_change() function, Pandas automatically calculates
the percent change for one row (in this case one day). This is very useful if we quickly want to
change daily closing prices into daily returns.
Data and analysis tools provided by Quantopian.com.

If we want to calculate the percent change over other time periods, we would pass the number
of rows we want in our .pct_change() function.

If, for example, we want to get the rolling percent change for the last 5 days, we would use
pct_change(5).
Data and analysis tools provided by Quantopian.com.

.dropna()

Notice that in our DataFrame, there are a couple of data points in the “SMA” column that shows
up as NaN (not a number).

This happens because we are calculating a 5-day moving average, but we need at least 5 day’s
worth of data to accurately calculate the average. For the first four rows in our DataFrame, we
don’t have enough data points to calculate the average, so Python returns NaN.

These NaN values can, at times, mess up the calculations for our code. As such, Pandas has
the handy .dropna() method, which by default drops (deletes) any row that contains a “NaN”
value.

See the code snippet below, the first code just prints out our whole DataFrame (notice the
“NaN” values in the “SMA” column for the first four rows). In the second piece of code, we add
the .dropna() method which results in those four rows containing NaN values to be dropped.
Data and analysis tools provided by Quantopian.com.
.shift()

A useful technique in quantitative research and trading is to shift values. If your index is a date,
which is typical in finance, this has the effect of shifting prices forward or backward in time.

How many rows to shift the value is controlled by the argument passed into the .shift() method.

In the code example below, we add a new column “Tomorrows_Close” by shifting the “price”
column back in time by one day (one row in this case).

Data and analysis tools provided by Quantopian.com.

Using this technique, we can observe the future return of a security given some factor, technical
analysis indicator, or anything else. We will do this in the next section.

.sort_values()

To sort a specific column in our DataFrame, we would use the .sort_values() method. In this
method, we need to pass in an argument to tell Pandas which columns we want to sort by.
There are also arguments for this function that control whether you want to sort from high to low
or low to high.

Notice that the code below sorts the DataFrame from low to high based on the “price” column.
Notice that the index (dates) are not in chronological order anymore.
Data and analysis tools provided by Quantopian.com.

Resampling Data in Pandas

Resampling data in Pandas is a very important technique for trading. Resampling refers to
changing the frequency of your price data. For example, if you have daily price data, like we do
in this example, and you want to change it to weekly frequency (weekly bars) you would employ
the .resample() method.

For the resample method, we need to pass in two things. First would be the time frequency we
want to resample the data to. For example, to resample to weekly data we would need to pass
in “w” to the resample method. Find all the available resample frequencies below:
Data and analysis tools provided by Quantopian.com.

The next thing we need is an aggregation method of our resampled data, letting Pandas know
how we want it to aggregate the data. For example, we can use .mean() along with the
.resample() method to return the average (mean) price for every week.

While this is useful for some tasks, what is more common in finance is to resample and take the
last data point. For example, if we had daily data and wanted to resample to weekly closes, we
would have to chain .last() after our .resample() method.

See the example below, where we resample our DataFrame from the daily frequency to the
weekly frequency, taking the weekly closes as our aggregation method using .last().
Data and analysis tools provided by Quantopian.com.

Notice that the dates are now the end of every week, as opposed to every day.
Chapter 7 - Case Study: Using Pandas to Conduct Quantitative
Financial Research

Finding historical edges based on different RSI values

We are now going to transition to conducting preliminary quantitative research that can be
implemented using our new Python/Pandas skills.

We will go over this step by step, explaining what we are doing and showing the code
examples.

All the techniques we are going to use in this example we have already covered in our
introduction to Python and Pandas.

What Is The Research Question?

Let’s say, for example, you aim to develop a trading strategy that uses the popular RSI technical
analysis indicator. You hypothesize that low RSI readings will lead to higher than average
returns over the next couple days since those securities showing low RSI readings will naturally
be oversold and are likely to revert to the mean.

You visually inspect some charts and this does seem to be the case. You observe consistent
snap-back rallies after low RSI readings.

Visually inspecting charts, however, is not nearly enough. Let's write some code to see if our
observation in fact holds up quantitatively.

Our goal for this research is to observe the future percent changes f or a given security after a
certain RSI value is reached. This will show us if this indicator has any predictive power and will
guide us as to how we use this indicator in our potential trading strategy.

Steps to Conduct This Analysis

The steps necessary to conduct this analysis are:

1. Grab data for the security in question.

2. Make a new column which calculates the RSI indicator value for every day in our
sample.
3. Make an additional new column which displays the future 3-day percent changes for
that security. We will break this down into two steps, first, we shift the closing prices back
in time (to observe the future prices) and second we calculate the 3-day percentage
change of the future prices.
4. We use Boolean filtering to filter our DataFrame into RSI readings in different buckets.
5. Finally, we observe the future 3-day percent changes given the different RSI values.

Step #1 - Grab Data for the Security in Question

In the code below, we grab data for SPY, the most popular ETF in the world. We will use this as
our example security.

We get this data via the “get_pricing” function in our embedded Jupyter notebook on the
Quantopian website. Notice we pass in the arguments for:
● The security we want data for, in this case “SPY”
● Our start date
● Our end date
● The frequency of our data

We then inspect the first 5 rows using the .head() method.

The result is a DataFrame, which we saved as “df”, spanning January 2003 to August 2019
containing data for SPY in the daily frequency.

Note: this data is total return data, so it is adjusted for dividends and other corporate actions.

Data and analysis tools provided by Quantopian.com.

Let’s get some quick information about our new DataFrame - “df.” We use the .info() method
here to give us a macro view of the data in this DataFrame.
Data and analysis tools provided by Quantopian.com.

Here we can observe some information about our DataFrame, including the number of rows
(4,184 rows/days), the six column names, and the date range (01/02/2003 - 08/15/2019) to
name a few.

Step #2 - Make a new column which calculates the RSI value for every day in our sample

Next we have to calculate our technical analysis indicator, RSI. While it certainly is possible to
code the actual math to calculate the RSI in Python, this indicator already comes prepackaged
in a third-party package called “TA-LIB.” Lucky for us, TA-LIB is already available on the
Quantopian website for us to use.

Before we use TA-LIB, however, we must import it into our environment and give it a name. We
do this via the “import” keyword. We name this package “ta”, which will be how we refer to it in
our code.

Now that we have TA-LIB imported, we can use it to calculate the RSI values per day. We will
add this statistic to our existing DataFrame, putting it in a new column called “RSI.”
Data and analysis tools provided by Quantopian.com.

In the code above, we told Pandas we are making a new column called “RSI” which contains
the 4-period RSI values per day. We used TA-LIB to calculate this, passing in the column we
want to use for our calculation (in this case the “close_price” column) and the length of our RSI
(in this case 4).

This will result in NaN values for the first three rows, since there isn't enough data to calculate
the RSI. We will use the .dropna() method to drop any rows that now contain NaN values.

The argument “inplace=True” just tells Pandas to save the new DataFrame “df” as a new copy
with the NaN rows dropped.

Step #3 - Make a new column displaying the future 3-day percent changes.

The next step is to make a new column displaying the future 3-day percent changes for our
security - SPY. If you have never done this before it may seem a bit tricky at first, but once you
get the hang of this technique it will become second nature.

We can easily get this done in one line of code, but for the sake of clarity, we will do this in two
steps, making two new columns.

The first step is to shift the close prices back in time. We can do this via the .shift() method.
Remember the goal here is to observe future price changes, shifting the data back in time
allows us to achieve this.
Data and analysis tools provided by Quantopian.com.

In the code above, we made a new column in our DataFrame titled “Future_3_day_close.” We
populate this column using the close_price column shifted back 3 rows (3 days).

We used .shift(-3) to achieve the desired result. To inspect the last 10 rows of our DataFrame,
we used .tail(10).

The next step is to take our new “Future_3_day_close” column and calculate the 3-day
percentage change. This will result in a column which contains the 3-day future percent
changes, which is what we are after.

Data and analysis tools provided by Quantopian.com.

Here we made our new column, “Future_3_day_pct_ch”, which we calculated by taking the
“Future_3_day_close” column and adding the .pct_change() method, passing in “3” for 3 days.

Finally, since these calculations resulted in some NaN values for very recent days (we can’t tell
the future, after all), we will add our .dropna() method to drop those rows.

Again the “inplace=True” argument just saves the new DataFrame “in place”, meaning a new
copy with the NaN rows dropped.

Data and analysis tools provided by Quantopian.com.

We now have our complete DataFrame ready to go.

To summarize, we first grabbed data using Quantopian’s built-in “get_pricing” function. We then
calculated the 4-period RSI values for every day using ta-lib and added that as a new column
titled “RSI”. Finally, we calculated the future 3-day percent changes by shifting the closing
prices back in time by 3 days and calculating the 3-day percent change.

Now all that is left to do is filter our DataFrame using Boolean filtering and observe the results.

Step #6 - Use Boolean filtering to filter our DataFrame using different RSI readings

Our next step is to filter our DataFrame for different RSI readings and observe the future
percent changes.

Before we do this, however, it's helpful to get a baseline. Let's observe all the days and the
average 3-day return for the 4,000+ days in our sample. The column we care about here is
“Future_3_day_pct_ch.” We will use the .describe() method to get some summary statistics for
this column.
Data and analysis tools provided by Quantopian.com.

We can see that we have 4,174 days in our sample and the average 3-day percentage return
was 0.0012 (0.12%).

Let’s now observe the future 3-day percent changes when the RSI is below 10. To do this we
first have to code a logical expression that checks if the RSI each day is below 10.

We then pass that logical expression into square brackets after our DataFrame name to filter
our DataFrame to return only rows (days) with an RSI value below 10. We name this new
DataFrame “filtered_df” and observe the last 5 rows using .tail(). Notice the RSI column, all the
values here are below 10.

Finally we grab the “Future_3_day_pct_ch” column in our new “filtered_df” DataFrame and use
.describe() as before.

Data and analysis tools provided by Quantopian.com.

Wow, the average 3-day return after an RSI reading of less than 10 was 0.0089 (0.89%), much
higher than the average 3-day return of the whole sample (0.12%). In fairness, the sample size
here is small, only 46 days as we can observe via the “count” statistic.

Let's do one more example to make sure we are clear. Here we are going to look for RSI
values greater than 10 but less than 20. We follow the same code structure as before; the only
difference is this time we need to pass in two logical expressions (RSI > 10 & RSI < 20) into the
square brackets along with “&”.

Data and analysis tools provided by Quantopian.com.

This time, after filtering for RSI values above 10 but below 20, we observe 211 trading days and
an average 3-day future return of 0.0037 (0.37%). Not as high as when RSI is below 10, but still
about 3 times higher than the average 3-day return for the whole sample (0.12%).

The table below provides more statistics using this technique. Average 3-day future returns are
listed for 10 buckets of RSI as well as the whole sample. The returns for each RSI bucket are
then plotted as a histogram.

It looks like low RSI readings do lead to higher average future 3-day returns. As such, we can
use this signal as a potential input to a complete trading model, which we will do in the next
chapter.
Data and analysis tools provided by Quantopian.com

Data and analysis tools provided by Quantopian.com.

Chapter 8 - Introduction to Zipline and Quantopian

In this chapter, we will be walking through the steps necessary to create our first trading
algorithm. This chapter is for educational purposes and meant to teach the basics of the
Quantopian/Zipline backtesting environment. The purpose is not to develop a world-beating
trading strategy.

Let’s get started!

What is Zipline?

Zipline is an event driven, open source backtesting engine written in Python. It is currently the
most popular and full featured Python backtester in the world. Zipline is the backend that
powers Quantopian.com.

Since Zipline is open source, you can download it to your local machine and use your own data
sources, though this takes some technical maneuvering to get it working properly.

Since this book is meant to introduce you to Python programming as it applies to quantitative
trading, we will be working on the Quantopian website, which doesn’t require any downloads.
Keep in mind that everything we learn regarding writing Zipline algorithms can be applied to
Zipline on your local environment as well.

The initialize() function

We will begin our introduction to Zipline by introducing the mandatory initialize() function. The
initialize() function is a required function for any trading strategy developed using Zipline.

What does the initialize() function do? The initialize() function is used to “initialize” our
algorithm, meaning it is used to “set up” our algorithm. The initialize function is a required
method that is called only once, at the beginning of a backtest.

As such, some of the things you code in the initialize() function include:

● Determine what securities you want your algo to trade.

● Set up any global variables you want to use (this will be explained later).
● Scheduling other functions to run.
● Setting up assumptions for transaction costs and slippage.
● Setting a benchmark to compare your algo against.
● Register a pipeline, Zipline’s way of handling a dynamically changing universe of
securities.
The initialize function always takes one argument - the important “context” argument.

What is context?

An important and often confusing topic for those new to Zipline is the “context” object. The
context object is technically an augmented Python dictionary that is used to maintain state
throughout your algorithm.

What does this actually mean?

For practical purposes, you can think of the context object as a way to maintain global variables
throughout your algorithm, as well as do things such as reference portfolio-level statistics such
as amount of current positions, cost basis of your positions, and current cash available just to
name three.

Global variables refer to variables that can be referenced anywhere in your algorithm, not just in
the current function you are working in. Think of things such as the maximum amount of
positions your strategy will hold, the length of a moving average and the lookback for RSI as
three examples of when global variables should be implemented.

The context object is an augmented Python dictionary so that properties can be accessed using
dot notation.

As a simple example, let’s say you wanted to set a moving average length of 200-days. You
can set up the global variable “context.ma_len” in your initialize function:

Data and analysis tools provided by Quantopian.com

From there, in your trading logic, you can reference that global variable when doing calculations.
You can also easily change the value of your variable, in this case the length of a moving
average, by changing just one number in your algorithm – “context.ma_len”.

This makes your code much more maintainable, dynamic, and robust.

If all of this seems confusing to you, fear not. Seeing an example will be a big help, and actually
setting up your own trading algos will be an even bigger learning experience.
Setting up Securities to Trade

Another task we must do in the initialize function is to set up what securities (stocks, ETFs, etc)
that our trading algorithm will be using to trade.

There are several ways to set this up.

One way would be to use Zipline’s pipeline API to dynamically bring in securities, such as the
500 most liquid US stocks on a monthly basis. Covering the pipeline, however, is a bit beyond
the scope of this beginner e-book, so we are going to stick with manually setting up securities
here.

As your Python and Zipline skills advance, you will find the pipeline to be a useful and powerful
tool in developing sophisticated trading strategies.

A straightforward way to set up a security to trade is by coding:

context.SOME_NAME = sid(SOME_SID_NUMBER)

SID stands for “security ID” and is the way Quantopian will know what exact security you want to
trade.

To look up the correct SID for the security you want to transact in, simply type sid(). This will
then bring up a dynamic dropdown list for you to search for and choose the security you are
looking for.

Data and analysis tools provided by Quantopian.com.

Notice that we first set up a variable - context.spy. We then have to set that equal to the correct
SID for the SPY ETF, in this case, SPY’s SID is 8554. There is no need to remember the actual
SID numbers, all securities are easily found via the dropdown menu.

The finished code looks like this:

Now if we want to reference SPY, either in our trading logic or to put in an order, we just
reference “context.spy”.

We must set the security to context.SOMETHING, as this context object is the way to set global
variables in our algorithm.

If we want to trade a basket of stocks or ETFs, another way to set this up is to create a Python
list, like we learned about in chapter 2. This is useful if we want to iterate through the securities
in our list using a for loop, which is common.

The following code sets up a list of securities called context.assets.

Data and analysis tools provided by Quantopian.com.

Notice the square brackets around the list of SIDs. Remember all Python lists have square
brackets.

Scheduling Other Functions

Another task that is implemented in the initialize function is scheduling other functions to run.
For example, let's say you wanted your trading logic to run once a day, 5 minutes before the
market close. You would schedule that function (you give it a name) to run in your initialize
function.

To schedule another function to run, you use schedule_function().

Schedule_function() takes a couple arguments:

● Name of your function

● Date rules - when you want your function to run (daily, weekly, etc)
● Time rules - what time you want your function to run (market open, market close, etc)
Here is an example:

Data and analysis tools provided by Quantopian.com

The above “schedule_function” function schedules a function called “trade” to run every day, 10
minutes before the market closes.

Another example:

Data and analysis tools provided by Quantopian.com

This “schedule_function” function schedules a function called “trade_weekly” which runs on the
last business day of every week, 5 minutes after the market opens.

Writing a function containing trading logic and execution

Once we set up our initialize function, including selecting which securities we want to trade and
scheduling other functions (which will contain our trading logic) to run, let's now explore how to
write some actual trading logic.

This is where we can put our newfound Python/Pandas skills to good use!

Let's assume we scheduled the function “trade” as shown below:

Data and analysis tools provided by Quantopian.com

From there, we would need to write this “trade” function. All our scheduled functions take two
arguments - context and data.

Data and analysis tools provided by Quantopian.com

Let’s now cover some common tasks required to do calculations, writing trading logic and do
trades.
Getting Current Data - data.current

Getting current and historical data is an obvious important step in most any trading strategy.

Now by “current” here, we mean the most recent data point in that point in the backtest. We
do not mean the actual most current data as of the current date you are writing Python code.

This, after all, would be cheating. We can’t know the future, so our algorithm needs to take the
“current” (at that point in that backtest) statistics when executing trading logic and entering
orders. This avoids look-ahead bias.

To get the current price of a security, we use the data object and use dot notation to reference
the most current data - data.current.

Arguments passed into data.current include the security you want current data for and what field
you are requesting.

In the code below, we are pulling in the current price for SPY and saving it as a variable -
current_price.

Data and analysis tools provided by Quantopian.com

Getting Historical Data - data.history()

To access a historical window of data, which we would need to calculate most if not all technical
analysis indicators (such as moving averages and RSI), we also need to use the data object,
this time data.history.

data.history accepts four arguments:

1. The asset or assets we want historical data for.

2. The field we want - such as “close”, “high”, etc.
3. The number of bars to be returned - think of this as a rolling window through time.
4. The frequency of the bars - “1d” for daily and “1m” for minute.

Here is an example, this grabs a 10-day history of closes for SPY

Data and analysis tools provided by Quantopian.com.

We now have both current and historical data we can then reference, transform, and use to
calculate indicators, control trading logic, and much more.

Ordering

So how do you actually send an order to be executed in Quantopian?

There are several ways to do this, depending on what you are trying to achieve. Do you have
an amount of shares you know you want to buy/sell? Or would you rather buy or sell based on
dollar value, and have Quantopian figure out how many shares that equates to? Or would you
want to control your trades by percentage of your portfolio - logic such as “buy XYZ stock with
10% of my portfolio”?

All of this is easily facilitated in the Zipline API.

Below are a couple of examples of ordering in Quantopian. This is not an exhaustive list, but
something to get you started.

Data and analysis tools provided by Quantopian.com.

In the above examples, the first one, using the “order” function, simply buys a specified amount
of shares of SPY. The second line of code “order_value” buys a specified amount of SPY
expressed in dollars. Finally, the last line of code, “order_percent” buys SPY based on a
percentage of your portfolio.

Note: if you want to sell, or sell short, simply change the values to negative numbers.

Other interesting ordering methods include the ability to “target” a specified amount. Instead of
just buying, say, $2,000 worth of SPY every time your logic is satisfied, this targets a position of
$2,000 worth of SPY in your portfolio.
Appropriate trades will then be taken to get your position value to $2,000, whether that involves
buying more or selling to get your allocation back in line. This obviously depends on your
current positions, which Zipline automatically checks for you.

Here are some examples:

Data and analysis tools provided by Quantopian.com.

What about limit orders?

Other order types, such as limit orders, stop orders, etc. are controlled through an argument
called “style” that you pass into any of the order examples we previously covered.

Here is an example of an order targeting 20% exposure to SPY, but only if the price gets to a
certain level (limit order).

Data and analysis tools provided by Quantopian.com.

This order would target a 20% allocation to SPY only if SPY traded at or below $285.

Accessing Portfolio Information

Accessing portfolio level information is an important part of coding a complete trading strategy.
After all, we don't want our algorithm to just continually buy a security if a condition is met if we
don't have the capital available to make the purchase, right?

Access information like this is done through context object again using dot notation. We won’t
cover every bit of information you can pull here, just some common ones to get you started:
Data and analysis tools provided by Quantopian.com.

In the code above, notice we referenced the context object followed by .portfolio. From there,
we can see all of our options for things to reference in our algorithm.

Most of these are self-explanatory. It is easy to imagine we would want to reference such
information in our algorithms.

For example, let's say we wanted to check how much capital is currently being used by your
strategy to determine if we can take a new trade or not. This is easily accessed via
context.portfolio.capital_used.

Once you get the hang of referencing these portfolio level statistics it will become second
nature.

To reference statistics regarding individual positions, simply pass that security into square
brackets after “context.portfolio.positions” to see the options available to you.

For example, the code below retrieves the number of shares we are long or short for a specific
security, in this case SPY. This is useful to check if we already have a position in that security or
not, which we will use in our example in the next chapter.

Data and analysis tools provided by Quantopian.com.

Putting It All Together

In our final chapter, we are going to walk through our first trading strategy step by step. We will
then provide the whole source code followed by an inspection of the backtested results.
Chapter 9 – Case Study: Writing Your First Zipline Algorithm

In this chapter, we will put everything we learned together. We will use our new Python/Pandas
skills as well as our knowledge of the Zipline API to create our first trading strategy. As this is
an introductory book, we will keep the strategy relatively straightforward.

We will go step by step, explaining what we are doing and offering code snippets. At the end of
the chapter, we will present the entire source code of the strategy. In the next chapter, we will
present the backtested results.

In this chapter we will:

1. State the rules of our strategy

2. Set up our initialize function
3. Write our other functions containing the trading logic

Let’s get started!

Coding Techniques We Will Use

For our example strategy, we are going to continue with the research we conducted in chapter
7.

Remember how we observed higher future returns when the 4-period RSI values are low? We
will use this insight to construct a simple trading model.

I want to reiterate; this model isn’t meant to be some world-beating algo. Instead, what we are
trying to do is tie in some of the techniques we have learned so far in this book.

Some of the Python/Pandas techniques we are going to utilize here include:

● For loops
● If statements
● Python Lists
● Certain specific functions and methods such as .len() and .mean()

We will also utilize some specific Quantopian/Zipline API code:

● Scheduling functions
● Pulling both current and historical data
● Utilizing order method including “order_percent” and “order_target_percent”
● Pulling portfolio level information - in this case checking the number of shares we are
long/short in a security

Trading Strategy Rules

Below are the rules for our strategy.

● Our algo will trade 10 US Sector ETFs:

○ XLY - Consumer Discretionary
○ XLK - Technology
○ XLP - Consumer Staples
○ XLF - Financials
○ XLV - Health Care
○ XLU - Utilities
○ XLI - Industrials
○ XLB - Materials
○ XLE - Energy
○ VNQ - Real Estate
● If the 4-period RSI is below 20 and the price of the ETF is above its 200-day moving
average, we will buy that ETF.
● If we are long an ETF, we will sell when the 4-period RSI is above 70
● Any capital not allocated to a sector ETF will be in high-quality bonds. We will use the
ETF “AGG” for this purpose (US Aggregate Bond Market).

Taking a step back and thinking about this strategy - what we are basically doing is investing in
high quality bonds (AGG) unless there is a mean reversion opportunity in one of the sector
ETFs.

When there is a mean reversion opportunity in one of the sector ETFs, as measured by an
RSI(4) value under 20 and the ETFs current price being above its 200-day moving average, we
sell some of the bonds and invest in that sector. We hold that sector until its 4-period RSI is
above 70.

Given we are in bonds most of the time, we should expect this strategy to have relatively lower
returns and much lower risk, in the form of volatility as well as max drawdown, compared to SPY
itself.

There you have it - simple and intuitive.

Let’s now code it up step-by-step!

Importing packages we are going to use

In this trading strategy, we are again going to utilize the third-party package “TA_LIB” to
calculate our 4-period RSI value, just like we did in chapter 7.

We have to import this library to make it available for us to use. To do this, we simply use
“import” followed by the name of the package. We then save it as a name, in this case “ta”, to
be used in our code.

Now when we reference “ta” in our code, we are utilizing the ta-lib package. This makes
calculating technical indicators much easier.

Setting up our initialize() function

Remember, the initialize() function is a required function for our algorithm to run and is used to
set up the algorithm.

We are going to do a few things in the initialize() function for this algorithm including setting up
the securities we want to trade and schedule other functions to run, which will contain our actual
trading logic as well as place our orders.

Setting Up Securities to Trade

We are going to set up a Python list called “context.sectors” which contains our 10 US sector
ETFs. We are also going to set up our bond ETF, which I am calling “context.bonds”.

Data and analysis tools provided by Quantopian.com.

Scheduling Functions

The next thing we are going to need to do is set up other functions to run which will contain our
trading logic. For this strategy, we are going to set up three additional functions called “entries”,
“exits” and “trade_bonds”, whose purpose is self-explanatory given their names.

We are going to schedule “exits” to run every day, 15 minutes before the close. We are then
going to schedule “entries” to run every day, 10 minutes before the close. Finally, we are going
to schedule “trade_bonds” which will run 5 minutes before the close.

Data and analysis tools provided by Quantopian.com.

That's all we are going to need to do for the initialize function. Our final initialize function looks
like this:

Data and analysis tools provided by Quantopian.com.

Setting Up Our Entries() Function:

Next up will be our entries() function. This function will control whether our buy logic for the US
sector ETFs is met.

To achieve this, we are first going to set up a for loop. Our for loop will iterate through all 10
ETFs in our context.assets universe one by one. We will then pull current and historical data for
each, calculate the 200-day moving average and 4-period RSI, then execute our trading logic.

First things first, let's set up a for loop to iterate through the Python list context.assets which
contains our 10 sector ETFs.
Data and analysis tools provided by Quantopian.com.

Notice the use of the iterative variable which I named “x”. This “x” variable will be each ETF one
by one as the loop iterates through the python list “context.assets”. You can name this variable
whatever you want.

The next thing we will do is pull both the current price and the trailing 200 days of historical
closes using data.current() and data.history().

Data and analysis tools provided by Quantopian.com.

We saved the most recent price as the variable “current_price” and we saved the trailing
200-day window of closes as “closes_history”.

The next thing to do would be to calculate the 200-day moving average and the 4-period RSI.

Since we are automatically grabbing a trailing 200-day window of historical data, all we have to
do is take the average of that data to get the moving average. To achieve this, simply add on
the .mean() method.

For RSI, we are going to lean on the ta-lib library here. Since we already imported “talib as ta”
we write the ta-lib function for RSI - ta.RSI().

ta.RSI() takes two arguments, the prices we are using the calculate the RSI and the period. We
pass in closes_history and the number 4 here.
Data and analysis tools provided by Quantopian.com.

We saved the 200-day moving average as the variable “sma_200_day” and the 4-period RSI as
“rsi”.

Notice that we typed [-1] after the RSI calculation. This is so we grab the most recent RSI value
in our logic.

Why don’t we use .iloc[-1] here like we learned? The reason is because this isn't a Pandas
DataFrame or Series, it's actually a Numpy Array (which we didn’t cover). In Numpy, which is
the underlying package that Pandas was built on, to reference the last value in an array you
simply use [-1].

This is a quirk of ta-lib, don’t let it trip you up. Once you get used to this subtle nuance, it's not a
big deal.

Finally, the last thing to do is code our trading logic. We know that our rules are to buy each
sector ETF if its current price is greater than its 200-day moving average and its RSI(4) is below
20.

We are going to add one more rule to the code - make sure our position is currently flat (i.e. we
don’t hold this position already). This will help control the flow of the algorithm and make sure
that we aren't adding to our long positions if the conditions are again met the next day. We
reference “context” to pull this portfolio level information as we showed in the last chapter.

Data and analysis tools provided by Quantopian.com.

Notice the use of the “if” statement and the “and” keyword. For the if statement to be “True”, all
of three of our conditions have to be met:

● RSI less than 20

● Current price greater than the 200-day MA
● Our current position is flat

If all of those conditions are true, we use “order_percent” to send an order. Our “order_percent”
function buys 10% of our portfolio in that ETF. This is simple math since there are 10 possible
sector ETFs we are allocating 10% to each.

Here is the whole entries() function:

Data and analysis tools provided by Quantopian.com.

Setting Up Our Exits() Function:

Next up would be our function exits(). In this function will, you guessed it, check if we need to
exit any current long positions in the sector ETFs.

Remember our rules are such that if we are long a US sector ETF, we will simply exit if the
RSI(4) is greater than 70.

To achieve this, we will again set up a for loop. This time we will iterate through all the ETFs in
our python list - “context.assets”. We then grab a window of historical data and calculate our
RSI the same way we did before.

From there, we again set up an if statement. This time two conditions have to be satisfied for
our if statement to be True:

● We are long the security

● The RSI(4) is greater than 70.
To check if we are long, we use the context object again to get portfolio level information, this
time checking if our current number of shares is greater than 0. If so, we know we are long that
ETF.

If both of those conditions are True, we sell the ETF using order_target_percent() and passing
in the desired allocation - 0%.

Data and analysis tools provided by Quantopian.com.

Note the use of order_target_percent() here. By using this method, it ensures that we are
getting out of the whole position no matter how long or short we are.

Here is the whole code for our exits() function:

Data and analysis tools provided by Quantopian.com.

Setting Up Our trade_bonds() Function:

The last thing we are going to need to do is adjust our bond position. Remember, we are
allocating all unused cash here to high-quality bonds, in this case, AGG. This function runs
every day, 5 minutes before the close. It is purposefully the last function to run, allowing our
entries and exits functions to do their transactions first before we allocate the rest of the capital
to AGG.

The main thing we need to do in this function is to calculate the percentage of capital to put into
AGG. To illustrate how this works, imagine that we are long four US sector ETFs - let’s say
XLF, XLV, XLI, and XLK.
Since we are allocating 10% of capital to each position, that would mean we are allocating 40%
of capital here to sector ETFs. As such, we need to put the remaining 60% of our capital in
AGG, which is easily done via an order_target_percent() order.

Using the above example, we would first need to check how many sector ETFs we are currently
long. We would then multiply that number by 10% (0.10) and subtract it from 100% (1.0) to
calculate how much capital to put in AGG.

So, if we have 4 sector positions, then 40% (0.4) is allocated to the sectors:

4 * 0.10 = 0.40

That leaves 60% (0.6) to be allocated to bonds:

1.0 - 0.40 = 0.60

We would have to put 60% of our capital in AGG, in this case.

An easy way to check the number of positions we are long is to use the len() function and pass
in “context.portfolio.positions”. This will give us our total number of current positions:

Data and analysis tools provided by Quantopian.com.

But Wait...

There is only one small problem with this approach. The code above will give us our total
number of long positions. This could, and most likely will, include the bond ETF we are long -
AGG.

We don't want to count the bond ETF however. We only want to know how many US Sector
ETFs we are long.

We can easily solve this problem by just checking if we are long AGG, which in our algo is
called “context.bonds”. If we are long AGG, we simply subtract 1 from the result of the
len(context.portfolio.positions) call.
The code looks like this:

Data and analysis tools provided by Quantopian.com.

Once we have the amount of sector ETFs we are long, saved under the variable
“amount_of_current_positions”, we simply subtract that from 10 and multiply by 0.10.

This will get us to our desired number - the percentage to allocate to AGG which we
appropriately call “percent_to_allocate_to_bonds”.

Finally, we use order_target_percent along with context.bonds and

percent_to_allocate_to_bonds, which is the percent of our portfolio to put in AGG

Data and analysis tools provided by Quantopian.com.

The Complete Code

There you have it. Find the complete code for our first algorithm below:
Data and analysis tools provided by Quantopian.com.

Results

In this section, we will display the test results of our sample trading algorithm. For such a simple
strategy, the results are quite good.

To calculate these performance statistics and make these charts, I utilized the open-source
Python library Pyfolio.

Pyfolio is a robust, open-source Python package used to analyze backtested results, also
developed by the good folks at Quantopian. You will find it to be a very useful package, allowing
you to do a deep dive into the historical test results of your strategy.
In the spirit of brevity, we won’t go into the code required to calculate these statistics and make
these performance charts. Just know that there is much less coding work here than you
probably assume, as Pyfolio has most of the statistics and charts already pre-programmed.

Our backtest spans the beginning of October 2004 to the end of August 2019.

Let's first take a look at some summary statistics, such as returns, volatility, max drawdown, and
Sharpe Ratio:

Data and analysis tools provided by Quantopian.com.

Here is the cumulative return chart of our strategy, or the theoretical growth of $1,000:

Data and analysis tools provided by Quantopian.com.

Here is a look at the drawdown plot, or percentage our strategy was away from an all-time high
through time:

Data and analysis tools provided by Quantopian.com.

Here are the details of the five worst drawdowns :

Data and analysis tools provided by Quantopian.com.

How about a monthly return table:

Data and analysis tools provided by Quantopian.com.

Yearly returns in bar graph form:

Data and analysis tools provided by Quantopian.com.

Here is a look at the number of positions we held through time. This is useful to see how often
we were taking mean reversion trades in the US Sector ETFs.
Data and analysis tools provided by Quantopian.com.

Trade Level Results

Let's now take a look at the results on a trade by trade basis. For this analysis, I excluded all
the transactions done in AGG, as this wouldn't make much sense to analyze. These trade
statistics only include mean reversion trades we did the US Sector ETFs:

Data and analysis tools provided by Quantopian.com.

This shows we had 516 total trades in the US Sector ETFs, with a win rate of 74%. Note that a
“trade” here is a round trip trade, meaning a buy and a sell.

This table shows us the duration statistics of our trades:

Data and analysis tools provided by Quantopian.com.

We can see that the average duration of our trades was 18 days with the median being 14 days.
It looks like we had one trade that took a while for us to exit - 72 days to be exact. Maybe
adding a stop rule would help that situation - something to potentially do more research on!
We can create a quick histogram to visualize our average trade duration:

Data and analysis tools provided by Quantopian.com.

A quick visual inspection of the chart shows that only a few trades lasted more than 50 days.

There are many more statistics and charts we can show, but I wanted to give you a taste of the
power of Pyfolio.
Chapter 10 – Conclusion and Next Steps

There you have it. I hope this e-book succeeded in giving you an introduction to Python
programming applied to the world of quantitative finance and trading. I hope you now realize the
potential power of this tool.

Notice that we used the same language, Python, to do our initial RSI research, code a full-blown
trading model, and analyze our backtested results. This is one of the many advantages of the
language, you can use Python from the beginning of your quant workflow all the way to the end.

I want to reiterate that as an introductory e-book, this just scratches the surface of what can be
done. Our trading model here was relatively simple, and we only used a static universe of
securities. The power of python and the Zipline API allows for the creation of complex
strategies complete with a dynamically changing universe of securities, such as the 500 most
liquid US stocks at any given time, for example. Not only that, we can also use fundamental
data in our algorithms, such as various measures of profitability, valuations, and debt/leverage,
something we didn’t touch in this book.

Python Programming for Traders Course

If you would like to go much deeper into Python, quantitative research, and trading strategy
development, we encourage you to check out our Python class – Python Programming for
Traders. In the course, we go much deeper into all topics related to Python programming for
quantitative research and trading.

What is covered in the Python Programming for Traders Course?

Below are some topics we cover in detail during our 5-week “Python Programming for Traders”
course:

● Python basics - data types, math, loops etc…

● The Numpy Library - numerical array-based computing in Python
● An in-depth look at the Pandas Library - the most important library for Quant finance.
● Conducting initial quantitative research in Python to see if your trading idea as any
validity.
● Code several complete trading models together:
○ Mean reversion applied to ETFs
○ Mean reversion applied to individual stocks
○ Trend following with ETFs
○ Pairs trading
○ Volatility trading
○ Dealing with a universe of hundreds and potentially thousands of stocks at once.
○ Using fundamental analysis in your trading models
○ Modeling factors such as quality, value, momentum and low volatility,
● Code to inspect your backtested results in detail
● Monte Carlo simulations in Python to check for the robustness of your strategy.
● Statistical modeling using the Python library statsmodels
● Introduction to machine learning in Python using Sci-Kit Learn, the most popular
machine learning package in multiple industries.

Register Today For our live Python Programming For Traders course. Please Call
Toll Free 1-888-484-8220 ext. 616 (outside the U.S. please dial 973-494-7311 ext. 616)

Click Here to Register Online Now

Peng Liu - Quantitative Trading Strategies Using Python - Technical Analysis, Statistical Testing, and Machine Learning-Apress (2023)
No ratings yet
Peng Liu - Quantitative Trading Strategies Using Python - Technical Analysis, Statistical Testing, and Machine Learning-Apress (2023)
300 pages
Backtrader Documentation 1.9.58.122 WJ
100% (1)
Backtrader Documentation 1.9.58.122 WJ
864 pages
Python for Finance Cookbook: Over 50 recipes for applying modern Python libraries to financial data analysis
From Everand
Python for Finance Cookbook: Over 50 recipes for applying modern Python libraries to financial data analysis
Eryk Lewinson
No ratings yet
Advanced Symbol Mapping
No ratings yet
Advanced Symbol Mapping
20 pages
Algorithmic Trading in Python
50% (2)
Algorithmic Trading in Python
28 pages
7
No ratings yet
7
140 pages
Time Series Analysis With R
No ratings yet
Time Series Analysis With R
6 pages
Python For Financial Analysis Ebook 2021
100% (2)
Python For Financial Analysis Ebook 2021
82 pages
Step by Step Automation Using Zerodha Api Python Version
100% (3)
Step by Step Automation Using Zerodha Api Python Version
156 pages
Python Algorithmic Trading Cookbook - Tutorialspoint
0% (1)
Python Algorithmic Trading Cookbook - Tutorialspoint
5 pages
Python For Quants.
No ratings yet
Python For Quants.
4 pages
Plotly Cheatsheet
No ratings yet
Plotly Cheatsheet
2 pages
AlgoTrader Reference Documentation
100% (2)
AlgoTrader Reference Documentation
436 pages
The Python Bible 7 in 1 Volumes One To Seven (Beginner, Intermediate, Data Science, Machine Learning, Finance, Neural Networks, Computer Vision) by Dedov, Florian
100% (1)
The Python Bible 7 in 1 Volumes One To Seven (Beginner, Intermediate, Data Science, Machine Learning, Finance, Neural Networks, Computer Vision) by Dedov, Florian
537 pages
Algorithmic Trading and Quantitative Strategies
No ratings yet
Algorithmic Trading and Quantitative Strategies
7 pages
Tutorials Finance Python Trading
100% (3)
Tutorials Finance Python Trading
71 pages
Mastering Python For Finance - Sample Chapter
50% (6)
Mastering Python For Finance - Sample Chapter
24 pages
Forecasting Non-Stationary Time Series by Wavelet Process Modelling (Lsero)
100% (1)
Forecasting Non-Stationary Time Series by Wavelet Process Modelling (Lsero)
32 pages
Application of Deep Learning in Stock Market - Recent Progess
No ratings yet
Application of Deep Learning in Stock Market - Recent Progess
97 pages
Course Outline Options Trading Strategies in Python Intermediate PDF
100% (1)
Course Outline Options Trading Strategies in Python Intermediate PDF
3 pages
High Frequency Trading Final Paper
No ratings yet
High Frequency Trading Final Paper
7 pages
Using Machine Learning To Locate Support and Resistance Lines For Stocks - by Suhail Saqan - The Startup - Jan, 2021 - Medium
No ratings yet
Using Machine Learning To Locate Support and Resistance Lines For Stocks - by Suhail Saqan - The Startup - Jan, 2021 - Medium
14 pages
Python For Trading
100% (5)
Python For Trading
47 pages
Technical Analysis Library in Python
100% (1)
Technical Analysis Library in Python
38 pages
The Top 21 Python Trading Tools For 2020 - Analyzing Alpha
No ratings yet
The Top 21 Python Trading Tools For 2020 - Analyzing Alpha
28 pages
Python For Finance PDF
50% (2)
Python For Finance PDF
28 pages
A Step-By-Step Guide To Implementing The SuperTrend Indicator in Python - by Nikhil Adithyan - CodeX - Medium
100% (1)
A Step-By-Step Guide To Implementing The SuperTrend Indicator in Python - by Nikhil Adithyan - CodeX - Medium
28 pages
Data Science For Financial Markets
No ratings yet
Data Science For Financial Markets
210 pages
Short Selling: "A New Beginning From 21st April 2008"
No ratings yet
Short Selling: "A New Beginning From 21st April 2008"
4 pages
Python Library For Trading
No ratings yet
Python Library For Trading
1 page
Visualizing Option Trading Strategies in Python - by Abhijith Chandradas - DataDrivenInvestor
No ratings yet
Visualizing Option Trading Strategies in Python - by Abhijith Chandradas - DataDrivenInvestor
11 pages
The Eight Best Python Libraries For Algorithmic Trading
No ratings yet
The Eight Best Python Libraries For Algorithmic Trading
4 pages
Backtesting of Algorithmic Cryptocurrenc
No ratings yet
Backtesting of Algorithmic Cryptocurrenc
135 pages
Adaptive Filtering
No ratings yet
Adaptive Filtering
10 pages
Mastering Pandas For Finance - Sample Chapter
100% (2)
Mastering Pandas For Finance - Sample Chapter
42 pages
Machine Learning-Algorithmic Trading-Python
No ratings yet
Machine Learning-Algorithmic Trading-Python
6 pages
Review of Deep Learning Models For Crypto Prices Prediction
No ratings yet
Review of Deep Learning Models For Crypto Prices Prediction
29 pages
A Noise Tolerant Money Management Stop System Trader Success
No ratings yet
A Noise Tolerant Money Management Stop System Trader Success
6 pages
Machine Learning For Algo Trading
100% (3)
Machine Learning For Algo Trading
29 pages
Python Quant Platform: Web-Based Financial Analytics and Rapid Financial Engineering With Python
100% (1)
Python Quant Platform: Web-Based Financial Analytics and Rapid Financial Engineering With Python
20 pages
Code MQL
100% (1)
Code MQL
10 pages
0133811662
No ratings yet
0133811662
34 pages
Y Hilpisch Derivatives Analytics Excerpt
100% (1)
Y Hilpisch Derivatives Analytics Excerpt
37 pages
Evaluating The Performance of Machine Learning Algorithms in Financial Market Forecasting
100% (1)
Evaluating The Performance of Machine Learning Algorithms in Financial Market Forecasting
22 pages
Algo Trading
No ratings yet
Algo Trading
9 pages
Algorithms - Hidden Markov Models
No ratings yet
Algorithms - Hidden Markov Models
7 pages
Pandas Datareader
No ratings yet
Pandas Datareader
31 pages
Algo Trader Documentation
No ratings yet
Algo Trader Documentation
48 pages
Cross Hedging & HFT
No ratings yet
Cross Hedging & HFT
69 pages
CSCI933 Machine Learning Algotithms and Applications
No ratings yet
CSCI933 Machine Learning Algotithms and Applications
19 pages
KDB Manual
No ratings yet
KDB Manual
189 pages
Quantra Machine Learning Ebook PDF
100% (2)
Quantra Machine Learning Ebook PDF
17 pages
Deep Reinforcement Learning For Automated Stock Trading - An Ensemble Strategy
No ratings yet
Deep Reinforcement Learning For Automated Stock Trading - An Ensemble Strategy
9 pages
Python Trader A Comprehensive Guide To Algorithmic Trading With
No ratings yet
Python Trader A Comprehensive Guide To Algorithmic Trading With
370 pages
Applying Machine Learning To Pairs Trading - Illya Barziy
No ratings yet
Applying Machine Learning To Pairs Trading - Illya Barziy
36 pages
An Introduction To Backtesting
No ratings yet
An Introduction To Backtesting
20 pages
Quantitative Trading
No ratings yet
Quantitative Trading
34 pages
A Machine Learning Framework For An Algorithmic Trading System PDF
No ratings yet
A Machine Learning Framework For An Algorithmic Trading System PDF
11 pages
Broadening Tops and Bottoms
No ratings yet
Broadening Tops and Bottoms
7 pages
An Automated FX Trading System Using Adaptive Reinforcement Learning
No ratings yet
An Automated FX Trading System Using Adaptive Reinforcement Learning
10 pages
DeepThought FinML
No ratings yet
DeepThought FinML
124 pages
10MinuteMillionaire.Barton_TMP
No ratings yet
10MinuteMillionaire.Barton_TMP
4 pages
DanKhoe.Writing
No ratings yet
DanKhoe.Writing
6 pages
Charles Harris - Method on buying Pullbacks
No ratings yet
Charles Harris - Method on buying Pullbacks
11 pages
Beginners Guide To Code Algorithms Experiments To Enhance Productivity and Solve Problems by Deepankar Maitra
No ratings yet
Beginners Guide To Code Algorithms Experiments To Enhance Productivity and Solve Problems by Deepankar Maitra
189 pages
An Introduction To The Operating Instructions For The Linx 5900 & 7900 Printers
No ratings yet
An Introduction To The Operating Instructions For The Linx 5900 & 7900 Printers
5 pages
Game Api
No ratings yet
Game Api
16 pages
Development Tools: Guide September 2000
100% (1)
Development Tools: Guide September 2000
418 pages
Library Management System
77% (13)
Library Management System
44 pages
Computer Programming (cp-1) Lecture# 3-4 Use of Asthmatic Operators
No ratings yet
Computer Programming (cp-1) Lecture# 3-4 Use of Asthmatic Operators
24 pages
CS Q (C)
No ratings yet
CS Q (C)
10 pages
Fib Reverse Engineering
No ratings yet
Fib Reverse Engineering
10 pages
TSPLUS Manual Do Administrador
No ratings yet
TSPLUS Manual Do Administrador
175 pages
WSM-COM-07 Project Construction Completion Pre-Comm and Commissioning Prosedure
No ratings yet
WSM-COM-07 Project Construction Completion Pre-Comm and Commissioning Prosedure
9 pages
Top 10 Internet Safety Rules
No ratings yet
Top 10 Internet Safety Rules
4 pages
wandb
No ratings yet
wandb
2 pages
Introduction To MSWORD
No ratings yet
Introduction To MSWORD
25 pages
Object-Oriented and Classical Software Engineering: Stephen R. Schach
No ratings yet
Object-Oriented and Classical Software Engineering: Stephen R. Schach
61 pages
Google_Ads_Crash_Course
No ratings yet
Google_Ads_Crash_Course
12 pages
IEEE-paper-format
No ratings yet
IEEE-paper-format
5 pages
Association Rule Mining
No ratings yet
Association Rule Mining
34 pages
Developers 6.0developers Guide For Dakota 6.0
No ratings yet
Developers 6.0developers Guide For Dakota 6.0
906 pages
Application of Data Warehouse in Power Transformer Diagnosis System
No ratings yet
Application of Data Warehouse in Power Transformer Diagnosis System
4 pages
Cookies
No ratings yet
Cookies
17 pages
Catalog PD810 Series
No ratings yet
Catalog PD810 Series
25 pages
Kubernetes Up and Running Dive into the Future of Infrastructure 1st Edition Kelsey Hightower 2024 Scribd Download
100% (1)
Kubernetes Up and Running Dive into the Future of Infrastructure 1st Edition Kelsey Hightower 2024 Scribd Download
55 pages
Assignment of Relational Algebra With Solutions
No ratings yet
Assignment of Relational Algebra With Solutions
4 pages
32 MRS-BTSNodeB Site Deployment Noi Dung 8 Ngay Sau 201708030856215661
No ratings yet
32 MRS-BTSNodeB Site Deployment Noi Dung 8 Ngay Sau 201708030856215661
2 pages
20589-24 01 April 2024 05 April 2024 Royal Enterprises Mr. Malik Hassan 22-Km, Ferozepur Road, Lahore
No ratings yet
20589-24 01 April 2024 05 April 2024 Royal Enterprises Mr. Malik Hassan 22-Km, Ferozepur Road, Lahore
2 pages
Synchronization in Distributed Systems
No ratings yet
Synchronization in Distributed Systems
56 pages
DevOps 3
No ratings yet
DevOps 3
3 pages
74LVC1G125
No ratings yet
74LVC1G125
19 pages
Barcode Upc Ean13
No ratings yet
Barcode Upc Ean13
12 pages
Installation Guide For Scopia Elite 6000 For Aura Collaboration Suite Version 82
No ratings yet
Installation Guide For Scopia Elite 6000 For Aura Collaboration Suite Version 82
60 pages