Python For Financial Analysis Ebook 2021
Python For Financial Analysis Ebook 2021
Table of Contents
Summary __________________________________________________________________________ 52
06 – A Simple Portfolio __________________________________________________________ 53
Learning objectives _________________________________________________________________________ 53
The simple portfolio _________________________________________________________________ 53
50-50 split without rebalance __________________________________________________________ 53
Project – An annual rebalance _________________________________________________________ 56
Step 1 ___________________________________________________________________________________ 56
Step 2 ___________________________________________________________________________________ 56
Step 3 ___________________________________________________________________________________ 57
Step 4 ___________________________________________________________________________________ 57
Step 5 ___________________________________________________________________________________ 57
A word of warning and how to avoid it __________________________________________________ 58
Summary __________________________________________________________________________ 59
07 – The Asset/Hedge Split _______________________________________________________ 60
Learning objectives _________________________________________________________________________ 60
The calculations _____________________________________________________________________ 60
Project – Sharpe Ratio ________________________________________________________________ 63
Step 1 ___________________________________________________________________________________ 64
Step 2 ___________________________________________________________________________________ 64
Step 3 ___________________________________________________________________________________ 64
Summary __________________________________________________________________________ 65
08 – Backtesting an Active Strategy ________________________________________________ 66
Learning objectives _________________________________________________________________________ 66
Getting started with an active strategy __________________________________________________ 66
The strategy explained ______________________________________________________________________ 66
Getting started ____________________________________________________________________________ 66
The signal line _____________________________________________________________________________ 67
Log returns of the strategy ___________________________________________________________________ 67
The cumulative return of the strategy __________________________________________________________ 68
Project – Backtesting different periods and visualize results _________________________________ 69
Step 1 ___________________________________________________________________________________ 69
Step 2 ___________________________________________________________________________________ 70
Step 3 ___________________________________________________________________________________ 70
The learning from 2008 backtesting _____________________________________________________ 71
Summary __________________________________________________________________________ 73
09 – Backtesting Another Strategy _________________________________________________ 74
Learning objectives _________________________________________________________________________ 74
The 12% solution described ___________________________________________________________ 74
Implementing the 12% solution ________________________________________________________ 74
Project – Backtesting the 12% solution __________________________________________________ 75
Step 1 ___________________________________________________________________________________ 76
Step 2 ___________________________________________________________________________________ 76
Step 3 ___________________________________________________________________________________ 76
Summary __________________________________________________________________________ 78
Next Steps – Free Video Courses ___________________________________________________ 79
Free Online video courses _____________________________________________________________ 79
Python for Finance: Risk and Return ___________________________________________________________ 79
Financial Data Analysis with Python ____________________________________________________________ 80
A 21-hour course Python for Finance ____________________________________________________ 81
Feedback _____________________________________________________________________ 82
Disclaimer ________________________________________________________________________________ 82
00 – Pandas DataFrames
In this chapter we will get acquainted with the main library when working with financial data in
Python: Pandas.
Note Pandas is part of the Anaconda distribution and is not needed to be installed separately.
Learning objectives
• Open the project in Jupyter Notebook.
• How to execute the code.
• Load financial data from a CSV file.
• Learn about the main data type in Pandas, DataFrame.
• Make your first project where you calculate the Simple Moving Average of the Close price.
What is Pandas?
Pandas (derived from panel and data) contains powerful and easy-to-use tools for doing practical,
real world data analysis in Python.
The best way to learn about new things is to relate it to something similar.
Let’s try to load some data into Pandas and see how similar it is to an Excel spreadsheet.
Note The above code snippets are available directly on GitHub. You can either download the
Notebooks or run them directly in Colab from the links provided (see GitHub).
If you are new to Jupyter Notebook – this section is for you. Otherwise, you can skip to the next
section, where we will learn about the code above.
What is Jupyter Notebook?
Jupyter Notebook is an open-source web application that allows you to create and share
documents that contain live code, equations, visualizations and narrative text.
Said differently.
• It works in your browser.
• You can write your Python code in it.
• It can execute the Python code for you and show the results.
• You can write text in it.
• It can create visualizations with charts.
Jupyter Notebooks are very popular among beginners and Data Scientists.
• It is easy to learn programming in a Notebook.
• The needs of a Data Scientist are well served in a Notebook.
• Notebooks are good to explore data and programming in an easy way.
Financial analysis can be seen as a Data Science task.
• You work with financial data.
• You explore the data and make calculations.
• You visualize the data.
Jupyter Notebook seems to be a perfect match for financial analysis.
How to open the Jupyter Notebooks from this eBook
How to get started.
• Install Anaconda Individual Edition.
• Download the zip file from the GitHub.
• Unzip the content of the zip-file in a location you can find.
• Launch Anaconda Navigator from your main menu (Launchpad or Windows menu).
• Inside Anaconda Navigator launch Jupyter Notebook.
This should bring you into your main browser with the Jupyter Notebook Dashboard (see below).
The above cell contains 1 line of code. The import statement of the main library in this case.
A cell has been executed if it has a number, like the above In [1]:
If the cell hasn’t been executed it would not contain a number.
To execute a cell can be done in various ways. The normal way is to mark the cell (can be done
with mouse or keys and press enter).
To edit a cell, just mark it and press enter. Then you enter edit-mode.
This is all you need to know to follow along this eBook.
If you would like to learn more about the specific elements within the Notebook Editor, you can go
through the user interface tour by selecting Help in the menu bar then selecting User Interface
Tour.
10
The first code cell, import pandas as pd, can look intimidating at first.
This ensures access to a library called pandas (the main library we use – read more about Pandas
here). You use the as pd for easy access, as we see in the next cell.
The real magic happens in the next cell. We define the location to the CSV file we want to read
(csv_file = “…”).
This CSV file was downloaded from Yahoo! Finance. It can be downloaded by finding Apple stock
and go to Historic Data and use the Download.
What is a CSV file?
A CSV file is a Comma Separated Values file.
The one we use here looks like this (or the beginning of it).
As you can see, the first row contains the names of each column. Then each line is separated with
commas for each value corresponding to the name in the first row.
Back on track
The second cell.
11
Here we define where the CSV file is located. Now, this is set to be the location in the repository. If
you execute it locally, you can change the cvs_file to the following.
csv_file = "AAPL.csv"
This will load it from your local computer and not from the repository.
Note If you run the code in Colab you need to load it from the repository.
Finally, the following line.
Here we see the that column Date has data type object, while Open, High, Low, Close, Adj Close
has data type float64, while Volume has data type int64.
A DataFrame has an index.
12
The index_col=”Date” sets the index to be the Date column. And parse_dates=True ensures that
the dates (in this case in the Date column) are parsed as dates.
As the above shows, we now have the Date column as the index.
The data types are looking similar, except we do not have the Date column there anymore. This is
because it is the index now.
13
The loc can access a row in our DataFrame by using the “index name”. Here, it is a date, hence, we
can write the date to access it.
Notice, that the output if formatted a bit different than when we used head().
Here the data is simply put in a column like fashion. Also, notice, that it only has one data type
(float64).
This is because it converts the row to a Series (the other main data type from the Pandas library).
Don’t worry too much about that now.
You can access a date interval by using the slicing technique above.
Notice that because we use DatetimeIndex, it can actually figure out what the date range is, even
though we have used dates with no data (it was weekend the 27th and 28th).
The iloc is an integer location. The iloc[0] will access the first row.
Similarly, iloc[1] the second row and so forth.
14
We can access the data from the back with negative indexing. Just like Python lists.
Finally, we can make slicing from row 0 to 4 (where row 4 is excluded). Hence, we see the rows of
index 0, 1, 2, and 3.
15
If we take the average of the Close price of the 9th and 10th of March, then we get the value given
for the rolling(2).mean() the 10th of March.
Similarly, for 10th and 11th of March, as the above example demonstrates.
Hence, the rolling(2).mean() applies the mean() function on the window of the last 2 rows.
Similarly, the rolling(5).mean() applies the mean() function on the window of the last 5 rows.
16
Step 3
Now we can calculate the simple moving average of any days.
In this example we will calculate the simple moving average of 10 days.
Often, the simple moving average is called the moving average (MA). On financial pages, like
Yahoo! Finance, they use the abbreviation MA for the simple moving average.
Summary
Now we have learnt what a DataFrame is. How to read data properly from financial pages like
Yahoo! Finance into our DataFrame. How to access data rows based on a date index and integer
value index. Also, how to slice data, i.e., showing subsets of data rows.
Finally, we made a project where we calculated the simple moving average of the Close price and
added it to our DataFrame.
17
What is an API
“An application programming interface (API) is a computing interface that defines interactions
between multiple software or mixed hardware-software intermediaries.” – Wikipedia.org
For our purpose, it helps us read historic stock prices directly from Jupyter Notebook without
manually downloading a CSV file from a page like Yahoo! Finance.
18
As you see, the start date is defined by the dt.datetime(2010, 1, 1). This creates a datetime object
representing the year 2020, month 1 (January), and day 1.
Then we call the Pandas Datareader with pdr.get_data_yahoo(…), and specify the ticker AAPL,
which is the ticker (stock symbol) for Apple.
You can find any available ticker on Yahoo! Finance by searching the name of the company you
wish to have historic stock prices from.
The good thing is, that it already has the Date as the index and it is a DatetimeIndex.
All ready to use.
In this case we want all the data from January 1st, 2010 until January 1st, 2020.
We can investigate the start and end data as follows.
19
Notice, that the output of the DataFrame (aapl) is by default shortened with … in the middle, and
not displaying the full set of 2,516 rows.
This is convenient for speed and to lower the number of requests you make to the API.
The data is structured as follows.
20
Later we will learn how to access the data in a convenient way, when we work with data from
multiple tickers.
This simple statement takes all the Adj Close prices from the DataFrame.
As you see the advantage of the two-layer column names, which the Pandas Datareader returns.
It makes it easy to get the Adj Close from all tickers.
If you are unfamiliar with adjusted close (Adj Close):
“The adjusted closing price amends a stock's closing price to reflect that stock's value after
accounting for any corporate actions. It is often used when examining historical returns or doing a
detailed analysis of past performance.” – Investopedia.org
Step 2
Now we need to normalize the data.
21
The result?
As you see, all the start prices are adjusted to 1. This is the normalization.
Then the relative change can be seen from day 2010-01-05 and forward.
Step 3
Let’s assume that our portfolio consists of 25% of each ticker.
How can we model that?
Then we get a weighted distribution of our portfolio. Starting with 25% for each ticker.
Step 4
If we calculate the sum of that and add that to the DataFrame. Notice, that we sum axis=1, which
means along the rows.
22
Then we have the sum of our weighted sum of how the portfolio evolves in the Total column.
Step 5
Let’s assume we invested 100,000 USD. This can be modeled quite easy as you see here.
Our portfolio looks like this in the end (tail() shows the last 5 rows by default).
Hence, this shows that if we invested 100,000 USD the opening day of 2010 in a portfolio
consisting of 25% of each ticker, then we would have 2,977,540 USD on March 11th, 2021.
23
Summary
In this chapter we learned how to read historic stock prices directly from the Yahoo! Finance API
using Pandas Datareader. How to set the start and end date. Also, how to read multiple tickers at
the same time.
In the project we learned how to normalize data to calculate the return of a portfolio.
24
02 – Visualize DataFrames
In this chapter we will learn how to visualize historic stock prices on a chart using Matplotlib, as
well as values calculated and added to the DataFrames.
“Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations
in Python.” – matplotlib.org
Learning objectives
• Learn how to make a chart of historic stock prices.
• Different modes to visualize charts in Jupyer Notebook.
• Have multiple charts on the same axis.
• Have multiple axis in one figure.
The first line (import pandas as pd) is familiar and imports the Pandas library.
Then the second line imports the Matplotlib library, where we choose the pyplot module and
want access to with the shorthand plt.
25
It is always good habit to check the data is as expected to avoid surprises later.
This is the object-oriented way to use Matplotlib. You might be familiar with rendering a chart
directly by one line of code.
In this case, the advantage is small, but still it is there. If you render another chart from the same
dataset, then this will be added to the existing figure. That leads to a lot of confusion. Here, we
avoid that.
A few notes on the above code.
The plt.subplots() returns a figure fig and axis ax. We need to parse the axis ax to the plot method
of the DataFrame (the second line .plot(ax=ax)).
The figure is the entire image of one or more charts (axes). The axis is the where the chart is
drawn. You can have multiple axes in one figure as we will see later.
26
The zoom function is given by the button. Press it and mark the area you want to zoom in on.
You can move around the chart by pressing the and navigate in your history of changes by
using the arrows back and forward.
This makes it easy to explore the chart interactively.
Multi-line chart
Let’s try to add the simple moving average to our chart. This requires that we calculate them and
add them to our DataFrame.
27
Notice, that the DataFrame (aapl) can take a list of columns and that list will be the columns it
makes available.
This will give a chart as follows.
In your Jupyter Notebook it will be shown with the interactive setup. Here we only see the chart.
Notice, that the legend in the upper left corners shows the color coding of the lines on the chart.
Notice how easy it is to do. Simply by adding a .loc[‘2020’:] you tell the DataFrame to take the
data from the beginning of 2020 and forward.
28
29
Then we have some lines of the form ax[0, 0].legend(), which again are optional. They only tell the
axis to have the legend visible.
You can also call ax[0, 0].set_title(“My title”) to set a title on your axis.
Finally, we call plt.tight_layour(). All that does is to render the axes (charts) nice to avoid overlap
of the text of the axes. You can try to comment it out and see the difference.
30
Step 2
We will compare the investments using the Adj Close (adjusted close) price, as it includes dividend
payout and other corporate adjustments.
Step 3
To make an easy comparison we normalize the data.
Step 4
Now to the fun stuff.
31
Visualization.
Now let’s take a minute and see what we can conclude from the figure.
Because we have normalized the data both Apple and Microsoft start at the same point in start of
2020.
This is convenient, because if we invested either in Apple or Microsoft, it will show the relative
difference on how our investment would evolve.
Said differently, for each dollar invested in Apple and for each dollar invested in Microsoft, then
we can see how it evolves over time.
Here at this stage, we are only interested in the return of our investment.
Hence, we only look at what we get at the end.
Later we will learn about volatility and maximum drawdown, which are important factors when
evaluating the risk of potential investments.
But for now, let’s investigate the return.
Step 5
Let’s calculate the return and visualize it.
32
As we have normalized data, we can get the return from the last day. We subtract 1 to get the
percentage growth of the investment.
33
Summary
In this chapter we have learned how to use Matplotlib to visualize historic stock prices. How to
use it with multiple lines in one chart or multiple axes in one figure. Also, how to create bar charts,
which we will use later in this eBook.
34
35
We will read from beginning of 1999, and you will know why later in this chapter.
As you see. This looks good. Nice growth in the second half.
36
Second half?
What about the first?
37
38
39
Maximum Drawdown
“A maximum drawdown (MDD) is the maximum observed loss from a peak to a trough of a
portfolio, before a new peak is attained. Maximum drawdown is an indicator of downside risk over
a specified time period.” – Investopedia.org
In this first stage we will calculate the maximum drawdown over the full period of 10 years. Later
we will be calculated it as a rolling value of a period of 1 year.
For the specific study here, it will not make any difference.
How do we get the maximum and minimum value cumulated over the period?
That is actually what we are looking for.
The rolling_max will at all times keep the maximum value seen so far.
The daily_drawdown keeps the current maximum drawdown.
40
Why? Because the maximum drawdown tells you how much money you can lose, based on
historic values.
In the next chapter we will learn about volatility of a stock.
The full picture will be to maximize the return (CAGR) and minimize the maximum drawdown and
volatility of the stock.
Project – Calculate the CAGR and Maximum Drawdown of S&P 500 from
2010 to 2020
In this project we will make the same calculations for the 10-year period form 2010 to 2020.
Step 1
First, we need the data for that period.
Step 2
Then we calculate the CAGR.
Summary
In this chapter we have started our journey to learn more about investing. We have explored the
advice of Buy-and-Hold a low-cost fund tracking the S&P 500. This led us to discover that 10 years
might pass without any earnings, like the “lost decade” 1999 to 2009.
We have formalized the return calculations CAGR, which is a primary interest when investing. To
learn about the risk side of an investment, we introduced the concept of maximum drawdown.
The maximum drawdown tells us how much money we might lose.
In the next chapter we will learn about volatility of a stock, which is the second risk measure we
use when evaluating an investment strategy.
41
04 – Volatility
In this chapter we will learn about the volatility of a stock.
“Volatility is a statistical measure of the dispersion of returns for a given security or market index.
In most cases, the higher the volatility, the riskier the security. Volatility is often measured as either
the standard deviation or variance between returns from that same security or market index.” –
Investopedia.org
A volatile stock is considered risky. It is common to assume that volatility is a good measure for
risk in investment.
Learning objectives
• Different measures of volatility.
• Learn about log return – needed for the calculations.
• How to calculate the volatility of an investment.
• A way to visualize the volatility for better understanding.
Log returns
What about this log?
Why bother about the log?
What is the log?
All good questions.
The answer. They make our calculations easier and faster.
It all boils down to two things about the log returns.
log(𝑎 × 𝑏) = log(𝑎) + log (𝑏)
And
42
43
What we use is the daily change. This can be accessed by taking the current price divided by the
previous price.
If we take the logarithm of that, we get the daily log return (called the Log returns in below).
The shift() shifts the data one place forward. Hence, the rows are shifted one day forward.
Here the Adj Close price from the current day is divided from the previous day. This is done for all
days.
Then we have taken the np.log, which applies the logarithm on all rows.
Now let’s see how this connects to the return as we know how to calculate.
If we apply the exponential function on that value we get the same return.
Wow. Check that out. We can add the daily log returns together and get the same return.
This still seems a bit counterintuitive to do.
Later when we shift our investment from one stock to another, it comes in handy to just add the
daily log returns together to find the full return.
That makes it really powerful.
It makes it easy to calculate the return of investment strategies, which have different investments
portfolios from day to day.
If it is not entirely clear now? Don’t worry, we’ll get there later and you will see the advantage of
using log returns (daily log returns).
Then we calculate the exponential value of the cumulated sum of the log return.
44
Volatility calculation
This is all we have been waiting for, right?
What?
The annualized standard deviation is used as volatility measure.
Let’s break it down.
The data[‘Log returns’].std() returns the daily standard deviation.
To annualize it we need to multiply by the square root of the number of days.
There are calculated on average 252 trading days. Hence, 252**0.5, is the square root of 252 and
we get the annualized standard deviation, which we use as volatility measure.
How to interpret the result?
It is a comparison. The lower the better.
As you will see later, we will use the S&P 500 index as our basis for the return, maximum
drawdown and volatility. If figures are worse than the S&P 500 in all aspects, we gain nothing.
45
The goal is to maximize the return (CAGR), and minimize the maximum drawdown and volatility.
Step 2
Convert the volatility into a string.
We use the histogram to make it (hist) and use 50 bins, an alpha of 0.6 (this makes it a bit
transparent, not strictly needed), and the color blue.
We also set labels (set_xlabel and set_ylabel) and titles (set_title).
46
The volatility says something about how these bins are spread and how close the majority of data
is to 0.0.
Remember that we are looking at the daily log return here. The higher the daily log return, the
more volatile it is. Hence, if we have a log of values that are far from 0.0, then it is volatile. If all
values are close to 0.0, it is not as volatile.
Summary
In this chapter we learned about two important concepts. The log return and volatility. First, the
log return is a great way to calculate return of. It will be handy, when we calculate returns of
changing investments. Second, the volatility of an investment is a measure of risk.
We now have the tools to evaluate an investment strategy. The return (CAGR), which should be
optimized, the maximum drawdown and volatility, which should be minimized.
47
05 – Correlation
In this chapter we will explore an important concept when talking about investment strategies.
“Correlation, in the finance and investment industries, is a statistic that measures the degree to
which two securities move in relation to each other. Correlations are used in advanced portfolio
management, computed as the correlation coefficient, which has a value that must fall between -
1.0 and +1.0.” – Investopedia.org
Our goal is to find ways to minimize the maximum drawdown and volatility.
A great way is to find investments with negative correlation.
Why?
Because when the price of one goes down, the other is expected to go up.
Learning objectives
• Understand negative and positive correlation.
• How negative correlation can minimize the maximum drawdown and volatility.
• How to calculate the correlation.
• Visualize negative correlated investments.
48
We will explore them over the 10 years from 2008 to 2018 (last day of December 2017).
As we know, we use the Adj Close.
49
50
This period looks almost too good to be true. While the SPY goes down 2008 and recovers in 2011,
the TLT goes almost opposite.
“A hedge is an investment that is made with the intention of reducing the risk of adverse price
movements in an asset.” – Investopedia.org
In the next chapter we will look at how to make a portfolio based on SPY and TLT.
This shows that the annual return of SPY is 8.5% and 6.3% for TLT.
51
Step 2
To calculate the maximum drawdown, we first define a function.
This shows that the maximum drawdown is 51.8% for SPY and 26.6% for TLT.
Step 3
We calculate the volatility of the log returns.
This shows that the volatility of SPY is 20.3% and 15.2% for TLT.
In the next chapter we will see how a combination of SPY and TLT affects the values.
Summary
In this chapter we have explored correlation. If two stocks are negatively correlated, we expect
that if the price goes up of one the price goes down of the other.
We found that SPY and TLT are negatively correlated.
52
06 – A Simple Portfolio
In the last chapter we realized that SPY and TLT are negatively correlated. Here we will explore if a
simple portfolio of a 50-50 percentage split of SPY and TLT will improve our investment in terms of
return (CAGR), maximum drawdown, and volatility.
The project in this chapter will investigate how an annual rebalancing will affect the result.
Learning objectives
• Calculate the return (CAGR), maximum drawdown, and volatility of a portfolio
• Compare the result to SPY.
• Investigate the portfolio visually.
• Make a rebalancing approach and compare it.
The maximum drawdown of SPY is scary, losing 51.9% is not what we want to experience.
This makes the period of 2008 interesting. How can we avoid a maximum drawdown like that?
In this chapter we will explore if a 50-50 percent split between SPY and TLT would improve on the
above figures.
First, we will make a simple Buy-and-hold approach with the split. Then in the project we will
make an annual rebalance of the split. This helps to avoid a that the split drifts far away from the
initial 50-50 percentage split.
Say, if we start with a 50-50 percentage split of SPY and TLT. Imagine SPY drops 50% and TLT
raises 50%. Then the split is 25-75 percentage between SPY and TLT.
53
Notice the .sum(axis=1), it sums along the rows. Hence, it takes the values of the portfolios and
sums each row up. This results in the normalized values of the portfolio.
Resulting in this figure.
54
Now that is a great improvement. The SPY had 51.9% and here we get 18.3%.
55
On a top level we iterate over the years and make a 50-50 percentage split of the portfolio.
The only thing we need to adjust for is the return from previous year. This is the if statement (as
we should not do it the first year).
Hence after the for-loop we have a list concat where we have the yearly rebalanced portfolio
adjusted from previous year return.
The pd.concat(.) simply concatenates the list concat into one Series.
Now that was nice.
Step 2
We can do as we did above to calculate the CAGR.
56
Step 3
The maximum drawdown.
Step 4
The volatility.
Step 5
Visualize it.
57
We see a difference already. At the far end of 2018 we end up at the same point as SPY (or fairly
close).
Did we lose our advantage?
Let’s compare it here.
58
Summary
In this chapter we explored our first strategy with a 50-50 percent split between SPY and TLT.
With backtesting we saw that this strategy keeps a decent return (CAGR), while lowering the risk
significantly (maximum drawdown and volatility). This can be further improved if we add an
annual rebalancing of the portfolio.
Finally, we looked at the importance to test (backtesting) presented strategies yourself. Often
strategies are presented under the best circumstances and do not show the full picture.
59
The calculations
The calculations in this section can be done for any type of asset/hedge split. We will continue our
journey with the SPY and TLT over the 10 years period of 2008 and 2018.
As we need to evaluate different splits and are interested in the same calculations we will create a
function which returns the results.
The calculations are the same we have done in previous chapters.
60
The function takes the data and split. The split is a value assigning the ratio to the asset.
Notice, that we do not make the rebalancing here. That is a great exercise you can do on your
own.
To see the 50-50 split simple call the function as follows.
What we want here, is to calculate these 3 values for all possible splits.
This can be done as follows.
61
62
To get the optimal maximal drawdown we should have a 45-55 percentage split to SPY-TLT.
We can get a small window of the numbers here.
63
One thing to notice, the Sharpe Ratio does not take the maximum drawdown into account.
Step 1
Calculate the Sharpe Ratio.
Hence, this is the 50-50 percent split (we see it in next step).
Step 3
Visualize and show the value of the optimal split.
This shows the chart of the Sharpe Ratio according to the different splits. The x-axis is showing the
same as previous.
64
We see the optimal values above (the index from step 2).
What can we conclude?
Well, the theory of Sharpe Ratio also confirms that a split somewhere in the middle seems to be
optimal. Remember, Sharpe Ratio tries to optimize the return in regards to risk.
Summary
In this chapter we have explored if our findings concur with the financial advisors of a 60-40
percentage split for asset-hedge. On the concreate calculations we found a split in the middle to
be optimal. This means that the minimal risk is somewhere in the middle.
To give a precise general split would require more calculations and is outside the scope of this
eBook. That said, our findings are not contradicting the general advice.
65
66
We have a longer time span than usual, this is to be able to make different backtesting frames.
The signal-line
How do we model the strategy?
We can create a signal-line.
We first calculate the moving average (ma), then we have the signal line to be the difference
between the price of SPY and ma. This will make signal_line positive when we should have our
holdings in SPY and negative when we should have our holding in TLT.
The apply(np.sign) simply transform the sign to either 1 or -1. Hence, if 1, we hold SPY, and if -1,
we hold TLT.
Log returns of the strategy
Then we calculate the log returns.
As you see, it is similar and still a bit different. It takes and subtracts all the values from TLT. This is
needed as the signal_line here is negative and we want to add the log return of TLT.
67
Notice, that we also use clip(upper=0) to get all the negative signal values (all the -1’s).
Now we have the log return of the strategy.
The cumulative return of the strategy
To calculate the cumulative return of the strategy you can use the cumsum() and apply(np.exp).
This sums the cumulative of the log returns and applies the exponential function on it. This will
calculate the normalized return of the strategy.
Similarly, you can do the same for the SPY.
68
Hence, this looks like our strategy performs better than SPY.
Let’s make a project and learn more about it.
69
Step 2
Let’s call the function for with SPY for the period 2008 to the end of 2017.
Now this looks pretty good. A better return 10.2% CAGR, 25.3% maximum drawdown, and 14.5%
volatility.
Step 3
Now for the big step.
How do we visualize this in a good way?
Let’s jump into it and try.
70
71
That was a big loss and it took long to recover. Many investors lost a great deal of money. Image
you had your life savings of 1,000,000 USD in the start of 2008 and ready to retire. Later that year
you would most like have less than 500,000 USD.
That hurts.
Hence, it is no great surprise that backtesting strategies over 2008 has been of great interest. How
do avoid such a big loss?
This is the maximum drawdown factor at play. We want to minimize it to lower the risk.
What this eBook should teach you is to make backtesting on your own and not trust others
findings.
When a backtesting performs good over 2008 it actually has a great advantage.
Why?
Let’s just look at the backtesting we just did and redo it over another period (yes, we made that
easy to do).
Let’s see how 2011 until the end of 2020 looks like.
Summary
In this chapter we have explored our first active strategy. We learned a great way to calculate the
return of a strategy. Using this return to make the actual backtesting. Additionally, we created a
nice way to visualize a backtesting with a SPY comparison.
Finally, we had a discussion on how the strategies beating the 2008 market collapse has a great
advantage in the following time.
73
Notice the interval=’m’, which will read monthly data instead of daily data.
When we create our strategy, we should also have the option to hold cash. This can be modeled
by adding a column with no growth.
Here we call the column Short and assign the full column to the value 1. This will give a 0 in return
and no growth if we keep money in it.
For convenience, we reorder the columns on the second line.
As usual, we calculate the log returns.
We simply calculate the rolling sum over the last 3 months for all of them. Notice, we have
excluded TLT from our calculations, as it will only have a 40% holding and not change.
This makes us ready to calculate the monthly return.
Yes, that is it. On the first line we simply take the maximum of the rolling_sum, which contains the
last 3 months return. We shift it, as we can first make the decision in the end of the month (we
still do not know how to look into the future).
We multiply it by 0.6 and add the log return of TLT multiplied by 0.4 to make the split.
Let’s evaluate this.
75
Step 1
Adjust the calculation function to use monthly data.
We make sure the drawdown is looking at the full picture and the volatility is now over the last 12
months and not 252 trading days.
Step 2
Copy the code of the visualization from last chapter.
76
Well, on paper it looks pretty solid here. A return of 10.2%, a low maximum drawdown (7.2%) and
low volatility (8.0%).
Not the fully 12% the solution is named after, but we have left out some details, like rebalancing
which actually can add some.
Let’s try it for 2011 and 10 years forward.
77
Still good. A return of 11.7%, though during this 10-year period the SPY returned 13.8% (calculated
on a monthly basis). Low drawdown and volatility.
Remember the comments from last chapter about the evaluating over 2008. It gives a great
advantage to beat 2008.
Summary
In this chapter we have explored a version of the 12% solution (not the final version in the book).
We have found it is not by default outperforming the return of the last 10 years, but still keeping
the risk in form of maximum drawdown and volatility lower than the SPY.
78
Learn Python for Finance with Risk and Return with Pandas and NumPy (Python libraries) in this
2.5 hour free 8-lessons online course.
The 8 lessons will get you started with technical analysis for Risk and Return using Python with
Pandas and NumPy.
The 8 lessons
• Introduction to Pandas and NumPy – Portfolios and Returns
• Risk and Volatility of a stock – Average True Range
• Risk and Return – Sharpe Ratio
• Monte Carlo Simulation – Optimize portfolio with Risk and Return
• Correlation – How to balance portfolio with Correlation
• Linear Regression – how X causes Y
79
Learn Python for Financial Data Analysis with Pandas (Python library) in this 2 hour free 8-lessons
online course.
The 8 lessons will get you started with technical analysis using Python and Pandas.
The 8 lessons
• Get to know Pandas with Python – how to get historical stock price data.
• Learn about Series from Pandas – how to make calculations with the data.
• Learn about DataFrames from Pandas – add, remove and enrich the data.
• Start visualize data with Matplotlib – the best way to understand price data.
• Read data from APIs – read data directly from pages like Yahoo! Finance the right way.
• Calculate the Volatility and Moving Average of a stock.
• Technical indicators: MACD and Stochastic Oscillator – easy with Pandas.
• Export it all into Excel – in multiple sheets with color formatted cells and charts.
Read more on the course page and see the video lectures.
The Code is available on GitHub.
80
Did you know that the No.1 killer of investment return is emotion?
Investors should not let fear or greed control their decisions.
How do you get your emotions out of your investment decisions?
A simple way is to perform objective financial analysis and automate it with Python!
Why?
• Performing financial analysis makes your decisions objective - you are not buying
companies that your analysis did not recommend.
• Automating them with Python ensures that you do not compromise because you get tired
of analyzing.
• Finally, it ensures that you get all the calculation done correctly in the same way.
Does this sound interesting?
• Do you want to learn how to use Python for financial analysis?
• Find stocks to invest in and evaluate whether they are underpriced or overvalued?
• Buy and sell at the right time?
This course will teach you how to use Python to automate the process of financial analysis on
multiple companies simultaneously and evaluate how much they are worth (the intrinsic value).
You will get started in the financial investment world and use data science on financial data.
Read more on the course page.
81
Feedback
Any feedback and suggestions are welcome.
You can contact me on
• Twitter: @PythonWithRune
• Facebook: @learnpythonwithrune
• Email: [email protected]
Disclaimer
The content of this eBook and any associated resources are provided for educational purposes
only. You assume all risks and costs associated with any trading you choose to take.
Version history
Version 0.9.2.1 – updates (2021.03.20)
Version 0.9.2 – updates (2021.03.15)
Version 0.9.1 – updates (2021.03.14)
Version 0.9 – first release (2021.03.13)
82