FILE - 3284 Compressed
FILE - 3284 Compressed
We at OQAM, a quantitative asset management firm, In recent years the field has seen a high activity of
are fascinated by the latest technological research and lots of advancements, one breakthrough
developments. Recent advances in the field of natural was Google's release of a new text-based model called
language processing (NLP) have led to new ways of BERT in 2018, significantly improving the capacity of past
systematically analysing vast amounts of text data and models to correctly understand context. This is the
hence created new ways of generating actionable model we have been using at OQAM and in this
signals to the investment process. whitepaper. Last month OpenAI opened access to an
NLP model called GPT-3, giving developers and
In this whitepaper, we explore how to use NLP in the researchers access to one of the most prominent
investment process. We are primarily targeting people language models created, showing that the field is still
with an interest in finance and who want to evolving at a fast pace.
understand/get an update on the recent developments
within natural language processing. We start by
introducing the subject. From there we move on to NLP at OQAM
show how to build a sentiment index by analysing press
releases. Lastly, we finish by presenting a quantitative We at OQAM decided to initiate our first NLP research
investment strategy based on the sentiment index. project in the summer of 2020. The project was initiated
as a summer project by two interns. Given that most of
The strategy we create manages to avoid the pandemic the work within NLP is conducted on the English
shock in March 2020 and increase the risk-adjusted language, and thus quite thoroughly researched, we
return compared to the benchmark index. Thus, decided to focus on the Swedish language.
showing promising results for using automatically
analysed text data as an input in the investment Our ambition was to investigate if we could predict the
process. category and sentiment of stock-specific press releases
by using NLP. We believe that NLP is an important tool
to incorporate alternative data sources such as news and
What is NLP? press releases into the quantitative investment process.
NLP, or natural language processing, is a field within Currently, we are evaluating a lab trading strategy based
computer science concerned with the task of giving on NLP which has been implemented with a low risk
computers the ability to understand human written allocation. The strategy has been live since mid-2021 and
language. This is mainly achieved with the help of we are continuously monitoring its characteristics and
artificial intelligence and statistical methods. evaluating potential next steps.
We are using press releases from large- and mid-cap The sentiment index
companies listed on the Swedish stock exchange. The
universe is a snapshot of today, and some selection bias There are multiple ways of constructing an index, but we
may therefore occur in the index. The total data set decided to keep it simple. For each day we count the
consists of roughly 60,000 unique press releases from respective number of positive, negative, and neutral
2010 and forward. The first step of creating a sentiment press releases. We calculate the difference between
index is to get the individual sentiment for each press positive and negative press releases and divide it by the
release. total amount of press releases that day. Lastly, we take
the 25-day (5 weeks) moving average and shift all data
backward one day. This smooths out the time series and
Make BERT work for us makes sure that any look-ahead bias is avoided. The
resulting sentiment index can be seen in the figure
One could manually go through each press release and below.
label it as either positive, negative, or neutral, and
aggregate these into an index, but going through 60,000 The sentiment index reaches its lowest value during the
samples takes a lot of time, it is also tedious if the index initial market reaction of the Covid-19 pandemic and
should be updated regularly. Hence, we want to go increases during the summer, coherent with how the
through a subset of all press releases, fine-tune a BERT market moved during the same time. The sentiment
model on these, and then let it do the work for us on the index also seems to correlate with the 1 month return of
rest of the samples. This procedure makes it possible to the stock index.
classify new press releases efficiently without human
supervision.
Sentiment index
Sentiment Index (LHA) 1M Return of Stock Index (RHA)
0.45 0.20
0.35 0.10
0.25 0.00
0.15 -0.10
0.05 -0.20
-0.05 -0.30
-0.15 -0.40
Sentiment index from January 2019 to September 2021 compared to monthly returns in a broad Swedish stock index
Improving the risk-
adjusted return with
sentiment
With help of the sentiment index, we take it one step As seen in the table, there are more days where the
further and investigate if it’s possible to use it to sentiment is flat or decreasing. Further, days with rising
generate better risk-adjusted returns. We limit ourselves sentiment have a higher average return while the
to one tradable asset, a broad benchmark index over the standard deviation is smaller compared to days with a
Swedish stock market. And the only two alternatives are decreasing sentiment. The ratio between the mean
either full exposure or no exposure. return and the standard deviation is nearly 10 times
greater for days with an increasing sentiment compared
By looking at the sentiment index compared to the to days with a decreasing sentiment.
monthly return of our stock index it is possible to spot
some correlation between the returns and the This analysis supports the thesis of a negative correlation
sentiment. This correlation is most prominent during the between a rise in sentiment and risk in the market.
pandemic shock in March 2020. Hence, we want to build a model with exposure to the
stock index when the sentiment is rising and no
We continue the analysis by dividing trading days into exposure when the change in sentiment is flat or
three groups, one where the sentiment index has negative.
increased for the past month, one where it has stayed
flat, and one where it has decreased. By doing this it is We implement this strategy by using a moving average
possible to investigate if there is any difference in the approach. If the 25-day moving average sentiment value
distribution of daily returns for each group respectively. is above the 50-day moving average sentiment value, we
have 100% exposure to the stock index, and otherwise
no exposure at all. The result of the strategy is seen in
Average Std of Group Return the figure below.
Sentiment
return returns size per std
Decrease 0.04% 1.61% 246 0.025 The strategy successfully performs on par with the
Flat 0.01% 1.05% 227 0.010 underlying index in terms of total return. Since the total
Increase 0.29% 0.93% 196 0.312 time in market is lower for the strategy, it reduces the
Trading strategy based on sentiment. Stock index with and without sentiment filter
Trading strategy based on a sentiment risk filter
Stock Index Stock Index with Sentiment Filter Stock Index with Sentiment Filter and 160% exposure
3.2
2.8
2.4
2.0
1.6
1.2
0.8
Trading strategy based on sentiment. Stock index with and without sentiment filter as well as increased exposure
to even risk between strategies
overall market risk, while keeping approximately the returns of an index. Though the strategy needs more
same upside. The strategy successfully managed to stay work before deployment, it gives a peek of what can be
out of the market during the drawdown in March 2020 accomplished with the help of data and the latest NLP
and later put on exposure when the sentiment increased technologies. In general, interest in NLP within finance
again. have been growing rapidly the last couple of years and
potential use cases keep expanding. This article focused
Although the strategy shows some attractive on a simple use case deploying a risk sentiment filter. In
characteristics during this period, to give the reader real life investors analyse sentiment in real-time, using
some perspective, the same strategy performs worse if all different sorts of data (Twitter/social media, news,
we extend the time window back to 2015. As seen in the economics, etc.) to help them manage their risk. Possible
figure above, the strategy misses a lot of the use cases could also be found for instance within the
performance in exchange for down-side protection. processes dealing with corporate actions. Making life
Until mid-2019 the strategy yields near-zero returns. hopefully smoother for asset managers focusing on large
equity universes.
Because the stock index exhibits more variance, it is
possible to adjust the exposure in the sentiment strategy
to match the variance of the index. This yields an Read more
increase in exposure during on-signals from 100% to A good and illustrative guide on how BERT works:
160%. The sentiment strategy with increased exposure https://jalammar.github.io/illustrated-bert/
can be seen in the figure above. The increase in exposure
gives a higher total return, while still limiting the The research team at the National Library of Sweden, a
downside. must if you want to stay up to date with cutting edge
Swedish language models:
https://kb-labb.github.io/
Summary
Collection of state-of-art AI models, free to use:
The last couple of years' advancements in the field of https://huggingface.co/
NLP has opened new possibilities that hardly were
imaginable in the past. Google’s AI-blog, must follow for everyone interested in
the field:
In this whitepaper, we have shown how to create a https://ai.googleblog.com/
simple sentiment filter to increase the risk-adjusted