1 s2.0 S0970389619301569 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

IIMB Management Review (2019) 31, 157–166

available at www.sciencedirect.com

ScienceDirect

journal homepage: www.elsevier.com/locate/iimb

News-based supervised sentiment analysis for


prediction of futures buying behaviour
Ritu Yadava,1, A. Vinay Kumarb,*, Ashwani Kumarc,2

a
Management Information Systems Area, Indian Institute of Management Rohtak, Rohtak, Haryana, India
b
Finance & Accounting Area, Indian Institute of Management Lucknow, Lucknow, Uttar Pradesh, India
c
IT and Systems Area, Indian Institute of Management Lucknow, Lucknow, Uttar Pradesh, India

Received 1 August 2016; revised form 5 April 2017; accepted 27 March 2019; Available online 2 April 2019

KEYWORDS Abstract This study examines the predictability of real-time news data on investors’ buying
Sentiment analysis; behaviour in the futures market, using supervised sentiment analysis. Market sentiment or trad-
Real-time news; ers’ buying behaviour is captured at the bid-ask stage of price formation using the net buying
Trade direction; pressure (NBP). Any significant change in NBP patterns defines an “interesting market event”.
Vector space model; Real-time news headlines are automatically labelled using interesting market events, assuming
Support vector a lag between the market information and its impact on buying behaviour. News was found to
machines; have an impact on the market buying behaviour of the S&P NIFTY index futures with an optimal
Prediction; lag of 5 minutes. Manual labelling of the news data validated this empirical finding.
Futures market; © 2019 Published by Elsevier Ltd on behalf of Indian Institute of Management Bangalore. This is
Net buying pressure an open access article under the CC BY-NC-ND license. (http://creativecommons.org/licenses/
by-nc-nd/4.0/)

Sentiment analysis can be used to determine the impact of and the behaviour of index futures traders at the level of bid-
unstructured market news on the emotions of investors, ask quotes.
which is referred to as market sentiment. Prior studies have Understanding the behaviour of futures market traders is
established the predictability of the impact of news on mar- pertinent as the futures market leads the spot markets in
ket sentiment in the spot market context. This study aims to reacting to market events (Kumar and Jaiswal, 2013; Vipul,
capture market sentiment at the earliest price formation 2009). Net buying pressure (NBP), which is the difference
stage, i.e., when investors reveal their bid-ask quotes. Bid- between the number of buyer-initiated trades and the num-
ask quotes are outcomes based on the information available ber of seller-initiated trades calibrated from the bid-ask
to the market participants. Investors who have superior infor- quotes, proxies traders’ sentiment as it reveals the direction
mation would reveal the same through their intentions in the of the trades. The objective of this study is to predict the
form of bid-ask quotes. In this study, we perform supervised trends in NBP in the futures market by using supervised sen-
sentiment analysis using high-frequency, real-time news data timent analysis of real-time news headlines. Understanding
the impact of news on market trade directions as well as the
time lag between the arrival of news and the anticipated
* Corresponding Author. Phone: 91-522-6696645; Fax: 91-522-2734025.
E-mail addresses: [email protected] (R. Yadav), trade direction is critical to this objective.
[email protected] (A.V. Kumar), [email protected] (A. Kumar). There are two main approaches to sentiment analysis—
1
Phone: 91-1262-228548; Fax: 91-1262-274051. the unsupervised dictionary-based approach and the super-
2
Phone: 91-522-6696660; Fax: 91-522-2734025. vised approach. In the unsupervised dictionary-based
https://doi.org/10.1016/j.iimb.2019.03.006
0970-3896 © 2019 Published by Elsevier Ltd on behalf of Indian Institute of Management Bangalore. This is an open access article under the CC
BY-NC-ND license. (http://creativecommons.org/licenses/by-nc-nd/4.0/)
158 R. Yadav et al.

approach, market sentiment is extracted from the news text financial markets is learnt by using labelled training data.
directly. A dictionary of sentiment words is used to count The creation of the training data involves labelling the news
the different sentiment-related or emotions-related textual instances according to their impact on the markets. The
cues (Antweiler & Frank, 2004; Das & Chen, 2007; Garcia, news instances can be labelled manually according to the
2013; Tetlock, 2007). In supervised sentiment analysis, mar- news discourse (Davis et al., 2006; Bozic & Seese, 2011) or
ket sentiment is learnt by using historical market trends and automatically based on the corresponding market trends
news patterns (Wuthrich et al., 1998; Lavrenko et al., 2000; (Mittermayer & Knolmayer, 2006). Labelling news instances
Fung et al., 2003). The efficacy of dictionary-based senti- manually is a precise but labour-intensive task; hence, this
ment analysis depends on the accuracy of the sentiment dic- method is not suitable for high-frequency news analytics.
tionary that is used and the suitability of the dictionary in a Automatically aligning news instances with the correspond-
specific context. ing market trends makes for one of the most challenging
In the supervised sentiment analysis approach, training tasks in sentiment analysis. Yoo et al. (2005), Mittermayer &
data is created by either manual labelling or automatic Knolmayer (2006), Nikfarjam et al. (2010), and Nassirtoussi
labelling of historical news. Although manual labelling has et al. (2014) reviewed studies that have used the supervised
higher precision, this method cannot be adopted for larger sentiment analysis approach. A brief review of some of the
data sets. Automatic labelling involves temporally aligning major concerns that our study deals with, follows.
“interesting” or significant market trends with the news News-trend alignment is one of the most important parts
that might have caused those trends. In this study, we use of a supervised sentiment analysis model for testing and con-
the supervised sentiment analysis approach to examine textualising market efficiency. The accuracy of the news-
high-frequency news headlines using a vector space model trend alignment procedure constrains the efficacy of the
(VSM). We also compare the predictability of dictionary- supervised sentiment analysis. It results in noisy news label-
based sentiment metrics and that of a VSM. ling, and consequently, erroneous training data. Some of the
We assume that markets are inefficient and take time to reasons for a noisy training data set are:
discount market information. Hence, we consider a trend to
be related to the news stories that are released within an 1. The news stories may have been published out of time
assumed alignment window. The time taken to discount with the market trends.
market information varies according to the type of market 2. A news story may have co-occurred with a trend by
or the asset under consideration. Prior studies on spot mar- chance.
ket sentiment analysis found empirically that new market 3. A news story may have contained contrary information
information takes around 20 minutes to reflect in the stock to a market trend—for example, a positive story might
prices (Gidofalvi, 2001). However, the futures market leads have co-occurred with a negative trend.
the spot market, and bid-ask quotes reflect the investors’ 4. A market trend may have been illusory because the
sentiment faster than price trends. Hence, in the context of market could move without the news information (Drury
the index futures’ sentiment at the bid-ask level, the arrival et al., 2012; Chan, 2003).
of news information has a much faster impact on the quotes.
This study tests this hypothesis empirically by using different An efficient market requires news to be aligned with the
values for the alignment window. The results are validated price trends since all the information that arrives into the
by manually labelling a small sample of the news data. The market is instantaneously discounted during price discovery.
context of this study is the National Stock Exchange (NSE) However, studies on sentiment analysis (Chan, 2003; Lein-
futures market, with index futures as the target asset and weber & Sisk, 2011) and behavioural finance refute the effi-
their trade direction as the measure of investor sentiment. cient market hypothesis (Baker & Wurgler, 2007). Markets
The rest of this paper is organised as follows. The subse- frequently under-react to the post-news drifts, and often
quent section presents a literature review of sentiment feature momentum trading (Chan, 2003; Baker & Wurgler,
analysis studies conducted in a financial market setting. The 2007).
third section presents the supervised sentiment analysis Momentum trading has been associated with stale news in
methodology used in this study, followed by the data analysis the extant literature (Tetlock, 2011). Long-term momen-
and the results in the fourth section. To the best of our tums can be filtered out to understand the impact of news
knowledge, this is one of the first sentiment analysis studies on long-term market trends (Uhl et al., 2015). However, for
conducted in the futures market in the Indian context. The high-frequency sentiment analysis, news headlines have
insights gained from this study could prove useful in high- been a preferred choice for analysis (Kohara et al., 1997;
frequency algorithmic trading and in understanding buying Bunningen, 2004; Takahashi et al., 2007; Chan, 2003). Head-
behaviour in the futures market in general. The implications lines are one of the earliest forms in which information
of this research for academics and financial investments are arrives into the markets (Takahashi et al., 2007). Analysing
discussed in the fifth section. news-trend patterns using headlines alone can significantly
improve classification performance compared to results
using the whole news body or sentence-level analysis (Bun-
Literature review ningen, 2004). Therefore, in this study, we conduct super-
vised sentiment analysis on real-time news headlines.
Prior studies have established the predictability of the The time lag in news-trend alignment signifies the impact
impact of news on market sentiment in the spot market con- of news on the market, i.e., once news has arrived in the
text. Supervised sentiment analysis is a text classification market, how much time does it take for the prices to reflect
task where the impact of textual market information on or discount the information? Prior studies refer to this lag
News Based Supervised Sentiment Analysis 159

period as the window of influence (Gidofalvi, 2001) or the 1. Market trend identification, which defines and identifies
reaction time of news (Cheng, 2010). In this paper, we refer the significant market patterns of a given asset.
to it as the alignment window as this time lag helps in align- 2. News-trend alignment scheme, which is used to create
ing market trends with the probable impact of news. the training data set
For news stories that get published out of time with 3. News representation scheme, which converts the
the corresponding trend, the time lag or a brief window unstructured news data into structured data
of influence helps in aligning such news stories with their 4. Classification and evaluation
respective market trends. However, if the length of the
window of influence is not chosen appropriately, it may
introduce the risk of confounding news. Confounding
Market trend identification
news instances are instances that occur in an opportunity
window of either an unrelated market trend or contra-
dicting market trends. Such instances introduce noise This study defined significant NBP instances as market trends
while labelling the training data. and the corresponding change points as market events. For a
One method of handling confounding news instances is to given asset, a market nugget mi at time ti contains the price
ignore such instances altogether (Gidofalvi, 2001). Another and trade direction event information.
way is to use an optimal alignment lag in aligning the news mi ¼ fpi ; ei gjti1 ti ¼ d
with the market trends that alleviates the confounding
where pi is the price traded, and ei is the trade direction
effect to a great extent. In the case of inefficient markets,
trend {+b, -s, NaN} at time ti. Here, +1 refers to the number
the news needs to be aligned with the observed market
of buyer-initiated trades in the given time interval d, and -1
trend with a certain time lag (Lavrenko et al., 2000; Gido-
refers to the number of seller-initiated trades.
falvi, 2001). Lavrenko et al. (2000), one of the first studies
To calculate ei, the trade direction is calculated using
of high-frequency news sentiment analysis, tested four dif-
real-time bid-ask quotes in an event time scale with one-
ferent lags, namely, 0, 1, 5, and 10 hours to calibrate the
second precision data following Lee and Ready’s (1991) algo-
impact of news on stock market trends. The predictability of
rithm. For each minute, the total number of buyer-initiated
the news features was found to decrease with an increase in
trades and the number of seller-initiated trades are calcu-
lag, delivering significantly better prediction accuracy.
lated to get the net difference, i.e., the net buying pres-
Gidofalvi (2001) empirically tested different lengths of the
sure. In this one-minute NBP time series, an event is defined
window of influence and found that the impact of news is
as significant or interesting if the NBP value and its first
significantly high in [-20 minutes, 0] and [0, 20 minutes].
moment are significantly above average. In this study, an
In tertiary duration studies, Li et al. (2010) found the
event is defined using NBP and NBP log returns time series in
optimal lag to be 15 days. While tertiary lags reflect the
the following way. At time ti, let the net buying pressure be
overall impact of news on long-term market trends, high-
nbpi, and the net buying pressure return be nbpri, and let
frequency time lags indicate the alignment lag over which
z_nbpi, zscore_nbpri be their respective z-scores.
the impact of the news can be profitably arbitraged (Lav-
renko et al., 2000; Gidofalvi, 2001; Robertson et al., 2007). ei ¼ þ1 if zscore_nbpi > 1 AND zscore_nbpri > 1
Gidofalvi’s 20-minute window of influence has been the de- 1 if zscore_nbpi < 1 AND zscore_nbpri < 1
facto standard in prior sentiment analysis studies conducted NaN otherwise
on stock markets (Gidofalvi, 2001; Schumaker & Chen,
2008). The larger the assumed opportunity window, the Hence, a positive trade direction event can be character-
more profound is the presence of the confounding news. ised using those time instances where the number of buyer-
Although a large window of influence helps in accounting for initiated trades significantly exceeds the number of seller-
more news records, resulting in a larger training data set, initiated trades. Using NBP log returns ensures that only the
the data set is obtained at the cost of inducing noise in the change points are captured and not the time instances that
training data. Finding an optimal window of influence is a could have been caused by the momentum of past news.
challenging task.
Trends in the futures markets often lead the trends in the News-trend alignment
stock markets (Vipul, 2009). These trends can be best cap-
tured at the price formation stage with bid-ask quotes. This The objective of the news-trend alignment task is to create
paper presents a sentiment analysis study of the futures a labelled news data set for training the sentiment classifier.
market using real-time news headlines to represent market To label a news story, one needs to identify the market
information and net buying pressure to represent market trends that the news story is capable of triggering in the
trading behaviour. The study has two objectives: to test the market. Given that markets are not efficient, they take time
predictability of news on the sentiment in the futures mar- to discount market information. Hence, the news stories
ket, and to find the optimal time lag in which the futures released within a given time lag of an interesting market
market reacts to news. trend are labelled according to that trend. The lag or the
amount of time that the market takes in discounting the
public information varies with the markets and the assets.
Methodology The optimal lag period can be found empirically from the
sentiment analysis model.
Supervised sentiment analysis is essentially a text classifica- Let D be the optimal lag period for a given asset. A trend ei
tion task that can be divided into the following steps: observed at time ti is the result of the news released during
160 R. Yadav et al.

the time period between ti-D-1 and ti-1. Time period [ti-D-1, ti-1] D-1 <= tij <= ti-1 and fijk is 1 if feature fijk is present in news
is the alignment window for trend ei, and the set of news sto- Nij; else, it is 0. V is the size of the feature space in the train-
ries released in this alignment window is labelled with the sen- ing data vocabulary.
timent si as per the trend ei. News stories that occurred If two market trends occur within a period D, their align-
simultaneously were labelled with the same sentiment, even ment windows might overlap and confound the labelling of
though they might be unrelated. This is one of the major limi- the news stories. We removed such confounding news stories
tations of the news trend alignment methodology. In order to from the training data set. An optimal alignment window D
handle such confounding labelling instances that were aligned helps to alleviate confounding news stories to an extent. We
with contradictory sentiments, confounding instances were empirically tested different alignment windows to observe
removed from the analysis of the training data set. the optimal choice for futures buying behaviour. The training
data thus formed is the set of labelled news stories NT, where
NT ¼ fN1 ; N2 ; . . .g
News representation
Nt ¼ ½ft1 ; ft2 ; . . . ftV ; st  or ½Ft ; st 
A real-time news source typically contains two types of news:
headline alerts and news stories. Headline alerts have only a
Classification and evaluation
headline and no textual discourse. A news story contains a
headline, leading paragraphs, and the rest of the story. A mar-
Supervised sentiment classification is a text classification prob-
ket event is generally reported using headline alerts first, fol-
lem often characterised by a huge feature space and noisy
lowed by the full-body news versions. In journalism, a full-
training data. Some of the most popular choices for sentiment
body news story follows an inverted triangle layout, where
analysis and for resolving the text classification problem are
the value of information is the maximum in the headline and
naive Bayesian classifiers and support vector machines (SVM).
decreases with the subsequent paragraphs (Van Dijk, 1988). A
A naive Bayesian classifier is a generative probabilistic classifier
headline alert is a summarised abstraction of the full-body
that assumes that features belonging to different classes are
news story that reaches the markets before the main news
independent. Even though features belonging to different sen-
does. News headlines have been the preferred choice in stud-
timent classes are far from being independent, naive Bayesian
ies using supervised sentiment analysis (Kohara et al., 1997;
classifiers have proved to be a robust choice for text classifica-
Bunningen, 2004; Takahashi et al., 2007; Chan, 2003). This
tion tasks (Zhang, 2004). Support vector machines are discrimi-
paper conducts sentiment analysis of objective information
native classifiers that translate noisy n-dimensional space to a
about news events using news headlines.
higher dimensional plane, identifying the best hyperplanes for
To capture subjectivity in the text of a news story, this
distinguishing the sentiment classes. They are well suited for
study used dictionary-based sentiment metrics extracted
large and noisy text classification tasks (Joachims, 1998). We
from the full news text using Loughran and McDonald’s
explored the predictability of both naïve Bayesian classifiers
(2011) finance sentiment dictionary. For a given news story,
and soft-margin SVMs for sentiment classification.
the number of positive, negative, and risk-related sentiment
The sentiment analysis was conducted over a period of
words were counted using the dictionary. These counts were
one year. For validation purposes, this study used (10+1) fold
aggregated and normalised to obtain the positive, negative,
cross-validation data set using data from the first 11 months
and uncertainty sentiment scores.
of the study period. The data from the last month was used
This study used a vector space model (VSM) with binary
as the test data set to verify the efficacy of the proposed
weights to convert the unstructured news headlines to a
algorithm on unseen data. For each of the sentiment classes,
structured feature vector. One of the major challenges
false positives were costlier than the false negatives; there-
involved in using a VSM is the large size of the feature space.
fore, precision was more important than recall in the senti-
There are three ways to reduce the feature space—semantic
ment classification task. Detection error trade-off (DET)
normalisation methods such as stemming and lemmatisation,
curves were used to compare the different sentiment analy-
unsupervised feature selection criteria such as term fre-
sis models to understand the impact of different parameters
quency and TF-IDF (term frequency-inverse document fre-
such as the alignment window size. The DET curve is a conve-
quency [Joachims, 1998]) based thresholds, and supervised
nient way to visualise the trade-off between missed detec-
feature selection methods such as chi-square, info-gain, and
tion rate and false alarm rate on normal deviant scales.
Gini-index. This study used lemmatisation, stop-word
removal, and lowercase conversion to normalise the textual
data. In addition, we normalised abbreviations to handle dif- Data analysis and results
ferent variations such as “US,” “U.S.,” “United States,” etc.
using a rule base and a domain knowledge base. Further, the This study was set in the Indian futures market, and the data
feature space was reduced using term frequency, chi-square, sampled for the study was from the year 2009. The year
and info-gain-based selection criteria to enhance the classifi- 2009 was an eventful year for Indian financial markets
cation task. because it was the year immediately following the 2008
For a given time lag period D, a news data nugget Ni at global credit crisis. In 2009, Indian financial markets went
time ti is a set of news stories released during [ti-D-1, ti-1]. through a number of troughs and crests with the Satyam
bankruptcy crisis (NSE Guide, 2009), a weakening rupee, and
Ni ¼ fNi1 ; Ni2 ; . . . ; si g
the euphoria following the election results (Economic Times,
where si is the sentiment labelled with the news-trend align- 2009). The duration of the study was from January 1, 2009 to
ment, Nij = [fij1, fij2, ...fijV, si] or [Fij, sij], released at tijj ti- December 31, 2009.
News Based Supervised Sentiment Analysis 161

The real-time trades and quotes data were collected from were not considered in the analysis. After filtering out the
the NSE futures market from January 1, 2009 to December 31, summary news feeds and formatted news feeds, 88,821
2009 for the S&P CNX NIFTY Index futures. The S&P CNX NIFTY news feeds were found relevant, i.e., around 88% of the
Index, also referred to as the NIFTY 50 or simply NIFTY, includes news stories were relevant. Of these relevant news stories,
50 companies that cover over 22 sectors of the Indian economy. around 64% were headline alerts. That is, headline alerts,
The trading hours for the NSE futures market are from 9:15 which arrive in markets before their corresponding full-body
a.m. to 3:30 p.m. Trading data for the quantitative data analy- news updates, accounted for more than half of the news
sis was considered from 10:00 a.m. to 3:30 p.m. to avoid early data. Headlines from 88,821 news instances were used for
trading-hour incoherence. the sentiment analysis of the futures market.
The news representation module for this study was devel- The news data is represented in vector form using a vec-
oped in Java 2.0 using Eclipse IDE. Pre-processing of the tor space model. To extract and count important features
news feeds, i.e., tokenisation, lemmatisation, and part of from the text, the following pre-processing steps were used.
speech (POS) tagging was done using Stanford CoreNLP Sentence splitting and tokenisation were used to identify
(v3.2) libraries (Klein & Manning, 2003). The market quanti- unique unigrams in the news headlines. To normalise seman-
tative data was pre-processed in Matlab R2012b. For classifi- tic inflections, unigrams were lemmatised and converted to
cation, we used Matlab’s naive Bayesian package and the lower case. After removing stop words, there were 19,753
SVMLight (Joachims, 1998) package. The SVMLight package unique features in the resulting feature space. After remov-
can handle a large number of features and training records ing features that were numeric or alphanumeric and fea-
with sparse vector representations with fast optimisation tures with document frequency less than two, the number of
algorithms. The data processing was performed using a 64- features left for analysis was 5545. This was significantly less
Ò
bit Intel CoreTM i7-3770 Processor, CPU @ 3.40 GHz, 4 cores, than the original feature space. However, a vector of 5545
with 8 GB RAM, running on Microsoft Windows 8 Pro. with training size of the order of 60,000–70,000 was still a
Real-time news was collected from the Bloomberg real- challenging task for the classifier.
time news service. News stories related either to the NIFTY We empirically tested two supervised feature selection
companies or to the general Indian financial, political, or methods, chi-square and info-gain, for reducing the feature
economic context were extracted. As per these criteria, space, thereby improving the classifier’s performance. The
there were 10,000,78 news stories collected, of which 52% top 10 features according to TF-IDF, chi-square, and info-
were company-related news stories, and 48% were socio- gain are shown in Table 1. Since the chi-square and info-gain
economic and political news stories. In the real-time news metrics are supervised, the features ranked with respect to
feeds, four types of news reporting formats were observed: these two metrics hold more classification value compared
headline alerts, full-story news feeds, summary feeds, and
structured news feeds. For a given event, headline alerts
are the first reports to the market, followed by full-story Table 1 Sample of top features ranked according to TF-IDF,
news feeds. A full-story news feed has three main ele- chi-square, and info-gain metrics.
ments—the headline, the leading paragraph that contains
the most important and necessary information about the Top TF-IDF-based Top chi-square- Top info-gain
market event(s), and the rest of the story, which contains features based features based features
further information and analysis about the event or related India satyam satyam
past events. Structured news feeds are full-story feeds that Rupee maharashtra gail
contain formatted structures such as tables and lists. Often, bln gail maharashtra
such news stories are periodical summary news such as mln ton reduce
“India Stocks Preview” and “Commodities Watch.” rupee vedanta jsw
In this data set, around 56.7% of the news feeds were indian reduce vedanta
headline alerts, and 37% were full-story feeds, of which 7.6% bank jsw cipla
were summary news stories; 5.5% of the news feeds were share average oilseed
formatted news stories. (Figure 1). price oilseed vijaya
Since the aim of this study was to analyse the impact of ton april reduce
new event information entering markets in real time, sum-
mary news stories, which includes formatted news feeds,

Figure 1 Categories of news from real-time news source Bloomberg.


162 R. Yadav et al.

Figure 2 Net buying pressure plot for a sample of time period.


Note: Significant market events marked. Refer to Table 2 for event details.

to those ranked with the TF-IDF based metric. The info-gain Table 2 News released around some of the “significant”
metric, with its top 90 percentile words, gave better perfor- market trends.
mance. Hence, the results from the info-gain configuration
are reported in this study. Of the 5545 features, 387 features Event News News
were selected for sentiment classification. number timestamp headline
The trade direction, i.e., the net buying pressure, for the (Figure 2)
real-time bid-ask quotes was calculated using Lee and 1 7 Jan, 2009 Emerging currencies to drop,
Ready’s (1991) algorithm. There were 79,771 data points in Morgan Stanley says
the resulting 1-minute-frequency NBP data of one year. A 2 7 Jan, 2009 SRSR holdings stake in Satyam
significant trade point is defined as the time instances that drops to 3.6%, spokesperson
lie above the z-score of two. Thus, the absolute NBP thresh- says
old for the market trends was found to be around 20 NBP 3 27 Jan, 2009 Asian stocks climb as U.S. indi-
points. With this criterion, around 2350 data points were cators boost exporters, banks
found to have “interesting” market sentiment. The 1-minute 4 26 Feb, 2009 Gold falls for third day on spec-
NBP plot of a sample time period is shown in Figure 2, along ulation, economy will recover
with the daily aggregated NBP for better visualisation of 5 26 Feb, 2009 Ranbaxy lab alleged to fake test
trading trends. Some significant NBP data points along with results in drug applications
the significant news stories released at that time are pre-
sented in Table 2. Significant NBP events marked with
increased buyer-initiated or seller-initiated trades are fol-
lowed by a buyer’s market trend or a seller’s market trend mt, 30 mt], with cross-validation folds of the first 11 months
for a few days, implying significant arbitrage possibilities. (10+1) fold for each training data set. We tested two classi-
Once the significant trade points were identified, they fiers: MATLAB naive Bayes algorithm and SVMLight. A multi-
were aligned with the news data according to the alignment nomial naive Bayesian classifier was used as it is best suited
window sizes [1 mt, 5 mt, 10 mt, 15 mt, 20 mt, 30 mt]. The for VSM-based text classification. A linear kernel soft margin
news stories in each alignment window were labelled SVM was used. The SVM’s error penalty—the trade-off
according to the aligned NBP trend (positive, negative, or between training error and margin—was set to N/jjDjj, where
neutral). The classifier classified positive news instances and N is the size of the training data D. The SVMLight classifier
negative news instances. Sentiment analysis was conducted was found to perform better than the naive Bayesian classi-
across six different training data sets, each created assum- fier on the validation data set and the test data set. Only the
ing a varying alignment lag [1 mt, 5 mt, 10 mt, 15 mt, 20 SVMLight results are reported in this paper.
News Based Supervised Sentiment Analysis 163

Table 3 Micro-averaged precision, recall, and F-metric of The classifier’s output and the training data quality were
news-based sentiment analysis model on (10+1) fold cross- tested with micro-averaged weighted precision, recall, and
validation data. F-metric that calibrated the efficacy of our two-class predic-
tion model. For trading floor decisions, a good classifier
Alignment lag Precision Recall F-metric should have lower false positive and false negative rates.
1 0.321912 0.313279 0.317537 Hence, the different classifier results were compared with
5 0.312701 0.32759 0.319972 the DET curves.
10 0.297627 0.319205 0.308039 Table 3 and Table 4 show the micro-averaged weighted
15 0.281293 0.327945 0.302833 precision, recall, and F-metric of the sentiment classifier on
20 0.263084 0.332933 0.293916 the (10+1) fold cross-validation data and on the test data,
respectively. The precision of the classifier decreased as the
alignment window size increased. The precision of the clas-
sifier is an indication of the noise present in the alignment
process. As the alignment window size decreases, we run
Table 4 Micro-averaged precision, recall, and F-metric of
the risk of missing out on important events altogether, which
news-based sentiment analysis model on test data.
is evident in the decreased recall at the short lags. The F-
Alignment lag Precision Recall F-metric metric provides the classifier’s overall accuracy in analysing
market sentiments. An alignment lag of 5 minutes was found
1 0.188772 0.300082 0.231754
to be optimal. The same findings were corroborated by the
5 0.262076 0.327389 0.291114
DET curves, as shown in Figure 3. Further, including the dic-
10 0.259737 0.308916 0.2822
tionary-based sentiment features of whole news text made
15 0.268283 0.294497 0.280779
no significant improvement on the predictability of the VSM
20 0.282828 0.266139 0.27423
features of news headlines (Figure 4).
Yadav et al. (2013) compared the precision of automati-
cally labelled news instances and manually labelled news

Figure 3 Detection error trade-off plot of the news-based sentiment analysis model comparing different alignment lags.
164 R. Yadav et al.

Figure 4 Detection error trade-off plot of the news-based sentiment analysis model with vector space model (VSM) features vs. dic-
tionary-based sentiment features of headlines.

instances in the futures market, and found a window of 5 presented a curious case that highlights the inherent com-
minutes to be an optimal alignment lag too. This further val- plexity of labelling news data. In the gold standard data, a
idates the results observed in this study. A set of significant sufficient number of news instances are manually labelled
NBP trade instances and the corresponding [1 mt, 5 mt, 10 by at least two experts and refined such that the mutual
mt, 15 mt, 20 mt, 30 mt] window-aligned news stories were agreement is above a given threshold. For instance, in Das
given to the annotator. The annotator had to decide whether and Chen’s (2007) work, two experts annotated message
a news story in each alignment window was plausibly respon- board data with the ambiguity coefficient (the labelling mis-
sible for the corresponding trade direction. The results of match percentage) of 27.54%, i.e. the experts themselves
this comparison concurred with the sentiment analysis find- disagreed about the nature of impact of market news 27.54%
ings. In the context of the Indian futures markets, the times. This highlights the inherent complexity in under-
impact of news stories or market events is reflected in the standing the impact of news on the financial markets by
investors’ bid-ask behaviour within a time window of 5 human experts. The inter-rater agreement scores obtained
minutes (Table 5). using manually labelled data can be treated as the upper
A major limitation in labelling financial news automati- bound on the accuracy expected from an automatically
cally is the noise that is inherent in financial markets. The labelled data set. This explains the low precision and recall
gold standard (i.e., the manually labelled training data) obtained in sentiment analysis studies.

Table 5 Precision in news trend-based manual labelling. Conclusion


Alignment lag (minutes) News-based precision
Financial markets behaviour is an outcome of the prevailing
1 0.46078431 sentiment shaped by the dynamics of the arriving news.
5 0.51660517 Often, markets take time to decipher the sentiment; there-
10 0.49862763 fore, there is a lag between the arrival of a news story and
15 0.48341837 the actual trading. An investor who identifies the sentiment
20 0.48396362 early would significantly profit from the anticipated direc-
tion. Therefore, understanding the alignment leads, lags, or
News Based Supervised Sentiment Analysis 165

the window of opportunity is highly crucial. Prior studies Bozic, C., & Seese, D. (2011). Neural networks for sentiment detection
have confirmed the importance and length of the alignment in financial text. Proceedings of the 14th International Business
lag in the context of stock markets (Gidofalvi, 2001; Robert- Research Conference Retrieved from https://pdfs.semanticscholar.
son et al., 2007). This study analysed the optimal choice of org/dd7f/4fa6137df5d5ec08efe97150996a548af5e7.pdf.
an alignment lag that is appropriate for the index futures Bunningen, A.H. (2004). Augmented trading - From news articles to
stock price predictions using syntactic analysis. (Master’s thesis).
market in India. We observed that the sentiment classifier
University of Twente, Enschede (Netherlands).
performed best with an alignment lag of 5 minutes in the Byrne, A., & Brooks, M. (2008). Behavioural finance: Theories and
context of the Indian futures market. This finding was fur- evidence. The Research Foundation of CFA Institute Literature
ther validated by the manually labelled data. Our results are Review 1–26.
consistent with the findings reported in prior studies, and Chan, W.S. (2003). Stock price reaction to news and no-news: Drift
indicate that the index futures buying behaviour leads the and reversal after headlines. Journal of Financial Economics 70
spot market sentiment. This is consistent with other studies (2), 223–260.
that propound that the derivatives market leads the spot Cheng, S.-H. (2010). Forecasting the change of intraday stock price
market (Kumar and Jaiswal, 2013; Vipul, 2009). by using text mining news of stock. In: Proceedings of the Ninth
High-frequency, real-time news plays an important role in International Conference on Machine Learning and Cybernetics.
IEEE, Qingdao, pp. 2605–2609.
the highly competitive financial trading industry. This study
Das, S.R., & Chen, M.Y. (2007). Yahoo! for Amazon: Sentiment
captures the impact of real-time news on the futures markets extraction from small talk on the web. Management Science 53
right at the first interface between the market and investors, (9), 1375–1388.
i.e., at the bid-ask stage. The findings reported here have sig- Davis, A. K., Piger, J. M., & Sedor, L. M. (2006). Beyond the numbers:
nificant implications for futures trading, especially for high- An analysis of optimistic and pessimistic language in earnings
frequency algorithmic trading. With nearly efficient markets, press releases. Working Paper 2006-005A. Working Paper Series,
where market information arrives with minimum delays, the Research Division Federal Reserve Bank of St. Louis.
timely analysis of market information is imperative for captur- Drury, B., Torgo, L., & Almeida, J. (2012). Classifying news stories
ing an arbitrage opportunity. This study establishes that the with a constrained learning strategy to estimate the direction of
arbitrage window of opportunity in the Indian futures markets a market index. International Journal of Computer Science and
Applications 9 (1), 1–22.
is as short as 5 minutes, which underlines the importance of
Economic Times. (2009). Sensex creates history; two upper cir-
high-frequency analysis of news on the markets. cuits in one day. https://economictimes.indiatimes.com/
Financial markets are highly evolving institutions that sensex-creates-history-two-upper-circuits-in-one-day/article-
have featured random walks, information efficiency, and show/4545975.cms.
several behavioural anomalies (Byrne and Brooks, 2008). Fung, G.P., Yu, J.X., & Lam, W. (2003). Stock prediction: Integrating
News happens to be one of the most significant yet under- text mining approach using real-time news. IEEE International
utilised sources of market information; however, it is chal- Conference on Computational Intelligence for Financial Engi-
lenging to model news data. With such a stimulating neering. IEEE, Hong Kong, pp. 395–402.
research problem, we explored a new research direction Garcia, D. (2013). Sentiment during recessions. The Journal of
Finance 68 (3), 1267–1300.
where the basic unit of analysis is a news headline. There
Gidofalvi, G. (2001). Using news articles to predict stock price
were constraints inherent with the news-based sentiment
movements. Department of Computer Science and Engineering,
analysis approach. The foremost limitation faced in this University of California, San Diego. Retrieved from https://
work (and by any news-based sentiment analysis in general) people.kth.se/»gyozo/docs/financial-prediction.pdf.
is the chaotic market behaviour. The key assumption of Joachims, T. (1998). Text categorization with support vector machines:
news-based sentiment analysis is that the news moves the Learning with many relevant features. European Conference on
markets. While news does move the markets, the markets Machine Learning. Springer, Berlin Heidelberg, pp. 137–142.
may move without news too. The second limitation was the Klein, D., & Manning, C.D. (2003). Accurate unlexicalized parsing.
extraction of event-related information from the news sto- In: Proceedings of the 41st Meeting of the Association for Compu-
ries using the vector space model, which did not account for tational Linguistics, pp. 423–430.
Kohara, K., Ishikawa, T., Fukuhara, Y., & Nakamura, Y. (1997). Stock
the word order and other semantic roles of news text.
price prediction using prior knowledge and neural networks. Intelli-
gent Systems in Accounting, Finance and Management 6 (1), 11–22.
Acknowledgements Kumar, A., & Jaiswal, S. (2013). The information content of alter-
nate implied volatility models: Case of Indian markets. Journal
This paper is a revised and extended version of the following of Emerging Market Finance 12 (2), 293–321.
conference paper: Yadav R., Kumar A., & Kumar A. V. (2013). Lavrenko, V., Schmill, M., Lawrie, D., & Ogilvie, P. (2000). Mining of
concurrent text and time series. In: Proceedings 6th ACM SIGKDD
Supervised sentiment analysis: Engineering a robust automatic
Int. Conference on Knowledge Discovery and Data Mining, Bos-
training dataset. Proceedings of the International Conference tonpp. 37–44.
on Business Analytics and Intelligence. Indian Institute of Man- Lee, C.M.C., & Ready, M.J. (1991). Inferring trade direction from
agement Bangalore. intraday data. The Journal of Finance 46 (2), 733–746.
Leinweber, D., & Sisk, J. (2011). Event-driven trading and the “new
References news”. The Journal of Portfolio Management 38 (1), 110–124.
Li, X., Deng, X., Wang, F., & Dong, K. (2010). Empirical analysis:
Antweiler, W., & Frank, M.Z. (2004). Is all that talk just noise? The News impact on stock prices based on news density. IEEE Interna-
information content of internet stock message boards. The Jour- tional Conference on Data Mining Workshops, pp. 585–592.
nal of Finance 59 (3), 1259–1294. Loughran, T., & McDonald, B. (2011). When is a liability not a liabil-
Baker, M., & Wurgler, J. (2007). Investor sentiment in the stock mar- ity? Textual analysis, dictionaries, and 10-Ks. The Journal of
ket. Journal of Economic Perspectives 21 (2), 129–152. Finance 66 (1), 35–65.
166 R. Yadav et al.

Mittermayer, M. A., & Knolmayer, G. F. (2006). Text mining systems Tetlock, P.C. (2011). All the news that's fit to reprint: Do investors
for market response to news: A survey. Working Paper No 184. react to stale information? Review of Financial Studies 24 (5),
Bern, Switzerland: Institute of Information Systems, University 1481–1512.
of Bern. Uhl, M., Pedersen, M., & Malitius, O. (2015). What’s in the news?
Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., & Ngo, D.C. (2014). Using news sentiment momentum for tactical asset allocation.
Text mining for market prediction: A systematic review. Expert The Journal of Portfolio Management 41 (2), 100–112.
Systems with Applications 41 (16), 7653–7670. Van Dijk, T.A. (1988). Structures of News. In: Van Dijk, T.A. (Ed.),
Nikfarjam, A., Muthaiyah, S., & Emadzadeh, E. (2010). Text mining News as discourse. Erlbaum, Hillside, NJ, pp. 17–94. Ch 2.
approaches for stock market prediction. In: Proceedings of the Vipul (2009). Mispricing, volume, volatility and open interest. Jour-
2nd International Conference on Computer and Automation Engi- nal of Emerging Market Finance 7 (3), 263–292.
neering (ICCAE 2010). IEEE, pp. 256–260. Vol. 4. Wuthrich, B., Cho, V., Leung, S., Permunetilleke, D., Sankaran, K.,
NSE Guide. (2009). Stock Market: Satyam scandal rattles stocks. Zhang, J., & Lam, W. (1998). Daily stock market forecast from
http://nseguide.com/stock-views/stock-market-satyam-scan- textual web data. 1998 IEEE International Conference on Sys-
dal-rattles-stocks/. tems, Man, and Cybernetics. IEEE, pp. 2720–2725. Vol. 3.
Robertson, C., Geva, S., & Wolff, R.C. (2007). Can the content of pub- Yadav, R., Kumar, A., & Kumar, A.V. (2013). Supervised sentiment
lic news be used to forecast abnormal stock market behaviour? Sev- analysis: Engineering a robust automatic training dataset. In:
enth IEEE International Conference on Data Mining, pp. 637–642. Proceedings of International Conference on Business Analytics
Schumaker, R., & Chen, H. (2008). Evaluating a news-aware quanti- and Intelligence. Indian Institute of Management Bangalore.
tative trader: The effect of momentum and contrarian stock Yoo, P.D., Kim, M.H., & Jan, T. (2005). Machine learning techniques
selection strategies. Journal of the American Society for Infor- and use of event information for stock market prediction: A sur-
mation Science and Technology 59 (2), 247–255. vey and evaluation. CIMCA '05 Proceedings of the International
Takahashi, S., Takahashi, M., Takahashi, H., & Tsuda, K. (2007). Conference on Computational Intelligence for Modelling, Control
Analysis of the relation between stock price returns and headline and Automation and International Conference on Intelligent
news using text categorization. 11th International Conference, Agents, Web Technologies and Internet Commerce (CIMCA-
KES 2007, XVII Italian Workshop on Neural Networks Proceedings, IAWTIC'06). IEEE Computer Society, Washington, DC, pp. 835–841
Part II. Vietri sul Mare, Italy. Springer, pp. 1339–1345. . Vol. 2.
Tetlock, P.C. (2007). Giving content to investor sentiment: The role Zhang, H. (2004). The optimality of naive Bayes. American Associa-
of media in the stock market. The Journal of Finance 62 (3), tion for Artificial Intelligence 1 (2), 3 Retrieved from https://
1139–1168. www.aaai.org/Papers/FLAIRS/2004/Flairs04-097.pdf.

You might also like