Agaian y Kolm (2017) - Financial Sentiment Analysis Using Machine Learning Techniques PDF
Agaian y Kolm (2017) - Financial Sentiment Analysis Using Machine Learning Techniques PDF
Email address
[email protected] (S. Agaian), [email protected] (P. Kolm)
Citation
Sarkis Agaian, Petter Kolm. Financial Sentiment Analysis Using Machine Learning Techniques.
International Journal of Investment Management and Financial Innovations.
Vol. 3, No. 1, 2017, pp. 1-9.
Keywords Abstract
Financial Sentiment, The rise of web content has presented a great opportunity to extract indicators of investor
Sentiment Analysis, moods directly from news and social media. Gauging this sentiment or general
Text Categorization, prevailing attitude of investors may simplify the analysis of large, unstructured textual
Text Classification datasets and help anticipate price developments in the market. There are several
challenges in developing a scalable and effective framework for financial sentiment
analysis, including: identifying useful information content, representing unstructured text
in a structured format under a scalable framework, and quantifying this structured
Received: March 25, 2017 sentiment data. To address these questions, a corpus of positive and negative financial
Accepted: May 10, 2017 news is introduced. Various supervised machine learning algorithms are applied to gage
Published: August 23, 2017 article sentiment and empirically evaluate the performance of the proposed framework
on introduced media content.
1. Introduction
Online news, blogs, and social networks have become popular communication
platforms to log thoughts and opinions about everything from world events to daily
chatter. These opinion-rich resources attract attention from financial investors to
understand the opinions of both businesses and individual users [1]. Market sentiment is
the general prevailing attitude of investors as to anticipate price development in a
market. This attitude is the accumulation of a variety of fundamental and technical
factors, including price history, economic reports, seasonal factors, and national and
world events. As more and more opinions are made available on websites, (such Twitter,
Reddit, Facebook, Bloomberg Finance, Google Finance, Yahoo Finance, etc.) it is
becoming increasingly difficult to analyze this large media content. For instance, the
popular micro-blogging site, Twitter, has over 200 million active users, who post more
than 400 million tweets a day [1].
Currently there is major interest in both industrial and academic research to use
sentiment to analyze, classify, make predictions and gain insights into various aspects of
daily life. A survey of sentiment analysis [2] has been cited over 5000 times in Google
Scholar. Significant progress has been made in sentiment tracking techniques that extract
indicators of public mood directly from social media content such as blog content [2] [3]
[4] [5] [6] [7] [8] [9]. These works have laid the groundwork to address several
challenges in developing a scalable and effective system for web dynamic sentiment
analysis. These challenges include identifying useful information content, representing
structured text under a scalable framework to determine sentiment, and extracting
relationships between market trends and this quantified sentiment.
To address the challenges above one needs to develop a framework to automatically
classify to classify financial news as positive or negative. Most previous research on
2 Sarkis Agaian and Petter Kolm: Financial Sentiment Analysis Using Machine Learning Techniques
sentiment-based classification has been focused on non- potential in fields like finance, where individuals seek
financial content, such as movie reviews [7], travel and analyze large texts of information of businesses and their
automobile reviews [10], and Amazon product reviews [11]. customers - studies indicate, for instance, that 80% of
In the finance community, few papers [12] [13] have been company’s information was contained in text documents such
published exploring sentiment in financial news. These as emails, memos, and reports [1].
works, however, use simple lexical algorithms to evaluate Sentiment analysis dates back to the 1990s [15] [16] [17].
sentiment, and focus primarily on addressing whether the Fama [18] previously demonstrated that emotions have an
returns of a firm on a given day are connected to the news effect on rational thinking and social behavior.
that was published about the firm on that day. This approach Hatzivassiloglou and McKeown [19] develop an algorithm
is unlike the strong connection between classification and for predicting semantic orientation. They classify positive,
text in a movie review [14]. negative and neutral expressions in texts by using a small set
This work focuses on developing the framework to of manually annotated seed words. Their algorithm performs
perform more sophisticated sentiment analysis and learning well, but it is designed for isolated adjectives, rather than
on financial text. The performance of a sentiment analyses phrases containing adjectives or adverbs. Hatzivassiloglou
relies significantly on the qualities of the training and testing and Wiebe [20] show the effects of adjective orientation and
data. Unfortunately, the commonly used text collection such gradability on sentence subjectivity. Turney and Littman [21]
as the Reuters-21578, Amazon Product Review Data, and present an unsupervised approach for classifying positive and
Cornell’s movie review dataset cannot be used as a negative terms. For additional works, readers are referred to:
benchmark as they lack either sentiment or financial focus. [22] [23] [24] [25] [26] [27] [28] [29] [30] [49] [50] [51] [52]
As there is no publicly offered dataset with positive or [53] [54] as well as the extensive reviews [6] [31].
negative financial news, a database of positive and negative Sentiment analysis started being adopted in finance with
financial texts is generated. The main remaining the introduction of works such [13] [32] [33], which use
contributions of this work are to formally define the problem sentiment analysis of weblog and news data to predict stock
of using supervised sentiment analysis of financial texts to price moves. Nofsinger [33] demonstrates that the stock
describe market trends and develop a supervised machine market itself can be considered as a measure of social mood.
learning approach to gage financial sentiment. The goal of Gilbert and Karahalios [34] have found out that increases in
this work is to provide a prototype that can be leveraged to expressions of anxiety, worry and fear in weblogs predict
represent unstructured large financial texts that is efficient, downward pressure on the S&P 500 index. Bordino et al.
accurate and scalable. [35] show that trading volumes of stocks traded in
The remainder of this paper is organized as follows. NASDAQ-100 are correlated with their query volumes (i.e.,
Section 2 reviews existing literature related to this paper and the number of users’ requests submitted to search engines on
formally define the problem of study. Section 3 proposes an the Internet). Thelwall et al. [36] analyze events in Twitter
automated supervised sentiment analysis framework. Section and show that popular events are associated with increases in
4 introduces the generated financial database of positive and average negative sentiment strength. Nofer [37] Bollen et al.
negative financial news. Sections 5 and 6 present the [38] have found that changes in a specific public mood
simulation and cross-validation results, respectively. Section dimension (i.e., calmness) can predict changes in stock price.
7 concludes and discusses directions for future research. Ruiz et al. [39] use time-constrained graphs to study the
problem of correlating the Twitter micro-blogging activity
with changes in stock prices and trading volumes. Smailović,
2. Background, Related Work, and et al. [40] use the volume and sentiment polarity of Apple
Challenges financial tweets to identify important events and future
movements of Apple stock prices.
This section presents an overview of sentiment analysis The sentiment analysis can be divided into two key
and machine learning literature that is related to current work classes: supervised and unsupervised [7]. A conventional way
and the problem statement. to perform unsupervised sentiment analysis is the lexicon-
2.1. Sentiment Analysis based method [3] [8] [29]. This is the primary method
employed in the financial work listed above, primarily due to
Sentiment analysis, also called opinion mining, is the field its simplicity in algorithm and implementation. The lexicon-
of study that analyzes people’s opinions, sentiments, based methods employ a sentiment lexicon to determine
evaluations, appraisals, attitudes, and emotions towards overall sentiment polarity of a document. Since they
entities such as products, services, organizations, individuals, disregard context and semantic structure, they are less
issues, events, topics, and their attributes [2]. This problem of accurate than other unsupervised and supervised methods;
automatic text classification and categorization has spread to they have also become increasingly difficult due to the
almost every possible domain and has grown to be one of the distinct language of social media, where short-unstructured
most active research areas in machine learning and natural texts with expressions such as “it’s coooool” and “good 9t:)”
language processing [2]. It has a very high commercial are commonplace [5] [31] [42]. Thus, it is difficult to define a
International Journal of Investment Management and Financial Innovations 2017; 3(1): 1-9 3
universally optimal sentiment lexicon to cover words from SVMs are currently among the best performers for a number
different domains [42]. of classification tasks including text data [44] [45] [46] [47].
Most research on non-lexical sentiment-based This work uses the methods listed below.
classification not been focused on financial texts. Research
has centered on: movie reviews [2] [7]; automobiles and 2.3. Classification Based on PMI-IR
travel destinations reviews [10]; and product reviews from Algorithm
Amazon [11]. Reviews have been used to generate datasets [10] Point-wise mutual information is a semantic word
as reviewers often summarize their overall sentiment with a similarity measure between two words Word1 and Word2 and
rating indicator, such as number of stars, thereby eliminating defined as
the need to hand label data [7].
Public, classified datasets have not been introduced for Pr(Word1 , Word 2 )
financial texts. Open research issues in sentiment analysis PMI (Word1 ,Word 2 ) = log (1)
Pr(Word1 ) Pr(Word 2 )
include [43]:
a) A need for better modeling of compositional sentiment. Where Pr(Word1 ,Word 2 ) is the probability that Word1 and
At the sentence level, this means more accurate
Word2 occur at the same, and where Pr(Wordi ), i = 1, 2 the
calculation of the overall sentence sentiment of the
sentiment-bearing words, the sentiment shifters, and the number of is times that Wordi appears in the corpus.
sentence structure. PMI, in other words, is the probability of observing words,
b) A need design and implementation a dataset with Word1 and Word2, together. PMI (Word1 ,Word 2 ) is the
positive or negative financial news. amount of information that is acquired about the presence of
c) A need to de-noise the noisy texts (those with one of the words when the other is observed. It is equal to
spelling/grammatical mistakes, missing/problematic zero if two words Word1 and Word2 are statistically
punctuation and slang) independent
This work addresses these issues by introducing a new
dataset and framework to assemble financial text of single Pr(Word1 , Word 2 ) = Pr(Word1 ) Pr(Word 2 ) (2)
names and gage their sentiment. Traditional machine learning
of each other. Moreover, it is positive if they are positively
methods are first overviewed.
correlated and negative if they are negatively correlated. For
2.2. Machine Learning Methods example words Word1 and Word2 could be:
There are many classification systems that rely on machine Table 1. Potential POS Pairs.
learning methods including k-Nearest Neighbors (simple, Word1 Word2
powerful), Naive Bayes (simple, very efficient as it is linearly 1. Adjective Noun
proportional to the time needed to read in all the data); 2. Adverb Adjective
Support-Vector Machines (relatively new, more powerful); 3. adjective Adjective
4. noun Adjective
K- Nearest Neighbor classification (simple, expensive at test
5. adverb Verb
time, high variance, non-linear); Vector space classification
using centroids and hyperplanes that split them (simple, Presented below is the unsupervised algorithm applied to
linear discriminant classifier); and AdaBoost (based on movie reviews as introduced by Turney [10]. Turney’s PMI-
creating a highly accurate prediction rule using a weighted IR algorithm uses words excellent and poor as seed words.
linear combination of other classifiers). Many commercial These seed words can be looked as proxies for the category
systems use a mixture of methods. The Naive Bayes and labels of “positive” or “negative.”
Table 2. PMI-IR Algorithm.
Input: Text-review
Identify phrases that contain adjectives or adverbs by using a part-of-speech tagger. Define a distance measure d(t1, t2) ( PMI (Word1, Word 2 ) )
Step 1:
between terms t1 (adjectives) and t2 (adverbs). Extract two consecutive words: one is an adjective or adverb, the other provides the context.
Estimate the semantic orientation of each phrase based on their association with database positive and seven negative words by using
SO(phrase) = PMI(phrase, “positive”) – PMI(phrase, “negative”) (3)
Step 2:
Note: Semantic Orientation is positive when phrase is more strongly associated with “excellent” and negative when phrase is more strongly
associated with “poor”.
Step 3. Calculate the average semantic orientation (SO) of the phrases.
Classify the review as recommended if average SO is positive, not recommended otherwise.
Step 4.
If hits(phrase NEAR “excellent”) and hits(phrase NEAR “poor”)≤4, then eliminate phrase
Bayesian methods provide the basis for probabilistic parameter b/||w|| defines the offset of the hyperplane from
learning methods that use knowledge about the prior the origin along the normal vector w.
probabilities of hypotheses and about the probability of The distance between these two hyperplanes is:
observing data given the hypothesis. Naive Bayes Classifier
is a Bayesian classifier for vector data (i.e. data with several Margin =|(1/||w||)- (-1/||w||) |=2/||w|| (5)
attributes) that assumes that attributes are independent given To minimize ||w|| and avoid data points from falling into
the class. The Bayesian classifier that uses the Naïve Bayes the margin, the following constraint is used:
assumption and computes the MAP hypothesis is called
Naïve Bayes classifier. It uses Bayes’ Rule wxi – b ≤ - 1 for xi of the first class (corresponding let’s say
to positive news)
P ( d | h) P ( h)
p(h | d ) = (4) wxi - b ≥ 1 for xi of the second class(corresponding to
P(d )
negative news) (6)
where This can be rewritten as:
d : data
h : hypothesis hi (wxi - b) ≥ 1 for all i, i=1,2,….,n (7)
P(h) : prior belief (probability of hypothesis h before Consequently the optimization problem can be formulated
seeing any data) as:
P(d | h) : likelihood (probability of the data if the Minimize ||w||
hypothesis h is true) Subject to
P(d ) = ∑ P (d | h) P ( h) : data evidence (marginal hi (wxi - b) ≥ 1, for any i=1,2,…n (8)
h
probability of the data) It has been shown that the optimization problem solution
P(h | d ) : posterior (probability of the hypothesis h after can be expressed as a linear combination of the training
vectors:
having seen the data d )
The key approach to this type of text categorization is to n
assign to a given text d to class H = {h(1) h(2),…,h(K)}: w= ∑α h x
i =1
i i i (9)
In the preprocessing step, stop-words (such as a, the, financial news sourced from Seeking Alpha, where articles
about, above, after, again, all, alone, along, already of, that, were selected where authors provide disclosure of either
etc) are filtered out as they do not carry information. Also being or intending to be long or short a stock. Several factors
note that in feature selection and extraction step point-wise are considered in using Seeking Alpha to implement the
mutual information (PMI) maybe used as a features. dataset: historical and future data perspectives, the diversity
of the financial news user community, data integrity, and
4. Financial Sentiment Corpus presence of sentiment. The constructed database has 501
documents from January 2011-January 2014 covering 125
As noted above there is no publicly offered dataset with companies in the SPX 500. A breakdown of the companies is
positive or negative financial news. This section presents an provided below:
implementation of a dataset with positive and negative
6 Sarkis Agaian and Petter Kolm: Financial Sentiment Analysis Using Machine Learning Techniques
Six students, working in pairs, were asked to manually consistently beat earnings estimates.</str><str>The recently
categorize the dataset into two sets of categories: positive and reported quarter was no exception.</str><str>Here are the
negative, on both an article and sentence level. Articles and quarterly earnings highlights: EPS came in at $3.65 a share,
sentences below a minimum consensus threshold were seven cents above consensus estimates.</str><str>Sales
filtered out. The breakdown of this categorization is came in slightly above consensus led by iShares revenues,
summarized in the table below: which were up more than 20% year over
year.</str><str>AUM increased 7% year over year to $3.94
Table 4. Manual Categorization Breakdown. trillion.</str><str>Equity funds saw net inflows of over $33
Level Positive Negative Total billion.</str><str>Adjusted operating margins increased
Article 251 250 501 140bps to 40%.</str><str>BlackRock is one of the largest
Sentence 2743 2346 5089 investment managers in the world.</str><str>The firm
provides its myriad services to institutional, intermediary,
An illustration of a sample article in XML format is
and individual investors.</str><str>Here are four reasons
provided below. The rating is defined as the median rating
why BLK still has upside from $257 a share: Consensus
assigned by each of the three groups on a range of strong sell
earnings estimates for both FY 2013 and FY 2014 had
to strong buy [-3,3].
consistently and significantly gone up before this earnings
<?xml version="1.0" encoding="UTF-
report.</str><str>FY 2014's projections are $1 a share above
8"?><response><result start="0" numFound="501"
where they were 90 days ago.</str><str>I would look for
name="response"><doc><str name="twitter_title">Bully For
further upward revisions after these quarterly
BlackRock</str><str
results.</str><str>BlackRock is well positioned for the
name="keywords">NYSE:BLK</str><str
migration from mutual funds to ETFs.</str><str>It also
name="url">http://seekingalpha.com/article/1345881-bully-
should do well as cash starts to come back into the markets as
for-
the Fed continues to encourage investors to move into riskier
blackrock</str><arrname="ratings_man"><int>0</int><int>
assets.</str><str>Finally, it does not have a huge gold ETF
3</int><int>0</int><int>2</int><int>2</int><int>0</int><i
like State Street (STT), which is seeing major
nt>0</int><int>0</int><int>0</int><int>0</int><int>0</int
outflows.</str><str>The company has now beat or met
><int>1</int><int>0</int><int>2</int><int>0</int><int>2</
quarterly earnings estimates for 13 straight quarters (12
int><int>2</int><int>2</int><int>2</int><int>0</int><int>
beats, one meet).</str><str>BlackRock is expected to grow
0</int><int>0</int><int>0</int></arr><int
revenues at a 10% CAGR over the next two years and is
name="sentiment_man">3</int><arr
selling for just over 14x 2014's projected
name="sentences"><str>Apr. 16, 2013 6:04 PM ET | About:
earnings.</str><str>BLK yields 2.6% and has quadrupled
BLK by: Bret Jensen I have owned and written about
dividend payouts over the last six years or so.</str><str>The
BlackRock ( BLK ) since October.</str><str>It is a core
stock has a reasonable five-year projected PEG (1.25) for a
holding in my income portfolio, as it has a solid yield and has
dividend payer. </str></arr><arr
raised its payouts tremendously over the years -- even
name="disclosure_sa"><str>I am long BLK.</str></arr><arr
through the financial crisis.</str><str>The shares have gone
name="author_sa"><str>Bret
from $187 to $257 in that time.</str><str>The company
Jensen</str></arr></doc></result></response>
continues to show solid growth as its businesses rise along
with the equity and credit markets.</str><str>Also, it has
International Journal of Investment Management and Financial Innovations 2017; 3(1): 1-9 7
[6] B. Liu, Handbook of Natural Language Processing, Boca [24] S. Durbin, D. Warner, J. Richter and Z. Gedeon, "Information
Raton: CRC Press, Taylor and Francis Group, 2010. Self-Service with a Knowledge Base That Learns," AI
Magazine, vol. 23, no. 4, pp. 41-50, 2002.
[7] B. Pang, L. Lee and S. Vaithyanathan, "Thumbs up?:
sentiment classification using machine learning techniques," [25] M. Efron, "Cultural orientations: Classifying subjective
in Proceedings of ACL, 2002. documents by cocitation analysis," in Proceedings of the AAAI
Fall Symposium Series on Style and Meaning in Language,
[8] J. Wiebe, T. Wilson and C. Cardie, "Annotating expressions of Art, Music, 2004.
opinions and emotions in language," Language Resources and
Evaluation, vol. 39, no. 165, pp. 165-210, 2005. [26] D. Inkpen, O. Feiguina and G. Hirst, "Generating more-
postive and more-negative text," in Computing Attitude and
[9] T. Wilson, J. Wiebe and P. Hoffmann, "Recognizing Affect in Text: Theory and Applications, Dordrecht, The
contextual polarity in phrase-level sentiment analysis," in Netherlands, Springer, 2005, pp. 187-196.
Proceedings of HLT and EMNLP, 2005.
[27] M. Gamon, "Sentiment classification on customer feedback
[10] P. Turney, "Thumbs Up or Thumbs Down? Semantic data noisy data, large feature vectors, and the role of linguistic
Orientation Applied to Unsupervised Classification of analysis," in Proceedings of the 20th international conference
Reviews," in Proceedings of the Association for on Computational Linguistics, 2004.
Computational Linguistics, 2002.
[28] J. Wiebe and E. Riloff, " Creating Subjective and Objective
[11] K. Dave, "Mining the Peanut Gallery: Opinion Extraction and Sentence Classifiers from Unannotated Texts," in
Semantic Classification of Product Reviews," in WWW2003, Computational Linguistics and Intelligent Text Processing,
2004. 2005.
[12] P. Tetlock, M. Saar-Tsechansky and S. Macskassy, "More [29] T. Wilson., J. Wiebe. and P. Hoffmann, "Recognizing
Than Words: Quantifying Language to Measure Firms," contextual polarity in phrase-level sentiment analysis,"
Journal of Finance, vol. 68, pp. 1437-1467, 2008. Computational Linguistics, vol. 35, no. 3, pp. 399-433, 2009.
[13] P. Tetlock, "Giving Content to Investor Sentiment: The Role [30] Y. Dang, Z. Yulei and H. Chen, "A lexicon enhanced method
of Media in the Stock Market," Journal of Finance, vol. 62, for sentiment classification: An experiment on online product
no. 3, pp. 1139-1168, 2007. reviews," IEEE Intelligent Systems, vol. 25, no. 4, pp. 46-53,
2010.
[14] P. Azar, "Sentiment Analysis in Financial News," Harvard
College (Thesis), Cambridge, Massachusetts, 2009. [31] B. Liu, Sentiment Analysis and Opinion Mining, Claypool
Publishers, 2012.
[15] S. Argamon-Engelson, M. Koppel and G. Avneri, "Style-based
Text Categorization: What Newspaper Am I Reading?," [32] P. Tetlock, M. Saar-Tsechansky and S. Macskassy, "More than
AAAI, 1998. words: Quantifying Language to Measure Firms'
Fundamentals," Journal of Finance, vol. 63, no. 3, pp. 1437-
[16] B. Kessler, G. Nunberg and H. Schautze, "Automatic 1467, 2008.
Detection of Text Genre," in ACL, 1997.
[33] G. Mishne, "Prediciting Movie Sales from Blogger
[17] E. Spertus, "Smokey: Automatic recognition of hostile Sentiment," in Computational Approaches to Analysing
messages," in Proceedings of Innovative Applications of Weblogs, 2006.
Artificial Intelligence, 1997.
[34] J. Nofsinger, "Social Mood and Financial Economics," Journal
[18] E. Fama, "Random Walks in Stock Market Prices," Financial of Behavioral Finance, vol. 6, no. 3, pp. 144-160, 2005.
Analysts Journal, vol. 21, no. 5, pp. 55-59, 1965.
[35] E. Gilbert and K. Karahalios, "Widespread Worry and the
[19] V. Hatzivassiloglou and K. McKeown, "Predicting the Stock Market," in Proceedings of the International, 2010.
International Journal of Investment Management and Financial Innovations 2017; 3(1): 1-9 9
[36] I. Bordino, S. Battiston, G. Caldarelli, M. Cristelli, A. Proceedings of the Tenth European Conference on Machine
Ukkonen and I. Weber, "Web Search Queries Can Predict Learning, Berlin, 1998.
Stock Market Volumes," PLoS ONE, vol. 7, no. 7, p. e40014,
2012. [46] T. Joachims, "Estimating the generalization performance of a
SVM efficiently," LS VIII-Report, Universit at Dortmund,
[37] M. Thelwall, K. Buckley, G. Paltoglou, D. Cai and A. Kappas, Germany, 1999.
"Sentiment strength detection in short informal text," Journal
of the American Society for Information Science and [47] B. Schlkopf and A. J. Smola, Learning with Kernels: Support
Technology, vol. 61, no. 12, pp. 2544-2558, 2010. Vector Machines, Regularization, Optimization, and Beyond,
MIT Press, 2002.
[38] M. Nofer, Using Twitter to Predict the Stock Market: Where is
the Mood Effect?, New York: Springer, 2015. [48] V. Vapnik, The Nature of Statistical Learning Theory,
Springer-Verlag, 1995.
[39] J. Bollen, H. Mao and X. Zeng, "Twitter mood predicts the
stock market," Journal of Computational Science, vol. 2, no. [49] M. Nardo, M. Petracco and M. Naltsidis, "Walking Down
1, pp. 1-8, 2011. Wall Street with a Tablet: A Survey of Stock Market
Predictions Using the Web," Journal of Economic Surveys,
[40] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis and A. Jaimes, vol. 30, no. 2, pp. 3556-369, 2016.
Correlating financial time series with micro-blogging activity,
ACM Press, 2012. [50] B. Agarwal and N. Mitta, " Machine Learning Approach for
Sentiment Analysis," in Prominent Feature Extraction for
[41] J. Smailovic, M. Grcar and M. Znidaršic, "Sentiment analysis Sentiment Analysis, Springer, 2015, pp. 21-45.
on tweets in a financial domain," in International
Postgraduate School Students Conference, 2012. [51] M.-Y. Day and C.-C. Lee, "Deep learning for financial
sentiment analysis on finance news providers," in IEEE/ACM
[42] Y. Lu, M. Castellanos, U. Dayal and C. Zhai, "Automatic International Conference, 2016.
construction of a context-aware sentiment lexicon: an
optimization," in Proceedings of WWW, 2011. [52] S Das and A. Das, "Fusion with sentiment scores for market
research," in Information Fusion International Conference,
[43] R. Feldman, "Techniques and Applications for Sentiment 2016.
Analysis," Communications of the ACM, vol. 56, no. 4, pp.
82-89, 2013. [53] A. Akansu, S. Kulkarni and D. Malioutov, Financial Signal
Processing and Machine Learning, John Wiley & Sons, 2016.
[44] P. Domingos and M. Pazzani, "On the optimality of the simple
Bayesian classifier under zero-one loss," Machine Learning, [54] D. D. Wu and D. L. Olson, "Financial Risk Forecast Using
vol. 29, pp. 103-130, 1997. Machine Learning and Sentiment Analysis," in Enterprise
Risk Management in Finance, Palgrave Macmillan, 2015, pp.
[45] T. Joachims, "Text categorization with support vector 32-48.
machines: Learning with many relevant features," in