NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg

This document describes NSTM, a real-time news summarization system developed by Bloomberg. NSTM uses semantic clustering and novel summarization techniques to take a user's news search query and produce a concise yet comprehensive overview of the key themes in the results. It filters out noise and duplicates to identify and summarize the main topics. When given a query, NSTM returns an ordered list of theme summaries, each containing a cluster of related stories on that theme along with a brief summary and selection of key stories. The system aims to help users quickly understand vast amounts of news results without reading every individual article. It is deployed globally and handles thousands of requests daily with sub-second latency.

Uploaded by

helloworld1215

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views12 pages

NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg

Uploaded by

helloworld1215

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

NSTM: Real-Time Query-Driven News Overview Composition at

Bloomberg
Joshua Bambrick1 , Minjie Xu1 , Andy Almonte1 , Igor Malioutov1 ,
Guim Perarnau1 , Vittorio Selo1 , Iat Chong Chan2, ∗
1
Bloomberg, London, United Kingdom
2
OakNorth, London, United Kingdom
1
{jbambrick7,mxu161,aalmonte2,imalioutov,gperarnau,vselo}@bloomberg.net
2
[email protected]

Abstract stories. Moreover, news headlines are frequently

sensational, opaque, or verbose, forcing readers to
Millions of news articles from hundreds of
open and read individual articles.
thousands of sources around the globe appear
in news aggregators every day. Consuming For illustration, imagine an analyst sees the price
such a volume of news presents an almost of Amazon.com stock drop and wants to know why.
arXiv:2006.01117v1 [cs.CL] 1 Jun 2020

insurmountable challenge. For example, a With a traditional system, they would search for
reader searching on Bloomberg’s system for news on the company and wade through many sto-
news about the U.K. would find 10,000 arti- ries (307 in this case1 ), often with duplicate infor-
cles on a typical day. Apple Inc., the world’s mation or unhelpful headlines, to slowly build up a
most journalistically covered company, gar-
full picture of what the key events were.
ners around 1,800 news articles a day.
By contrast, using NSTM (Key News Themes),
We realized that a new kind of summarization this same analyst can search for ‘Amazon.com’,
engine was needed, one that would condense
over a given time horizon, and promptly receive a
large volumes of news into short, easy to ab-
sorb points. The system would filter out noise concise and comprehensive overview of the news,
and duplicates to identify and summarize key as shown in Fig. 1. We tackle the challenges in-
news about companies, countries or markets. volved with consuming vast quantities of news by
When given a user query, Bloomberg’s solu- leveraging modern techniques to semantically clus-
tion, Key News Themes (or NSTM), leverages ter stories, as well as innovative summarization
state-of-the-art semantic clustering techniques methods to extract succinct, informational sum-
and novel summarization methods to produce maries for each cluster. A handful of key stories are
comprehensive, yet concise, digests to dramat- then selected from each cluster. We define a (story
ically simplify the news consumption process. cluster, summary, key stories) triple as one theme
NSTM is available to hundreds of thousands of and an ordered list of themes as an overview.
readers around the world and serves thousands NSTM works at web scale but responds to ar-
of requests daily with sub-second latency. At
bitrary user queries with sub-second latency. It is
ACL 2020, we will present a demo of NSTM.
deployed to hundreds of thousands of users around
1 Introduction the globe and serves thousands of requests per day.

In many domains, finding contextually-important 2 Design Goals

news as fast as possible is a key goal. With millions
of articles published around the globe each day, We focus on the scenario where a news search
quickly finding relevant and actionable news can query can render many matching news articles,
mean the difference between success and failure. from tens up to hundreds of thousands. The task is
When provided with a search query, a traditional to create a succinct overview of the results to help
system returns links to articles sorted by relevance. our users to easily grasp the gist of them without
However, users typically encounter (near) duplicate combing through the individual articles.
or overlapping articles, making it hard to quickly Since the matching articles often cover various
identify key events and easy to miss less-reported aspects and events, NSTM must first cluster related
∗ stories to form a clear separation among them.
Order reflects writing contributions; M.X., I.C.C., and
J.B. designed and developed a prototype of the system; All Furthermore, the system must extract a concise
implemented the production system; A.A. managed the project.
1
I.C.C. worked on the project while employed by Bloomberg. The corresponding overview can be found in Appendix C.
Se
arc
hbo
x Su
mma
ry Cl
ust
ers
ize To
tal
sea
rchr
esu
lt
s Ti
mep
eri
ods
ele
cti
on

Ke
yst
ori
es Fe
edb
ackb
utt
ons So
urc
e Pu
bli
cat
io
nda
te
Figure 1: A query-based UI for NSTM showing two themes. The un-cropped screenshot is in Appendix C.

(up to 50 characters, or roughly 6 tokens) summary several media broadcasts. SUMMA applies the
for each cluster. It needs to be short enough to online clustering algorithm by Aggarwal and Yu
be understandable to humans with a single glance, (2006) and the extractive summarization algorithm
but also rich enough to retain critical details from a by Almeida and Martins (2013). In contrast to
minimal ‘who-does-what’ stub, so the most popular NSTM, SUMMA focuses on scenarios with contin-
noun phrase or entity alone will not suffice. Such uous multimedia and multilingual data streams and
conciseness also helps when screen space is limited produces much longer summaries.
(for context-driven applications or mobile devices).
From each cluster, NSTM must surface a few key 4 Approach
stories to provide a sample of its contents. The clus- 4.1 Architecture
ters themselves should also be ranked to highlight
the most important few in limited screen space. The functionality of NSTM can be formulated
Finally, the system must be fast. It may only as: given a search query, generate a ranked list
take up to a few seconds for the slowest queries. (overview) of the key themes, or (news cluster, sum-
mary, key stories) triples, that concisely represent
Main technical challenges: 1) There is no pub- the most important matching news events.
lic dataset corresponding to this overview composi- Fig. 2 depicts the system’s architecture. The
tion problem with all the requirements set above, so story ingestion service processes millions of pub-
we were required to either define new (sub-)tasks lished news stories each day, stores them in a search
and collect new annotations, or select techniques index, and applies online clustering to them. When
by intuition, implement them, and iterate on feed- a search query is submitted via a user interface ( 1
back; 2) Generating summaries which are simulta- in the diagram), the overview composition service
neously accurate, informational, fluent, and highly retrieves matching stories and their associated on-
concise necessitates careful and innovative choices line cluster IDs from the search index ( ). 2 The
of summarization techniques; 3) Supporting arbi- system then further clusters the retrieved online
trary user searches in real-time places significant clusters into the final clusters, each correspond-
performance requirements on the system whilst ing to one theme ( ).3 For each such cluster, the
also setting a high bar for its robustness. system extracts a concise summary and a handful
of key stories to reflect the cluster’s contents ( ).
4
3 Related Work This creates a set of themes, which NSTM ranks to
A comparable system is Google News’ ‘Full Cover- create the final overview. Lastly, the system caches
age’ feature2 , which groups stories from different the overview for a limited time to support future
sources, akin to our clustering approach. However, reuse ( )
5 before returning it to the UI ( ).6
it doesn’t offer summarization and its clustered
4.2 News Search
view is unavailable for arbitrary search queries.
SUMMA (Liepins et al., 2017) is another com- The first step in the NSTM pipeline is to retrieve
parable system which integrates a variety of NLP relevant news stories ( 1 in Fig. 2), for which we
components and provides support for numerous leverage a customized in-house news search engine
media and languages, to simultaneously monitor based on Apache Solr.3 This supports searches
based on keywords, metadata (such as news source
2
https://www.blog.google/products/news/new-google-
3
news-ai-meets-human-intelligence/ http://lucene.apache.org/solr/
Index & cluster stories
Search index Cache
Story ingestion service
Real-time stream ② Retrieve stories & ⑤ Cache
of news stories online cluster IDs overview
③ Cluster search ④ Summarize
① Send search query results clusters
Overview composition service
User interface ⑥ Return overview
Figure 2: The architecture of NSTM. The digits indicate the order of execution whenever a new request is made.

and time of ingestion), and tags generated during a learnable common background word distribution
ingestion (such as topics, regions, securities, and into the generative model (Arora et al., 2017).
people). For example, TOPIC:ECOM AND NOT We trained the model on an internal corpus of
COMPANY:AMZN4 will retrieve all news about ‘E- 1.85M news articles, using a vocabulary of size
commerce’ but exclude Amazon.com. about 200k and a latent dimension n of 128.
NSTM uses Solr’s facet functionality to surface
the largest k online clusters (detailed in Sec. 4.3.2) 4.3.2 Clustering Stages
in the search results, before returning n stories from We divide clustering into two stages in the pipeline,
each. This tiered approach offers better coverage 1) online incremental clustering at story ingestion
and scalability than direct story retrieval. time, and 2) hierarchical agglomerative clustering
(HAC) at query time ( 3 in Fig. 2). The former is
4.3 Clustering used to produce query-agnostic online clusters at
4.3.1 News Embedding and Similarity a relatively low cost to handle the daily influx of
millions of news stories. These clusters reduce the
At the core of any clustering system is a similar-
computational cost at query time. However, due
ity metric. In NSTM, we define the similarity be-
to its online nature, over-fragmentation, among
tween two articles as the cosine similarity between
other quality issues, occurs in the resulting clusters.
their embeddings as computed by NVDM (Miao
This necessitates further refinement at query time
et al., 2016), i.e., τ (d1 , d2 ) = 0.5(cos(z1 , z2 ) + 1),
when an offline HAC step is performed on top of
where z ∈ Rn denotes the NVDM embedding.
the retrieved online clusters. A similar, but more
Our choice is motivated by two observations: 1)
complicated, design was adopted in Vadrevu et al.
The generative model of NVDM is based on bag-
(2011) for clustering real-time news search results.
of-words (BoW) and P (w|z) = σ(W > z) where
At both stages, we compute the cluster embed-
σ is the softmax function, W ∈ Rn×V is the word
ding zc ∈ Rn as the mean of all the story em-
embedding matrix in the decoder and V is the size
beddings therein, and evaluate similarities between
of the vocabulary. This resembles the latent topic
clusters (individual stories are taken as singleton
structure popularized by LDA (Blei et al., 2003)
clusters) using the metric τ defined in Sec. 4.3.1.
which has proven effective in capturing textual se-
mantics. Additionally, the use of cosine similarities For online clustering, we apply an in-house im-
is naturally motivated by the fact that the genera- plementation which uses a distributed pool of work-
tive model is directly defined by the dot-product ers to reduce latency and increase throughput. It
between the story embedding (z) and a shared vo- merges each incoming story with the closest cluster
cabulary embedding (W ). 2) NVDM’s Variational if the similarity is within a parameterized threshold
Autoencoder (VAE) (Kingma and Welling, 2014; and otherwise creates a new singleton cluster.
Rezende et al., 2014) framework makes the infer- For HAC, we apply fastcluster5 (Müllner,
ence procedure much simpler than LDA and it also 2013) to construct the dendrogram. We use com-
supports decoder customizations. For example, it plete linkage to encourage more congruent clusters
allows us to easily integrate the idea of introducing and then form flat clusters by cutting the dendro-
gram at the same (height) threshold. To further
4
This is Bloomberg’s internal news search query syntax,
5
which maps closely to the final query submitted to Solr. https://www.jstatsoft.org/article/view/v053i09
reduce fragmentation where similar clusters are 4.4.2 BERT-based Sentence Compression
left un-clustered, we apply HAC twice recursively. In addition to the rule-based OpenIE system, we
To find a reasonable similarity threshold, we apply a Transfer Learning-based solution, using a
manually annotated just over 1k pairs of news arti- novel in-house dataset specific to our sub-task. In
cles. Each annotator indicated whether they would particular, we model candidate summary extraction
expect to see the articles grouped together or not as a ‘sentence compression’ task (Filippova et al.,
in an overview. We then selected the threshold 2015), where each story is split into sentences and
which achieved the highest F1 score on this binary tokens are classified as keep or delete to make each
classification task, which was 0.86. sentence shorter, while retaining the key message.
We oversaw the manual annotation of a dataset
4.4 Summary Extraction which maps sentences to compressed equivalents
Clustering search results (Vadrevu et al., 2011) is a that correspond to summaries. When presented
meaningful step towards creating a useful overview. with a news story, annotators selected one sentence
With NSTM, we push this one step further by ad- and deleted words to create a high quality summary.
ditionally generating a concise, yet still human- This rendered 10k annotations which we randomly
readable, summary for each cluster ( 4 in Fig. 2). partitioned into train (80%) and test (20%) sets.
Due to the unique style of the summary ex- The task is formulated as sequence tagging,
plained in Sec. 2, the scarcity of training data makes whereby each sub-token ( 1 in Fig. 3), defined
it hard to train an end-to-end seq2seq (Sutskever using the BERT vocabulary, is classified as keep or
et al., 2014) model, as is typical for abstractive sum- delete ( ).
2 We implement this using a feedforward
marization. Also, this technique would only offer layer on top of a Bloomberg-internal pre-trained
limited control over the output. Hence, we opt for neural network, akin to the uncased English BERT-
an extractive method, leveraging OpenIE (Banko Base model, applying an adapated implementation.
et al., 2007) and a BERT-based (Devlin et al., 2019) To create a compression, we stitch sub-tokens
sentence compressor (both illustrated in Fig. 3) to labelled keep together ( ).
3 Lastly, we use postpro-
surface a pool of sub-sentence-level candidate sum- cessing rules to improve formatting ( ), 4 such as
maries from the headline and the body, which are titlecasing and fixing partial-entity deletion (where
then scored by a ranker. only some sub-tokens of a token/entity are deleted).

4.4.1 OpenIE-based Tuple Extraction 4.4.3 Summary Candidate Ranking

Open Domain Information Extraction (OpenIE) Tuple generation and sentence compression pro-
presents an unsupervised approach to extract sum- vide a pool of summary candidates for individ-
mary candidates from an input sentence. ual news stories. These are further aggregated
across stories within a cluster to form the final
First, we construct a dependency parse tree of
pool. To identify the best summary for the cluster,
the sentence, using a model based on Kiperwasser
we trained a sequence-pair model sθ (a, c) to score
and Goldberg (2016) ( 1 in Fig. 3).
each candidate c given an article a. Such article-
From this tree, we extract predicate-argument n-
level scores for a candidate are computed against
tuples using an adapted reimplementation of Pred-
all the stories in a cluster and then aggregated (e.g.,
Patt (White et al., 2016) ( ).
2 The tuples represent
averaged) to produce the final cluster-level scores,
nested proto-semantic parses of the sentence, and
which we use for ranking.
typically correspond to well-formed phrases. This
For this purpose, we collected an in-house anno-
method applies rules cast over Universal Depen-
tated dataset. We sampled a few thousand news ar-
dencies (Nivre et al., 2016) so syntactic patterns
ticles and generated 33k summary candidates from
are unlexicalized and language-neutral.
them using OpenIE,6 . Then we asked internal anno-
We then prune these tuples ( ),
3 applying rules
tators to label each as Great, Acceptable or Terrible
which reduce the arguments to their syntactic heads,
were it to be used as a summary for the article, con-
while heuristics keep named entities and multi-
sidering both readability and informativeness.
word expressions intact. We recursively intersect
From this dataset, we constructed about 48k pair-
the resulting tuples to create more tuples.
wise samples (c, c0 )|a where c is labelled more
Finally, to render summary candidates, we create
a titlecased surface form of each tuple ( ).
4 6
At this time, we hadn’t considered sentence compression.
Automaker ST is investing $2B in electric vehicles (EVs), atoning for the 2018 scandal
① Parse dependencies (shown cropped) ① Create sub-tokens
['automaker’, ‘ST’, 'is', 'investing', '$', '2', ‘##B’, …]
② Extract pred-arg n-tuples (1 output shown) ② Classify sub-tokens
(atoning, Automaker ST, for the 2018 scandal) [ 0.3, 0.8, 0.2, 0.8, 0.4, 0.6, 0.8, …]
PRED ARG ARG
③ Prune tuples (1 output shown) ③ Stitch sub-tokens (with score greater than 0.5)
(atoning, ST, for 2018 scandal) st investing 2b in evs
PRED ARG ARG
④ Create surface form ④ Postprocess
ST Atoning For 2018 Scandal ST Investing $2B in EVs
OpenIE Pipeline BERT-based Sentence Compression Pipeline
Figure 3: Illustrations of the symbolic OpenIE (left) and neural sentence compression (right) candidate extraction
pipelines. We apply both, to render a diverse pool of candidate summaries, and use a ranker to select the best.

favorably than c0 for a given common article a, scores (Lin and Hovy, 2003; Lin, 2004) (details in
and the model sθ (a, c) was then trained to match Appendix B), the latter provides superior results.
such preferences using pairwise margin loss, i.e., However, in a production system which informs
max(0, 1 − sθ (a, c) + sθ (a, c0 )). business decisions, we must consider factors which
We considered a few models, including a aren’t readily captured by metrics which compare
parameter-free baseline which scores candidate- generated and ‘gold’ outputs. For example, chang-
article pairs as the dot-product of their NVDM ing a single word can reverse the meaning of a
(Sec. 4.3.1) embeddings, i.e., s = za> zc . We summary, with only a small change in such scores.
also considered this model’s bilinear extension Hence, we consider a range of pros and cons.
s = za> W zc where W is the learnable weight ma- The sentence compression method is supervised
trix. Lastly, we tried neural network models, such and is trained to produce summaries which can
as DecAtt (Parikh et al., 2016). We evaluated these take advantage of news-specific grammatical styles.
models on a held-out test set with metrics such as However, the OpenIE system is much faster and
pairwise ranking accuracy and NDCG. We opted offers greater interpretability and controllability.
to productionize the baseline model, since it was Since the neural and symbolic systems provide
the simplest and performed on par with the others.7 different advantages, we apply both. This renders
Because NVDM uses a bag-of-words model, this a diverse pool of candidate summaries from which
ranker ignores syntax entirely. We believe that its the ranker’s task is to select the best. At the pool-
empirical success owes to both the well-formedness ing stage we also impose a length constraint of 50
of the majority of the candidates and the averaging characters and exclude any longer candidates.
effect that amplifies the ‘signal-noise ratio’ when
the scores are averaged over the cluster. 4.5 Key Story Selection
Empirically, this approach tends to surface ‘in- As a sample from the full story cluster, NSTM se-
formational’ summaries, in contrast to headlines lects an ordered list of key stories which are deemed
which are often ‘sensational’. We posit that this to be representative. We select these using a heuris-
is because high-ranked summaries must also be tic based on intuition and client feedback.
representative of story bodies, not just headlines.
Our approach is to re-cluster all stories in the
4.4.4 Combining Summary Candidates cluster using HAC (see Sec. 4.3.2), to create a
parameterized number of sub-clusters. For each
OpenIE and sentence compression offer distinct sub-cluster, we select the story that has maximum
ways to extract candidates, and we experimented average similarity τ (as per Sec. 4.3.1) to the other
with each as the sole source of summary candi- sub-cluster stories. This strategy is intended to se-
dates in our pipeline. On the basis of ROUGE lect stories which represent each cluster’s diversity.
7
E.g., with NDCG5 , the (untrained) NVDM dot-product We sort the key stories by sub-cluster size and
yields 0.61, while the bilinear model and DecAtt yield 0.64. time of ingestion, in that order of precedence.
4.6 Theme Ranking Summary Size

We have described how (story cluster, summary, 1 Facebook to Settle Recognition Privacy Lawsuit 90
2 Facebook Warns Revenue Growth Slowing 79
key stories) triples, or themes, are created. How- 3 Facebook Stock Drops 7% Despite Earnings Beat 70
ever, some themes are considered to be more im- 4 Facebook to Remove Coronavirus Misinformation 49
portant than others since they are more useful to 5 Mark Zuckerberg to Launch WhatsApp Payments 19
readers. It is tricky to define this concept concretely
Table 1: Ranked theme summaries and cluster sizes for
but we apply proxy metrics in order to estimate an ‘Facebook’ (1,176 matching stories) from 31 Jan. 2020.
importance score for each theme. We rank themes
by this score and, in order to save screen space, re- Summary Size
turn only the top few (‘key’) themes as an overview. 1 Britain to Leave the EU 459
The main factor considered in the importance 2 Bank of England Would Keep Interest Rate Unchanged 141
3 Sturgeon Demands Scottish Independence Vote 71
score is the size of the story cluster – the larger 4 Pompeo in UK for Trade Talks 45
the cluster, the larger the score. This heuristic cor- 5 Boris Johnson Hails ‘Beginning’ on Brexit Day 63
responds to the observation that more important
themes tend to be reported on more frequently. Ad- Table 2: Ranked theme summaries and cluster sizes for
ditionally, we consider the entropy of the news ‘U.K.’ (13,858 matching stories) from 31 Jan. 2020.
sources in the cluster, which corresponds to the ob-
servation that more important themes are reported period that the overview is calculated over (this UI
on by a larger number of publishers and reduces offers 1 hour, 8 hour, 1 day, and 2 day options).
the impact of a source publishing duplicate stories. This interface also allows users to provide feed-
back via the ‘thumb’ icons or plain-text comments.
4.7 Caching
Of several hundred per-overview feedback submis-
Since many user requests are the same or use sim- sions, over three quarters have been positive.
ilar data, caching is useful to minimize response Tables 1 and 2 show example theme summaries
times. When NSTM receives a request, it checks generated for the queries ‘Facebook’ and ‘U.K.’.
whether there is a corresponding overview in the Note that the summaries are quite different from
cache, and immediately returns it if so. 99.6% of what has previously been studied by the NLP com-
requests hit the cache and 99% of requests are han- munity (in terms of brevity and grammatical style)
dled within 215ms.8 In the event of a cache miss, and that they accurately represent distinct events.
NSTM responds in a median time of 723ms.9 In addition to user-driven settings, NSTM can
We apply two mechanisms to ensure cache fresh- be used to supplement context-driven applications.
ness. Firstly, we preemptively invoke NSTM us- One example, demonstrated in Appendix D, uses
ing requests that are likely to be queried by users themes provided by NSTM to help explain why
(e.g., most read topics) and re-compose them from companies or topics are ‘trending’.
scratch at fixed intervals (e.g., every 30 min). Once
computed, they are cached. The second mecha- 6 Conclusion
nism is user-driven: every time a user requests an
We presented NSTM, a novel and production-ready
overview which is not cached, it will be created and
system that composes concise and human-readable
added to the cache. The system will subsequently
news overviews given arbitrary user search queries.
preemptively invoke NSTM using this request for
NSTM is the first of its kind; it is query-driven,
a fixed period of time (e.g., 24 hours).
it offers unique news overviews which leverage
5 Demonstration clustering and succinct summarization, and it has
been released to hundreds of thousands of users.
NSTM was deployed to our clients in 2019. Using We also demonstrated effective adoption of mod-
the UI depicted in Fig. 1, users can find overviews ern NLP techniques and advances in the design and
for customized queries to help support their work. implementation of the system, which we believe
From this screen, the user can enter a search query will be of interest to the community.
using any combination of Boolean logic with tag- There are many open questions which we intend
or keyword-based terms. They may also alter the to research, such as whether autoregressivity in
8
Computed for all requests over a 90-day period. neural sentence compression can be exploited and
9
Computed for the top 50 searches over a 7-day period. how to compose themes over longer time periods.
References Peggy van der Kreeft, Hervé Bourlard, João Pri-
eto, Ondřej Klejch, Peter Bell, Alexandros Lazaridis,
Charu C Aggarwal and Philip S Yu. 2006. A frame- Alfonso Mendes, Sebastian Riedel, Mariana S. C.
work for clustering massive text and categorical data Almeida, Pedro Balage, Shay B. Cohen, Tomasz
streams. In Proceedings of the 2006 SIAM Interna- Dwojak, Philip N. Garner, Andreas Giefer, Marcin
tional Conference on Data Mining, pages 479–483. Junczys-Dowmunt, Hina Imran, David Nogueira,
SIAM. Ahmed Ali, Sebastião Miranda, Andrei Popescu-
Miguel Almeida and André Martins. 2013. Fast and ro- Belis, Lesly Miculicich Werlen, Nikos Papasaran-
bust compressive summarization with dual decom- topoulos, Abiola Obamuyide, Clive Jones, Fahim
position and multi-task learning. In Proceedings Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong,
of the 51st Annual Meeting of the Association for Rico Sennrich, Nikolaos Pappas, Shashi Narayan,
Computational Linguistics (Volume 1: Long Pa- Marco Damonte, Nadir Durrani, Sameer Khurana,
pers), pages 196–206, Sofia, Bulgaria. Association Ahmed Abdelali, Hassan Sajjad, Stephan Vogel,
for Computational Linguistics. David Sheppey, Chris Hernon, and Jeff Mitchell.
2017. The SUMMA platform prototype. In Pro-
Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. ceedings of the Software Demonstrations of the 15th
A simple but tough-to-beat baseline for sentence em- Conference of the European Chapter of the Associa-
beddings. In Proceedings of the 5th International tion for Computational Linguistics, pages 116–119,
Conference on Learning Representations, ICLR’17. Valencia, Spain. Association for Computational Lin-
OpenReview.net. guistics.

Michele Banko, Michael J. Cafarella, Stephen Soder- Chin-Yew Lin. 2004. ROUGE: A package for auto-
land, Matt Broadhead, and Oren Etzioni. 2007. matic evaluation of summaries. In Text Summariza-
Open information extraction from the web. In Pro- tion Branches Out, pages 74–81, Barcelona, Spain.
ceedings of the 20th International Joint Conference Association for Computational Linguistics.
on Artifical Intelligence, IJCAI’07, page 2670–2676, Chin-Yew Lin and Eduard Hovy. 2003. Auto-
San Francisco, CA, USA. Morgan Kaufmann Pub- matic evaluation of summaries using n-gram co-
lishers Inc. occurrence statistics. In Proceedings of the 2003 Hu-
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. man Language Technology Conference of the North
2003. Latent dirichlet allocation. J. Mach. Learn. American Chapter of the Association for Computa-
Res., 3:993–1022. tional Linguistics, pages 150–157.
Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and variational inference for text processing. In Proceed-
Kristina Toutanova. 2019. BERT: Pre-training of ings of the 33rd International Conference on Inter-
deep bidirectional transformers for language under- national Conference on Machine Learning - Volume
standing. In Proceedings of the 2019 Conference 48, ICML’16, pages 1727–1736. JMLR.org.
of the North American Chapter of the Association
for Computational Linguistics: Human Language Daniel Müllner. 2013. fastcluster: Fast hierarchical,
Technologies, Volume 1 (Long and Short Papers), agglomerative clustering routines for R and Python.
pages 4171–4186, Minneapolis, Minnesota. Associ- Journal of Statistical Software, Articles, 53(9):1–18.
ation for Computational Linguistics.
Joakim Nivre, Marie-Catherine De Marneffe, Filip Gin-
Katja Filippova, Enrique Alfonseca, Carlos A. Col- ter, Yoav Goldberg, Jan Hajic, Christopher D Man-
menares, Lukasz Kaiser, and Oriol Vinyals. 2015. ning, Ryan McDonald, Slav Petrov, Sampo Pyysalo,
Sentence compression by deletion with LSTMs. In Natalia Silveira, et al. 2016. Universal dependencies
Proceedings of the 2015 Conference on Empirical v1: A multilingual treebank collection. In Proceed-
Methods in Natural Language Processing, pages ings of the Tenth International Conference on Lan-
360–368, Lisbon, Portugal. Association for Compu- guage Resources and Evaluation (LREC’16), pages
tational Linguistics. 1659–1666.
Diederik P. Kingma and Max Welling. 2014. Auto- Ankur Parikh, Oscar Täckström, Dipanjan Das, and
encoding variational bayes. In 2nd International Jakob Uszkoreit. 2016. A decomposable attention
Conference on Learning Representations, ICLR model for natural language inference. In Proceed-
2014, Banff, AB, Canada, April 14-16, 2014, Con- ings of the 2016 Conference on Empirical Methods
ference Track Proceedings. in Natural Language Processing, pages 2249–2255,
Austin, Texas. Association for Computational Lin-
Eliyahu Kiperwasser and Yoav Goldberg. 2016. Sim- guistics.
ple and accurate dependency parsing using bidirec-
tional LSTM feature representations. Transactions Danilo Jimenez Rezende, Shakir Mohamed, and Daan
of the Association for Computational Linguistics, Wierstra. 2014. Stochastic backpropagation and ap-
4:313–327. proximate inference in deep generative models. In
Proceedings of the 31th International Conference on
Renars Liepins, Ulrich Germann, Guntis Barzdins, Machine Learning, ICML 2014, Beijing, China, 21-
Alexandra Birch, Steve Renals, Susanne Weber, 26 June 2014, pages 1278–1286.
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014.
Sequence to sequence learning with neural networks.
In Advances in Neural Information Processing Sys-
tems 27: Annual Conference on Neural Informa-
tion Processing Systems 2014, December 8-13 2014,
Montreal, Quebec, Canada, pages 3104–3112.
Srinivas Vadrevu, Choon Hui Teo, Suju Rajan, Kunal
Punera, Byron Dom, Alexander J. Smola, Yi Chang,
and Zhaohui Zheng. 2011. Scalable clustering of
news search results. In Proceedings of the Fourth
ACM International Conference on Web Search and
Data Mining, WSDM’11, pages 675–684, New
York, NY, USA. ACM.
Aaron Steven White, Drew Reisinger, Keisuke Sak-
aguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger,
Kyle Rawlins, and Benjamin Van Durme. 2016. Uni-
versal Decompositional Semantics on Universal De-
pendencies. In Proceedings of the 2016 Conference
on Empirical Methods in Natural Language Process-
ing, pages 1713–1723, Austin, Texas. Association
for Computational Linguistics.
A Acknowledgements
This has been a multi-year project, involving con-
tributions from many people at different stages.
In particular, we thank Miles Osborne, Marco
Ponza, Amanda Stent, Mohamed Yahya, Christoph
Teichmann, Prabhanjan Kambadur, Umut Topkara,
Ted Merz, Sam Brody, and Adrian Benton for re-
viewing and commenting on the manuscript; We
further thank Adela Quinones, Shaun Waters, Mark
Dimont, Ted Merz and other colleagues from the
News Product group for helping to shape the vi-
sion of the system; We also thank José Abarca
and his team for developing the user interface; We
thank Hady Elsahar for helping to improve sum-
mary ranking during his internship; Finally, we
thank all colleagues (especially those in the Global
Data department) who helped to produce high qual-
ity in-house annotations and all others who con-
tributed valuable thoughts and time into this work.

B End-To-End Evaluation
We evaluate the end-to-end NSTM system when
using the OpenIE (Sec. 4.4.1) and the BERT-based
sentence compression (Sec. 4.4.2) algorithms as the
sole source of candidate summaries. We also con-
ducted one experiment where both were used to cre-
ate a shared pool of candidates (as per Sec. 4.4.4).
We test the system end-to-end using the
manually-annotated Single Document Summariza-
tion (SDS) test set described in Sec. 4.4.2. To
implement SDS, our experimental setup assumes
that only one story was returned by a search request
(as per Sec. 4.2). We evaluate the output from each
system with ROUGE (Lin and Hovy, 2003; Lin,
2004)10 . The results are presented in Table 3.

Metric OpenIE BSC Both

ROUGE-1 F1 0.831 0.863 0.851
ROUGE-2 F1 0.609 0.701 0.667
ROUGE-3 F1 0.530 0.640 0.599
ROUGE-4 F1 0.492 0.603 0.562
ROUGE-L F1 0.621 0.706 0.670

Table 3: ROUGE scores for the Single-Document Sum-

marization task in the end-to-end system, when using
OpenIE, BERT-based sentence compression (BSC) and
both to construct the pool of candidate summaries.

10
https://github.com/google/seq2seq/blob/master/seq2seq/metrics/rouge.py
C Screenshots of A Query-Driven User Interface

Figure 4: Screenshot (taken on 29 January 2020) of a query-driven interface for NSTM showing the overview for
the company ‘Amazon.com’.

Figure 5: Screenshot (taken on 29 January 2020) of a query-driven interface for NSTM showing the overview for
the topic ‘Electric Vehicles’.
Figure 6: Screenshot (taken on 29 January 2020) of a query-driven interface for NSTM showing the overview for
the region ‘Canada’.

Figure 7: Screenshot (taken on 29 January 2020) of a query-driven interface for NSTM showing the overview for
a complex query, including a keyword.
D Screenshots of A Context-Driven User Interface

Figure 8: Screenshot (taken on 29 January 2020) of a context-driven application of NSTM. In the ‘Security’ column
are the companies that have seen the largest increase in news readership over the last day. Each entry in the ‘News
Summary’ column is the summary of the top theme provided by NSTM for the adjacent company.

Figure 9: Screenshot (taken on 29 January 2020) of a context-driven application of NSTM. In the ‘News Topic’
column are the topics that have seen the largest volume of news readership over the past 8 hours. Each entry in the
‘News Summary’ column is the summary of the top theme provided by NSTM for the adjacent topic.

UNIT 4 Information Retrieval Using NLP
No ratings yet
UNIT 4 Information Retrieval Using NLP
13 pages
Viswajothi Technologies PR Ivate Limited: "Text Summarization Based On NLP"
67% (3)
Viswajothi Technologies PR Ivate Limited: "Text Summarization Based On NLP"
23 pages
Semantic
No ratings yet
Semantic
165 pages
News Recommender System: A Review of Recent Progress, Challenges, and Opportunities
No ratings yet
News Recommender System: A Review of Recent Progress, Challenges, and Opportunities
52 pages
A Survey of Automatic Text Summarization Progress
No ratings yet
A Survey of Automatic Text Summarization Progress
29 pages
Beyond Search - Event-Driven Summarization For Web Videos
No ratings yet
Beyond Search - Event-Driven Summarization For Web Videos
23 pages
Project Thesis Grp-8 - Final - Upload - Jul31
No ratings yet
Project Thesis Grp-8 - Final - Upload - Jul31
42 pages
GR8 Presentation (IT) 8th Sem
No ratings yet
GR8 Presentation (IT) 8th Sem
21 pages
FINAL REPORT grp15
No ratings yet
FINAL REPORT grp15
24 pages
Multi-Document Extractive Summarization For News Page 1 of 59
No ratings yet
Multi-Document Extractive Summarization For News Page 1 of 59
59 pages
Smriti Mishra
No ratings yet
Smriti Mishra
15 pages
Lect 08
No ratings yet
Lect 08
17 pages
ReactJS PDF
No ratings yet
ReactJS PDF
403 pages
F.4 Topic Detection and Tracking
No ratings yet
F.4 Topic Detection and Tracking
9 pages
1331 4786 1 PB
No ratings yet
1331 4786 1 PB
14 pages
Semantic Fake News Detection: A Machine Learning Perspective
No ratings yet
Semantic Fake News Detection: A Machine Learning Perspective
12 pages
Icimes 113
No ratings yet
Icimes 113
27 pages
Feature Eng
No ratings yet
Feature Eng
34 pages
A Comprehensive Survey On Process-Oriented Automatic Text Summarization With Exploration of LLM-Based Methods
No ratings yet
A Comprehensive Survey On Process-Oriented Automatic Text Summarization With Exploration of LLM-Based Methods
20 pages
Applications of NLP
No ratings yet
Applications of NLP
85 pages
Report
No ratings yet
Report
22 pages
Paper News Text Summaraization 1
No ratings yet
Paper News Text Summaraization 1
7 pages
PNS A Personalized News Aggregator On The Web
No ratings yet
PNS A Personalized News Aggregator On The Web
19 pages
2automatic Text Summarization With Neural Networks
No ratings yet
2automatic Text Summarization With Neural Networks
5 pages
Topic Extraction From News Archive Using TF PDF Algorithm
No ratings yet
Topic Extraction From News Archive Using TF PDF Algorithm
10 pages
Data Mining News Article
No ratings yet
Data Mining News Article
30 pages
Paper Work
No ratings yet
Paper Work
12 pages
Research Paper Text Mining
No ratings yet
Research Paper Text Mining
7 pages
Paper News Text Summaraizaton
No ratings yet
Paper News Text Summaraizaton
8 pages
A News Analysis and Tracking System
No ratings yet
A News Analysis and Tracking System
6 pages
Experiance Letter Sample
No ratings yet
Experiance Letter Sample
3 pages
Shubh Am
No ratings yet
Shubh Am
40 pages
Capstone Project Report (AST)
No ratings yet
Capstone Project Report (AST)
44 pages
Research Paper Summer Izer
No ratings yet
Research Paper Summer Izer
6 pages
Unravel News: An Efficient Summarization Approach: Ankan Saha Abdullah Al Shafi
No ratings yet
Unravel News: An Efficient Summarization Approach: Ankan Saha Abdullah Al Shafi
6 pages
Personalized News Summarization and Analysis Using Pre-Trained Transformer Models
No ratings yet
Personalized News Summarization and Analysis Using Pre-Trained Transformer Models
6 pages
Fetterman Aaron
No ratings yet
Fetterman Aaron
3 pages
Newsstand Through RSS Feeds
No ratings yet
Newsstand Through RSS Feeds
11 pages
Seminar - Report - PYLI - RAGHURAM - Entire Document Ready
No ratings yet
Seminar - Report - PYLI - RAGHURAM - Entire Document Ready
26 pages
Abstractive Text Summarization Using Transformer Architecture
No ratings yet
Abstractive Text Summarization Using Transformer Architecture
5 pages
Semantic News Finder: A Semantic Retrieval From News Items: M.Thangaraj G.Sujatha
No ratings yet
Semantic News Finder: A Semantic Retrieval From News Items: M.Thangaraj G.Sujatha
9 pages
A-22rp pdf2
No ratings yet
A-22rp pdf2
6 pages
Unit 5 TB
No ratings yet
Unit 5 TB
18 pages
Teste 10 Ano - Technology (Inglês)
100% (2)
Teste 10 Ano - Technology (Inglês)
5 pages
State of The Art Text - Summarisation
No ratings yet
State of The Art Text - Summarisation
15 pages
Intro To TM
No ratings yet
Intro To TM
32 pages
2WH Light
No ratings yet
2WH Light
36 pages
Abstractive Text Summarization Using Transformer Based Approach
No ratings yet
Abstractive Text Summarization Using Transformer Based Approach
10 pages
HPE6-A88 HPE Aruba Networking ClearPass Exam Free Dumps
No ratings yet
HPE6-A88 HPE Aruba Networking ClearPass Exam Free Dumps
10 pages
Automatic Text Recognisation
No ratings yet
Automatic Text Recognisation
4 pages
Geoparsing and Geosemantics For Social Media: Spatio-Temporal Grounding of Content Propagating Rumours To Support Trust and Veracity Analysis During Breaking News
No ratings yet
Geoparsing and Geosemantics For Social Media: Spatio-Temporal Grounding of Content Propagating Rumours To Support Trust and Veracity Analysis During Breaking News
27 pages
Fake News Detection Using Deep Learning
No ratings yet
Fake News Detection Using Deep Learning
5 pages
Text Summarization Using The T5 Transformer Model
No ratings yet
Text Summarization Using The T5 Transformer Model
3 pages
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
No ratings yet
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
29 pages
Theme-Based Retrieval of Web News
No ratings yet
Theme-Based Retrieval of Web News
2 pages
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
No ratings yet
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
13 pages
Text Summarization Using Word Frequency
No ratings yet
Text Summarization Using Word Frequency
3 pages
Robust Semantic Framework For Web Search Engine
No ratings yet
Robust Semantic Framework For Web Search Engine
6 pages
Kmu BSN 1st Semes Computer Slides by M Ibrahim
No ratings yet
Kmu BSN 1st Semes Computer Slides by M Ibrahim
33 pages
Krushi Bhavan
No ratings yet
Krushi Bhavan
5 pages
White Paper
No ratings yet
White Paper
16 pages
A Queueing Model With Server Breakdowns Repairs Va
No ratings yet
A Queueing Model With Server Breakdowns Repairs Va
13 pages
Ed TVN 041920
No ratings yet
Ed TVN 041920
88 pages
The Design of A Low-Voltage Bandgap Reference The Analog Mind
No ratings yet
The Design of A Low-Voltage Bandgap Reference The Analog Mind
8 pages
Steering System PDF
No ratings yet
Steering System PDF
49 pages
GCash Edelene
No ratings yet
GCash Edelene
10 pages
MECH0023 Week 01 Notes
No ratings yet
MECH0023 Week 01 Notes
24 pages
StopRansomware Guide 508C v3 - 1
No ratings yet
StopRansomware Guide 508C v3 - 1
31 pages
Solar Photovoltaic Glint and Glare Guidance First Edition
No ratings yet
Solar Photovoltaic Glint and Glare Guidance First Edition
55 pages
Unit 8
No ratings yet
Unit 8
4 pages
ILAC - Members (By Category)
No ratings yet
ILAC - Members (By Category)
11 pages
Nexans NYY 80-0-6 1 KV Single Core
No ratings yet
Nexans NYY 80-0-6 1 KV Single Core
6 pages
3.1 Usage of Ajax and Json
No ratings yet
3.1 Usage of Ajax and Json
18 pages
Database Administration Level IV Theory Exam 6
No ratings yet
Database Administration Level IV Theory Exam 6
5 pages
Handwriting Recognition Software
No ratings yet
Handwriting Recognition Software
10 pages
DDS-CAD Installation Manual
No ratings yet
DDS-CAD Installation Manual
36 pages
Aplikasi Ujian Online Masuk Universitas Merdeka Madiun Berbasis Android
No ratings yet
Aplikasi Ujian Online Masuk Universitas Merdeka Madiun Berbasis Android
12 pages
Section C - Digital MCQ2
No ratings yet
Section C - Digital MCQ2
6 pages
Education: Mechanical Engineer
No ratings yet
Education: Mechanical Engineer
2 pages
First Order Differential Equation: Homogenous Equations TEST 1: 19/10 6-7 PM Presentation Chapter 1 - 30/9/2015
No ratings yet
First Order Differential Equation: Homogenous Equations TEST 1: 19/10 6-7 PM Presentation Chapter 1 - 30/9/2015
8 pages
Add Label For XY Scatter Chart
No ratings yet
Add Label For XY Scatter Chart
34 pages
A Reliable Architecture Based On Reactive Microservices For Iot Applications
No ratings yet
A Reliable Architecture Based On Reactive Microservices For Iot Applications
5 pages
JP-Finance Officer
No ratings yet
JP-Finance Officer
2 pages
A.I. Cancer Timebomb
From Everand
A.I. Cancer Timebomb
charles r giardina
No ratings yet
Mainframes in the Modern Era: Adapting to New Technologies: Mainframes
From Everand
Mainframes in the Modern Era: Adapting to New Technologies: Mainframes
Isaac Nangan
No ratings yet
The New Age of Communications
From Everand
The New Age of Communications
John O. Green
4/5 (2)
Investing In Nanotechnology: Think Small. Win Big
From Everand
Investing In Nanotechnology: Think Small. Win Big
Jack Uldrich
No ratings yet
Mirachip: Miracle Computer Chip Combats Airborne Viruses and Fosters a Cyber Virus to Debilitate China
From Everand
Mirachip: Miracle Computer Chip Combats Airborne Viruses and Fosters a Cyber Virus to Debilitate China
Jason O’Neil
No ratings yet
The Road Ahead (Review and Analysis of Gates' Book)
From Everand
The Road Ahead (Review and Analysis of Gates' Book)
BusinessNews Publishing
3/5 (1)
Artificial Intelligence: Understanding Business Applications, Automation, and the Job Market
From Everand
Artificial Intelligence: Understanding Business Applications, Automation, and the Job Market
John Adamssen
No ratings yet

NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg

Uploaded by

NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg

Uploaded by

NSTM: Real-Time Query-Driven News Overview Composition at

Abstract stories. Moreover, news headlines are frequently

In many domains, finding contextually-important 2 Design Goals

4.4.1 OpenIE-based Tuple Extraction 4.4.3 Summary Candidate Ranking

Metric OpenIE BSC Both

Table 3: ROUGE scores for the Single-Document Sum-

You might also like