A Survey of Automatic Text Summarization Progress
A Survey of Automatic Text Summarization Progress
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
ABSTRACT With the evolution of the Internet and multimedia technology, the amount of text data
has increased exponentially. This text volume is a precious source of information and knowledge that
needs to be efficiently summarized. Text summarization is the method to reduce the source text into
a compact variant, preserving its knowledge and the actual meaning. Here we thoroughly investigate
the automatic text summarization (ATS) and summarize the widely recognized ATS architectures. This
paper outlines extractive and abstractive text summarization technologies and provides a deep taxonomy
of the ATS domain. The taxonomy presents the classical ATS algorithms to modern deep learning ATS
architectures. Every modern text summarization approach’s workflow and significance are reviewed with
the limitations with potential recovery methods, including the feature extraction approaches, datasets,
performance measurement techniques, and challenges of the ATS domain, etc. In addition, this paper
concisely presents the past, present, and future research directions in the ATS domain.
INDEX TERMS Automatic Text Summarization, Feature Extraction, Summarization Methods, Perfor-
mance Measurement Matrices, Challenges.
VOLUME 4, 2016 1
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
topic. Therefore, this study aims to assist academics and Identification 498 papers are collected from various sources
professionals in developing an idea of the evolution of ATS,
research progress, and future research directions in this topic.
In addition, the apparent obstacles or limitations in future Scanning Duplicate and similar papers are removed
research in this field are also discussed in this paper.
Text summarization was invented by H.P. Luhn [1] in the
Eligibility
1950s, which was used in the first commercial computer testing
Inappropriate and low-grade papers are filtered
neural word embedding [2], Bag of Words (BoW) [3], and rent methods, datasets, feature extraction, and summa-
word2vec [4], and modern deep learning approaches such rization approaches. Moreover, this study explains the
as recurrent neural networks (RNN) [5] and long short-term constraints and limitations of such methods.
memory (LSTM) [6] have observed significant progress in • Subsequently, the study ends by distinguishing the cur-
the ATS domain. The evolution of ATS from the 1950s to the rent difficulties and challenges of ATS architectures,
present is reviewed in this study. along with future research directions.
Text summarization processes from the 1970s to the early The remainder of this paper is organized as follows: The
2000s are considered traditional methods. Traditional text literature review of existing ATS surveys shown in Section
summarization processes require a better knowledge of the II, the motivations and applications of ATS are described in
document to find the essential keywords. ATS has become an Section III, the basic structure is provided in Section IV,
appealing domain for its influential assistance in the study the most commonly used datasets in ATS are described in
and expansion of automation, too [7]. The improvement Section V, the widely used pre-processing techniques are
behind this new ATS is achieved by following a standard addressed in Section VI, the strategies for extracting features
structure. Text summarization becomes more accurate and are described in Section VII, main ATS approaches are
fluent by getting trimmed and interpreted with the proper discussed in VIII and algorithms are described in Section IX.
design of processes. The ATS approaches are reviewed in Section X and the ATS
As we have investigated ATS in-depth in this compre- measuring performance methods are discussed in Section XI.
hensive survey, the study required a collection of scholarly The ATS challenges with potential research objectives are
research between 1998 and 2021. We followed a systematic addressed in Section XII. Finally, Section XIII concludes the
literature review (SLR) approach to complete the review. paper.
Kitchenham proposed this systematic literature review (SLR)
II. LITERATURE REVIEW OF EXISTING ATS SURVEY
approach [8], [9] which consists of three phases: planning,
conducting, and reporting the review. The SLR approaches We have investigated the existing surveys of the ATS domain,
tried to answer all possible questions that could arise while and a few of them are presented to prove the significance
progressing in this research field. The goal of this research of this paper. Most surveys covered the former methods
was to examine the findings of several essential research and research on ATS. However, recent trends, applicability,
disciplines. The necessary materials for this research are effects, limitations, and challenges of ATS techniques were
assembled using the (Preferred Reporting Items for System- not present. Table 1 summarizes and compares the existing
atic Reviews and Meta-Analyses) workflow diagram. The survey on ATS.
PRISMA workflow for this survey is shown in Figure 1. Mishra et al. [12] reviewed (2000-2013) years of studies
and found some methods such as hybrid statistical and ML
approaches. The researchers did not include cognitive aspects
The overall contributions of this paper are given as follows:
or evaluations of the impact of ATS. Allahyari et al. [15]
• This article performs a systematic review of the au- investigated different processes such as topic representation,
tomatic text summarization, including the fundamental frequency-driven, graph-based, and machine learning meth-
theories and evolutions. ods for ATS. This research only includes the frequently used
• The survey includes the investigation of the exist- strategies. El-Kassas et al. [17] described graph-based, fuzzy
ing dataset, feature extraction, text summarization ap- logic-based, concept-oriented, ML approaches, etc., with
2 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
their advantages or disadvantages. This research did not in- the application, which is presented in the following section.
clude abstractive or hybrid techniques. Saranyamol et al. [11] Recently ATS has extensively employed applications
offered a thorough survey for analysts by introducing various based on information retrieval, information extraction, ques-
aspects of ATS such as structure, strategies, datasets, evalua- tion answering, text mining, and analytics. The ATS also
tion metrics, etc. Gambhir et al. [16] attempted to analyze a improves the search engine’s capabilities with various ap-
hybrid approach including two text summarization methods. plications, including news summary, email summarizations,
This study missed many contemporary techniques for review. domain-specific summarization. Now, the applications of the
The research of Gholamrezazadeh et al. [10] represents a ATS domain are presented below:
comprehensive and comparative study of extractive methods
in ATS of the last decade. Several multilingual approaches 1 Books or Novel Summarization: ATS is used mainly to
have also been discussed. Andhale et al. [13] provided a summarize long documents such as books, literature, or
taxonomy of text summarization methods and a variety of novels, as short documents are unsuitable for summariza-
techniques. Although the author has covered some time- tion. It is not easy to find context from short texts, whether
consuming processes of ATS, recent, more efficient methods long documents are a better summary material [19].
such as machine learning were missed. Abualigah et al. [18] 2 Social Posts or Tweet Summarization: Every day, mil-
conducted research on how to handle multiple documents and lions of messages, posts are generated on social network-
massive web data for text summarization. Lastly, the paper ing sites such as Facebook, Twitter, etc. Useful important
contains a comparative table with recent studies without text summarization can be achieved using ATS [20]. This
details. Bharti et al. [14] presented a survey of research valuable source of information using the ATS [20].
papers based on automated keyword extraction methods and 3 Sentiment Analysis (SA): The analysis of people’s views,
techniques. It covers ideas about multiple databases that are feelings, and judgments regarding events and situations
used for document summarization. is known as sentiment analysis. SA classifies emotions
and mostly opinions from product reviews as "Positive"
or "Negative" using fuzzy logic. ATS is quite helpful for
III. MOTIVATION AND APPLICATION OF ATS market analysts in summarizing the feelings or thoughts of
This study aims to provide an overview of current research hundreds of people [21].
in NLPs and, precisely, ATS to accelerate knowledge about 4 News Summarization: The ATS helps summarize news
it. In addition, it allows the creation of new tools, meth- from many websites, such as CNN and other prominent
ods, datasets, and resources that meet the needs of the news portals. ATS extracts the primary emphasis point of
research and industrial sectors. The advancement of NLPs the story in a newspaper, which is sometimes used as the
made automatic text summarization usable for a regular text story’s headline [22].
document summary and sentiment analysis. Moreover, ATS 5 Email Summarization: Email communications are un-
promotes a versatile approach to research various fields such structured and not usually syntactically well-formed do-
as machine learning, natural language, cognitive science, mains for summarization. ATS usually extracts noun
and psychology. With multiple sources of information, ATS phrases and generates a summary of email messages using
discusses cutting-edge work and future directions in this linguistic methods, and machine learning algorithms [23].
exciting area. These collective findings are the motivation 6 Legal Documents Summarization: ATS discovers rele-
behind this research. An essential part of research on ATS is vant prior instances based on legal questions and rhetorical
VOLUME 4, 2016 3
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
TABLE 2. The table presented the popular datasets used in the text summarization domain.
Number of
Dataset Name Documents / Language Used in URL
Sentences
New Taiwan Weekly 2738 sentences English [41]–[44] http://www.newtaiwan.com.tw
Annotated English 9,876,086 docu-
English [16], [45]–[50] https://catalog.ldc.upenn.edu/LDC2011T07
Gigaword ments
DUC 2001 600 sentences English [51]–[56] https://www-nlpir.nist.gov/projects/duc/data.html
DUC 2002 600 sentences English [57]–[62] https://www-nlpir.nist.gov/projects/duc/data.html
[51], [55], [58], [63]–
DUC 2003 600 sentences English https://www-nlpir.nist.gov/projects/duc/data.html
[67]
DUC 2004 1000 sentences Arabic, English [68]–[74] https://www-nlpir.nist.gov/projects/duc/data.html
DUC 2005 1600 sentences English [75]–[78] https://www-nlpir.nist.gov/projects/duc/data.html
DUC 2006 1250 sentences English [77], [79]–[83] https://www-nlpir.nist.gov/projects/duc/data.html
DUC 2007 [84] 250 sentences English [85]–[89] https://www-nlpir.nist.gov/projects/duc/data.html
EASC 153 documents Arabic [6], [90]–[92] https://www.lancaster.ac.uk/staff/elhaj/corpora.htm
Brazilian,
TeMario corpus 1000 sentences Portuguese, [93]–[97] https://www.linguateca.pt/Repositorio/TeMario/
Spanish
619,446
Enron email dataset English [32], [33], [98]–[101] https://github.com/deepmind/rc-data/
messages
CNN/Daily mail One million news English [102]–[107] https://github.com/abisee/cnn-dailymail
Wikihow [108] 200k English [109] https://wikiHow.com
800,000
SUMMAC dataset http://www-nlpir.nist.gov/related projects/tipster
electronic English [15], [111]–[116]
[110] summac/cmplg-xml.tar.gz.
documents
SummBank 400 sentences Chinese, English [17], [117]–[119] https://catalog.ldc.upenn.edu/LDC2003T16
http://kavita-ganesan.com/opinosis-opinion-
Opinosis [120] 575 sentences English [86], [121], [122]
dataset/
TAC 2008 960 sentences English [123]–[128] https://tac.nist.gov/data/index.html
TAC 2009 880 sentences Multi-media [129]–[133] https://tac.nist.gov/data/index.html
TAC 2010 920 sentences English [131], [134]–[138] https://tac.nist.gov/data/index.html
[92], [131], [135],
TAC 2011 880 sentences English https://tac.nist.gov/data/index.html
[139]–[142]
2013 TREC 1 billion English [143]–[147] https://trec.nist.gov/data.html
[116], [146], [148],
2014 TREC 1 billion English https://trec.nist.gov/data.html
[149]
2015 TREC 38GB English [150]–[152] https://trec.nist.gov/data.html
LCSTS [153] 2,400,591 Chinese [154]–[158] http://icrc.hitsz.edu.cn/Article/show/139.html
Blog Summarization (http://cosmicvariance.com)
350 sentences English [15], [159]–[162]
Dataset (http://blogs.msdn.com/ie/)
tion of text pieces, articles, parts of newspapers, etc. These 1) Parts Of Speech (POS) Tagging: The technique of
datasets are used in their respective sectors, such as news- grouping or organizing text words according to speech
paper datasets used only for newspaper summary purposes, categories such as nouns, verbs, adverbs, adjectives,
as they have the same pattern. In addition, these datasets can etc., is known as speech tagging [163].
be classified based on sources. Text summarization datasets 2) Stop Word Filtering: Based on the context, stop words
have four primary sources: newspapers, articles or blogs, are screened out either before or after textual analysis.
reviews, and emails or messages based on types. Newspaper A, an, and by are illustrations of stop words that can be
source datasets are TeMario Corpus, CNN News, and Daily analyzed and eliminated from plain text [164].
mail dataset; Articles or blogs type datasets are EASC, LC- 3) Stemming: Stemming eliminates inflections and deriva-
STS, and Wikihow; Reviews source datasets are New Taiwan tive forms to a set of words categorized as primary
Weekly, and Opinosis; Emails or messages source datasets or root forms. By using linguistic strategies such as
are SKE, and Enron dataset. The most popular datasets in affixation, text stemming transforms words to consider
ATS are presented in Table 2. different word forms [164].
4) Named Entity Recognition (NER): Words in the input
text are recognized as names of items (i.e., person name,
VI. PRE-PROCESSING TECHNIQUES IN ATS location name, company name, etc.) [165].
5) Tokenization: Tokenization is a text pre-processing
Several pre-processing are performed to clean the noisy
technique that divides text flows into tokens, which can
and unfiltered text. Erroneous messages and chats, including
be words, phrases, symbols, or other meaningful pieces.
slang or trash phrases, are known as "noisy" and "unfiltered
The goal of this technique is to examine the words in a
text". The approaches mentioned below appear to be some of
document [166], [167].
the most often utilized pre-processing procedures:
VOLUME 4, 2016 5
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
6) Capitalization: Diverse capitalization in different doc- tion about the document. Researchers have such a better
uments can be problematic and thus requires to convert chance of being included in the summary as a result of
every letter into lowercase letters in a document. All this. The feature’s binary or regressive score value could
text and document words are then merged into a single be anywhere from [0. .1] [176].
feature space using this method [168]. 4) Length Feature: A sentence’s length can indicate
7) Slang and Abbreviation: Slang and abbreviation are whether it is summary-worthy. In summation, it may be
two different types of text anomalies that are addressed wrong to assume that a sentence is worthy of mention
in the pre-processing stage. A support vector machine is based on its length. Compared to the size of other sen-
an acronym [169], a shortened form of a word or phrase tences in the source material, very long and comparatively
made up mainly of the first letter of the terms. short sentences are usually not included in the summary
8) Noise Removal: Most textual data contain many more [177].
characters, such as punctuation and special characters. 5) Sentence–Sentence Similarity: The resemblance of the
While important punctuation and special characters are querying sentences to other sentences in the text may be
required for human interpretation of documents, they helpful for summarization. This feature extraction process
can cause problems with classification algorithms [170]. can be performed in various ways [178].
9) Spelling Correction: Spelling correction is an optional 6) Title Feature (Tif): Sentences containing terms [179]
step in the pre-processing process. Typos are common from the headline may suggest the document’s theme and
in texts and documents, particularly in online media text are more likely to be included in the summary.
datasets (e.g., Twitter) [171]. 7) Phrasal Information (PI): The proportion of phrases is
10) Lemmatization: The process of changing a word’s always helpful in summarizing. A collection of phrases
suffix with a new one or eliminating a word’s suffix to P includes adjective phrases (ADJP), noun phrases (NP),
obtain the basic word form is known as lemmatization prepositions (PPM), and verbal phrases (VP) [180].
(lemma). Its main application area is natural language 8) Title Similarity (TS): A sentence receives a decent grade
processing [172], [173]. if it has the most terms in common with the title. The
number of words can determine the title similarity in a
VII. FEATURE EXTRACTION IN ATS sentence that appears in the title, and the total number of
Feature extraction is a technique for discovering topic sen- words [116].
tences, essential data traits or attributes from the source 9) Sentence Position (SP): This feature determines where
documents. ATS follows two phases to locate the important a sentence appears in the text. The importance of the
sentences in the text: extracting features and text represen- sentences is decided by where they appear in the text,
tation approach. This section describes the most often used whether it is the opening of five sentences in a paragraph
extraction features and text representation approaches for [177].
generating sentences for text summarization. 10) Thematic Word (TW): This feature is associated with
domain-specific phrases that frequently appear in a text
A. FEATURES most likely relevant to the document’s topic. The score is
Collecting the essential features is the first phase of the calculated by comparing the number of theme words in
feature extraction process. It is necessary to represent the the phrase to the maximum sum of thematic terms in the
sentences as vectors or score them to find a vital sentence sentence [179].
from a document. Some features are used as attributes to 11) Numerical Data (ND): A statement incorporating nu-
define the text for this task. The most prevalent features for merical data is generally crucial. This is most likely found
calculating the score of a sentence and indicating the degree in the summary of a document. The score is calculated by
to which it belongs to a summary are given below: dividing the numerical data in a sentence by the length of
1) Term Frequency (TF): The TF metric is used to deter- the sentence [116].
mine the importance of terms in a single document [174].
As one of the most fundamental properties of ATS, it is B. TEXT REPRESENTATION
commonly employed to represent a word’s weight. The text representation models are now utilized to represent
2) Term Frequency-Inverse Sentence Frequency (TF- the input documents in a better shape. In NLP, text represen-
ISF): The most relevant feature extraction approach based tation approaches imply translating words into numbers so
on the text summarization survey measures the term that computers can comprehend and decode patterns within a
frequency-inverse sentence frequency amongst the sen- language. Generally, these approaches develop a connection
tences in all documents [175]. The weights, which seem between the chosen phrase and the context word from the
to be reasonable indications for meaningful sentences, are document. Some popular text presentation methods such as
generated using this method. Calculating is a quick and bag-of-words, n-gram, and word embedding are discussed
straightforward process. below:
3) Position Feature: It is usually considered that the be- 1) N-gram: N-gram is an ideal approach for multi-language
ginning and last sentences would provide more informa- operations because it does not require any linguistic
6 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
preparation. An n-gram is a collection of words or char- obtain it (CBOW). The CBOW technique uses each
acters with N components. This model is simple to create, word’s context as an input and attempts to anticipate
and the text may be represented by a vector, which is the word that corresponds to it. Skip-gram aims to
usually of a reasonable size. Unigrams, bigrams, trigrams, optimize the categorization of a word based on an-
quad grams, and other n-grams compromise a set of text other word in the exact phrase rather than expecting
N-grams [181], [182], [183], [184], [185], [63]. The n- the current word based on context [2]. Several articles
gram has some limitations, such as the fact that the greater focused on Word2Vec and can be seen in [50], [101],
the N, the better the model. However, these results go [194], [195], [196], [197], [194].
through a lot of processing, requiring heavy computing • Global Vectors ForWord Representation (GloVe):
power in the RAM. N-grams are also a sparse representa- GloVe [198] is another robust word-embedding ap-
tion of language as the model is based on the likelihood proach that has been utilized for text categorization.
of terms co-occurring. All words that are not present are This method is comparable to the Word2Vec process.
given a chance of zero in the training corpus. Each word is represented by a high-dimensional vec-
2) Bag of Words (BoW): The most primitive sort of numer- tor and trained using the surrounding words over a
ical text representation is the bag-of-words model [3]. A large corpus. [103], [199], [200], [201] are the articles
phrase, such as a term itself, can be expressed as a bag- in which the GloVe word embedding approach was
of-words vector [65]. In a text document, it is a shortened used.
and simplified rendition of the substance of a sentence. • FastText: Several alternative word-embedding repre-
Computer vision, NLP, Bayesian spam filters, document sentations disregard the morphology of words [202].
categorization, and information retrieval utilizing ma- By proposing a new word embedding approach called
chine learning are all areas where the BoW technique FastText, the Facebook AI Research team introduced
is used. [186], [101], [187], and [188] are the papers a unique solution to tackle this issue. [203], [204],
in which BOW feature extraction approaches are used. [205] are the proposed papers in which the FastText
The following are some of the issues related to BoW: If word embedding was used.
the new phrases include new words, the vocabulary will This section covers all feature extraction methods. The
expand, as will the length of the vectors. Furthermore, the approaches implemented in ATS over the years are detailed
vectors would have a significant number of elements. in the following section.
3) Term Frequency-Inverse Document Frequency (TF-
IDF: IDF measures how important the word is, whereas VIII. AUTOMATIC TEXT SUMMARIZATION APPROACHES
Phrase Frequent (TF) measures how frequently a term Generally, ATS is a complex and time-consuming operation
appears in a text. The IDF value is needed because merely that often lacks better results because computers lack a proper
computing the TF is not sufficient to comprehend the understanding of human language. Researchers have tried
significance of words. The inverse document frequency to extract better performances and standard classifications
(IDF) was developed by K. Sparck Jones [189] as a for summary texts. Text summarization approaches vary
strategy to use term frequency to reduce the impact of based on the number of input documents, such as single or
implicitly popular terms in the corpus. Term frequency- multiple, objective-wise generic, domain-specific, or query-
inverse document frequency is the name given to the based, and performance-wise. The following sections cover
combination of TF and IDF (TF-IDF). However, TF-IDF performance-wise analysis, which is divided into two classes.
has several drawbacks: it directly calculates texts’ resem-
blance in the word-count space, which might be slow with A. EXTRACTIVE TEXT SUMMARIZATION
large vocabularies. Also, it is presumed that the counts The extractive text summarization method aims to identify
of various terms give independent evidence of similarity. words and sentences in a text material and use them effec-
[190], [191], [192], and [175] are examples where TF-IDF tively to create a summary [22]. This requires the selection of
is proposed for the feature extraction approach. sentences from the original document based on their impor-
4) Word Embedding: Word embedding is a type of feature tance. These important sentences are then used to replicate
learning. Each word or phrase in a lexicon is mapped to the essential elements of the text word for word, resulting in
an N-dimensional vector of absolute values. Various word a subset of the original document’s phrases. The foundation
embedding algorithms have been proposed to convert n- is consisted of three independent tasks [152]:
grams into comprehensible inputs for machine learning 1) Splitting the source document into sentences and then
systems. This study focuses on Word2Vec, GloVe, and create an intermediate representation of the text which
FastText, three of the most widely used deep learning highlights the task. Intermediate representation has two
methods for word embedding [2], [193]. main types, such as Indicator representation and topic
representation [15]
• Word2Vec: Word2Vec [4] is a technique for creating 2) Assigning scores in each sentence for specifying their
embedding. Skip-gram and common bag of words are importance depending on their performance after the
two approaches (both utilizing neural networks) to representation creation. Topic representation scores on
VOLUME 4, 2016 7
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
document [152]. These retrieved concepts are calculated, and retrieval algorithms, such as stop-word removal, case folding,
the sentences are scored based on importance. Nevertheless, and stemming. The stemming algorithm and the concept
this method and fuzzy logic have the same limitations, but of sentences represented as vectors are proposed by Porter
fuzzy logic stands out as it handles ambiguous situations [223], and [224] respectively. Machine learning trainable
better. Ramanathan et al. [216] proposed a method that em- algorithms, such as C4.5 or naive Bayes [225] are mostly
ploys a sentence concept bipartite graph structure to generate used where these algorithms are learned on a training set and
summaries derived from sentences from Wikipedia. More tested on a separate test set. Several studies have focused on
examples for concept-based summarization to retrieve textual machine learning-based summarization tasks, and it can be
concepts from an external information base can be found in seen in [174], [226], [227], [12].
[217], [218].
2) Neural Network (NN) based method
3) Latent Semantic Analysis (LSA) Method A neural network approach [228] uses a three-layered feed-
Latent semantic analysis (LSA) is an algebraic-statistical forward network that learns the features of sentences during
method for extracting hidden semantic structures of sen- model training. The feature matching phase is significant,
tences and phrases [219]. LSA is an unsupervised learning and the relationship between the characteristics is identified
technique that derives information and similar words from in some steps. Removing infrequent features and combining
input documents. The significance of this method is that frequent elements followed by sentence ranking are the steps
any outside training or template is not necessary to find to define meaningful sentences. The neural network-based
out similar words appear in separate sentences [220]. How- method is also used to train as per the human’s style or
ever, the LSA method has some limitations, such as not requirements as the network learn from its training data. With
analyzing word order, syntactic relations, or morphologies. multiple layers and increasing the number of hidden layers,
In addition, it relies solely on the information contained NN algorithms perform better than ML algorithms as an ad-
in the input document rather than on outside knowledge. vanced version of ML. A framework developed in [229], the
Finally, limitations such as performance deterioration using RankNet technique also requires neural nets to classify the
inhomogeneous datasets take this method out of comparison. relevant sentences in the text automatically. It incorporates
Moreover, this method works quite well in the semantic a two-layer neural network with backpropagation, which is
summary of texts. [221], [222], [42], [54] are other examples trained using the RankNet algorithm. An old TextSum system
of the LSA method for text summarization tasks. architecture including text preparation, keyword extraction,
and summary creation was proposed in [230]. The system
B. SUPERVISED LEARNING METHODS pre-processes the source document using two methods: stop
Supervised learning methods are sentence-level classifica- word removal and stemming. [158], [59], [231], [232], and
tions that learn to distinguish between summarized and non- [176] are other proposed studies in which a neural network
summarized sentences [152]. A collection of documents and was established for summarizing source documents.
human-generated summaries can learn the characteristics
of the sentences included in the summary. In addition, it 3) Conditional Random Fields (CRFs) Method
has significant disadvantages in making context summaries Conditional random fields are statistical modeling techniques
manually and requiring more labeled training samples for based on machine learning that provide a standardized pre-
classification. diction [233]. CRF uses non-negative matrix factorization
(NMF) approaches to extract the correct features. Then the
1) Machine Learning (ML) Method proper elements are used to define the introductory sentence
The machine learning method is used to classify the sen- from the document. CRF’s main benefit is classifying suit-
tences as summary or non-summary classes using training able characteristics by offering a more precise representation
data. These methods are applied when multiple document of sentences and sections. A vital problem of this technique
copies require extractive summaries. Notably, in this case, is that it specializes in domain-specific, which necessitates an
each document’s sentences are represented as vectors. Pri- external domain-specific framework for the training phase.
marily, machine learning algorithms are implemented on a set This methodology can be generically employed to any text
of trained datasets with documents that are trainable [174]. A without first creating a domain framework that is time-
collection of training manuals is fed into input documents consuming. Therefore, ML and NN-based algorithms are still
in the training phase and classified based on the weight of a better option. Some studies have been conducted condi-
a sentence. In most cases, a simple regression model works tional random field-based methods, and it can be seen in
better than classifiers but still requires an extensively trained [234], [235], [236], and [237].
dataset. This data is the only layer in the machine learning Other methods have been explained in various studies to
algorithm whether a neural network model has multiple lay- explain the extractive approach for text summarization. These
ers. That is why neural network models are becoming more methods are optimization-based, statistical-based methods,
usable and user-likelier on ATS. In addition, the ML methods topic-based methods, sentence centrality, or clustering-based
consist of some other standard pre-processing information methods. Researchers utilized a genetic algorithm to calcu-
VOLUME 4, 2016 9
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
late the optimal weights in the optimization method because the choice. Oya et al. [243] proposed a system that requires
of its high computational time and cost. The number of a human-made template summary template using a fusion
iterations required must be defined. The topic-based method algorithm for multiple sentences.
concentrates on topics in the input text. The limitation of the In another study of Zhang et al. [244], a speech act-
topic-based approach is that the sentences will not appear in based strategy was proposed to summarize Twitter topics.
the summary if the score is not the highest; it affects the The majority of existing Twitter translating algorithms are
quality of the generated summary [238]. Sentence central- based on template-based summarization methods. It provides
ity or clustering-based methods includes repeated sentences abstract summaries that are appropriate for the many, brief,
and is suitable for multi-document summarization. It groups and chaotic characters of tweets. The issue with this method
different sentences on the same topic [239]. However, it is that the templates for summarization are always pre-
requires prior specification of the number of clusters, and defined, which does not give much variety in the summaries.
for similar sentences, redundancy removal techniques are Therefore it can not produce fluent summaries in comparison
required [240]. to the tree-based approach.
To summarize the text in an abstractive approach, the
summary may include a new language that is not seen in 3) Rule-Based Method
the main text, which leads to paraphrasing. Language gen- The rule-based approach finds facts and reviews of essen-
eration and compression strategies are required to generate tial concepts in source documents through questioning. The
abstractive summaries. To generate better abstractive sum- interrogation and questions can be "What are the topic?"
maries, abstractive text summarization is also divided into "What is the time-being of the story or topic?" etc., and
two categories, structured and semantic. A brief discussion answering these questions tries to generate an abstractive
of structured-based methods, semantic-based methods, and summary. Gupta et al. [245] proposed a rule-based method
their subclasses is given below: to extract relevant lines from a text paragraph in the Hindi
language. Some artificial rules in the Hindi language are
C. STRUCTURED BASED METHODS employed as "What are person names in the table?" "What
In the abstractive summary, the source document requires are the locations mentioned in the table?" "What are the
newly constructed sentences to summarize. In the structure- special symbols contained in the table?" etc. In addition,
based method, phrases from source documents are inter- Laskar et al. [246] suggested a method using the BERTSUM
preted in a specified structure without losing their meaning. model [247], which uses a transformer-based architecture
Structure-based approaches mainly rely on preset forms and for abstractive summarization. Rule-based methods are used
spatial reasoning schemas, such as templates, tree-based, when input documents need to be represented as classes and
ontology-based, and rule-based structures. lists of aspects, such as query-based methods. This method is
required to prepare the rules, which is a time-consuming pro-
1) Tree-Based method cess. Manually written rules make this method less efficient
The tree-based method recognizes sentences that exchange than the other methods mentioned earlier in this subsection.
shared knowledge and facts and then mixed them to provide
an abstractive summary. This tree-like structure is called tree 4) Ontology-Based Method
linearization [241], which comes from many dependency Ontology is a knowledge-based approach that acts as a for-
trees. Dependency trees are a representation of the source mal naming and definition of the entity types of a specific
text of a document. The tree-based model helps process domain [186]. A base of knowledge is applied in this method
multiple documents and identify the usual information using to improve the outcome of summarization. Ontology-based
a syntactic tree. These methods also produce less redundant methods perform extensively when a document has a knowl-
summaries, but they cannot detect the relationship between edge structure or is repeatedly constructed to the same topic.
sentences without considering the context. Therefore, it over- Therefore this method is focuses on specific domain-related
looks significant phrases in the text. Another issue with this documents and constructs coherent summaries. Similar to
method is the continuous focus on syntax, not semantics. the rule-based method, this method is also time-consuming.
Even after these issues, this model stands out in structured- Okumura et al. [248] proposed a Wordnet ontology in his
based methods because of its fluency in summarization. research work. In other work, Mohan et al. [249] proposed
some methods for evaluating ontology, such as; ontometric,
2) Template-Based Method ontoclean and evalexon. A suitable ontology preparation is
In the template-based method, the topic or content is ex- a very time-consuming process and cannot be generalized to
tracted into possible phrases and speakers by finding similar- other domains.
ities with a template space [242]. The template-based method
is used when a document requires a predefined guideline D. SEMANTIC BASED METHODS
or a human-made template for the summary. This method Semantic-based methods illustrate the linguistics of a doc-
constructs informative and coherent summaries as various ument’s texts into a natural language generation (NLG)
phrases and speakers of the content are selected based on system, with a significant focus on noun and verb phrase
10 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
identification [147]. These methods are effective at making Graph-based ranking algorithms determine the relevance
less redundant and grammatically correct sentences. A dis- of a vertex in a graph based on global information iteratively
advantage of these methods is that they sometimes ignore extracted from the entire chart. When it comes to text sum-
critical information or data even when grammatically correct. marization, specific graph-based techniques are applied.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
As weight is applied to the matching edge that links the LSTM unit. The LSTM shows promise in producing a
two vertices, it may be beneficial to express and include concise abstractive summary. [5], [289], [290] used the
the "strength" of the relationship between two vertices in LSTM-based method to summarize risk.
the model. When calculating the score associated with a 3) Gated Recurrent Unit (GRU): GRU is a simplified
vertex in the graph, the edge weights are considered. It LSTM with two gates: a reset gate and an update gate
is worth noting that integrating vertex weights may be with no explicit memory. When all the reset gate el-
performed using a similar method. [263], [264], [65] are ements approach zero, the previously hidden state in-
the examples of the articles in which the weighted graph formation is discarded. Only the input vector influences
algorithm is used. the candidate hidden state. The update gate serves as a
8) Graph-based Attention Mechanism: The relationship forget gate in this situation. LSTM contains a memory
between all other phrases determines the significance unit that offers more control, but the calculation time
score in the graph model. Traditional attention and graph of the GRU consistently decreases. Furthermore, LSTM
ranking algorithms are combined in this mechanism to makes it easier to modify the parameters of whether the
compute the rank scores of the original sentences, re- GRU takes less time to train [5]. [153], [291], [155]
sulting in varying significance ratings of actual phrases are studies where the writers focused on the GRU-based
while decoding various states [283]. Some articles are method for summarization tasks.
proposed graph-based attention mechanism for the text 4) Restricted Boltzmann Machine (RBM): A random-
summarization task [194], [197]. probability-distributed neural network (RBM) is a neu-
ral network with random probability distributions. A
2) Deep learning Algorithm visible layer of visible neurons (input nodes) and hidden
Deep learning models help information-driven ATS to be- layers of hidden neurons constitute the network (hidden
come more efficient, accessible, and user-friendly. These nodes). Every hidden node is connected to every input
models are highly promising for ATS because they attempt node in a bidirectional manner. Every hidden node is
to imitate human brain functions. Deep neural networks are connected to the bias node. In the visible layer, the input
commonly employed in NLP issues because their design fits nodes are not linked. In addition, hidden nodes are not
well with the language’s complicated structure; for example, connected at the hidden levels [292]. The network is
each layer can handle a particular job before passing the known as a restricted Boltzmann machine because of its
output to the next. A few commonly known deep-learning limited connections. [293], [294], [295], [296], [297],
models [284] for ATS are described below: [298] are the research that focused on RBM method for
1) RNN Encoder-Decoder: The sequence-to-sequence text summarization.
paradigm is used in the RNN encoder-decoder archi- 5) Naive Bayesian Classification: The naive Bayesian
tecture. The sequence-to-sequence model converts the classification method is used to extract the essential
input sequence of the neural network into an iden- keywords from the text [29]. The Bayes technique is
tical series of letters, words, or sentences. Machine a machine learning approach for estimating differenti-
translation and text summarization are two examples ating keyword characteristics in a text and retrieving
of NLP applications [285]. The challenge behind this the keyword from the input using this data. The use
RNN seq2seq is that it requires an extensive dataset. of this naive Bayesian, score, and timestamp idea to-
The training process of datasets is time-consuming. This gether improves the accuracy of summarization. [299],
is why the deep learning methods mentioned in the [226], [174] focused on the naive Bayesian classification
later part perform better. Anyway, there are some other method for text summarization.
papers that proposed RNN encoder-decoder in the text 6) Query Based: The score of sentences in a given doc-
summarization task, and some of them are [50], [286], ument is based on the frequency counts of words or
[287], and [155]. phrases in query-based text summarization [116]. Sen-
2) Long Short-Term Memory (LSTM): The repeating tences containing query phrases received higher rat-
unit of the LSTM architecture comprises input/read, ings than sentences containing single query terms. The
memory/update, forget, and output gates [6], [288]. The sentences with the highest scores and their structural
chain structure is very similar to that of an RNN. The contexts are then extracted for the output summary [79],
input gate is a randomly initialized vector. The input [300], [301] are focused on query-based methods for
of the current step is the output of the previous step in text summarization.
future stages. The forget gate is a single-layer neural net- 7) Generic Summarization: Generic summaries aimed at
work with a sigmoid activation function. The sigmoid summarizing the document’s significant points [302].
function’s result determines whether the prior state’s A number of excellent general summary examines the
information should be ignored or remembered. The papers’ key points focused on generic methods for text
memory gate controls the influence of recognized in- summarization. Rather than repeating the same informa-
formation on new information. The output gate controls tion we provide the references here [222], [303], [221],
the quantity of new information transmitted to the next and [191].
12 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
TABLE 3. A set of papers on unsupervised learning methods are summarized in this table.
Task Method Dataset Feature Extract Accuracy Limitation & Future Work
F-measure: - Fuzzy logic did not pro-
0.498, recall: duce promising results
Text summariza-
Tif, SL, TW, SP, 0.457 and - Authors would like to use
tion by sentence Fuzzy logic DUC 2002
TW, etc. precision: other learning with fuzzy
selection [212]
0.471 logic in thei future work
respectively
ROUGE-1: - Arabic abstractive summa-
TF-IDF, SKE for
Sentence similar- AE, VAE, ELM- 0.6043 and rization was not possible
Word2vec, English and
ity measure [101] AE ROUGE-2: - Authors would like to use
Sectence2vec Arabic
0.5771 auto encoders or attention
encoders in their future
work
Euclidean
Entropy
Latent Corpus of Distance, Cosine - NA
Short representa- 0.1275(ap-
Semantic Contempo- Similarity,
tion from a long prox.), Purity
Analysis rary Arabic Jaccard
document [316] 0.385(ap-
(LSA) (CCA) Coefficient
prox.) and
etc.
Precision: - Did not cover multi-
Query-based Restricted Key phrase 0.1694, documents summarization
single-document Boltzmann SKE, BC3, oriented, Recall: into this system
summarization Machine TAC Subject oriented 0.2115 and - Authors would like to ex-
[99] (RBM) summarization F-score of tend generic summariza-
AE: 0.1816 tion by clustering
based sentence centrality scoring suggesting a framework extraction from the perspective of ATS. Yousefi et al. [99]
comprises three distinct approaches for calculating central- introduced an unsupervised deep neural network using the
ity in similarity graphs. Yeh et al. [42] provided two new SKE and BC3 email datasets that employ global and local
modified methods for ATS with the corpus-based approach vocabularies to represent words as the AE input.
(MCBA) and the LSA-based TRM approach (LSA + TRM)
[328] and a text connection map to extract semantically B. SUMMARY OF PAPERS REGARDING SUPERVISED
significant structures from a document. LEARNING METHOD
Alami et al. [101] used neural network-based techniques In this section, we discuss the papers that covers the super-
for ATS using the Sentence2Vec feature extraction approach, vised learning methods of ATS.
which produced the best outcomes. Gong et al. [222] pro- Xu et al. [327] introduced a neural network architecture
posed two text summary approaches for creating a gen- for extractive summarization, consisting of a sentence ex-
eral text summary by minimizing redundancy. Shen et al. traction model and a compression classifier. According to
[235] solved a sequence labeling problem by employing the results of liu et al. [37], constructing English Wikipedia
the effective sequence-labeling algorithm CRF. Froud et al. articles can be addressed as a multi-document extractive
[316] suggested enhancing the functionality of summariza- summarization of original documents with a decoder-only se-
tion using the latent semantic analysis model and Arabic quence transduction architecture. Xu et al. [322] introduced
document clustering measures with stemming. Mihalcea et DISCOBERT, which employed discourse units as the lowest
al. [264] studied and evaluated a variety of graph-based rank- selection basis to eliminate summarization redundancy and
ing algorithms that enable automatic unsupervised sentence utilizes two types of discourse graphs. Alguliyev et al. [324]
14 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
TABLE 4. A set of papers on supervised learning methods are summarized in this table.
Task Method Dataset Feature Extract Accuracy Limitation & Future Work
TF-IDF, - Required performance measurement
Two-stage extrac- Highlighting WikiSum
TextRank, ROUGE-L: improvement
tive and abstractive the relevant (self-
SumBasic, T-D, 38.8 - Authors would like to conduct further
framework [37] information generated)
T-DMACA etc. research to improve performance
suggested an unsupervised approach with sentence clustering single-document extractive ATS that generates a text sum-
using the open datasets DUC 2001 and 2002. Aliguliyev et al. mary by extracting selected sentences from the source.
[323] also described a strategy for sentence clustering using
a discrete differential evolution technique. In addition, the C. SUMMARY OF PAPERS REGARDING STRUCTURED
NGD-based dissimilarity measure outperformed Euclidean TEXT SUMMARIZATION
distance. Structured-based methods are a vital part of abstractive text
Ferreira et al. [325] evaluated 15 sentence scoring methods summarization approaches. The following section examines
on three distinct datasets (news, blogs, and article settings) to studies that discuss structured learning methods.
improve the acquired sentence extraction findings. Neto et Li et al. [155] designed a methodology for ATS based on a
al. [174] investigated the framework using an ML method seq2seq encoder-decoder architecture with a deep recurrent
by utilizing statistics-oriented techniques where the Naive generative decoder (DRGN). Liu et al. [330] proposed an
Bayes method and the C4.5 decision tree method are the best adversarial technique for ATS that trained both a generative
classification methods. Fang et al. [326] investigated a graph- model and a discriminative model simultaneously. Hennig
based ranking model using redundancy removal strategies et al. [186] described how sentences can be mapped to
to enhance the effectiveness of the summarization process. nodes with several linguistic features that are generated to
Kaikhah et al. [228] described artificial neural networks test the efficiency of an SVM classifier. Genest et al. [256]
to generate summaries of news stories of different lengths presented a methodology for information extraction and nat-
using feature fusion to summarize highly ranked sentences. ural language generation. Kikuchi et al. [329] developed an
Ledeneva et al [331] provided a statistical approach for approach for summarizing a single text that contained both
VOLUME 4, 2016 15
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
TABLE 5. A set of papers on structured based learning methods are summarized in this table.
Task Method Dataset Feature Extract Accuracy Limitation & Future Work
Condensed - Did not cover extensive ontology
ROUGE-1
version by Stop words, Stem for further effectiveness
SVM classi- F1: 0.5716,
selecting DUC 2002 and cosine simi- - Authors would like to explore the
fier ROUGE-2
informative larity approach in non-hierarchical on-
F2: 0.3143
sentences [186] tology
ROUGE- - NA
Aim to develop CNN
LSTM- 1: 34.9,
a semantic model dataset, MOSP
CNN model ROUGE-2:
[5] DailyMail
17.8
ROUGE-
Performing
DRGD 1: 36.71, - NA
a variational DUC2004,
Seq2Seq outperforms ROUGE-2:
inference and Gigaword,
with DRGN state-of-the-art 24.00 and
generation with LCSTS
approaches ROUGE-L:
NN [155]
34.10
Pre-trained ROUGE-
An adversarial
Generative generator and 1: 39.92, - NA
technique for
Model, Dis- CNN/ Daily discriminator of ROUGE-2:
abstractive text
criminative Mail human-generated 17.65 and
summarization
Model positive ROUGE-L:
[330]
examples 36.71
sentence and word relationships in a hierarchical tree and the hierarchically while preserving language proficiency using
ROUGE score compared to EDU selection. Song et al. [5] CNN/daily mail dataset. Kryściński et al. [46] developed a
constructed a novel ATSDL system based on an LSTM-CNN method for validating abstractive neural models to perform
that solves numerous challenges in text summarizing. Oya et factual consistency testing on the document-sentence level.
al. [243] demonstrated an ATS for meeting discussions based Zhu et al. [206] proposed a multimodal objective function
on modifying a word graph algorithm to build frameworks to utilize the loss through summary generation. ROUGE and
from human-generated summaries. order ranking is used to produce the multimodal reference for
both automatic and human performance measures. Khan et
D. SUMMARY OF PAPERS REGARDING SEMANTIC al. [254] proposed the Sem-Graph-Both-Rel method, which
TEXT SUMMARIZATION was compared to other summarization techniques based on
Based on a comprehensive review of structured learning three pyramid evaluation metrics. A strategy for creating an
methods, the following section focuses only on semantic abstractive summary for a single document is described in
learning approaches for ATS. this paper [252], which uses a rich semantic graph reducing
Wang et al. [154] introduced a joint attention and bi- methodology that can reduce the actual document to 50%.
ased probability generation approach using three datasets, Genest et al. [255] offered an optimistic abstractive summa-
DUC 2004, Gigaword, LCSTS, where ConvS2S architec- rization, which aimed at achieving an accurate objective by
ture improved by topic embedding, and SCST provided the managing the content and structure of the summary using
best results. Chen et al. [332] presented a unique sentence- TAC the 2010s dataset.
level policy gradient strategy between two neural networks
16 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
TABLE 6. A set of papers on semantic based learning methods are summarized in this table.
Task Method Dataset Feature Extract Accuracy Limitation & Future Work
Topic ROUGE-1 F, - Only provides results based
Incorporated embeddings, ROUGE-2 F on sentence summarization
Convolutional DUC 2004,
topic Joint attention, and ROUGE-L - Authors would like to work
sequence- Gigaword,
information Biased F is 36.92, further on long paragraphs
to-sequence LCSTS
[154] Probability 18.29 and or multi-document summa-
Generation 34.58 rization
XI. PERFORMANCE OF AUTOMATIC TEXT if it aids these mentioned other activities. There are numerous
SUMMARIZATION approaches to extrinsic evaluation. Relevance assessment de-
The evaluation of text summarization is difficult. This task termines whether the text is relevant to the topic, and reading
is complex for machines to identify key phrases or contents comprehension determines whether it can answer multiple-
that are important and add value in summary. Placing key choice assessments or not.
phrases has changed the meaning of the summary depending
on the purpose of the context, and it is challenging to locate B. INTRINSIC EVALUATION
this relevant information. As a result, automatic evaluation The intrinsic evaluation determines the quality of the sum-
measures are necessary for reliable and effective evaluation. mary based on comparability among the machine-generated
After reviewing previously researched papers covering text and human-generated summaries. A good summary is judged
summarization topics, several methods are determined for based on two significant factors: quality and information. Hu-
summarization measurement. Now, the evaluation measure- man experts may be required to evaluate machine-generated
ment metrics of the ATS domain are discussed below: summaries utilizing several quality measures. Readability,
non-redundancy, structure, and coherence, and some other
A. EXTRINSIC EVALUATION quality metrics include referential clarity, conciseness and
Extrinsic evaluation determines the quality of ATS generated focus, and content coverage, etc.
summary depending on how it influences other activities such Some valuable measures for intrinsically evaluating sum-
as text categorization, information retrieval, and question maries are precision, recall, and F-measure. Researchers
responding. In the summarising process, it is considered good must anticipate comparability between human-generated and
VOLUME 4, 2016 17
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
automatically generated summaries. With the evaluation met- • ROUGE-S: ROUGE-S (Skip-Bi-gram co-occurrence
rics mentioned earlier, it is also possible that the two sum- statistics) measures the percentage of skip bigrams
maries produce different evaluation outcomes despite being shared between the system and reference summaries.
equally good. The following section focuses on the most The skip bigrams would be any word pair in the
frequently used evaluation metrics in the research, including sentence sequence with random gaps [16], [341].
the following: • ROUGE-SU*: ROUGE-SU is extended by employing
1 Precision Metric: The precision metric evaluates whether skip-bi-grams and a uni-gram as a measuring unit,
the percentage of sentences chosen by humans and the a weighted average of ROUGE-S and ROUGE-N.
computer is correct. The formula shows how the precision These metrics allow bi-grams to be made up of non-
metric is calculated by dividing the total number of sen- adjacent words with a maximum of n-words between
tences between two summaries by the number of sentences them [16].
in the system summary [152]. [333], [334].
Sref ∩ Scand
Precision = (1)
Scand 5 Pyramid Method: The pyramid technique is used because
there is no best comparison summary among the human-
2 Recall Metric: The recall metric determines the system
created model summaries. The fundamental aim is to
recognizes how many sentences are selected by humans.
generate a global standard summary by comparing human-
The following equation is calculated by dividing the num-
generated comparison summaries based on summary con-
ber of sentences in both the reference and system sum-
tent units (SCUs). A good summary has more SCUs from
maries [152]. These studies used recall metrics for the
higher pyramid levels than lower levels, whereas a poor
evaluation measurement task [226], [335], [99], [179].
summary has more SCUs from lower tiers than higher tiers
Sref ∩ Scand [16].
Recall = (2) 6 Relative Utility: This measurement assigns a score be-
Sref
tween 0 and 10 to each sentence in the input document
3 F-Measure Metric: The F-measure metric incorporates based on relevance. The highest-scored sentence is thought
recall and precision metrics. The arithmetic mean of pre- to be more appropriate for summary [342].
cision and recall is an F-measure metric [152]. [336], [42], 7 Basic Elements: Basic element is a modifier or an ar-
[212], and [337] focused on the F-measure metric for the gument and the connection of the modifier to the head.
evaluation task. The goal of this strategy is to match distinct comparable
2 (Precision) (Recall) expressions more easily [16].
F-Measure = (3) 8 Text grammars: This strategy aids in the evaluation of
Precision + Recall
text summaries. This focuses on identifying the structure
4 ROUGE Metric: Recall-Oriented Understudy for Gisting of acceptable text in a formalized setting [16].
Evaluation (ROUGE) is a series of evaluations ATS and 9 Factoid Score: Factoid score is the evaluation of comput-
machine translation. It compares an automatically gener- erized summaries in terms of factoids which are atomic
ated summary or translation to a set of predetermined sum- units of information. Different pre-defined summaries are
maries such as human-generated summaries. ROUGE con- utilized, and shared knowledge is evaluated among these
sists of five measures: ROUGE-N, ROUGE-L, ROUGE- [343].
W, ROUGE-S, and ROUGE-SU. The examples where the 10 Cohesion and Coherence: Cohesion attempts to account
ROUGH metrics were used can be found in [146], [338], for relationships between text elements. The four signif-
[339] and [340]. icant forms of cohesion revealed are reference, ellipsis,
• ROUGE-N (R1): ROUGE-N is focused on the uni- conjunction, and lexical coherence [344]. And coherence
gram measure of a ATS summary against a human- refers to the text’s overall unity or cohesiveness, which is
generated or pre-defined reference summary. N-gram accomplished by efficiently grouping and logically arrang-
recall algorithm that compares the system and refer- ing ideas. It’s expressed in terms of text-to-text relation-
ence summaries [16], [341]. ships, such as elaboration, cause, and explanation. Mani et
• ROUGE-L (R-L): The ROUGE-L process is based al. [345] addressed the cohesion and coherence in their text
on the longest common sub-sequences (LCS) be- summarization task.
tween human-generated and automatic-generated 11 BLEU: The Bilingual Evaluation Understudy (BLEU)
summaries. It evaluates the ratio of the size of the LCS evaluation metrics assess the output quality of machine
of two summaries to the size of the reference summary translation systems in terms of reference translation [346].
[16], [341]. Counting the number of n-gram matches located indepen-
• ROUGE-W: ROUGE-W determines the weighted dently between the system and the reference translations is
longest standard sub-sequence, which is an enhance- the main task of this metric [347]. The BLEU metrics can
ment of the LCS [16]. be computed as:
18 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
Algorithm Limitations
It requires a redundancy removal technique in the post-
Fuzzy logic
processing phase to improvise the summarization quality.
It needs to utilize similarity measures for reducing redun-
Unsupervised Concept-based
dancy which can affect the quality of the summary.
The LSA generated summary required a large amount of
Extractive Latent-Semantic
time.
It needs a large set of data for training and improving the
Machine Learning
sentence selection for making a good summary.
It is quite slow in training phase and application phase. Also
Supervised Neural Network
requires human interruption for training data.
Conditional Random In CRF linguistic features are not considered. It also requires
Fields external domain specific corpus.
It ignores the context and significant phrases in the text,
eventually failing to detect the relation between sentences.
Tree based
Another drawback is that it continuously focuses on syntax,
not semantics.
The templates are pre-defined in this method that creates a
Template based
lack of diversity in the summaries.
The requirement to prepare the rules is a time-wasting pro-
Structure-
Rule-based cess. Another challenge is that the rules needed to be written
based
manually.
A suitable ontology preparation is a very time-consuming
Ontology-based
process and cannot be generalized to other domains.
An automatic evaluation of the framework is required as it is
Abstractive Multi-modal semantic
manually evaluated by humans.
Difficulties of creating meaningful and grammatical sen-
Semantic-
Information item tences from text. Also linguistic quality of summaries is very
based
low due to incorrect parses.
This method is limited to single document abstractive sum-
Semantic graph
marization.
Deep learning It need human effort for building big training data manually.
It does not consider importance of words and does not
Graph-based
consider dangling anaphora problem.
which pronoun complements which word. and representation must be strong concerning any areas
7) Retaining the quality of the text: An ATS should the system faces.
ensure the quality of the summarized text. From the 10) Predefined template: Recently, natural language pro-
user perspective, the most desired quality of ATS is to cessing has made an incredible amount of progress in
understand the source text while summarizing. Various ATS. But these methods cannot generate new sentences
machine learning techniques can be used to retain the on their own. Therefore, the template-based algorithm
quality of the summarized text. was introduced, where a specific template needs to be
8) Word sense ambiguity: Ambiguity in words makes a predefined for a particular summarization task.
difference while summarizing sentences. This ambigu- 11) Attaining higher level of abstraction: In a text summa-
ity may appear due to abbreviations with more than rization task, an open research topic is the achievement
one acronym, multiple usages of the same word in of a higher-level abstraction. Therefore, there are plenty
different contexts, etc. Then the acronym has to match of possibilities for researchers and linguistics to find the
the topic or meet the sense depending on the subject answer to this problem.
for better understanding. This problem is the opposite In addition to the above-mentioned general challenges, we
of the anaphora problem, which is called the Cataphora also present a few limitations of the current algorithms used
problem. Therefore, this problem can be solved using a in the ATS domain. Table 7 presents the limitations those
disambiguation algorithm. are needed to be solved to achieve better text summarization
9) Meaningful, intuitive, and robust: Summarized sen- results.
tences must be influential or make sense to the users,
20 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
XIII. CONCLUSION [16] M. Gambhir and V. Gupta, “Recent automatic text summarization tech-
Text summarization is an old topic, but this field continues to niques: a survey,” Artificial Intelligence Review, vol. 47, no. 1, pp. 1–66,
2017.
gain the interest of researchers. Nonetheless, the performance [17] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Auto-
of text summarization is average in general, and the sum- matic text summarization: A comprehensive survey,” Expert Systems with
maries created are not always ideal. As a result, researchers Applications, p. 113679, 2020.
[18] L. Abualigah, M. Q. Bashabsheh, H. Alabool, and M. Shehab, “Text
are attempting to improve existing text-summarizing meth- summarization: a brief review,” Recent Advances in NLP: The Case of
ods. In addition, developing novel summarization approaches Arabic Language, pp. 1–15, 2020.
to produce higher-quality, human standards and robust sum- [19] R. Mihalcea and H. Ceylan, “Explorations in automatic book summariza-
tion,” in Proceedings of the 2007 joint conference on empirical methods
maries is a priority. Therefore, ATS should be made more in natural language processing and computational natural language
intelligent by combining it with other integrated systems to learning (EMNLP-CoNLL), 2007, pp. 380–389.
perform better. Automatic text summarization is an eminent [20] N. V. Kumar and M. J. Reddy, “Factual instance tweet summarization
and opinion analysis of sport competition,” in Soft Computing and Signal
domain of research that is extensively implemented and Processing. Springer, 2019, pp. 153–162.
integrated into diverse applications to summarize and reduce [21] P. Gupta, R. Tiwari, and N. Robert, “Sentiment analysis and text summa-
text volume. In this paper, we present a systematic survey rization of online reviews: A survey,” in 2016 International Conference
on Communication and Signal Processing (ICCSP). IEEE, 2016, pp.
of the vast ATS domain in various phases: the fundamental 0241–0245.
theories with previous research backgrounds, dataset inspec- [22] V. Gupta and G. S. Lehal, “A survey of text summarization extractive
tions, feature extraction architectures, influential text summa- techniques,” Journal of emerging technologies in web intelligence, vol. 2,
no. 3, pp. 258–268, 2010.
rization algorithms, performance measurement matrices, and [23] S. Muresan, E. Tzoukermann, and J. L. Klavans, “Combining linguistic
challenges of current architectures. This paper also presents and machine learning techniques for email summarization,” in Proceed-
the current limitations and challenges of ATS methods and ings of the ACL 2001 Workshop on Computational Natural Language
Learning (ConLL), 2001.
algorithms, which would encourage researchers to try to [24] S. D. Kavila, V. Puli, G. P. Raju, and R. Bandaru, “An automatic legal
solve these limitations and overcome new challenges in the document summarization and search using hybrid system,” in Proceed-
ATS domain. ings of the international conference on frontiers of intelligent computing:
Theory and applications (FICTA). Springer, 2013, pp. 229–236.
[25] H. D. Menéndez, L. Plaza, and D. Camacho, “Combining graph con-
REFERENCES nectivity and genetic clustering to improve biomedical summarization,”
[1] H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journal in 2014 IEEE Congress on Evolutionary Computation (CEC). IEEE,
of research and development, vol. 2, no. 2, pp. 159–165, 1958. 2014, pp. 2740–2747.
[2] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of [26] N. A. Ramu, M. S. Bandarupalli, M. S. S. Nekkanti, and G. Ramesh,
word representations in vector space,” arXiv preprint arXiv:1301.3781, “Summarization of research publications using automatic extraction,” in
2013. International Conference on Intelligent Data Communication Technolo-
[3] Z. S. Harris, “Distributional structure,” Word, vol. 10, no. 2-3, pp. 146– gies and Internet of Things. Springer, 2019, pp. 1–10.
162, 1954. [27] J.-M. Torres-Moreno, “Single-document summarization,” Automatic Text
[4] K. W. Church, “Word2vec,” Natural Language Engineering, vol. 23, Summarization, pp. 53–108, 2014.
no. 1, pp. 155–162, 2017. [28] C. Ma, W. E. Zhang, M. Guo, H. Wang, and Q. Z. Sheng, “Multi-
[5] S. Song, H. Huang, and T. Ruan, “Abstractive text summarization us- document summarization via deep learning techniques: A survey,” arXiv
ing lstm-cnn based deep learning,” Multimedia Tools and Applications, preprint arXiv:2011.04843, 2020.
vol. 78, no. 1, pp. 857–875, 2019. [29] N. Ramanujam and M. Kaliappan, “An automatic multidocument text
[6] Q. A. Al-Radaideh and D. Q. Bataineh, “A hybrid approach for arabic summarization approach based on naive bayesian classifier using times-
text summarization using domain knowledge and genetic algorithms,” tamp strategy,” The Scientific World Journal, vol. 2016, 2016.
Cognitive Computation, vol. 10, no. 4, pp. 651–669, 2018. [30] A. R. Fabbri, I. Li, T. She, S. Li, and D. R. Radev, “Multi-news: a large-
[7] K. Ježek and J. Steinberger, “Automatic text summarization (the state of scale multi-document summarization dataset and abstractive hierarchical
the art 2007 and new challenges),” in Proceedings of Znalosti. Citeseer, model,” 2019.
2008, pp. 1–12. [31] M. Yasunaga, J. Kasai, R. Zhang, A. R. Fabbri, I. Li, D. Friedman, and
[8] B. Kitchenham and S. Charters, “Guidelines for performing systematic D. R. Radev, “Scisummnet: A large annotated corpus and content-impact
literature reviews in software engineering,” 2007. models for scientific paper summarization with citation networks,” in
[9] B. Kitchenham, “Procedures for performing systematic reviews,” Keele, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33,
UK, Keele University, vol. 33, no. 2004, pp. 1–26, 2004. no. 01, 2019, pp. 7386–7393.
[10] S. Gholamrezazadeh, M. A. Salehi, and B. Gholamzadeh, “A comprehen- [32] G. Carenini, R. T. Ng, and X. Zhou, “Summarizing email conversations
sive survey on text summarization systems,” in 2009 2nd International with clue words,” in Proceedings of the 16th international conference on
Conference on Computer Science and its Applications. IEEE, 2009, pp. World Wide Web, 2007, pp. 91–100.
1–6. [33] D. M. Zajic, B. J. Dorr, and J. Lin, “Single-document and multi-document
[11] C. Saranyamol and L. Sindhu, “A survey on automatic text summa- summarization techniques for email threads using sentence compres-
rization,” International Journal of Computer Science and Information sion,” Information Processing & Management, vol. 44, no. 4, pp. 1600–
Technologies, vol. 5, no. 6, pp. 7889–7893, 2014. 1610, 2008.
[12] R. Mishra, J. Bian, M. Fiszman, C. R. Weir, S. Jonnalagadda, J. Mostafa, [34] S. Gerani, Y. Mehdad, G. Carenini, R. Ng, and B. Nejat, “Abstractive
and G. Del Fiol, “Text summarization in the biomedical domain: a summarization of product reviews using discourse structure,” in Proceed-
systematic review of recent research,” Journal of biomedical informatics, ings of the 2014 conference on empirical methods in natural language
vol. 52, pp. 457–467, 2014. processing (EMNLP), 2014, pp. 1602–1613.
[13] N. Andhale and L. Bewoor, “An overview of text summarization tech- [35] W. Luo, F. Liu, Z. Liu, and D. Litman, “Automatic summarization of
niques,” in 2016 International Conference on Computing Communication student course feedback,” arXiv preprint arXiv:1805.10395, 2018.
Control and automation (ICCUBEA). IEEE, 2016, pp. 1–7. [36] W. Luo and D. Litman, “Summarizing student responses to reflection
[14] S. K. Bharti and K. S. Babu, “Automatic keyword extraction for text prompts,” in Proceedings of the 2015 Conference on Empirical Methods
summarization: A survey,” arXiv preprint arXiv:1704.03242, 2017. in Natural Language Processing, 2015, pp. 1955–1960.
[15] M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutier- [37] P. J. Liu, M. Saleh, E. Pot, B. Goodrich, R. Sepassi, L. Kaiser, and
rez, and K. Kochut, “Text summarization techniques: a brief survey,” N. Shazeer, “Generating wikipedia by summarizing long sequences,”
arXiv preprint arXiv:1707.02268, 2017. arXiv preprint arXiv:1801.10198, 2018.
VOLUME 4, 2016 21
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[38] S. Afantenos, V. Karkaletsis, and P. Stamatopoulos, “Summarization [62] M. S. Binwahlan, N. Salim, and L. Suanmali, “Swarm based features
from medical documents: a survey,” Artificial Intelligence in Medicine, selection for text summarization,” IJCSNS International Journal of Com-
vol. 33, no. 2, p. 157–177, Feb 2005. [Online]. Available: http: puter Science and Network Security, vol. 9, no. 1, pp. 175–179, 2009.
//dx.doi.org/10.1016/j.artmed.2004.07.017 [63] P.-y. Zhang and C.-h. Li, “Automatic text summarization based on
[39] M. Alghamdi, C. Treude, and M. Wagner, “Human-like summaries sentences clustering and extraction,” in 2009 2nd IEEE international
from heterogeneous and time-windowed software development arte- conference on computer science and information technology. IEEE,
facts,” 2020. 2009, pp. 167–170.
[40] Wikipedia contributors, “Multi-document summarization — Wikipedia, [64] D. R. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. Celebi,
the free encyclopedia,” 2020, [Online; accessed 8-October- S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu et al., “Mead-a
2021]. [Online]. Available: https://en.wikipedia.org/w/index.php?title= platform for multidocument multilingual text summarization,” 2004.
Multi-document_summarization&oldid=986613170 [65] G. Erkan and D. R. Radev, “Lexrank: Graph-based lexical centrality
[41] H. Chen, “Digital library development in the asia pacific,” in Interna- as salience in text summarization,” Journal of artificial intelligence
tional Conference on Asian Digital Libraries. Springer, 2005, pp. 509– research, vol. 22, pp. 457–479, 2004.
524. [66] E. H. Hovy, C.-Y. Lin, L. Zhou, and J. Fukumoto, “Automated summa-
[42] J.-Y. Yeh, H.-R. Ke, W.-P. Yang, and I.-H. Meng, “Text summarization rization evaluation with basic elements.” in LREC, vol. 6. Citeseer, 2006,
using a trainable summarizer and latent semantic analysis,” Information pp. 604–611.
processing & management, vol. 41, no. 1, pp. 75–95, 2005. [67] H. Takamura and M. Okumura, “Text summarization model based on
[43] A. Moreno and T. Redondo, “Text analytics: the convergence of big data maximum coverage problem and its variant,” in Proceedings of the 12th
and artificial intelligence,” IJIMAI, vol. 3, no. 6, pp. 57–64, 2016. Conference of the European Chapter of the ACL (EACL 2009), 2009, pp.
[44] F.-r. Lin and C.-H. Liang, “Storyline-based summarization for news topic 781–789.
retrospection,” Decision Support Systems, vol. 45, no. 3, pp. 473–490, [68] G. Erkan and D. Radev, “Lexpagerank: Prestige in multi-document text
2008. summarization,” in Proceedings of the 2004 Conference on Empirical
[45] Y. Du, Q. Li, L. Wang, and Y. He, “Biomedical-domain pre-trained lan- Methods in Natural Language Processing, 2004, pp. 365–371.
guage model for extractive summarization,” Knowledge-Based Systems, [69] A. Nenkova and L. Vanderwende, “The impact of frequency on summa-
vol. 199, p. 105964, 2020. rization,” Microsoft Research, Redmond, Washington, Tech. Rep. MSR-
[46] W. Kryściński, B. McCann, C. Xiong, and R. Socher, “Evaluating the TR-2005, vol. 101, 2005.
factual consistency of abstractive text summarization,” arXiv preprint [70] X. Wan and J. Yang, “Improved affinity graph based multi-document
arXiv:1910.12840, 2019. summarization,” in Proceedings of the human language technology con-
[47] Q. Zhou, N. Yang, F. Wei, and M. Zhou, “Selective encoding for abstrac- ference of the NAACL, Companion volume: Short papers, 2006, pp. 181–
tive sentence summarization,” arXiv preprint arXiv:1704.07073, 2017. 184.
[48] J. Lin, X. Sun, S. Ma, and Q. Su, “Global encoding for abstractive [71] A. Dasgupta, R. Kumar, and S. Ravi, “Summarization through submod-
summarization,” arXiv preprint arXiv:1805.03989, 2018. ularity and dispersion,” in Proceedings of the 51st Annual Meeting of
[49] X. Shen, Y. Zhao, H. Su, and D. Klakow, “Improving latent alignment in the Association for Computational Linguistics (Volume 1: Long Papers),
text summarization by generalizing the pointer generator,” in Proceedings 2013, pp. 1014–1022.
of the 2019 Conference on Empirical Methods in Natural Language Pro- [72] J. M. Conroy, J. D. Schlesinger, J. Goldstein, and D. P. O’leary, “Left-
cessing and the 9th International Joint Conference on Natural Language brain/right-brain multi-document summarization,” in Proceedings of the
Processing (EMNLP-IJCNLP), 2019, pp. 3753–3764. Document Understanding Conference (DUC 2004), 2004.
[50] R. Nallapati, B. Zhou, C. Gulcehre, B. Xiang et al., “Abstractive text [73] Y. Chali and M. Kolla, “Summarization techniques at duc 2004,” in
summarization using sequence-to-sequence rnns and beyond,” arXiv Proceedings of the document understanding conference. National
preprint arXiv:1602.06023, 2016. Institute of Standards in Technology (NIST), 2004, pp. 105–111.
[51] D. Harman and P. Over, “The effects of human variation in duc sum- [74] K. C. Litkowski, “Summarization experiments in duc 2004,” in Proceed-
marization evaluation,” in Text Summarization Branches Out, 2004, pp. ings of the 2004 Document Understanding Conference, 2004.
10–17. [75] H. T. Dang, “Overview of duc 2005,” in Proceedings of the document
[52] A. Abdi, S. M. Shamsuddin, S. Hasan, and J. Piran, “Automatic understanding conference, vol. 2005, 2005, pp. 1–12.
sentiment-oriented summarization of multi-documents using soft com- [76] G. Melli, Y. Wang, Y. Liu, M. M. Kashani, Z. Shi, B. Gu, A. Sarkar,
puting,” Soft Computing, vol. 23, no. 20, pp. 10 551–10 568, 2019. and F. Popowich, “Description of squash, the sfu question answering
[53] C.-Y. Lin and E. Hovy, “From single to multi-document summarization,” summary handler for the duc-2005 summarization task,” safety, vol. 1,
in Proceedings of the 40th annual meeting of the association for compu- p. 14345754, 2005.
tational linguistics, 2002, pp. 457–464. [77] S. Li, Y. Ouyang, W. Wang, and B. Sun, “Multi-document summarization
[54] I. Mashechkin, M. Petrovskiy, D. Popov, and D. V. Tsarev, “Automatic using support vector regression,” in Proceedings of DUC. Citeseer,
text summarization using latent semantic analysis,” Programming and 2007.
Computer Software, vol. 37, no. 6, pp. 299–305, 2011. [78] E. Hovy, C.-Y. Lin, and L. Zhou, “Evaluating duc 2005 using basic
[55] A. Nenkova, “Automatic text summarization of newswire: Lessons elements,” in Proceedings of DUC, vol. 2005. Citeseer, 2005.
learned from the document understanding conference,” 2005. [79] A. A. Mohamed and S. Rajasekaran, “Improving query-based summa-
[56] C.-Y. Lin and E. Hovy, “Automated multi-document summarization in rization using document graphs,” in 2006 IEEE international symposium
neats,” in Proceedings of the Human Language Technology Conference on signal processing and information technology. IEEE, 2006, pp. 408–
(HLT2002). San Diego, CA, USA, 2002, pp. 23–27. 410.
[57] R. Ferreira, L. de Souza Cabral, F. Freitas, R. D. Lins, G. de França Silva, [80] Y. Seki, K. Eguchi, N. Kando, and M. Aono, “Opinion-focused summa-
S. J. Simske, and L. Favaro, “A multi-document summarization system rization and its analysis at duc 2006,” in Proceedings of the Document
based on statistics and linguistic treatment,” Expert Systems with Appli- understanding conference (Duc), 2006, pp. 122–130.
cations, vol. 41, no. 13, pp. 5780–5787, 2014. [81] A. Haghighi and L. Vanderwende, “Exploring content models for multi-
[58] E. Lloret, O. Ferrández, R. Munoz, and M. Palomar, “A text summariza- document summarization,” in Proceedings of Human Language Tech-
tion approach under the influence of textual entailment.” in NLPCS, 2008, nologies: The 2009 Annual Conference of the North American Chapter
pp. 22–31. of the Association for Computational Linguistics, 2009, pp. 362–370.
[59] A. Sinha, A. Yadav, and A. Gahlot, “Extractive text summarization using [82] F. Lacatusu, A. Hickl, K. Roberts, Y. Shi, J. Bensley, B. Rink, P. Wang,
neural networks,” arXiv preprint arXiv:1802.10137, 2018. and L. Taylor, “Lcc’s gistexter at duc 2006: Multi-strategy multi-
[60] A. Patel, T. Siddiqui, and U. Tiwary, “A language independent approach document summarization,” in Proceedings of DUC’06, 2006.
to multilingual text summarization,” Large scale semantic access to [83] M. R. Amini and N. Usunier, “A contextual query expansion approach by
content (text, image, video, and sound), pp. 123–132, 2007. term clustering for robust text summarization,” in Proceedings of DUC,
[61] H. Asgari, B. Masoumi, and O. S. Sheijani, “Automatic text summariza- vol. 200, no. 7. Citeseer, 2007.
tion based on multi-agent particle swarm optimization,” in 2014 Iranian [84] P. Over, H. Dang, and D. Harman, “Duc in context,” Information Pro-
Conference on Intelligent Systems (ICIS). IEEE, 2014, pp. 1–5. cessing & Management, vol. 43, no. 6, pp. 1506–1520, 2007.
22 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[85] M. A. Tayal, M. M. Raghuwanshi, and L. G. Malik, “Atssc: Develop- for Computational Linguistics: Human Language Technologies, Volume
ment of an approach based on soft computing for text summarization,” 2 (Short Papers), 2018, pp. 55–60.
Computer Speech & Language, vol. 41, pp. 214–235, 2017. [106] R. Paulus, C. Xiong, and R. Socher, “A deep reinforced model for
[86] R. Verma, P. Chen, and W. Lu, “A semantic free-text summarization abstractive summarization,” arXiv preprint arXiv:1705.04304, 2017.
system using ontology knowledge,” in Proc. of Document Understanding [107] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, “Neural doc-
Conference, 2007. ument summarization by jointly learning to score and select sentences,”
[87] K. Toutanova, C. Brockett, M. Gamon, J. Jagarlamudi, H. Suzuki, and arXiv preprint arXiv:1807.02305, 2018.
L. Vanderwende, “The pythy summarization system: Microsoft research [108] M. Koupaee and W. Y. Wang, “Wikihow: A large scale text summariza-
at duc 2007,” in Proc. of DUC, vol. 2007, 2007. tion dataset,” arXiv preprint arXiv:1810.09305, 2018.
[88] M. Afsharizadeh, H. Ebrahimpour-Komleh, and A. Bagheri, “Query- [109] J. Zhang, Y. Zhao, M. Saleh, and P. Liu, “Pegasus: Pre-training with
oriented text summarization using sentence extraction technique,” in extracted gap-sentences for abstractive summarization,” in International
2018 4th International Conference on Web Research (ICWR). IEEE, Conference on Machine Learning. PMLR, 2020, pp. 11 328–11 339.
2018, pp. 128–132. [110] I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim,
[89] N. Madnani, D. Zajic, B. Dorr, N. F. Ayan, and J. Lin, “Multiple “Summac: a text summarization evaluation,” Natural Language Engi-
alternative sentence compressions for automatic text summarization,” in neering, vol. 8, no. 1, pp. 43–68, 2002.
Proceedings of the 2007 Document Understanding Conference (DUC- [111] I. Mani, D. House, G. Klein, L. Hirschman, T. Firmin, and B. M.
2007) at NLT/NAACL, 2007, p. 24. Sundheim, “The tipster summac text summarization evaluation,” in Ninth
[90] F. T. AL-Khawaldeh and V. W. Samawi, “Lexical cohesion and entail- Conference of the European Chapter of the Association for Computa-
ment based segmentation for arabic text summarization (lceas).” World tional Linguistics, 1999, pp. 77–85.
of Computer Science & Information Technology Journal, vol. 5, no. 3, [112] S. Azzam, K. Humphreys, and R. Gaizauskas, “Using coreference chains
2015. for text summarization,” in Coreference and Its Applications, 1999.
[91] R. Z. Al-Abdallah and A. T. Al-Taani, “Arabic single-document text [113] T. Fukusima and M. Okumura, “Text summarization challenge text
summarization using particle swarm optimization algorithm,” Procedia summarization evaluation in japan,” in North American Association
Computer Science, vol. 117, pp. 30–37, 2017. for Computational Linguistics (NAACL2001), Workshop on Automatic
[92] H. Oufaida, O. Nouali, and P. Blache, “Minimum redundancy and Summarization, 2001, pp. 51–59.
maximum relevance for single and multi-document arabic text summa- [114] E. Hovy and D. Marcu, “Automated text summarization,” The Oxford
rization,” Journal of King Saud University-Computer and Information Handbook of computational linguistics, vol. 583598, 2005.
Sciences, vol. 26, no. 4, pp. 450–461, 2014.
[115] T. Nomoto and Y. Matsumoto, “A new approach to unsupervised text
[93] L. d. S. Cabral, R. D. Lins, R. F. Mello, F. Freitas, B. Ávila, S. Simske, summarization,” in Proceedings of the 24th annual international ACM
and M. Riss, “A platform for language independent summarization,” in SIGIR conference on Research and development in information retrieval,
Proceedings of the 2014 ACM symposium on Document engineering, 2001, pp. 26–34.
2014, pp. 203–206.
[116] O. Tas and F. Kiyani, “A survey automatic text summarization,” Pres-
[94] R. Mihalcea, “Language independent extractive summarization,” in ACL,
sAcademia Procedia, vol. 5, no. 1, pp. 205–213, 2007.
vol. 5, 2005, pp. 49–52.
[117] H. Saggion and T. Poibeau, “Automatic text summarization: Past, present
[95] D. R. Amancio, M. G. Nunes, O. N. Oliveira Jr, and L. d. F. Costa,
and future,” in Multi-source, multilingual information extraction and
“Extractive summarization using complex networks and syntactic depen-
summarization. Springer, 2013, pp. 3–21.
dency,” Physica A: Statistical Mechanics and its Applications, vol. 391,
[118] H. Zhang and J. Wang, “Semantic wordrank: Generating finer single-
no. 4, pp. 1855–1864, 2012.
document summarizations,” in International Conference on Intelligent
[96] T. A. S. Pardo, L. Antiqueira, M. d. G. V. Nunes, O. N. Oliveira,
Data Engineering and Automated Learning. Springer, 2018, pp. 398–
and L. D. F. Costa, “Using complex networks for language processing:
409.
The case of summary evaluation,” in 2006 International Conference on
Communications, Circuits and Systems, vol. 4. IEEE, 2006, pp. 2678– [119] L. Shao, H. Zhang, and J. Wang, “Robust single-document summariza-
2682. tions and a semantic measurement of quality,” in International Joint Con-
[97] L. H. M. Rino, T. A. S. Pardo, C. N. Silla, C. A. A. Kaestner, and ference on Knowledge Discovery, Knowledge Engineering, and Knowl-
M. Pombo, “A comparison of automatic summarizers of texts in brazilian edge Management. Springer, 2017, pp. 118–138.
portuguese,” in Brazilian Symposium on Artificial Intelligence. Springer, [120] K. Ganesan, C. Zhai, and J. Han, “Opinosis: A graph based approach to
2004, pp. 235–244. abstractive summarization of highly redundant opinions,” 2010.
[98] S. Liu, M. X. Zhou, S. Pan, Y. Song, W. Qian, W. Cai, and X. Lian, [121] M. Kågebäck, O. Mogren, N. Tahmasebi, and D. Dubhashi, “Extractive
“Tiara: Interactive, topic-based visual text summarization and analysis,” summarization using continuous vector space models,” in Proceedings
ACM Transactions on Intelligent Systems and Technology (TIST), vol. 3, of the 2nd Workshop on Continuous Vector Space Models and their
no. 2, pp. 1–28, 2012. Compositionality (CVSC), 2014, pp. 31–39.
[99] M. Yousefi-Azar and L. Hamey, “Text summarization using unsupervised [122] A. Padmakumar and A. Saran, “Unsupervised text summarization us-
deep learning,” Expert Systems with Applications, vol. 68, pp. 93–105, ing sentence embeddings,” Dept. Comput. Sci., Univ. Texas, Austin,
2017. USA, Tech. Rep.[Online]. Available: https://www. cs. utexas. edu/˜
[100] J. Ulrich, G. Murray, and G. Carenini, “A publicly available annotated aish/ut/NLPProject. pdf, 2016.
corpus for supervised email summarization,” in Proc. of aaai email-2008 [123] D. Gillick, B. Favre, and D. Hakkani-Tür, “The icsi summarization
workshop, chicago, usa, 2008. system at tac 2008.” in Tac, 2008.
[101] N. Alami, M. Meknassi, and N. En-nahnahi, “Enhancing unsupervised [124] D. Galanis and P. Malakasiotis, “Aueb at tac 2008.” in TAC, 2008.
neural networks based text summarization with word embedding and [125] A. Bawakid and M. Oussalah, “A semantic summarization system: Uni-
ensemble learning,” Expert systems with applications, vol. 123, pp. 195– versity of birmingham at tac 2008.” in TAC. Citeseer, 2008.
211, 2019. [126] M. Bhandari, P. Gour, A. Ashfaq, P. Liu, and G. Neubig, “Re-evaluating
[102] M. Yang, X. Wang, Y. Lu, J. Lv, Y. Shen, and C. Li, “Plausibility- evaluation in text summarization,” arXiv preprint arXiv:2010.07100,
promoting generative adversarial network for abstractive text summa- 2020.
rization with multi-task constraint,” Information Sciences, vol. 521, pp. [127] D. Yogatama, F. Liu, and N. A. Smith, “Extractive summarization by
46–61, 2020. maximizing semantic volume,” in Proceedings of the 2015 Conference
[103] H. Zhang, J. Xu, and J. Wang, “Pretraining-based natural language on Empirical Methods in Natural Language Processing, 2015, pp. 1961–
generation for text summarization,” arXiv preprint arXiv:1902.09243, 1966.
2019. [128] M. Peyrard, “Studying summarization evaluation metrics in the appro-
[104] U. Khandelwal, K. Clark, D. Jurafsky, and L. Kaiser, “Sample efficient priate scoring range,” in Proceedings of the 57th Annual Meeting of the
text summarization using a single pre-trained transformer,” arXiv preprint Association for Computational Linguistics, 2019, pp. 5093–5100.
arXiv:1905.08836, 2019. [129] L. Marujo, W. Ling, R. Ribeiro, A. Gershman, J. Carbonell, D. M.
[105] C. Li, W. Xu, S. Li, and S. Gao, “Guiding generation for abstractive text de Matos, and J. P. Neto, “Exploring events and distributed represen-
summarization based on key information guide network,” in Proceedings tations of text in multi-document summarization,” Knowledge-Based
of the 2018 Conference of the North American Chapter of the Association Systems, vol. 94, pp. 33–42, 2016.
VOLUME 4, 2016 23
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[130] V. Varma, P. Pingali, R. Katragadda, S. Krishna, S. Ganesh, K. Sarv- [155] P. Li, W. Lam, L. Bing, and Z. Wang, “Deep recurrent generative decoder
abhotla, H. Garapati, H. Gopisetty, V. B. Reddy, K. Reddy et al., “Iiit for abstractive text summarization,” arXiv preprint arXiv:1708.00625,
hyderabad at tac 2009.” in TAC, 2009. 2017.
[131] J.-P. Ng and V. Abrecht, “Better summarization evaluation with word [156] Z. Li, Z. Peng, S. Tang, C. Zhang, and H. Ma, “Text summarization
embeddings for rouge,” arXiv preprint arXiv:1508.06034, 2015. method based on double attention pointer network,” IEEE Access, vol. 8,
[132] D. Gillick, B. Favre, D. Hakkani-Tür, B. Bohnet, Y. Liu, and S. Xie, “The pp. 11 279–11 288, 2020.
icsi/utd summarization system at tac 2009.” in Tac. Citeseer, 2009. [157] Z. Liang, J. Du, and C. Li, “Abstractive social media text summarization
[133] K. Hong, M. Marcus, and A. Nenkova, “System combination for multi- using selective reinforced seq2seq attention model,” Neurocomputing,
document summarization,” in Proceedings of the 2015 conference on vol. 410, pp. 432–440, 2020.
empirical methods in natural language processing, 2015, pp. 107–117. [158] S. Ma and X. Sun, “A semantic relevance based neural net-
[134] H. Ji, R. Grishman, H. T. Dang, K. Griffitt, and J. Ellis, “Overview of work for text summarization and text simplification,” arXiv preprint
the tac 2010 knowledge base population track,” in Third text analysis arXiv:1710.02318, 2017.
conference (TAC 2010), vol. 3, no. 2, 2010, pp. 3–3. [159] A. Joshi, E. Fidalgo, E. Alegre, and L. Fernández-Robles, “Summcoder:
[135] J.-Y. Delort and E. Alfonseca, “Dualsum: a topic-model based approach An unsupervised framework for extractive text summarization based on
for update summarization,” in Proceedings of the 13th Conference of deep auto-encoders,” Expert Systems with Applications, vol. 129, pp.
the European Chapter of the Association for Computational Linguistics, 200–215, 2019.
2012, pp. 214–223.
[160] L.-W. Ku, Y.-T. Liang, H.-H. Chen et al., “Opinion extraction, summa-
[136] F. Jin, M. Huang, and X. Zhu, “The thu summarization systems at tac
rization and tracking in news and blog corpora.” in AAAI spring sym-
2010.” in TAC. Citeseer, 2010.
posium: Computational approaches to analyzing weblogs, vol. 100107,
[137] A. Kennedy, A. Kazantseva, D. Inkpen, and S. Szpakowicz, “Getting
2006, pp. 1–167.
emotional about news summarization,” in Canadian conference on ar-
[161] M. Hu, A. Sun, and E.-P. Lim, “Comments-oriented blog summarization
tificial intelligence. Springer, 2012, pp. 121–132.
by sentence extraction,” in Proceedings of the sixteenth ACM conference
[138] W. M. Darling and F. Song, “Pathsum: A summarization framework
on Conference on information and knowledge management, 2007, pp.
based on hierarchical topics,” on Automatic Text Summarization 2011,
901–904.
p. 5, 2011.
[139] T. Makino, H. Takamura, and M. Okumura, “Balanced coverage of [162] G. Kim, S. Moon, and L. Sigal, “Joint photo stream and blog post
aspects for text summarization,” in Proceedings of the 21st ACM inter- summarization and exploration,” in Proceedings of the IEEE Conference
national conference on Information and knowledge management, 2012, on Computer Vision and Pattern Recognition, 2015, pp. 3081–3089.
pp. 1742–1746. [163] S. Petrov, D. Das, and R. McDonald, “A universal part-of-speech tagset,”
[140] G. Giannakopoulos, M. El-Haj, B. Favre, M. Litvak, J. Steinberger, and arXiv preprint arXiv:1104.2086, 2011.
V. Varma, “Tac 2011 multiling pilot overview,” 2011. [164] J. R. Méndez, E. L. Iglesias, F. Fdez-Riverola, F. Díaz, and J. M.
[141] G. Algaphari, F. M. Ba-Alwi, and A. Moharram, “Text summarization us- Corchado, “Tokenising, stemming and stopword removal on anti-spam
ing centrality concept,” International Journal of Computer Applications, filtering domain,” in Conference of the Spanish Association for Artificial
vol. 79, no. 1, 2013. Intelligence. Springer, 2005, pp. 449–458.
[142] K. Owczarzak, J. Conroy, H. T. Dang, and A. Nenkova, “An assess- [165] A. Mansouri, L. S. Affendey, and A. Mamat, “Named entity recognition
ment of the accuracy of automatic evaluation in summarization,” in approaches,” International Journal of Computer Science and Network
Proceedings of workshop on evaluation metrics and system comparison Security, vol. 8, no. 2, pp. 339–344, 2008.
for automatic summarization, 2012, pp. 1–9. [166] G. Gupta and S. Malhotra, “Text document tokenization for word fre-
[143] A. Dean-Hall, C. L. Clarke, J. Kamps, P. Thomas, N. Simone, and quency count using rapid miner (taking resume as an example),” Int. J.
E. Voorhees, “Overview of the trec 2013 contextual suggestion track,” Comput. Appl, vol. 975, p. 8887, 2015.
WATERLOO UNIV (ONTARIO), Tech. Rep., 2013. [167] T. Verma, R. Renu, and D. Gaur, “Tokenization and filtering process
[144] F. Dernoncourt, M. Ghassemi, and W. Chang, “A repository of corpora for in rapidminer,” International Journal of Applied Information Systems,
summarization,” in Proceedings of the Eleventh International Conference vol. 7, no. 2, pp. 16–18, 2014.
on Language Resources and Evaluation (LREC 2018), 2018. [168] K. Darwish, “Named entity recognition using cross-lingual resources:
[145] R. McCreadie, C. Macdonald, and I. Ounis, “Incremental update summa- Arabic as an example,” in Proceedings of the 51st Annual Meeting of
rization: Adaptive sentence selection based on prevalence and novelty,” the Association for Computational Linguistics (Volume 1: Long Papers),
in Proceedings of the 23rd ACM international conference on conference 2013, pp. 1558–1567.
on information and knowledge management, 2014, pp. 301–310. [169] D. L. Whitney and B. W. Evans, “Abbreviations for names of rock-
[146] A. Kanapala, S. Pal, and R. Pamula, “Text summarization from legal forming minerals,” American mineralogist, vol. 95, no. 1, pp. 185–187,
documents: a survey,” Artificial Intelligence Review, vol. 51, no. 3, pp. 2010.
371–402, 2019. [170] B. Pahwa, S. Taruna, and N. Kasliwal, “Sentiment analysis-strategy for
[147] N. Moratanch and S. Chitrakala, “A survey on abstractive text sum- text pre-processing,” Int. J. Comput. Appl, vol. 180, pp. 15–18, 2018.
marization,” in 2016 International Conference on Circuit, power and
[171] J. Schaback and F. Li, “Multi-level feature extraction for spelling correc-
computing technologies (ICCPCT). IEEE, 2016, pp. 1–7.
tion,” in IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text
[148] Y. Zhao, F. Yao, H. Sun, and Z. Yang, “Bjut at trec 2014 temporal
Data. Citeseer, 2007, pp. 79–86.
summarization track,” BEIJING UNIVERSTIY OF TECHNOLOGY
[172] J. Plisson, N. Lavrac, D. Mladenic et al., “A rule based approach to word
(CHINA), Tech. Rep., 2014.
lemmatization,” in Proceedings of IS, vol. 3, 2004, pp. 83–86.
[149] H. Li, Z. Yang, Y. Lai, L. Duan, and K. Fan, “Bjut at trec 2014
contextual suggestion track: Hybrid recommendation based on open-web [173] T. Korenius, J. Laurikkala, K. Järvelin, and M. Juhola, “Stemming and
information,” BEIJING UNIVERSTIY OF TECHNOLOGY (CHINA), lemmatization in the clustering of finnish text documents,” in Proceed-
Tech. Rep., 2014. ings of the thirteenth ACM international conference on Information and
[150] M. Aliannejadi, S. A. Bahrainian, A. Giachanou, and F. Crestani, “Uni- knowledge management, 2004, pp. 625–633.
versity of lugano at trec 2015: Contextual suggestion and temporal [174] J. L. Neto, A. A. Freitas, and C. A. Kaestner, “Automatic text summa-
summarization tracks.” in TREC, 2015. rization using a machine learning approach,” in Brazilian symposium on
[151] E. Agichtein, D. Carmel, D. Pelleg, Y. Pinter, and D. Harman, “Overview artificial intelligence. Springer, 2002, pp. 205–215.
of the trec 2015 liveqa track.” in TREC, 2015. [175] H. Christian, M. P. Agus, and D. Suhartono, “Single document automatic
[152] N. Moratanch and S. Chitrakala, “A survey on extractive text summariza- text summarization using term frequency-inverse document frequency
tion,” in 2017 international conference on computer, communication and (tf-idf),” ComTech: Computer, Mathematics and Engineering Applica-
signal processing (ICCCSP). IEEE, 2017, pp. 1–6. tions, vol. 7, no. 4, pp. 285–294, 2016.
[153] B. Hu, Q. Chen, and F. Zhu, “Lcsts: A large scale chinese short text [176] M. A. Fattah and F. Ren, “Ga, mr, ffnn, pnn and gmm based models for
summarization dataset,” arXiv preprint arXiv:1506.05865, 2015. automatic text summarization,” Computer Speech & Language, vol. 23,
[154] L. Wang, J. Yao, Y. Tao, L. Zhong, W. Liu, and Q. Du, “A reinforced no. 1, pp. 126–144, 2009.
topic-aware convolutional sequence-to-sequence model for abstractive [177] ——, “Automatic text summarization,” World Academy of Science, Engi-
text summarization,” arXiv preprint arXiv:1805.03616, 2018. neering and Technology, vol. 37, no. 2, p. 192, 2008.
24 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[178] B. Mutlu, E. A. Sezer, and M. A. Akcayol, “Multi-document extractive [200] S. Villata et al., “Using argument mining for legal text summarization,”
text summarization: A comparative assessment on features,” Knowledge- in Legal Knowledge and Information Systems: JURIX 2020: The Thirty-
Based Systems, vol. 183, p. 104848, 2019. third Annual Conference, Brno, Czech Republic, December 9-11, 2020,
[179] S. Babar and P. D. Patil, “Improving performance of text summarization,” vol. 334. IOS Press, 2020, p. 184.
Procedia Computer Science, vol. 46, pp. 354–363, 2015. [201] C. Zhang, S. Sah, T. Nguyen, D. Peri, A. Loui, C. Salvaggio, and
[180] Y. Mehdad, G. Carenini, and R. Ng, “Abstractive summarization of R. Ptucha, “Semantic sentence embeddings for paraphrasing and text
spoken and written conversations based on phrasal queries,” in Proceed- summarization,” in 2017 IEEE Global Conference on Signal and Infor-
ings of the 52nd Annual Meeting of the Association for Computational mation Processing (GlobalSIP). IEEE, 2017, pp. 705–709.
Linguistics (Volume 1: Long Papers), 2014, pp. 1220–1230. [202] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word
[181] M. Abdolahi and M. Zahedh, “Sentence matrix normalization using most vectors with subword information,” Transactions of the Association for
likely n-grams vector,” in 2017 IEEE 4th International Conference on Computational Linguistics, vol. 5, pp. 135–146, 2017.
Knowledge-Based Engineering and Innovation (KBEI). IEEE, 2017, [203] K. Kurniawan and S. Louvan, “Indosum: A new benchmark dataset for
pp. 0040–0045. indonesian text summarization,” in 2018 International Conference on
[182] C.-Y. Lin and E. Hovy, “The automated acquisition of topic signatures for Asian Language Processing (IALP). IEEE, 2018, pp. 215–220.
text summarization,” in COLING 2000 Volume 1: The 18th International [204] J.-W. Lin, Y.-C. Gao, and R.-G. Chang, “Chinese story generation with
Conference on Computational Linguistics, 2000. fasttext transformer network,” in 2019 International Conference on Arti-
[183] L. Suanmali, M. S. Binwahlan, and N. Salim, “Sentence features fusion ficial Intelligence in Information and Communication (ICAIIC). IEEE,
for text summarization using fuzzy logic,” in 2009 Ninth International 2019, pp. 395–398.
Conference on Hybrid Intelligent Systems, vol. 1. IEEE, 2009, pp. 142– [205] D. T. Anh and N. T. T. Trang, “Abstractive text summarization us-
146. ing pointer-generator networks with pre-trained word embedding,” in
[184] Y. Ma and J. Wu, “Combining n-gram and dependency word pair for Proceedings of the Tenth International Symposium on Information and
multi-document summarization,” in 2014 IEEE 17th International Con- Communication Technology, 2019, pp. 473–478.
ference on Computational Science and Engineering. IEEE, 2014, pp. [206] J. Zhu, Y. Zhou, J. Zhang, H. Li, C. Zong, and C. Li, “Multimodal
27–31. summarization with guidance of multimodal reference,” in Proceedings
[185] G. Giannakopoulos, V. Karkaletsis, G. Vouros, and P. Stamatopoulos, of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, 2020,
“Summarization system evaluation revisited: N-gram graphs,” ACM pp. 9749–9756.
Transactions on Speech and Language Processing (TSLP), vol. 5, no. 3, [207] V. Dalal and L. Malik, “Semantic graph based automatic text summa-
pp. 1–39, 2008. rization for hindi documents using particle swarm optimization,” in In-
[186] L. Hennig, W. Umbrath, and R. Wetzker, “An ontology-based approach to ternational Conference on Information and Communication Technology
text summarization,” in 2008 IEEE/WIC/ACM International Conference for Intelligent Systems. Springer, 2017, pp. 284–289.
on Web Intelligence and Intelligent Agent Technology, vol. 3. IEEE,
[208] Y. K. Meena and D. Gopalani, “Domain independent framework for
2008, pp. 291–294.
automatic text summarization,” Procedia Computer Science, vol. 48, pp.
[187] K. Mani, I. Verma, H. Meisheri, and L. Dey, “Multi-document summa- 722–727, 2015.
rization using distributed bag-of-words model,” in 2018 IEEE/WIC/ACM
[209] S. Sah, S. Kulhare, A. Gray, S. Venugopalan, E. Prud’Hommeaux, and
International Conference on Web Intelligence (WI). IEEE, 2018, pp.
R. Ptucha, “Semantic text summarization of long videos,” in 2017 IEEE
672–675.
Winter Conference on Applications of Computer Vision (WACV). IEEE,
[188] D. Yan, K. Li, S. Gu, and L. Yang, “Network-based bag-of-words model
2017, pp. 989–997.
for text classification,” IEEE Access, vol. 8, pp. 82 641–82 652, 2020.
[210] R. A. García-Hernández, R. Montiel, Y. Ledeneva, E. Rendón, A. Gel-
[189] K. S. Jones, “A statistical interpretation of term specificity and its
bukh, and R. Cruz, “Text summarization by sentence extraction using un-
application in retrieval,” Journal of documentation, 2004.
supervised learning,” in Mexican International Conference on Artificial
[190] N. Alsaedi, P. Burnap, and O. Rana, “Temporal tf-idf: A high per-
Intelligence. Springer, 2008, pp. 133–143.
formance approach for event summarization in twitter,” in 2016
IEEE/WIC/ACM International Conference on Web Intelligence (WI). [211] D. Patel, S. Shah, and H. Chhinkaniwala, “Fuzzy logic based multi doc-
IEEE, 2016, pp. 515–521. ument summarization with improved sentence scoring and redundancy
removal technique,” Expert Systems with Applications, vol. 134, pp. 167–
[191] J. Steinberger, K. Jezek et al., “Using latent semantic analysis in text
177, 2019.
summarization and summary evaluation,” Proc. ISIM, vol. 4, pp. 93–100,
2004. [212] L. Suanmali, N. Salim, and M. S. Binwahlan, “Fuzzy logic based method
[192] J. L. Neto, A. D. Santos, C. A. Kaestner, N. Alexandre, D. Santos et al., for improving text summarization,” arXiv preprint arXiv:0906.4690,
“Document clustering and text summarization,” 2000. 2009.
[193] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Dis- [213] F. Kyoomarsi, H. Khosravi, E. Eslami, P. K. Dehkordy, and A. Tajod-
tributed representations of words and phrases and their compositionality,” din, “Optimizing text summarization based on fuzzy logic,” in Seventh
in Advances in neural information processing systems, 2013, pp. 3111– IEEE/ACIS International Conference on Computer and Information Sci-
3119. ence (icis 2008). IEEE, 2008, pp. 347–352.
[194] Y. Wang, X. Liu, and Z. Gao, “Neural related work summariza- [214] R. S. Dixit and S. Apte, “Improvement of text summarization using
tion with a joint context-driven attention mechanism,” arXiv preprint fuzzy logic based method,” IOSR Journal of Computer Engineering
arXiv:1901.09492, 2019. (IOSRJCE), vol. 5, no. 6, pp. 5–10, 2012.
[195] M. M. Haider, M. A. Hossin, H. R. Mahi, and H. Arif, “Automatic [215] J. Yadav and Y. K. Meena, “Use of fuzzy logic and wordnet for improving
text summarization using gensim word2vec and k-means clustering algo- performance of extractive automatic text summarization,” in 2016 Inter-
rithm,” in 2020 IEEE Region 10 Symposium (TENSYMP). IEEE, 2020, national Conference on Advances in Computing, Communications and
pp. 283–286. Informatics (ICACCI). IEEE, 2016, pp. 2071–2077.
[196] S. Abdulateef, N. A. Khan, B. Chen, and X. Shang, “Multidocument [216] Y. Sankarasubramaniam, K. Ramanathan, and S. Ghosh, “Text sum-
arabic text summarization based on clustering and word2vec to reduce marization using wikipedia,” Information Processing & Management,
redundancy,” Information, vol. 11, no. 2, p. 59, 2020. vol. 50, no. 3, pp. 443–461, 2014.
[197] J. Niu, M. Sun, J. J. Rodrigues, and X. Liu, “A novel attention mechanism [217] M. Wang, X. Wang, and C. Xu, “An approach to concept-obtained text
considering decoder input for abstractive text summarization,” in ICC summarization,” in IEEE International Symposium on Communications
2019-2019 IEEE International Conference on Communications (ICC). and Information Technology, 2005. ISCIT 2005., vol. 2. IEEE, 2005, pp.
IEEE, 2019, pp. 1–7. 1337–1340.
[198] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for [218] F. B. Goularte, S. M. Nassar, R. Fileto, and H. Saggion, “A text sum-
word representation,” in Proceedings of the 2014 conference on empirical marization method based on fuzzy rules and applicable to automated
methods in natural language processing (EMNLP), 2014, pp. 1532–1543. assessment,” Expert Systems with Applications, vol. 115, pp. 264–275,
[199] A. Jain, D. Bhatia, and M. K. Thakur, “Extractive text summarization 2019.
using word vector embedding,” in 2017 International Conference on [219] S. T. Dumais, “Latent semantic analysis,” Annual review of information
machine learning and data science (MLDS). IEEE, 2017, pp. 51–55. science and technology, vol. 38, no. 1, pp. 188–230, 2004.
VOLUME 4, 2016 25
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[220] M. G. Ozsoy, F. N. Alpaslan, and I. Cicekli, “Text summarization using [244] R. Zhang, W. Li, D. Gao, and Y. Ouyang, “Automatic twitter topic
latent semantic analysis,” Journal of Information Science, vol. 37, no. 4, summarization with speech acts,” IEEE transactions on audio, speech,
pp. 405–417, 2011. and language processing, vol. 21, no. 3, pp. 649–658, 2012.
[221] J. Steinberger and K. Ježek, “Text summarization and singular value [245] M. Gupta and N. K. Garg, “Text summarization of hindi documents
decomposition,” in International Conference on Advances in Information using rule based approach,” in 2016 international conference on micro-
Systems. Springer, 2004, pp. 245–254. electronics and telecommunication engineering (ICMETE). IEEE, 2016,
[222] Y. Gong and X. Liu, “Generic text summarization using relevance mea- pp. 366–370.
sure and latent semantic analysis,” in Proceedings of the 24th annual [246] M. T. R. Laskar, E. Hoque, and J. Huang, “Query focused abstractive
international ACM SIGIR conference on Research and development in summarization via incorporating query relevance and transfer learning
information retrieval, 2001, pp. 19–25. with transformer models,” in Canadian Conference on Artificial Intelli-
[223] M. F. Porter, “An algorithm for suffix stripping,” Program, 1980. gence. Springer, 2020, pp. 342–348.
[224] G. Salton and C. Buckley, “Term-weighting approaches in automatic text [247] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,”
retrieval,” Information processing & management, vol. 24, no. 5, pp. 513– arXiv preprint arXiv:1908.08345, 2019.
523, 1988. [248] N. Okumura and T. Miura, “Automatic labelling of documents based on
[225] T. M. Mitchell et al., “Machine learning,” 1997. ontology,” in 2015 IEEE Pacific Rim Conference on Communications,
[226] W. T. Chuang and J. Yang, “Extracting sentence segments for text Computers and Signal Processing (PACRIM). IEEE, 2015, pp. 34–39.
summarization: a machine learning approach,” in Proceedings of the [249] M. J. Mohan, C. Sunitha, A. Ganesh, and A. Jaya, “A study on ontology
23rd annual international ACM SIGIR conference on Research and based abstractive summarization,” Procedia Computer Science, vol. 87,
development in information retrieval, 2000, pp. 152–159. pp. 32–37, 2016.
[227] S. Adhikari et al., “Nlp based machine learning approaches for text [250] N. Bhatia and A. Jaiswal, “Trends in extractive and abstractive techniques
summarization,” in 2020 Fourth International Conference on Computing in text summarization,” International Journal of Computer Applications,
Methodologies and Communication (ICCMC). IEEE, 2020, pp. 535– vol. 117, no. 6, 2015.
538. [251] J. Zhu, H. Li, T. Liu, Y. Zhou, J. Zhang, and C. Zong, “Msmo: Mul-
[228] K. Kaikhah, “Automatic text summarization with neural networks,” in timodal summarization with multimodal output,” in Proceedings of the
2004 2nd International IEEE Conference on’Intelligent Systems’. Pro- 2018 conference on empirical methods in natural language processing,
ceedings (IEEE Cat. No. 04EX791), vol. 1. IEEE, 2004, pp. 40–44. 2018, pp. 4154–4164.
[229] K. Svore, L. Vanderwende, and C. Burges, “Enhancing single-document [252] I. F. Moawad and M. Aref, “Semantic graph reduction approach for ab-
summarization by combining ranknet and third-party sources,” in Pro- stractive text summarization,” in 2012 Seventh International Conference
ceedings of the 2007 joint conference on empirical methods in natu- on Computer Engineering & Systems (ICCES). IEEE, 2012, pp. 132–
ral language processing and computational natural language learning 138.
(EMNLP-CoNLL), 2007, pp. 448–457. [253] R. Kabeer and S. M. Idicula, “Text summarization for malayalam doc-
[230] S. Yong, A. I. Abidin, and Y. Chen, “A neural-based text summarization uments—an experience,” in 2014 International Conference on Data
system,” WIT Transactions on Information and Communication Tech- Science & Engineering (ICDSE). IEEE, 2014, pp. 145–150.
nologies, vol. 37, 2006. [254] A. Khan, N. Salim, H. Farman, M. Khan, B. Jan, A. Ahmad, I. Ahmed,
[231] R. Nallapati, F. Zhai, and B. Zhou, “Summarunner: A recurrent neural and A. Paul, “Abstractive text summarization based on improved se-
network based sequence model for extractive summarization of docu- mantic graph approach,” International Journal of Parallel Programming,
ments,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017. vol. 46, no. 5, pp. 992–1016, 2018.
[232] L. Wang and W. Ling, “Neural network-based abstract generation for [255] P.-E. Genest and G. Lapalme, “Framework for abstractive summarization
opinions and arguments,” arXiv preprint arXiv:1606.02785, 2016. using text-to-text generation,” in Proceedings of the workshop on mono-
[233] A. Nenkova and K. McKeown, “A survey of text summarization tech- lingual text-to-text generation, 2011, pp. 64–73.
niques,” in Mining text data. Springer, 2012, pp. 43–76. [256] ——, “Fully abstractive approach to guided summarization,” in Proceed-
[234] M. Galley, “A skip-chain conditional random field for ranking meeting ings of the 50th Annual Meeting of the Association for Computational
utterances by importance,” 2006. Linguistics (Volume 2: Short Papers), 2012, pp. 354–358.
[235] D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen, “Document summariza- [257] J. Otterbacher, G. Erkan, and D. R. Radev, “Biased lexrank: Passage
tion using conditional random fields.” in IJCAI, vol. 7, 2007, pp. 2862– retrieval using random walks with question-based priors,” Information
2867. Processing & Management, vol. 45, no. 1, pp. 42–54, 2009.
[236] N. Sobhana, P. Mitra, and S. Ghosh, “Conditional random field based [258] K. Wu, P. Shi, and D. Pan, “An approach to automatic summarization for
named entity recognition in geological text,” International Journal of chinese text based on the combination of spectral clustering and lexrank,”
Computer Applications, vol. 1, no. 3, pp. 143–147, 2010. in 2015 12th International Conference on Fuzzy Systems and Knowledge
[237] K. Hirohata, N. Okazaki, S. Ananiadou, and M. Ishizuka, “Identify- Discovery (FSKD). IEEE, 2015, pp. 1350–1354.
ing sections in scientific abstracts using conditional random fields,” in [259] P. Verma and H. Om, “Extraction based text summarization methods on
Proceedings of the Third International Joint Conference on Natural user’s review data: A comparative study,” in International Conference
Language Processing: Volume-I, 2008. on Smart Trends for Information Technology and Computer Communica-
[238] Y. Ma, Y. Wang, and B. Jin, “A three-phase approach to document cluster- tions. Springer, 2016, pp. 346–354.
ing based on topic significance degree,” Expert systems with applications, [260] A. Li, T. Jiang, Q. Wang, and H. Yu, “The mixture of textrank and
vol. 41, no. 18, pp. 8203–8210, 2014. lexrank techniques of single document automatic summarization research
[239] N. Nazari and M. Mahdavi, “A survey on automatic text summarization,” in tibetan,” in 2016 8th International Conference on Intelligent Human-
Journal of AI and Data Mining, vol. 7, no. 1, pp. 121–135, 2019. Machine Systems and Cybernetics (IHMSC), vol. 1. IEEE, 2016, pp.
[240] P. Mehta and P. Majumder, “Effective aggregation of various summariza- 514–519.
tion techniques,” Information Processing & Management, vol. 54, no. 2, [261] A. Samuel and D. K. Sharma, “Modified lexrank for tweet summariza-
pp. 145–158, 2018. tion,” International Journal of Rough Sets and Data Analysis (IJRSDA),
[241] L. J. Kurisinkel, Y. Zhang, and V. Varma, “Abstractive multi-document vol. 3, no. 4, pp. 79–90, 2016.
summarization by partial tree extraction, recombination and lineariza- [262] M. Litvak and M. Last, “Graph-based keyword extraction for single-
tion,” in Proceedings of the Eighth International Joint Conference on document summarization,” in Coling 2008: Proceedings of the workshop
Natural Language Processing (Volume 1: Long Papers), 2017, pp. 812– Multi-source Multilingual Information Extraction and Summarization,
821. 2008, pp. 17–24.
[242] S. Gupta and S. Gupta, “Abstractive summarization: An overview of the [263] K. S. Thakkar, R. V. Dharaskar, and M. Chandak, “Graph-based algo-
state of the art,” Expert Systems with Applications, vol. 121, pp. 49–65, rithms for text summarization,” in 2010 3rd International Conference on
2019. Emerging Trends in Engineering and Technology. IEEE, 2010, pp. 516–
[243] T. Oya, Y. Mehdad, G. Carenini, and R. Ng, “A template-based ab- 519.
stractive meeting summarization: Leveraging summary and source text [264] R. Mihalcea, “Graph-based ranking algorithms for sentence extraction,
relationships,” in Proceedings of the 8th International Natural Language applied to text summarization,” in Proceedings of the ACL interactive
Generation Conference (INLG), 2014, pp. 45–53. poster and demonstration sessions, 2004, pp. 170–173.
26 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[265] H. Bhandari, M. Shimbo, T. Ito, and Y. Matsumoto, “Generic text [287] S. Ma, X. Sun, J. Xu, H. Wang, W. Li, and Q. Su, “Improving semantic
summarization using probabilistic latent semantic indexing,” in Proceed- relevance for sequence-to-sequence learning of chinese social media text
ings of the Third International Joint Conference on Natural Language summarization,” arXiv preprint arXiv:1706.02459, 2017.
Processing: Volume-I, 2008. [288] S. Syed, M. Völske, N. Lipka, B. Stein, H. Schütze, and M. Potthast, “To-
[266] J. Zhang, L. Sun, and Q. Zhou, “A cue-based hub-authority approach for wards summarization for social media-results of the tl; dr challenge,” in
multi-document text summarization,” in 2005 International Conference Proceedings of the 12th International Conference on Natural Language
on Natural Language Processing and Knowledge Engineering. IEEE, Generation, 2019, pp. 523–528.
2005, pp. 642–645. [289] T. Wang, P. Chen, K. Amaral, and J. Qiang, “An experimental study
[267] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation of lstm encoder-decoder model for text simplification,” arXiv preprint
ranking: Bringing order to the web.” Stanford InfoLab, Tech. Rep., 1999. arXiv:1609.03663, 2016.
[268] R. Elbarougy, G. Behery, and A. El Khatib, “Extractive arabic text sum- [290] M. Tomer and M. Kumar, “Improving text summarization using ensem-
marization using modified pagerank algorithm,” Egyptian Informatics bled approach based on fuzzy with lstm,” Arabian Journal for Science
Journal, vol. 21, no. 2, pp. 73–81, 2020. and Engineering, vol. 45, no. 12, pp. 10 743–10 754, 2020.
[269] C. Mallick, A. K. Das, M. Dutta, A. K. Das, and A. Sarkar, “Graph-based [291] Y. Diao, H. Lin, L. Yang, X. Fan, Y. Chu, D. Wu, D. Zhang, and
text summarization using modified textrank,” in Soft computing in data K. Xu, “Crhasum: extractive text summarization with contextualized-
analytics. Springer, 2019, pp. 137–146. representation hierarchical-attention summarization network,” Neural
[270] J. Wang, J. Liu, and C. Wang, “Keyword extraction based on pagerank,” Computing and Applications, vol. 32, no. 15, pp. 11 491–11 503, 2020.
in Pacific-Asia Conference on Knowledge Discovery and Data Mining. [292] M. Tanaka and M. Okutomi, “A novel inference of a restricted boltzmann
Springer, 2007, pp. 857–864. machine,” in 2014 22nd International Conference on Pattern Recogni-
[271] A. Chongsuntornsri and O. Sornil, “An automatic thai text summarization tion. IEEE, 2014, pp. 1526–1531.
using topic sensitive pagerank,” in 2006 International Symposium on [293] N. Jaitly and G. Hinton, “Learning a better representation of speech
Communications and Information Technologies. IEEE, 2006, pp. 547– soundwaves using restricted boltzmann machines,” in 2011 IEEE In-
552. ternational Conference on Acoustics, Speech and Signal Processing
[272] O. Sornil and K. Gree-Ut, “An automatic text summarization approach (ICASSP). IEEE, 2011, pp. 5884–5887.
using content-based and graph-based characteristics,” in 2006 IEEE [294] H. Larochelle, M. Mandel, R. Pascanu, and Y. Bengio, “Learning algo-
Conference on Cybernetics and Intelligent Systems. IEEE, 2006, pp. rithms for the classification restricted boltzmann machine,” The Journal
1–6. of Machine Learning Research, vol. 13, no. 1, pp. 643–669, 2012.
[273] R. Mihalcea and P. Tarau, “Textrank: Bringing order into text,” in [295] C. P. Chen, C.-Y. Zhang, L. Chen, and M. Gan, “Fuzzy restricted boltz-
Proceedings of the 2004 conference on empirical methods in natural mann machine for the enhancement of deep learning,” IEEE Transactions
language processing, 2004, pp. 404–411. on Fuzzy Systems, vol. 23, no. 6, pp. 2163–2173, 2015.
[296] M.-A. Côté and H. Larochelle, “An infinite restricted boltzmann ma-
[274] F. Barrios, F. López, L. Argerich, and R. Wachenchauzer, “Variations of
chine,” Neural computation, vol. 28, no. 7, pp. 1265–1288, 2016.
the similarity function of textrank for automated summarization,” arXiv
preprint arXiv:1602.03606, 2016. [297] T. Kuremoto, S. Kimura, K. Kobayashi, and M. Obayashi, “Time series
forecasting using restricted boltzmann machine,” in International Con-
[275] N. Akhtar, M. S. Beg, and H. Javed, “Textrank enhanced topic model
ference on Intelligent Computing. Springer, 2012, pp. 17–22.
for query focussed text summarization,” in 2019 Twelfth International
[298] P. Xie, Y. Deng, and E. Xing, “Diversifying restricted boltzmann machine
Conference on Contemporary Computing (IC3). IEEE, 2019, pp. 1–6.
for document modeling,” in Proceedings of the 21th ACM SIGKDD
[276] D. Gunawan, A. Pasaribu, R. Rahmat, and R. Budiarto, “Automatic
International Conference on Knowledge Discovery and Data Mining,
text summarization for indonesian language using textteaser,” in IOP
2015, pp. 1315–1324.
Conference Series: Materials Science and Engineering, vol. 190, no. 1.
[299] J. M. Conroy and D. P. O’leary, “Text summarization via hidden markov
IOP Publishing, 2017, p. 012048.
models,” in Proceedings of the 24th annual international ACM SIGIR
[277] N. Yadav and N. Chatterjee, “Text summarization using sentiment anal- conference on Research and development in information retrieval, 2001,
ysis for duc data,” in 2016 International Conference on Information pp. 406–407.
Technology (ICIT). IEEE, 2016, pp. 229–234.
[300] M. O. El-Haj and B. H. Hammo, “Evaluation of query-based arabic text
[278] N. K. Jha and A. Mitra, “Introducing word’s importance level-based text summarization system,” in 2008 International Conference on Natural
summarization using tree structure,” International Journal of Information Language Processing and Knowledge Engineering. IEEE, 2008, pp.
Retrieval Research (IJIRR), vol. 10, no. 1, pp. 13–33, 2020. 1–7.
[279] K. Almekhlafi, “A review of graph-based extractive text summarization [301] H. Jing, “Sentence reduction for automatic text summarization,” in Sixth
models,” Innovative Systems for Intelligent Health Informatics: Data Applied Natural Language Processing Conference, 2000, pp. 310–315.
Science, Health Informatics, Intelligent Systems, Smart Computing, p. [302] H. Zha, “Generic summarization and keyphrase extraction using mutual
439. reinforcement principle and sentence clustering,” in Proceedings of the
[280] L. Antiqueira, O. N. Oliveira Jr, L. da Fontoura Costa, and M. d. 25th annual international ACM SIGIR conference on Research and
G. V. Nunes, “A complex network approach to text summarization,” development in information retrieval, 2002, pp. 113–120.
Information Sciences, vol. 179, no. 5, pp. 584–599, 2009. [303] M. Kutlu, C. Cıǧır, and I. Cicekli, “Generic text summarization for
[281] C. Kruengkrai and C. Jaruskulchai, “Generic text summarization using turkish,” The Computer Journal, vol. 53, no. 8, pp. 1315–1323, 2010.
local and global properties of sentences,” in Proceedings IEEE/WIC [304] I. Sorokin, A. Seleznev, M. Pavlov, A. Fedorov, and A. Ignateva, “Deep
International Conference on Web Intelligence (WI 2003). IEEE, 2003, attention recurrent q-network,” arXiv preprint arXiv:1512.01693, 2015.
pp. 201–206. [305] G. H. Lee and K. J. Lee, “Automatic text summarization using reinforce-
[282] S. Kuppan and L. Sobha, “An approach to text summarization.” in ment learning with embedding features,” in Proceedings of the Eighth
Proceedings of the Third International Workshop on Cross Lingual International Joint Conference on Natural Language Processing (Volume
Information Access: Addressing the Information Need of Multilingual 2: Short Papers), 2017, pp. 193–197.
Societies (CLIAWS3), 2009, pp. 53–60. [306] A. P. Widyassari, A. Affandy, E. Noersasongko, A. Z. Fanani, A. Syukur,
[283] J. Tan, X. Wan, and J. Xiao, “Abstractive document summarization with a and R. S. Basuki, “Literature review of automatic text summarization:
graph-based attentional neural model,” in Proceedings of the 55th Annual research trend, dataset and method,” in 2019 International Conference
Meeting of the Association for Computational Linguistics (Volume 1: on Information and Communications Technology (ICOIACT). IEEE,
Long Papers), 2017, pp. 1171–1181. 2019, pp. 491–496.
[284] D. Suleiman and A. Awajan, “Deep learning based abstractive text sum- [307] Z. Deng, F. Ma, R. Lan, W. Huang, and X. Luo, “A two-stage chinese
marization: Approaches, datasets, evaluation measures, and challenges,” text summarization algorithm using keyword information and adversarial
Mathematical Problems in Engineering, vol. 2020, 2020. learning,” Neurocomputing, vol. 425, pp. 117–126, 2021.
[285] L. Lebanoff, K. Song, and F. Liu, “Adapting the neural encoder- [308] M. E. Peters, M. Neumann, L. Zettlemoyer, and W.-t. Yih, “Dissecting
decoder framework from single to multi-document summarization,” contextual word embeddings: Architecture and representation,” arXiv
arXiv preprint arXiv:1808.06218, 2018. preprint arXiv:1808.08949, 2018.
[286] R. Nallapati, B. Xiang, and B. Zhou, “Sequence-to-sequence rnns for text [309] A. M. Dai and Q. V. Le, “Semi-supervised sequence learning,” Advances
summarization,” 2016. in neural information processing systems, vol. 28, pp. 3079–3087, 2015.
VOLUME 4, 2016 27
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
[310] S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A large [335] P. W. Foos, “The effect of variations in text summarization opportunities
annotated corpus for learning natural language inference,” arXiv preprint on test performance,” The Journal of experimental education, vol. 63,
arXiv:1508.05326, 2015. no. 2, pp. 89–95, 1995.
[311] W. B. Dolan and C. Brockett, “Automatically constructing a corpus [336] R. Shardan and U. Kulkarni, “Implementation and evaluation of evo-
of sentential paraphrases,” in Proceedings of the Third International lutionary connectionist approaches to automated text summarization,”
Workshop on Paraphrasing (IWP2005), 2005. 2010.
[312] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training [337] G. PadmaPriya, “An approach for text summarization using deep learning
of deep bidirectional transformers for language understanding,” arXiv algorithm,” International journal of trends in computer science, no. 1,
preprint arXiv:1810.04805, 2018. 2014.
[313] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, [338] N. Yadav, “Neighborhood rough set based multi-document summariza-
Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in tion,” arXiv preprint arXiv:2106.07338, 2021.
neural information processing systems, 2017, pp. 5998–6008. [339] M. Yang, C. Li, Y. Shen, Q. Wu, Z. Zhao, and X. Chen, “Hierarchical
[314] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., human-like deep neural networks for abstractive text summarization,”
“Language models are unsupervised multitask learners,” OpenAI blog, IEEE Transactions on Neural Networks and Learning Systems, 2020.
vol. 1, no. 8, p. 9, 2019. [340] B. Elayeb, A. Chouigui, M. Bounhas, and O. B. Khiroun, “Automatic
[315] V. Kieuvongngam, B. Tan, and Y. Niu, “Automatic text summarization of arabic text summarization using analogical proportions,” Cognitive Com-
covid-19 medical research articles using bert and gpt-2,” arXiv preprint putation, vol. 12, no. 5, pp. 1043–1069, 2020.
arXiv:2006.01997, 2020. [341] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Auto-
[316] H. Froud, A. Lachkar, and S. A. Ouatik, “Arabic text summarization matic text summarization: A comprehensive survey,” Expert Systems with
based on latent semantic analysis to enhance arabic documents cluster- Applications, vol. 165, p. 113679, 2021.
ing,” arXiv preprint arXiv:1302.1612, 2013. [342] D. R. Radev and D. Tam, “Summarization evaluation using relative
[317] M. Mitra, A. Singhal, and C. Buckley, “Automatic text summarization by utility,” in Proceedings of the twelfth international conference on Infor-
paragraph extraction,” in Intelligent Scalable Text Summarization, 1997. mation and knowledge management, 2003, pp. 508–511.
[318] A. R. Pal and D. Saha, “An approach to automatic text summarization [343] S. Teufel and H. V. Halteren, “Evaluating information content by factoid
using wordnet,” in 2014 IEEE International Advance Computing Confer- analysis: human annotation and stability,” 2004.
ence (IACC). IEEE, 2014, pp. 1169–1173. [344] L. A. i Alemany and M. F. Fort, “Integrating cohesion and coherence for
[319] S. Ryang and T. Abekawa, “Framework of automatic text summariza- automatic summarization,” in Proceedings of EACL2003, 2003, pp. 1–8.
tion using reinforcement learning,” in Proceedings of the 2012 Joint [345] I. Mani, E. Bloedorn, and B. Gates, “Using cohesion and coherence
Conference on Empirical Methods in Natural Language Processing and models for text summarization,” in Intelligent Text Summarization Sym-
Computational Natural Language Learning, 2012, pp. 256–265. posium, 1998, pp. 69–76.
[320] H. G. Silber and K. F. McCoy, “Efficiently computed lexical chains [346] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for
as an intermediate representation for automatic text summarization,” automatic evaluation of machine translation,” in Proceedings of the 40th
Computational Linguistics, vol. 28, no. 4, pp. 487–496, 2002. annual meeting of the Association for Computational Linguistics, 2002,
[321] F. Chua and S. Asur, “Automatic summarization of events from social pp. 311–318.
media,” in Proceedings of the International AAAI Conference on Web [347] Y. Graham, “Re-evaluating automatic summarization with bleu and 192
and Social Media, vol. 7, no. 1, 2013. shades of rouge,” in Proceedings of the 2015 conference on empirical
[322] J. Xu, Z. Gan, Y. Cheng, and J. Liu, “Discourse-aware neural extractive methods in natural language processing, 2015, pp. 128–137.
text summarization,” arXiv preprint arXiv:1910.14142, 2019. [348] M. Popović, “chrf: character n-gram f-score for automatic mt evaluation,”
[323] R. M. Aliguliyev, “A new sentence similarity measure and sentence based in Proceedings of the Tenth Workshop on Statistical Machine Translation,
extractive technique for automatic text summarization,” Expert Systems 2015, pp. 392–395.
with Applications, vol. 36, no. 4, pp. 7764–7772, 2009. [349] O. Bojar, Y. Graham, A. Kamran, and M. Stanojević, “Results of the
[324] R. Alguliev, R. Aliguliyev et al., “Evolutionary algorithm for extractive wmt16 metrics shared task,” in Proceedings of the First Conference on
text summarization,” Intelligent Information Management, vol. 1, no. 02, Machine Translation: Volume 2, Shared Task Papers, 2016, pp. 199–231.
p. 128, 2009. [350] Q. Ma, O. Bojar, and Y. Graham, “Results of the wmt18 metrics shared
[325] R. Ferreira, L. de Souza Cabral, R. D. Lins, G. P. e Silva, F. Freitas, G. D. task: Both characters and embeddings achieve good performance,” in
Cavalcanti, R. Lima, S. J. Simske, and L. Favaro, “Assessing sentence Proceedings of the third conference on machine translation: shared task
scoring techniques for extractive text summarization,” Expert systems papers, 2018, pp. 671–688.
with applications, vol. 40, no. 14, pp. 5755–5764, 2013. [351] D. Radev, S. Teufel, H. Saggion, W. Lam, J. Blitzer, H. Qi, A. Celebi,
[326] C. Fang, D. Mu, Z. Deng, and Z. Wu, “Word-sentence co-ranking for D. Liu, and E. F. Drabek, “Evaluation challenges in large-scale document
automatic extractive text summarization,” Expert Systems with Applica- summarization,” in Proceedings of the 41st Annual Meeting of the Asso-
tions, vol. 72, pp. 189–195, 2017. ciation for Computational Linguistics, 2003, pp. 375–382.
[327] J. Xu and G. Durrett, “Neural extractive text summarization with syntac- [352] P. Malik, A. Gupta, and A. Baghel, “Key issues in machine translation
tic compression,” arXiv preprint arXiv:1902.00863, 2019. evaluation of english-indian languages,” International Journal of Engi-
[328] T. K. Landauer, P. W. Foltz, and D. Laham, “An introduction to latent neering Research & Technology (IJERT), vol. 2, no. 10, pp. 2278–0181,
semantic analysis,” Discourse processes, vol. 25, no. 2-3, pp. 259–284, 2013.
1998. [353] B. Babych, “Automated mt evaluation metrics and their limitations,”
[329] Y. Kikuchi, T. Hirao, H. Takamura, M. Okumura, and M. Nagata, “Single Tradumàtica, no. 12, pp. 0464–470, 2014.
document summarization based on nested tree structure,” in Proceedings
of the 52nd Annual Meeting of the Association for Computational Lin-
guistics (Volume 2: Short Papers), 2014, pp. 315–320.
[330] L. Liu, Y. Lu, M. Yang, Q. Qu, J. Zhu, and H. Li, “Generative adversarial
network for abstractive text summarization,” in Proceedings of the AAAI
Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
[331] Y. Ledeneva, A. Gelbukh, and R. A. García-Hernández, “Terms derived
from frequent sequences for extractive text summarization,” in Inter-
national conference on intelligent text processing and computational
linguistics. Springer, 2008, pp. 593–604.
[332] Y.-C. Chen and M. Bansal, “Fast abstractive summarization with
reinforce-selected sentence rewriting,” arXiv preprint arXiv:1805.11080,
2018.
[333] R. Barzilay and M. Elhadad, “Using lexical chains for text summariza-
tion,” Advances in automatic text summarization, pp. 111–121, 1999.
[334] E. Hovy, C.-Y. Lin et al., “Automated text summarization in summarist,”
Advances in automatic text summarization, vol. 14, pp. 81–94, 1999.
28 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3129786, IEEE Access
M. F. MRIDHA (Senior Member, IEEE) is cur- SUJOY CHANDRA DAS is a Computer Science
rently working as an associate professor in the student of Bangladesh University of Business and
Department of Computer Science and Engineer- Technology. He is determinant, communicative,
ing of the Bangladesh University of Business and and sincere to work. Currently, he is working as
Technology. He also worked as a CSE department an assistant researcher in the Advanced Machine
faculty member at the University of Asia Pacific Learning lab. He has good communication and
and as a graduate coordinator from 2012 to 2019. presentation skills. He has experience working
He received his Ph.D. in AI/ML from Jahangir- with front-end development, Tensorflow, Keras,
nagar University in the year 2017. He joined as Matplotlib, etc., and is interested in deep learning
a lecturer at the Department of Computer Science research. He is currently researching the Advanced
and Engineering, Stamford University Bangladesh, in June 2007. He was driver assistance system, Automatic Text Summarization.
promoted as a senior lecturer at the same department in October 2010
and promoted as an assistant professor at the same department in October
2011. Then he joined UAP in May 2012 as an assistant professor. His
research experience, within both academia and industry, results in over 80
journal and conference publications. His research interests include artificial
intelligence (AI), machine learning, deep learning, and natural language
processing (NLP). For more than 10 (Ten) years, he has been with the
masters and undergraduate students as a supervisor of their thesis work.
His research interests include artificial intelligence (AI), machine learning,
natural language processing (NLP), big data analysis, etc. He has served as a MAHMUD HASAN is currently a PhD candi-
program committee member in several international conferences/workshops. date at the Department of Computer Science in
He served as an associate editor of several journals. Western University, Canada. He completed his
bachelor’s in computer science from Chittagong
University of Engineering & Technology in 2011.
He started his professional career as a lecturer at
Stamford University Bangladesh. He later joined
Bangladesh University of Textiles as a lecturer.
Mahmud received his MSc in Computer Science
from Western University in 2014. He worked as a
research software developer at Robarts Research Institute, as a staff software
developer at R&D unit of IBM Watson Health Imaging and as a principal
AKLIMA AKTER LIMA is a Computer Science
investigator at Acceo Tender Retail. He is also working as a research
student of Bangladesh University of Business and
analyst at the Department of Medical Biophysics in Western University.
Technology. She is well organized, determined
In his versatile career moves, Mahmud closely worked with digital and
to work. Currently, she is working as an assis-
medical image analysis, understanding, segmentation, registration, compres-
tant researcher in the Advanced Machine Learn-
sion, classification, denoising and computer vision. His primary focus is on
ing lab. She has experience working with Tensor-
biomedical image analysis using deep learning. He is also interested about
flow, Keras, Matplotlib, etc., and is interested in
other application of areas of machine learning in general, such as NLP.
machine learning, deep learning research. She is
He has proved track record of publications in quality journals and talks
currently researching Advanced driver assistance
presented in many conferences.
systems, Stock exchange, Automatic Text Summa-
rization, etc.
VOLUME 4, 2016 29
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/