0% found this document useful (0 votes)

65 views

Types of Extractive Methods

Types of extractive emethods

Uploaded by

illuminati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views

Types of Extractive Methods

Types of extractive emethods

Uploaded by

illuminati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/329945598

Review of recent techniques for extractive text summarization

Article in Journal of Theoretical and Applied Information Technology · December 2018

CITATION READS
1 378

3 authors, including:

Ahmed El-Refaiy
Zagazig University
1 PUBLICATION 1 CITATION

SEE PROFILE

All content following this page was uploaded by Ahmed El-Refaiy on 26 February 2019.

The user has requested enhancement of the downloaded file.

Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

REVIEW OF RECENT TECHNIQUES FOR EXTRACTIVE

TEXT SUMMARIZATION
1
AHMED ELREFAIY, 2AHMED RAFAT ABAS, 3IBRAHIM ELHENAWY
1
Teaching Assistant. Department of Computer Science, Faculty of Computers and Informatics, Zagazig
University, 44519, Egypt
2
Lecturer. Department of Computer Science, Faculty of Computers and Informatics, Zagazig University,
44519, Egypt
3
Professor. Department of Computer Science, Faculty of Computers and Informatics, Zagazig University,
44519, Egypt
E-mail: [email protected]

ABSTRACT

In the view of a significant increase in the burden of information over and over the limit by the amount of
information available on the internet, there is a huge increase in the amount of information overloading and
redundancy contained in each document. Extracting important information in a summarized format would
help a number of users. It is therefore necessary to have proper and properly prepared summaries.
Subsequently, many research papers are proposed continuously to develop new approaches to automatically
summarize the text. “Automatic Text Summarization” is a process to create a shorter version of the original
text (one or more documents) which conveys information present in the documents. In general, the
summary of the text can be categorized into two types: Extractive-based and Abstractive-based.
Abstractive-based methods are very complicated as they need to address a huge-scale natural language.
Therefore, research communities are focusing on extractive summaries, attempting to achieve more
consistent, non-recurring and meaningful summaries. This review provides an elaborative survey of
extractive text summarization techniques. Specifically, it focuses on unsupervised techniques, providing
recent efforts and advances on them and list their strengths and weaknesses points in a comparative tabular
manner. In addition, this review highlights efforts made in the evaluation techniques of the summaries and
finally deduces some possible future trends.
Keywords: Extractive Text Summarization - Summarization Review - Artificial Intelligence - Information
Retrieval - Natural Language Processing

1. INTRODUCTION However, the generation of automatic text

summarization is still a challenging task and more
Text summarization is the process of creating a complex due to the issues founded in this task such
shorter version of one or more documents that as degree of redundancy, compression ratio which
conveys the information in these documents. It founded when summarizing multi documents than
produces a summary that reduces repetition in the single document [5]. Furthermore, in recent years
text by containing a large part of the information in research seeks to overcome the lack of coherence
the original text. Therefore, we can say that presented by the summaries, resulting in common
summary is a tool that helps a user to efficiently approaches identifying relevant content and
find useful information from a vast amount of integrating it into new parts of information [6], [7].
information [1]. The text started to be summarized
Another important aspect of summarizing the text
in the late fifties [2] and yet, there has been
relates to its Evaluation. There are several Methods
considerable improvement in this area; and so, a
have been presented to automatically evaluate
large number of techniques have been proposed
summaries to link well with human evaluation.
here [3], [4].
However, This is also a major challenge because it

7739
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

is not clear, even by human beings, what kind of the way that form opinion mining, according to
information the summary should contain [8]. person feelings towards the subject, product or
entities. Many researchers have been presented
The two basic types of text summarization are
comprehensive survey about this type of
abstractive and extractive [1]. Extractive summary
summarization [29], [30].
extract the important and meaningful sentences
from the original text and placing them into Based on the content type of the original text, the
summary without any changes. Abstractive summarization may be considered as Generic or
summary doesn’t not rely on concatenating Query based [11], [12], [13], [9], and [14]. In
sentences; instead of that, it analyze the original Generic summary, Extracted information is not a
text semantically to understand it and build more user specific and doesn’t rely on the document
coherent meaningful related conclusion summary. subject. In Query based summarization, the
The sentences in the summary may not be present generated summary based on the user query. So, it
in the original text. Abstractive summary give more present the user view. Query based summarization
generalized summary but it is difficult to compute. can be named as Topic-focused or user- focused
summaries.
Many researchers have presented comprehensive
surveys about text summarization. Some of them Based on the limitation of input text,
still focusing on improving extractive text summarization can be Genre specific, Domain
summarization and the others move toward Dependent or Domain Independent systems [9]. In
abstractive summarization. Previously analysis on Genre specific systems, specific inputs types only
extractive text summarization presented elaborative can be accepted such as, stories, newspaper articles
studies for well-known approaches, recently etc. Domain Dependent systems deal with text
discussed types or evaluation techniques to gain which their subject defined in the fixed domain.
knowledge about text summarization key issues. In Domain Independent systems can accept any type
this paper, classification of extractive text of text as they are not relied on the domain.
summarization techniques is done into different
Based on the number of input documents, in
new categories including supervised, semi-
which the system input can be one or more
supervised and unsupervised. We focusing on
documents [9]. It can be divided into Single
unsupervised techniques, providing state of the art
Document or Multi-Document Summarization [7],
efforts and advances on them and list their strengths
[15]. In Single Document Summarization,
and weaknesses points in a comparative tabular
summarization is built on one document only
manner. In addition, we highlight efforts made in
whereas in Multi-Document Summarization,
the evaluation task and finally deduces some
summarization is built based on more than one
possible future trends. To our knowledge, we are
document, all of them are of the same topic. Multi-
the first to present such study for the unsupervised
document summarization may suffer from some
field in extractive text summarization.
issues such as redundancy, sentence ordering,
In recent years, progress has been made in text temporal dimension, co-reference which make this
summarization in various aspects, leading to the task more difficult than summarizing task of single
appearance of different subtypes under the two document type. [5].The most prominent issue which
basic types. Based on the summarization Purpose, also appeared more with multi-document task is
type of details or style of output, the summarization redundancy. So, there are some attempts to tackle
can be Indicative, Informative or Critical [9], [10]. this problem such as selecting the sentences at the
Indicative summary present the main idea of the beginning of the paragraph and then measure the
entire document, it gives the user a quick view from similarity of the following sentence with the
the original text. So, it may not contain all sentences already chosen and this sentence is
important factual content. Informative summary retained only if it consists of new related content
express the important concise information of the [16]. Maximal Marginal Relevance approach
original text to the user. In Critical summary the produced at 1998 [17]. Another different methods
document is criticized. For example, In the case of suggested by researchers trying to achieve best
the scientific paper, it can expresses an opinion results in multi-document summarization [18], [13],
[10]. The most feasible type to automate is [19], [20], [21], and [22].
Indicative summary and the least one is Critical.
Based on the language of the text, which the
Like Critical summary which can express an
system can accept. Summarization can be Mono
opinion for the document, scientific paper etc.
Lingual System, Multi Lingual or cross-lingual
Sentiment based summary generate summaries in

7740
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

System. Mono Lingual System deal with documents 2. EXTRACTIVE TEXT SUMMARIZATION
with specific language and the produced summary BACKGROUND
is based on that language. In Multi-Lingual
Extractive text summarization done by
System, source documents are more than one
picking up the most important sentences from the
language and generated summary are in these
original text in the way that forms the final
different languages. In cross-lingual, the input
summary. Extractive techniques generally generate
document is in specific language and the output is
summaries through 3 phases or it essentially based
in a different language than input language.
on them. These phases are preprocessing step,
Based on the level of linguistic space. processing step and generation step:
Summarization approaches can be either Shallow 1) Preprocessing step: the representation
Approach or Deeper Approach [23]. Shallow space dimensionality of the original text is reduced
approaches limited on syntactically representation to involve a new structure representation. It usually
and try to extract the prominent parts of the text. includes:
Deeper approach restricted on semantically a. Stop-word elimination: Common words
representation and basically depend on linguistic without semantics that do not collect information
processes during the extraction method. relevant to the task (for example, "the", "a", "an",
"in") are eliminated.
Every year the Web pages increase significantly
b. Steaming: Acquire the stem of each
and there are some of search engines return list of
word by bringing the word to its base form.
web pages as a result for a single search query.
c. Part of speech tagging: The process of
Users usually need to know which documents are
identifying and classifying words of the text on the
relevant and which are not through going through
basis of part of speech category they belong (nouns,
multiple pages. In addition, they are abandoning the
verbs, adverbs, adjectives).
search in the first attempt. Therefore, it’s important
to generate summaries and pick up important
information in web pages. Such summaries are Another technique used here is case
web-based summaries. WebInEssence is a search folding, in which all characters are converted to the
engine which can generate summaries from clusters same kind of letter case, either lower case or upper
of related documents [24]. Due to e-mail case [23] . But, it's not good to use this technique
overloading problem that happens when e-mails when dealing with documents in domains which
keep coming in the inbox and great time consuming suppose for example that the appearance of upper
in reading or archiving them, there is a need to case word in the sentence increase its importance
summarize email conversations. Such type of [32]. Finally in this phase, the sentences are
summarization is called E-mail based analyzed and transformed in terms of features to be
summarization. ready for the next stage. The sentences are analyzed
on the basis of statistical, linguistic or hybrid
Summarization also can be Personalized which analysis of features where statistical features
generate summary of information related to the user doesn’t take into consideration word meanings but,
interests. Therefore, the summary system need to linguistic features goes deeply to capture semantic
keep tracking with user profile to be able to meanings. Each sentence in the document is
determine relevant information that the user is transformed in terms of these features so that we
interested in. User profile can be determined by can determine whether it is important enough to
statistical mapping method from personality include it in the summary or not. Table 1 below
characteristics such as genders with some other shows extractive text summarization common
features [25]. Another different methods suggested features and table 2 below shows comparison
by researchers using this type of summarization between extractive text summarization statistical
[26], [27]. Update based summary generate and linguistic features.
summaries by acquiring the latest updates related to
the topic by taking into considerations that users 2) Processing step: It uses an algorithm
already have fundamental knowledge on the subject with the help of features generated in the
[28]. Survey summaries are another kind which preprocessing step to convert the text structure to
present a long overview for a specific subject or the summary structure. In which, the sentences are
entity, trying to gathering the most significant facts scored.
belonging to any entity, person, place etc. Survey 3) Generation step: sentences are ranked.
summaries contain these types: Wikipedia articles, Then, it pick up the most important sentences from
Survey summary and biographical summary [31].

7741
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Table 1: Extractive Text Summarization Common Features

Features Description Comments

Sentence Position It implies that in a specific position, the important sentences will Ex. Value = 1 for first or last position.
be presented such as first or last positions. Otherwise, equation can be used to keep
tracking with remaining positions to take
values between 0 and 1 [36], [37]
Title Similarity The sentence is considered to be important if it has similarity with Can’t use this feature with documents
the document title. This similarity can be calculated by cosine without title
similarity measure.
Similarity to Compute the similarity between each sentence and set of Can be used with query-based summarization
Keywords keywords based on the cosine similarity measure.
Sentence Length Sentences with specific length are considered to be important. Generally shorter and longer sentences have
small values as they are not suitable for the
summary [38].
Term Frequency This means terms that have occurred over and over and that The term word in term frequency feature can
increase the score of their sentences. It reflects how important the take several views such as unique term or
word is for the document. word, Bi gram key or tri gram key [37].
It can be calculated by the number of
occurrence for the term.
The most term frequency measures used are
TF-IDF [39], [38] and TF-ISF [25], [37].
TF-IDF or TF-ISF means that the terms in
unit (e.g. document or sentence) are
important only if they are not appeared more
frequently in the whole collection of all units.
Cue Method Words that have positive or negative effect on sentence weight. Such as: in conclusion, in summary [38].
Proper Noun Sentences which have proper nouns are considered to be Such as: name of a persons, organizations or
important. places [37], [38].
Sentence to The similarity between each sentence and all other sentences This feature employs the concept of text
Sentence Similarity calculated, added up and then normalized [37]. coherence [25].
Sentence to Centroid sentence is calculated first. Then similarity between each This feature employs the concept of text
Centroid Similarity sentence and the centroid sentence calculated [37]. coherence [25]. For example, centroid
sentence is calculated on the basis of TF-ISF
feature – sentence with highest TF-ISF value
is considered to be centroid [37].
Numerical Data The Appearance of such data in a sentence can reflect important
statistics and can increase its chance to be selected for the [32], [37]
summary.
Presence of Special Some of them give the sentences lower probability to be selected [32], [40]
Characters or Words such as: presence of brackets. And others give the sentences
higher probability such as: presence of commas, inverted commas,
acronym words and upper case words.

Table 2: Comparison between Extractive Text Summarization Statistical and Linguistic Features

Type Statistical features Linguistic features

Description Doesn’t take into consideration word meanings; instead of that, It goes deeply to be aware of the semantics
it try to analyze and extract sentences using statistical features connections between words and know the linguistic
only. knowledge.
It identifies term relationships through part of speech
tagging, grammar analysis and other techniques.
Examples Term frequency, sentence length and position, cue method, title Lexical chain, word net, Transition relationship,
method, etc. Anaphoric relationship, etc.
Advantages Efficient in computations. Based on the semantic meanings.
Generate better summary results.
Disadvantages Lack of the semantic meanings. Computations take more time than statistical.
It is difficult to compute rather than statistical.

7742
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

the ranked structure to generate the final required consuming, a lot of techniques have been made to
summary. automate evaluation task. Evaluation can be
The last two stages - processing and generation computed by two ways:
steps - can be also described approximately as three 1. Extrinsic evaluation: evaluation of
main components: sentence scoring, selection and summary done based on how it provides help to
paraphrasing (reformulation). other tasks. It includes several methods like:
At sentence scoring, for each sentence a a. Relevance assessment: it evaluate the
score is assigned which points to its significance. relevance of a topic in the summary or original text.
After that, the most important sentences is b. Reading comprehension: it represents
extracted. Sentence scoring can be done via several the capability or correctness of answering multiple
approaches: supervised, semi-supervised or choices questions that can be gathered after reading
unsupervised approaches (cf. Sect. 4). At sentence summary.
selection, the summarization system has to specify
the best collection of significant sentences that form 2.Intrinsic evaluation: it depends on human
the final summary with taking into consideration judgment as, it evaluate the summary based on the
the most prominent factors: redundancy and coverage of this summary (system summary) and
cohesion. The traditional method for sentence the human-written summary and so, the evaluation
selection is to pick up the top ranked sentences of the summary can be Quality or informativeness.
directly but, the redundancy elimination is the key a. Informativeness evaluation: it is
issue especially for multi-document summarization. computed by comparing system summary with
There are more than one approach used for this task human-written summary or comparing the system
(sentence selection). For instance, Maximum summary with the original text to check that the
Marginal Relevance (MMR) is the most popular summary contains similar contents as original text.
approach for such task [17] which find the linear It includes: ROUGE [41], [42], Relative utility
incorporation for relevance and novelty – [43], Factoid Score [44], Pyramid Method [45], etc.
independently– measures. Another approaches b. Quality evaluation: it is provided based
based on the Kullback–Leibler (KL) divergence in on linguistics so expert humans evaluate summaries
which sentences are selected in the way that manually based on five linguistic questions
decrease the KL divergence between words including: non redundancy, focus, grammaticality,
probability distribution of the candidate summary referential clarity, and structure and Coherence.
and probability distribution from the input [38], Due to none of the previous questions can be
[39]. And because decreasing KL divergence are properly modeled automatically; thus, manual
mathematically tenacious, it is optimized via greedy evaluation is irreplaceable.
selection. Recall-Oriented Understudy for Gisty
Evaluation (ROUGE) [41], [42] is the standard
At sentence paraphrasing (reformulation), method to evaluate summarization automatically. It
the selected sentences to form the summary are is based on the comparison of n-grams between the
modified or reformulated in order to enhance the system summary (to be evaluated) and reference
summary, provide more cohesion and clarity and summaries (human-written summaries). ROUGE
also eliminate redundant or unnecessary metrics have more than one shape including:
information, for example the usage of reformulation ROUGE-N (refer to n-grams), ROUGE-S (skip
and sentence fusion [6]. bigrams), ROUGE-L (longest common
subsequence), ROUGE-W (weighted longest
The summarization process main phases common subsequence), or ROUGE-SU (skip
can be discussed by another view in which it bigrams and unigrams). The most commonly used
contains the following three main subtasks: topic one is ROUGE-N, in which n-gram based metrics
identification, interpretation and finally the are computed with the recall, precision and f-
summary generation [40]. measure oriented score as following:

3. TEXT SUMMARIZATION EVALUATION

(1)
Performance measurement (evaluation) of
the automatic summaries is a challenging task. Due
to manual evaluation is difficult and time (2)

7743
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

centroid based on cosine similarity and then

features values for each sentences added together to
(3) get sentences scores. A detailed review of
techniques based on statistical approaches are
discussed in [50], [51].
4. EXTRACTIVE TEXT SUMMARIZATION
TECHNIQUES 4.2 Graph Approaches
In statistical approaches, central sentences
From the late fifties until now, there are that have maximum similarity to others, supposed
several extractive text summarization techniques to contain the most central-ideas of the text. The
which can be classified based on its nature into 5 previous assumption helps to form the foundation
approaches: Statistical, Graph, Machine-learning, of graph based approach. Graph based ranking
Fuzzy-Logic and Latent Semantics approach and approach is based on Page-Ranking algorithm [52]
additionally into topic, discourse approach which in which text unites (words or sentences) are
come from or based on one or more from the represented by nodes in a weighted graph, with
previous approaches. These approaches can be weighted edges determined using similarities
categorized based on learning type into supervised, between nodes. Both TextRank and LexRank are
semi-supervised and unsupervised approach. Figure graph based approaches. In TextRank [53],
1 below show the extractive text summarization importance scores of nodes determined used voting
techniques categorized by learning type. based weighting while in LexRank [54], it’s a
cosine-transform-based weighting algorithm.
4.1 Statistical Approaches TextRank was introduced as the first graph based
Earlier approaches mostly depend on approach algorithm in which a vertex obtains more
statistical approaches, mainly on frequency and significance if it connects with a higher number of
centrality, also frequency and centrality are earlier vertices as each vertex casts voting to the connected
unsupervised approaches. The assumption is that vertex with it. Mihalcea introduced TextRank for
the most significant information will contain the sentences extraction and keywords extraction of
most frequent words. Luhn [2] generated single document task, while LexRank is for
summaries based on term frequency to detect the multiple-documents task. A graph is formed for all
importance of a sentence in the document. There sentences as nodes, and for each two nodes if they
are many techniques based on term frequency are similar to each other’s with a value greater than
feature include another statistical features with it. a threshold then they can be connected. After graph
For example in [46], single document is made, a random walk is occurred to detect highly
summarization generated based on the combination central sentences. In 2007 [55], an approach that
of word-frequency feature (WF), Textual relied on affinity-graph was introduced for generic-
Entailment (TE), and The Code Quantity Principle based and topic-based multi document
(CQP). Hence, there are many established features summarization. Summarization done by picking up
that can be used with statistical approaches - such the highest information richness and novelty by
as: sentence position, positive and negative terms, calculating similarities on differentiating intra
title similarity, sentence centrality, term frequency, document and inter document connections between
etc - which can be used to score the sentences and links. After that greedy algorithm used to penalize
then pick up the highly scored ones to generate the redundancy.
final summary. Another features which can detect
word or term importance are: TF*IDF (Term
Frequency-Inverse Document Frequency) [47], In the last few years, there are several
information-gain [48] which used to detect the researches proposed based on the graph approaches
relevance of terms or sentences, mutual- which also presented good results in
information which used to measure dependency or summarization. For instance, GRAPHSUM [56]
information shared between two terms and residual- was developed in 2013 as graph based summarizer
IDF (residual-inverse document frequency) in for novelty and general purpose in which
which term frequency is calculated based on association rules is performed to discover
Poisson distribution. In [49], summarization correlations between terms. Recent graph based
generated based on some features including approaches relies on lexical association for
similarity to centroid sentence in which centroid determining document topic. Murali in 2016 [57]
sentence captured based on TF-IDF and then each proposed technique based on lexical association
sentence calculated the similarity value with the with the help of graph-based ranking algorithms to

7744
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

assign relative weights for the retrieved keywords learn the ranking function that used in sentences
which used after that in sentences scoring. scoring. Based on RankNet, NetSum was
Ravinuthala in 2016 [58] assumed that the topics developed on 2007 [65], two layer neural network
are formed by identified words and then the central trained by RankNet - actually RankNet here was
idea formed through the topics, called theme. So, implemented in a more enhanced algorithm called
the technique depends on lexical association LambdaRank - to score sentences and then pick up
relationship to extract words that form document the highest ones. LambdaRank framework [66] is a
themes. TextRank and LexRank are fully- flexible enhanced algorithm for ranking which
unsupervised algorithms as they didn’t rely on works through non smooth target cost function,
training set but rather they depends on the entire providing a training speed up and more accuracy.
text. Support Vector Regression (SVR) algorithm is used
in [67], based on some features (such as sentence
4.3 Machine-Learning Approaches position, name entities, semantic features, word and
Variety of techniques based on machine- phrase features) in which the model trained to score
learning approaches are proposed which can be text sentences. Support Vector Machine (SVM) was
classified into supervised, semi-supervised or used in [68] for query-based summarization to
unsupervised approach. Supervised approaches reveal the relevant sentences to be inserted in the
needs training datasets (labeled data) represented in final summary. Also in 2009 [69], structural SVM
a set of documents with their human summaries so, used to summarize a single document taking into
it can be easily to learn and detect important consideration diversity, coverage, and the balance
features of the sentences. Supervised learning issues. A trainable summarizer was proposed in
techniques are such as Regression, Multilayer 2009 [15], focused on some features including
Neural network, Decision Tree, Support Vector sentence position, sentence centrality, positive and
machine, Genetic Algorithm and Naïve Bayesian negative word, Bushy path of node (sentence), etc.
Classier. Semi-supervised approaches depends on And the following models including: GA,
labeled and unlabeled data to produce the Mathematical Regression, Feed Forward NN,
convenient classifier; For instance, Support Vector Probabilistic NN and Gaussian Mixture Model are
machine (SVM) and Naïve Bayes Classier are used used to train previous features. Also Fattah and Ren
as semi-supervised learning techniques [59]. On the discuss the effects of each feature and showed that
other hand, unsupervised approaches generate the sentence Bushy path feature is the most
summaries without needing of training data. Hidden significant one, also showed that Gaussian Mixture
Markov Model, Clustering and Deep learning Model results outperform other models results. In
techniques (RBM, Autoencoder, Convolutional 2014 [70], multi-document summarization
network, RNN) are instances of unsupervised technique based on hybrid model of Maximum
learning technique. Entropy, Naïve Bayes and SVM which are trained
on some features to score sentences and then form
The earlier machine-learning techniques the final summary. Another algorithm to summarize
used are binary classifier, Bayesian method [60] single mono-lingual documents based on Memetic
and Hidden Markov Model. In Binary Classifier Algorithm (MA) is [71], in which genetic operators
using Bayes’ rule [61], the probability to include is used with the help of local search strategy, called
the sentence in summary is calculated for each MA-SingleDocSum and this technique
sentence given some features. And for Hidden outperformed state of the art methods. Another
Markov Model [62], the algorithm detects a technique for summarization belonging to
likelihood of each sentence to be included in the supervised approaches is Conditional Random Field
summary. Also in 2002 [63] a summarization (CRF), a popular probabilistic model that focusing
algorithm proposed based on Logistic Regression on machine-learning and used for structured
Model (LRM) and Hidden Markov Model (HMM) prediction. CRF in [72], used as a sequence
using a joint distribution to the features collection labelling problem to detect the correct features that
rather than the assumption of features include the interactions between sentences.
independency in Naive Bayesian techniques. And
for this assumption, HMM have advantage over On the other hand, a great efforts in
Naive Bayesian algorithm. unsupervised machine-learning approaches
occurred on the last years and updated
In 2005 [64], RankNet was discussed, a continuously. Starting with HMM as we mentioned
gradient descent method using Neural Network to before [62] where HMM detects the probability that

7745
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

each sentence should be included in the summary. Furthermore, the cosine similarity between each
Based on some statistical features including sentence and the query are calculated to accurately
sentence position, number of terms, baseline term select best sentences for the summary.
probability and document term probability,
calculate the posterior probability that each Despite, researchers face difficulty to
sentence can be picked up to be in the summary. cluster sentences compared to clustering the
The algorithm handle naïve Bayes classifier documents. Louvain clustering algorithm was
limitations by some dependency assumptions, introduced with the help of dependency graph for
including sentence positional dependency, single document summarization [75]. The
dependency among all features and dependency algorithm build dependency graph for sentences
between each two sentences where the probability and applying Louvain algorithm for words
to select one sentence to be in the summary clustering so, words within each cluster are scored
depends on the status of the previous one (it was based on the dependency relations. Furthermore,
included or not in the summary), called Markovity. scores of words are strengthened and enhanced by
several approaches, including increasing word
In [73], Fung and Ngai proposed a new score by one if it was mentioned in the context of
unsupervised training multi document another keyword (related keyword), and also
summarization technique which can be used to adding term frequency score of each word to its
generate summaries by picking up the prominent scores. After that, sentence score is calculated by
sentences or used to detect topics. The proposed the summation of its words scores, and then top
method combines vector space clustering model via sentences in scores are selected to form the
modified K-means for iteratively classifying summary.
articles and segmental K-means decoding for
paragraph and sentences classifications and tagging Another single document summarization
data into sentence-class pairs with a probabilistic approach based on Agglomerative clustering is
model via Hidden Markov Model for sentences proposed in [76]. After the document is
cohesion and clustering improvements. And then, preprocessed, it is represented by Vector Space
it’s easy now to extract the prominent sentences Modeling and the weights are assigned using TF-
from each theme (class) for the final summary. ISF measure. After that, Agglomerative nested
clustering (hierarchical approach) applied for
In recent years, leap occurred in sentences clustering based on cosine similarity
unsupervised machine learning approaches; measures and then sentences within each cluster are
especially in clustering, deep learning techniques. scored based on sentence similarities with other
A query-based document summarizer based on sentences in its cluster added to sentence similarity
OpenNLP tool and Clustering technique is with the title. Finally, from each sentence-cluster,
presented in [74]. The summarizer obtain pick up top two ranked sentences for the final
paragraphs from the document and build document summary. (Disadvantage here: lack of coherence).
graph, where nodes represent paragraphs and edges
represent syntactic relationships between nodes Moreover, Deep Learning Techniques
which calculated by semantic parsing. After that, represented by Boltzmann machines [77], [78],
K-mean clustering algorithm applied to group [34], Auto-Encoder [79], Convolutional Neural
coherent sentences with each other based on Network [80], [81], [82] and Recurrent Neural
associativity degree according to keywords in the Network [83] are recently proposed in
user’s query. Finally, picking up the top five nodes summarization field. The first paper that uses Deep
to form the final summary. Learning technique is [77], in which a Deep
Boltzmann machine is utilized for query oriented
And in [32], another clustering based multi document summarization. This algorithm
approach technique discussed to summarize query- tries to predict concept importance via Query
based multi documents, in which the documents are Oriented Deep Extraction (QODE); a three stages
clustered using cosine similarity; then sentences of Deep Belief Network (DBN): concept extraction,
within each document-cluster are clustered and then reconstruction validation, and summary generation.
pick up the best sentences from each sentence- In first stage, DBN is used to filter out not
cluster. This paper introduced the user query important words and discover others through DBN
strengthening where the most repeatedly words in layers. Then, apply fine tuning process (for
documents are picked up and added to the query. reconstructing distribution of data) to get important

7746
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

sentences. And finally, Dynamic Programming prestige and diversity cost of them. PriorSum
(DP) is used to maximize summary importance that model is proposed in [81] to determine the chance
make summary length equal to 250 words. of the sentence to be selected in a summary without
considering its context. An enhanced CNN is
In [78], Restricted Boltzmann machine applied to learn the overall set of document
(RBM) is used with two hidden layers where each independent features from variable-length phrases.
sentence represented by four features including title The enhanced CNN applies two max-over-time
similarity, sentence position, term weight and pooling operations, first one to detect the most
concept feature and so RBM input is sentences prominent features and the second to capture the
features vector. RBM aim to refine sentences by get best representative features. After that, the
optimal feature vector set and then score sentences generated independent features are combined with
by calculating intersection between each one and document dependent features such as position, term
user query, after that ranking sentences and select frequency and cluster frequency and working after
top sentences for the summary. Depending on the that with the regression model [67] for ranking
previous algorithm, another technique for single sentences. A query focused multi-document
document summarization proposed [34] where summarization model based on CNN is discussed in
features increased to be eleven-feature vector [82], where the model use weighted-sum pooling
values including sentence position, TF-ISF, over sentence embeddings to represent document
sentence to sentence and centroid similarity, named cluster by learning query relevance of the sentence
entity, etc. (from attention over sentence representations based
on the query). After that, sentences are ranked
In [79], a Deep Auto-Encoder technique is using their similarity representation to the
document cluster.
used for extractive query-based single document
summarization and based on local term frequency
feature the AE tries to detect and learn the features In [83], a Recurrent Neural Network
and then rank sentences using cosine measure with (RNN) based on Gated Recurrent Unit neural
subjects or key phrases. Unlike others deep learning network (GRU) is proposed to handle single
techniques which may suffer from sparse input document extractive summarization as sequence
representation, this technique proposed solutions to classification task in which a binary decision is
reduce this problem via two techniques. First, computed for each sentence (taking into
developing local word representation (a bag-of- consideration the previous decision made) to detect
words (BOW) representation) consisting of input whether it should be selected or not.
representations of each sentence in the document
and second, additional random noise value added to 4.4 Latent Semantic Analysis Approaches
the word representation weight. Also in this paper, Latent Semantic Analysis (LSA) is
another a Deep Auto-Encoder technique based on considered a fully-unsupervised method for
ensemble approach called Ensemble Noisy Auto- learning and representing the contextual usage
Encoder (ENAE) is used in which the model runs meaning of words by statistical computations; so, it
multiple times on the same input, each with has the ability to avoid the problem of synonymy
different added random noise to input by using semantic content of words. LSA
representation. This led to different extractive composed of three main steps including: input
summaries and then aggregate the ranking of these matrix creation, singular value decomposition
different experiments, after that sentences that (SVD) and sentence selection. In input matrix
occur most frequently are obtained to form the final creation, the input document is represented by a
summary. matrix in which columns are mapped to sentences,
rows are mapped to words and cells represent
In [80], Convolutional neural network importance of words in sentences. The function that
calculate cells values is called a weight function
(CNN) is applied for multi document
summarization to model and project sentences into which can be Normal, GFIDF, IDF or Entropy
distributed representation and then cosine similarity weight function [84]. In singular value
decomposition, to model the relationship between
measurement is applied for representing and
modeling the sentences redundancy. After that, words and sentences as it decompose the input
sentence selection method called diversified matrix into three other matrices (first and third
matrices represents vector of extracted values for
selection is used as an optimization problem to pick
up the high quality sentences by minimizing the original rows and original columns respectively

7747
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

and the second matrix represents scaling values and Ozsoy's 3. SVD Based on matrix VT,
the third matrix represents original columns as (Topic Calculations. creation of concept x
Method) [87] concept matrix,
vector of extracted values). In sentence selection, 4. Sentence strength value of each
important sentences are selected from SVD results, Selection. concept and
different algorithms used here like Gong and Liu discovering the main
[11], Steinberger and Jezek [85], Murray, Renals and sub concepts.
and Carletta [86] and Ozsoy [87]. A comprehensive
survey about these algorithms have been presented
here [88]. Table 3 below shows these methods in
comparative manner.
4.5 Fuzzy-Logic Approaches
Latent Semantic Analysis-based Text Some of the features used in the previous
relationship map (LSA + TRM) is proposed for summarization approaches such as main concepts,
automatic summarization [89], in which LSA is occurrence of anaphors and proper nouns have
used to obtain text’s semantic matrix and build binary values (zeros and ones) which sometimes are
relationship map based on sentence’s semantic not exact. To solve this problem, these binary
representation. After that, a global bushy path is feature can be redefined as fuzzy quantities to take
used to select important sentences to generate final values ranging from zero to one [91]. Fuzzy logic
summary. A multi-document Summarization are able to model common sense reasoning in
technique was proposed based on Optimal addition to dealing with uncertainty in an
Combinatorial Covering Algorithm (OCCAMS) unsupervised manner. On the other hand, the
[90] and outperforms all human generated classification solution is another task appeared
summaries (CLASSY11). OCCAMS is based on using fuzzy logic to summarize text. For instance,
LSA algorithm to learn terms distribution for in [92], fuzzy-rough set aided method is proposed
documents and then use optimization methods to extract key sentences, in which approach the
(greedy methods for Budgeted Maximal Coverage sentences takes relevance ranking based on fuzzy
and dynamic programming method Fully relevance clustering. The relevance of each
Polynomial Time Approximation Scheme) for sentence is maintained by a vector of these features:
maximizing combination of covered terms weight sentence position, length, TF-ISF and semantic
and minimizing redundancy. pattern, after that these vectors are clustered by
fuzzy c-mean algorithm (FCM) and the relevance
Table 3: LSA sentence selection algorithms score is computed for each sentences. Finally, pick
up sentences with relevance score larger than 0.5 to
LSA Main Steps Selection Criteria be candidate sentences and then select highest
algorithm scored sentence from each cluster to form the final
summary. This method tackle the problem of
Gong and 1. Input matrix Based on matrix VT.
“sentences of similar semantic meaning but written
Liu's Method creation.
[13]
in synonyms are treated differently” by depending
2. SVD on senses rather than raw words.
Steinberger Calculations. Based on matrix VT
and Iezek's and length of sentence
Method [85] 3. Sentence In [93], a single document summarization
vector.
Selection. approach is discussed based on nine features
Murray, Based on matrix VT including sentence centrality, position, length,
Renals and and ∑ matrices. number of proper noun, etc with using the
Carletta's combination of fuzzy rules and sets to pick up
Method [86]
sentences based on their features. On the other
Ozsoy's 1. Input matrix Based on matrix VT, hand, there are some researches supposing that
(Cross creation. average value of each
Method) [87] sentence and length of
integration of fuzzy logic with other approaches
2. Preprocessing. each sentence. will give better results, such as previously
mentioned approach which integrate fuzzy set with
rough set [92]. Another integration approach was
proposed in [94], which incorporated fuzzy logic
with swarm intelligence where features weights is
obtained from the swarm algorithm to adjust
features score and use them as inputs for the fuzzy

7748
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Figure1: Taxonomy of Extractive Text Summarization Techniques Categorized by Learning Type

Table 4: Advantages and Disadvantages of Extractive Text Summarization Approaches

Techniques Advantages Disadvantages

Statistical based approaches 1. Simple and fast processing. No linguistic knowledge
2. Requires less processor and memory capacity. processing or semantic
3. Unsupervised approaches, no need for training relation mapping [98].
datasets.
Graph based approaches 1. Can generate query-specific or topic-specific Accuracy will rely on the
summaries. selected affinity function.
2. Unsupervised approaches, no need for training
datasets.
Machine-learning based approaches 1. Simple. 1. Requires statistical
2. Easy to test performance of high number of data.
features. 2. Need a huge training
corpus for supervised
and semi- supervised
techniques.
Latent Semantic Analysis approaches 1. Provide Semantic relation. Difficult to handle
2. Present important information with least noise. polysemy.

Fuzzy logic based approaches 1. Knowledge-driven reasoning based, can take

better results if integrated with data-driven 1. Human experts are
technique. needed to define the
2. Fuzzy logic can give compression ratio as low as fuzzy rules.
20%. 2. Overhead in
designing the
membership function.

7749
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

inference system to gather the final scores. In [95], training datasets to generate high accuracy
fuzzy logic approach integrated with latent summaries. Latent semantic approaches are the best
semantic analysis (to keep aware of text semantics) to provide semantic relations and generate a good
for single document summarization where each coverage knowledge with least noise, but they still
approach generate a summary and then intersect suffer from polysemy problem. Fuzzy logic
both summaries to find the final one. Like the approaches are good alternative to improve
previous technique, another one is proposed in [96] sentence scoring problem and enhance
where fuzzy logic, bushy path and WordNet summarization if integrating with other techniques,
synonyms are used, each algorithm give different but human experts are need to define fuzzy rules.
summary and then find the intersection of these
summaries to form the final summary. Therefore, to handle the limitations of
given approach, it can be integrated with another
In [97], Adaptive Neuro-Fuzzy Inference helper technique to improve the accuracy of the
System (ANFIS) – that is used to summarize single summary. For instance, the usage of Fuzzy c-mean
documents – is a fuzzy inference system which clustering technique in [92] which reduce the
implemented based on the frameworks of NN. A redundancy and give good information coverage.
vector of nine features for each sentence including: Integrating of Fuzzy-Logic with LSA [95] which
title similarity, sentence position and similarity, handle sentence scoring problem and LSA that
numerical data, proper noun, etc will be input to generate semantic summaries. Also Greedy
nine neurons in ANFIS model. After that, each algorithms or Dynamic programming techniques
input converted to a fuzzy value using membership can be integrated to handle sentence selection task
function which then used to compute the firing to achieve high coverage and low redundancy [55],
strength of the corresponding rule. ANFIS model [56], [90].
contained premise and consequent parameters for
the IF and THEN that will be adjusted during the Table 4 above shows advantages and
training based on a combination of least-square disadvantages of the previous discussed 5
estimation and back-propagation gradient descent approaches.
method. The ANFIS model is learned to be able
from classifying sentences as summary and non- The recent unsupervised techniques that
summary sentence. This model tackle the problem have been discussed above (under the 5 approaches
of needing the human experts for building fuzzy
in Sect. 4) are compared in a tabular form with
rules by using subtractive clustering method to additional details about them. Table 5 below shows
automatically generate rules.
such a comparison of these unsupervised extractive
text summarization techniques.
5. COMPARING UNSUPERVISED
EXTRACTIVE TEXT SUMMARIZATION In text summarization supervised training
TECHNIQUES approaches, there is a need to obtain human labeled
class-sentence pairs to complete training and testing
While, there are many approaches for operations; but, hand labeling large collection of
extractive text summarization, each approach still documents with theme-classes is very tedious and
suffer from some limitations. time consuming task. In addition, there is a huge
amount of dispute between humans on manual
Statistical approaches have simple and fast labeling (annotation) of document themes and
processing without the need for training datasets, topics. How many themes or topics should be
but they generate summaries with no linguistic or present? What’s the beginning and ending of each
semantic knowledge. Graph approaches can topic? Therefore, it would be better to learn and
generate query or topic specific summaries with decode the hidden theme or topic of text using an
good information coverage, but the accuracy unsupervised training method without manually
depends on the used affinity function. Machine- labeled data (manually annotated data) and this is
learning approaches can represent document the first reason why we focus on unsupervised
features in appropriate manner, test the approaches.
performance of high number of features, providing
a solution for sentence scoring problem, but it is The second reason is that in supervised
recommended to use statistical data and a huge training approaches and given any corpus datasets,

7750
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

it’s possible to learn corpus rules and features by [4] A. Nenkova and K. McKeown, “Automatic
training and testing; but, such approaches become summarization,” Found. Trends® Inf. Retr.,
corpus-based approaches which cannot guarantee vol. 5, no. 2–3, pp. 103–233, 2011.
that the generated summaries are helpful, due to its [5] J. Goldstein, V. Mittal, J. Carbonell, and M.
shortage of coherence and cohesion and the Kantrowitz, “Multi-document summarization
disability of working with different datasets fields. by sentence extraction,” in Proceedings of the
So, it’s desirable to develop unsupervised algorithm 2000 NAACL-ANLP Workshop on Automatic
that learn and decode current document features summarization, 2000, pp. 40–48.
rather than training on its belonging corpus [6] R. Barzilay and K. R. McKeown, “Sentence
features. fusion for multidocument news
summarization,” Comput. Linguist., vol. 31,
6. CONCLUSION AND FUTURE WORKS no. 3, pp. 297–328, 2005.
[7] D. M. Zajic, B. J. Dorr, and J. Lin, “Single-
This paper provides an elaborative study
document and multi-document summarization
of different extractive text summarization
techniques for email threads using sentence
techniques and especially focusing on recent efforts
compression,” Inf. Process. Manag., vol. 44,
and advances in unsupervised approaches.
no. 4, pp. 1600–1610, 2008.
Moreover, we present quick discussion on text
[8] A. Nenkova, “Summarization evaluation for
summarization types and the evaluation task.
text and speech: issues and approaches,” in
While, there are many researchers focusing on
Ninth International Conference on Spoken
improving extractive summarization by supervised
Language Processing, 2006.
approaches by learning datasets features, there are
[9] S. Gholamrezazadeh, M. A. Salehi, and B.
also other researchers work toward the
Gholamzadeh, “A comprehensive survey on
improvement based on unsupervised approach.
text summarization systems,” in Computer
Unsupervised approaches aim to discover document
Science and its Applications, 2009. CSA’09.
hidden features or learn document semantic
2nd International Conference on, 2009, pp. 1–
representation without the need to train model over
6.
datasets. Furthermore, Due to the availability issues
[10] C. T. Shubhangi, “An approach to single
of summaries labels, the unsupervised approaches
document text summarization and
can be used to build these labels automatically. So,
simplification,” IOSR J. Comput. Eng., vol. 16,
there is a space to improve unsupervised techniques
no. 3, pp. 42–49, 2014.
in extractive summarization to discover new
[11] H. D. Kim, K. Ganesan, P. Sondhi, and C.
features for documents. On the other hand, the
Zhai, “Comprehensive review of opinion
evaluation field still representing a challenging task
summarization,” 2011.
and need more updates as due to the variety of
[12] B. Liu, “Sentiment analysis and opinion
summarization types, it’s required to find best
mining,” Synth. Lect. Hum. Lang. Technol.,
evaluation method that works effectively with each
vol. 5, no. 1, pp. 1–167, 2012.
type. Beside, while building manual summaries is a
[13] Y. Gong and X. Liu, “Generic text
tedious task and also two human experts usually
summarization using relevance measure and
build different summaries, there are need to make
latent semantic analysis,” in Proceedings of the
evaluation method automated; but, still we don’t
24th annual international ACM SIGIR
know whether evaluation automation can be done
conference on Research and development in
sufficiently.
information retrieval, 2001, pp. 19–25.
[14] D. M. Dunlavy, D. P. O’Leary, J. M. Conroy,
REFRENCES: and J. D. Schlesinger, “QCS: A system for
[1] N. Munot and S. S. Govilkar, “Comparative querying, clustering and summarizing
Study of Text Summarization Methods,” Int. J. documents,” Inf. Process. Manag., vol. 43, no.
Comput. Appl., vol. 102, no. 12, pp. 975–8887, 6, pp. 1588–1605, 2007.
2014. [15] X. Wan, “Using only cross-document
[2] H. P. Luhn, “The automatic creation of relationships for both generic and topic-
literature abstracts,” IBM J. Res. Dev., vol. 2, focused multi-document summarizations,” Inf.
no. 2, pp. 159–165, 1958. Retr. Boston., vol. 11, no. 1, pp. 25–49, 2008.
[3] K. S. Jones, “Automatic summarising: The [16] Y. Ouyang, W. Li, S. Li, and Q. Lu, “Applying
state of the art,” Inf. Process. Manag., vol. 43, regression models to query-focused multi-
no. 6, pp. 1449–1481, 2007. document summarization,” Inf. Process.

7751
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Manag., vol. 47, no. 2, pp. 227–237, 2011. J. Zimmerman, “User study for generating
[17] M. A. Fattah and F. Ren, “GA, MR, FFNN, personalized summary profiles,” in Multimedia
PNN and GMM based models for automatic and Expo, 2005. ICME 2005. IEEE
text summarization,” Comput. Speech Lang., International Conference on, 2005, pp. 1094–
vol. 23, no. 1, pp. 126–144, 2009. 1097.
[18] K. Sarkar, “Syntactic trimming of extracted [28] A. Díaz and P. Gervás, “User-model based
sentences for improving extractive multi- personalized summarization,” Inf. Process.
document summarization,” J. Comput, vol. 2, Manag., vol. 43, no. 6, pp. 1715–1734, 2007.
no. 7, pp. 177–184, 2010. [29] C. Kumar, P. Pingali, and V. Varma,
[19] J. Carbonell and J. Goldstein, “The use of “Generating personalized summaries using
MMR, diversity-based reranking for reordering publicly available web documents,” in Web
documents and producing summaries,” in Intelligence and Intelligent Agent Technology,
Proceedings of the 21st annual international 2008. WI-IAT’08. IEEE/WIC/ACM
ACM SIGIR conference on Research and International Conference on, 2008, vol. 3, pp.
development in information retrieval, 1998, pp. 103–106.
335–336. [30] R. Witte, R. Krestel, and S. Bergler,
[20] Y. Tao, S. Zhou, W. Lam, and J. Guan, “Generating update summaries for DUC
“Towards more effective text summarization 2007,” in Proceedings of the Document
based on textual association networks,” in Understanding Conference, 2007, pp. 1–5.
Semantics, Knowledge and Grid, 2008. [31] L. Zhou, M. Ticrea, and E. Hovy, “Multi-
SKG’08. Fourth International Conference on, document biography summarization,” arXiv
2008, pp. 235–240. Prepr. cs/0501078, 2005.
[21] D. Wang, S. Zhu, T. Li, Y. Chi, and Y. Gong, [32] A. R. Deshpande and L. Lobo, “Text
“Integrating document clustering and summarization using clustering technique,” Int.
multidocument summarization,” ACM Trans. J. Eng. Trends Technol., vol. 4, no. 8, 2013.
Knowl. Discov. from Data, vol. 5, no. 3, p. 14, [33] L. Vanderwende, H. Suzuki, C. Brockett, and
2011. A. Nenkova, “Beyond SumBasic: Task-
[22] C. Wang, L. Long, and L. Li, “HowNet based focused summarization with sentence
evaluation for Chinese text summarization,” in simplification and lexical expansion,” Inf.
Natural Language Processing and Knowledge Process. Manag., vol. 43, no. 6, pp. 1606–
Engineering, 2008. NLP-KE’08. International 1618, 2007.
Conference on, 2008, pp. 1–6. [34] A. Haghighi and L. Vanderwende, “Exploring
[23] D. Wang, T. Li, S. Zhu, and C. Ding, “Multi- content models for multi-document
document summarization via sentence-level summarization,” in Proceedings of Human
semantic analysis and symmetric matrix Language Technologies: The 2009 Annual
factorization,” in Proceedings of the 31st Conference of the North American Chapter of
annual international ACM SIGIR conference the Association for Computational Linguistics,
on Research and development in information 2009, pp. 362–370.
retrieval, 2008, pp. 307–314. [35] D. R. Radev, E. Hovy, and K. McKeown,
[24] D. Wang, S. Zhu, T. Li, and Y. Gong, “Multi- “Introduction to the special issue on
document summarization using sentence-based summarization,” Comput. Linguist., vol. 28,
topic models,” in Proceedings of the ACL- no. 4, pp. 399–408, 2002.
IJCNLP 2009 Conference Short Papers, 2009, [36] I. H. Witten, G. W. Paynter, E. Frank, C.
pp. 297–300. Gutwin, and C. G. Nevill-Manning, “KEA:
[25] J. L. Neto, A. A. Freitas, and C. A. A. Practical automatic keyphrase extraction,” in
Kaestner, “Automatic text summarization using Proceedings of the fourth ACM conference on
a machine learning approach,” in Brazilian Digital libraries, 1999, pp. 254–255.
Symposium on Artificial Intelligence, 2002, pp. [37] S. P. Singh, A. Kumar, A. Mangal, and S.
205–215. Singhal, “Bilingual automatic text
[26] D. R. Radev, W. Fan, and Z. Zhang, summarization using unsupervised deep
“Webinessence: A personalized web-based learning,” in Electrical, Electronics, and
multi-document summarization and Optimization Techniques (ICEEOT),
recommendation system,” Ann Arbor, vol. International Conference on, 2016, pp. 1195–
1001, p. 48103, 2001. 1200.
[27] L. Agnihotri, J. R. Kender, N. Dimitrova, and [38] L. H. Reeve, H. Han, S. V Nagori, J. C. Yang,

7752
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

T. A. Schwimmer, and A. D. Brooks, “Concept linguistics-Volume 1, 2002, pp. 1–7.

frequency distribution in biomedical text [49] D. R. Radev, H. Jing, M. Styś, and D. Tam,
summarization,” in Proceedings of the 15th “Centroid-based summarization of multiple
ACM international conference on Information documents,” Inf. Process. Manag., vol. 40, no.
and knowledge management, 2006, pp. 604– 6, pp. 919–938, 2004.
611. [50] C. Orasan, V. Pekar, and L. Hasler, “A
[39] G. Salton and C. Buckley, “Term-weighting Comparison of Summarisation Methods Based
approaches in automatic text retrieval,” Inf. on Term Specificity Estimation.,” in LREC,
Process. Manag., vol. 24, no. 5, pp. 513–523, 2004.
1988. [51] C. Orăsan, “Comparative evaluation of term-
[40] S. R. ANIL KUMAR, JYOTIYADAV, weighting methods for automatic
“Automatic Text Summarization Using summarization,” J. Quant. Linguist., vol. 16,
Regression Model (GA),” Int. J. Innov. Res. no. 1, pp. 67–95, 2009.
Comput. Commun. Eng., vol. 3, no. 5, pp. [52] S. Brin and L. Page, “The anatomy of a large-
4253–4260, 2015. scale hypertextual web search engine,”
[41] C.-Y. Lin and E. Hovy, “Automatic evaluation Comput. networks ISDN Syst., vol. 30, no. 1–7,
of summaries using n-gram co-occurrence pp. 107–117, 1998.
statistics,” in Proceedings of the 2003 [53] R. Mihalcea and P. Tarau, “Textrank: Bringing
Conference of the North American Chapter of order into text,” in Proceedings of the 2004
the Association for Computational Linguistics conference on empirical methods in natural
on Human Language Technology-Volume 1, language processing, 2004.
2003, pp. 71–78. [54] G. Erkan and D. R. Radev, “Lexrank: Graph-
[42] C.-Y. Lin, “Rouge: A package for automatic based lexical centrality as salience in text
evaluation of summaries,” Text Summ. summarization,” J. Artif. Intell. Res., vol. 22,
Branches Out, 2004. pp. 457–479, 2004.
[43] D. R. Radev and D. Tam, “Summarization [55] X. Wan and J. Xiao, “Towards a unified
evaluation using relative utility,” in approach based on affinity graph to various
Proceedings of the twelfth international multi-document summarizations,” in
conference on Information and knowledge International Conference on Theory and
management, 2003, pp. 508–511. Practice of Digital Libraries, 2007, pp. 297–
[44] S. Teufel and H. Van Halteren, “Evaluating 308.
information content by factoid analysis: human [56] E. Baralis, L. Cagliero, N. Mahoto, and A.
annotation and stability,” in Proceedings of the Fiori, “GRAPHSUM: Discovering correlations
2004 conference on empirical methods in among multiple terms for graph-based
natural language processing, 2004. summarization,” Inf. Sci. (Ny)., vol. 249, pp.
[45] A. Nenkova and R. Passonneau, “Evaluating 96–109, 2013.
content selection in summarization: The [57] R. V. V. M. Krishna and C. S. Reddy,
pyramid method,” in Proceedings of the human “Extractive Text Summarization Using Lexical
language technology conference of the north Association and Graph Based Text Analysis,”
american chapter of the association for in Computational Intelligence in Data
computational linguistics: Hlt-naacl 2004, Mining—Volume 1, Springer, 2016, pp. 261–
2004. 272.
[46] E. Lloret and M. Palomar, “A gradual [58] V. V. M. K. Ravinuthala and S. R. Chinnam,
combination of features for building automatic “A Keyword Extraction Approach for Single
summarisation systems,” in International Document Extractive Summarization Based on
Conference on Text, Speech and Dialogue, Topic Centrality.”
2009, pp. 16–23. [59] K.-F. Wong, M. Wu, and W. Li, “Extractive
[47] V. McCargar, “Statistical approaches to summarization using supervised and semi-
automatic text summarization,” Bull. Assoc. supervised learning,” in Proceedings of the
Inf. Sci. Technol., vol. 30, no. 4, pp. 21–25, 22nd International Conference on
2004. Computational Linguistics-Volume 1, 2008, pp.
[48] T. Mori, “Information gain ratio as term 985–992.
weight: the case of summarization of ir [60] C. Aone, M. E. Okurowski, and J. Gorlinsky,
results,” in Proceedings of the 19th “Trainable, scalable summarization using
international conference on Computational robust NLP and machine learning,” in

7753
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Proceedings of the 17th international [71] M. Mendoza, S. Bonilla, C. Noguera, C.

conference on Computational linguistics- Cobos, and E. León, “Extractive single-
Volume 1, 1998, pp. 62–66. document summarization based on genetic
[61] J. Kupiec, J. Pedersen, and F. Chen, “A operators and guided local search,” Expert
trainable document summarizer,” in Syst. Appl., vol. 41, no. 9, pp. 4158–4169,
Proceedings of the 18th annual international 2014.
ACM SIGIR conference on Research and [72] D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z.
development in information retrieval, 1995, pp. Chen, “Document Summarization Using
68–73. Conditional Random Fields.,” in IJCAI, 2007,
[62] J. M. Conroy and D. P. O’leary, “Text vol. 7, pp. 2862–2867.
summarization via hidden markov models,” in [73] P. Fung, G. Ngai, and C.-S. Cheung,
Proceedings of the 24th annual international “Combining optimal clustering and hidden
ACM SIGIR conference on Research and Markov models for extractive summarization,”
development in information retrieval, 2001, pp. in Proceedings of the ACL 2003 workshop on
406–407. Multilingual summarization and question
[63] J. D. Schlesinger et al., “Understanding answering-Volume 12, 2003, pp. 21–28.
machine performance in the context of human [74] H. J. Jain, M. S. Bewoor, and S. H. Patil,
performance for multi-document “Context Sensitive Text Summarization Using
summarization,” 2002. K Means Clustering Algorithm,” Int. J. Soft
[64] C. Burges et al., “Learning to rank using Comput. Eng., vol. 2, no. 2, 2012.
gradient descent,” in Proceedings of the 22nd [75] A. El-Kilany and I. Saleh, “Unsupervised
international conference on Machine learning, document summarization using clusters of
2005, pp. 89–96. dependency graph nodes,” in Intelligent
[65] K. Svore, L. Vanderwende, and C. Burges, Systems Design and Applications (ISDA), 2012
“Enhancing single-document summarization by 12th International Conference on, 2012, pp.
combining RankNet and third-party sources,” 557–561.
in Proceedings of the 2007 joint conference on [76] A. Sharaff, H. Shrawgi, P. Arora, and A.
empirical methods in natural language Verma, “Document Summarization by
processing and computational natural Agglomerative nested clustering approach,” in
language learning (EMNLP-CoNLL), 2007. Advances in Electronics, Communication and
[66] C. J. Burges, R. Ragno, and Q. V Le, Computer Technology (ICAECCT), 2016 IEEE
“Learning to rank with nonsmooth cost International Conference on, 2016, pp. 187–
functions,” in Advances in neural information 191.
processing systems, 2007, pp. 193–200. [77] S. Zhong, Y. Liu, B. Li, and J. Long, “Query-
[67] S. Li, Y. Ouyang, W. Wang, and B. Sun, oriented unsupervised multi-document
“Multi-document summarization using support summarization via deep learning model,”
vector regression,” in Proceedings of DUC, Expert Syst. Appl., vol. 42, no. 21, pp. 8146–
2007. 8155, 2015.
[68] M. Fuentes, E. Alfonseca, and H. Rodríguez, [78] G. PadmaPriya and K. Duraiswamy, “An
“Support vector machines for query-focused approach for text summarization using deep
summarization trained and evaluated on learning algorithm,” J. Comput. Sci., vol. 10,
pyramid data,” in Proceedings of the 45th no. 1, pp. 1–9, 2014.
Annual Meeting of the ACL on Interactive [79] M. Yousefi-Azar and L. Hamey, “Text
Poster and Demonstration Sessions, 2007, pp. summarization using unsupervised deep
57–60. learning,” Expert Syst. Appl., vol. 68, pp. 93–
[69] L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu, 105, 2017.
“Enhancing diversity, coverage and balance [80] W. Yin and Y. Pei, “Optimizing Sentence
for summarization through structure learning,” Modeling and Selection for Document
in Proceedings of the 18th international Summarization.,” in IJCAI, 2015, pp. 1383–
conference on World wide web, 2009, pp. 71– 1389.
80. [81] Z. Cao, F. Wei, S. Li, W. Li, M. Zhou, and W.
[70] M. A. Fattah, “A hybrid machine Houfeng, “Learning summary prior
learning model for multi-document representation for extractive summarization,”
summarization,” Appl. Intell., vol. 40, no. 4, in Proceedings of the 53rd Annual Meeting of
pp. 592–600, 2014. the Association for Computational Linguistics

7754
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

and the 7th International Joint Conference on 1, pp. 450–453.

Natural Language Processing (Volume 2: [93] L. Suanmali, M. S. Binwahlan, and N. Salim,
Short Papers), 2015, vol. 2, pp. 829–833. “Sentence features fusion for text
[82] Z. Cao, W. Li, S. Li, F. Wei, and Y. Li, summarization using fuzzy logic,” in Hybrid
“Attsum: Joint learning of focusing and Intelligent Systems, 2009. HIS’09. Ninth
summarization with neural attention,” arXiv International Conference on, 2009, vol. 1, pp.
Prepr. arXiv1604.00125, 2016. 142–146.
[83] R. Nallapati, F. Zhai, and B. Zhou, [94] M. S. Binwahlan, N. Salim, and L. Suanmali,
“SummaRuNNer: A Recurrent Neural Network “Fuzzy Swarm Based Text Summarization 1,”
Based Sequence Model for Extractive 2009.
Summarization of Documents.,” in AAAI, [95] S. A. Babar and P. D. Patil, “Improving
2017, pp. 3075–3081. performance of text summarization,” Procedia
[84] S. T. Dumais, “Improving the retrieval of Comput. Sci., vol. 46, pp. 354–363, 2015.
information from external sources,” Behav.
Res. Methods, Instruments, Comput., vol. 23, [96] J. Yadav and Y. K. Meena, “Use of fuzzy logic
no. 2, pp. 229–236, 1991. and wordnet for improving performance of
[85] J. Steinberger and K. Jezek, “Using latent extractive automatic text summarization,” in
semantic analysis in text summarization and Advances in Computing, Communications and
summary evaluation,” Proc. ISIM, vol. 4, pp. Informatics (ICACCI), 2016 International
93–100, 2004. Conference on, 2016, pp. 2071–2077.
[86] G. Murray, S. Renals, and J. Carletta, [97] Y. J. Kumar, F. J. Kang, O. S. Goh, and A.
“Extractive summarization of meeting Khan, “Text summarization based on
recordings.,” 2005. classification using ANFIS,” in Asian
[87] M. G. Ozsoy, I. Cicekli, and F. N. Alpaslan, Conference on Intelligent Information and
“Text summarization of turkish texts using Database Systems, 2017, pp. 405–417
latent semantic analysis,” in Proceedings of the [98] Y. Ko and J. Seo, “An effective sentence-
23rd international conference on extraction technique using contextual
computational linguistics, 2010, pp. 869–876. information and statistical approaches for text
[88] R. M. Badry, A. S. Eldin, and D. S. Elzanfally, summarization,” Pattern Recognit. Lett., vol.
“Text Summarization within the Latent 29, no. 9, pp. 1366–1371, 2008
Semantic Analysis Framework: Comparative
Study,” Int. J. Comput. Appl., vol. 81, no. 11,
2013.
[89] J.-Y. Yeh, H.-R. Ke, W.-P. Yang, and I.-H.
Meng, “Text summarization using a trainable
summarizer and latent semantic analysis,” Inf.
Process. Manag., vol. 41, no. 1, pp. 75–95,
2005.
[90] S. T. Davis, J. M. Conroy, and J. D.
Schlesinger, “OCCAMS--An Optimal
Combinatorial Covering Algorithm for Multi-
document Summarization,” in Data Mining
Workshops (ICDMW), 2012 IEEE 12th
International Conference on, 2012, pp. 454–
463.
[91] H. Khosravi, E. Eslami, F. Kyoomarsi, and P.
K. Dehkordy, “Optimizing text summarization
based on fuzzy logic,” in Computer and
Information Science, Springer, 2008, pp. 121–
130.
[92] H.-H. Huang, Y.-H. Kuo, and H.-C. Yang,
“Fuzzy-rough set aided sentence extraction
summarization,” in Innovative Computing,
Information and Control, 2006. ICICIC’06.
First International Conference on, 2006, vol.

7755
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Table 5: Comparison of Unsupervised Extractive Text Summarization Techniques

Year Used Algorithm Dataset Evaluation Comments

2004 Graph-based DUC 2002 ROUGE-1 = 0.4229 Input document. Single document.
ranking algorithm
(TextRank) [53] Adv. Adaptability with any language or
domain.

2004 Graph-based DUC 2003, DUC On DUC 2003, ROUGE-1 = 0.3646 Input document. Multi documents.
ranking algorithm 2004
(LexRank) [54] On DUC 2004, ROUGE-1 = 0.3966 Adv. Obtain good information coverage in
generated summary. Prevents unnaturally
On 17% noisy DUC 2003, high idf scores from increasing the score of
ROUGE-1 = 0.3621 a sentence that is unrelated to the topic
(work well with noisy data).
On 17% noisy DUC 2004,
ROUGE-1 = 0.3905

2005 LSA+TRM [89] 100 political Recall = Precision = F- measure = Input document. Single document.
articles from 0.4442
New Taiwan Adv. Generated summary composed of
Weekly semantically related sentences. Approach is
language independent.

Dis-Adv. Take large time to compute SVD.

Difficult to obtain best dimension reduction.
Shortage of coherence.

2006 Fuzzy-Rough set 8 pdf articles F-Measure = 0.4620391 Input document. Single document.
(Fuzzy c-mean from Journal of
clustering) [92] Artificial Adv. Give good information coverage and
Intelligence reduce redundancy.
Research (JAIR)

2007 Graph ranking DUC 2002, DUC On DUC 2002, ROUGE-1 = Input document. Multi documents.
algorithm (Affinity 2003, DUC 0.38111, ROUGE-2 = 0.08163,
Graph) + Greedy 2004, DUC 2005 ROUGE-W = 0.12292 Adv. Generate generic and Topic-focused
algorithm (for high summaries. Handle redundancy issue.
information On DUC 2004, ROUGE-1 =
richness & 0.39926, ROUGE-2 = 0.08793, Dis-Adv. Words are independent with each
novelty). [55] ROUGE-W = 0.12228 other; so, it may contain a shortage in
semantic relations.
On DUC 2003, ROUGE-1 =
0.36187, ROUGE-2 = 0.07114,
ROUGE-W = 0.11464

On DUC 2005, ROUGE-1 =

0.38354, ROUGE-2 = 0.07069,
ROUGE-W = 0.10080

2009 WF+TE+CQP DUC 2002, 5 On DUC 2002, F- measure of: Input document. Single document.
features in DUC articles from
2002. fairy tales ROUGE-1 = 0.45611, ROUGE-2 = Adv. Can summarize documents that have
domain 0.20252, ROUGE-SU4 = 0.22200, no title. Doesn’t require much processor.
ROUGE-L = 0.41382 Handle redundancy problem.

And WF+CQP On fairy tales, F- measure of:

features in fairy
tales. [46] ROUGE-1 = 0.41797, ROUGE-2 =
0.10267, ROUGE-SU4 = 0.15898,
ROUGE-L = 0.33742

7756
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

2009 Fuzzy-Logic [93] DUC 2002 ROUGE-1: Input document. Single document.

Precision = 0.47589, Adv. Solve binary values of features or

features that have low and high values; so, it
Recall = 0.46660, balance features values to balance weight in
computations.
F-Measure = 0.47019

2012 dependency DUC 2001, DUC Recall Score %: On DUC 2001, Input document. Single document.
graphs + 2002, British
Louvain Colombia ROUGE-1 = 45.7, ROUGE-L = Adv. Can summarize documents that have
clustering conversation 40.6, ROUGE-SU1 = 26.2 no title. Can summarize multiple genres of
algorithm corpus (BC3), documents and is language independent.
(keywords level) Concisus corpus On DUC 2002, ROUGE-1 = 48.8,
[75] of event ROUGE-L = 44, ROUGE-SU1 =
summaries 29.4

On BC3, ROUGE-1 = 79.8,

ROUGE-L = 79.4, ROUGE-SU1 =
71.8

On Concisus, ROUGE-1 = 47.7,

ROUGE-L = 39.1, ROUGE-SU1 =
30.6

2012 LSA + DUC 2005, On DUC 2005, ROUGE-2 = 0.081, Input document. Multi documents.
Optimization ROUGE-SU4 = 0.134,
methods (Greedy DUC 2006, Adv. Using greedy method and dynamic
method + On DUC 2006, ROUGE-2 = 0.102, programming algorithm to handle weight
Dynamic DUC 2007, terms computation task and sentences
programming) ROUGE-SU4 = 0.152, extraction task separately which achieve
TAC 2008, high coverage with low redundancy. The
[90]
On DUC 2007, ROUGE-2 = 0.128, model is language-independent.
TAC 2009,
ROUGE-SU4 = 0.175,
TAC 2010,
On TAC 2008, ROUGE-2 = 0.103,
TAC 2011,
ROUGE-SU4 = 0.136,

On TAC 2009, ROUGE-2 = 0.110,

ROUGE-SU4 = 0.142,

On TAC 2010, ROUGE-2 = 0.108,

ROUGE-SU4 = 0.135,

On TAC 2011, ROUGE-2 = 0.131,

ROUGE-SU4 = 0.162

2013 Graph ranking DUC 2004, 5 On DUC 2004, Input document. Multi documents.
algorithm + real life
Association rule documents in ROUGE-2: Adv. Can discover correlations between
mining + Greedy news. terms by association rules. A flexible and
algorithm (for Recall = 0.093, Precision = 0.099, F- portable approach.
maximum measure = 0.097
coverage &
ROUGE-SU4:
relevance). [56]
Recall = 0.015, Precision = 0.021, F-
measure = 0.019

7757
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

2014 Deep learning Documents from On networking domain, Input document. Multi documents.
(Restricted networking and
Boltzmann software Recall = 0.429, Precision = 0.6, F- Dis-Adv. Sensitivity to datasets.
Machine) [78] engineering measure = 0.490
domains
On software engineering domain,

Recall = 0.342, Precision = 0.83, F-

measure = 0.469

2015 Fuzzy-Logic + 10 different Average results (%): Input document. Single document.
LSA [95] datasets
Recall = 44.36375, Precision = Adv. Handle sentences scoring problem by
90.77572, F-measure = 67.56974 fuzzy logic and generate semantically
summaries based on LSA.

2015 Deep learning DUC 2005, On DUC 2005, ROUGE-1 = 0.3751, Input document. Multi documents.
(DBN) + Dynamic ROUGE-2 = 0.0775, ROUGE-SU4 =
programming [77] DUC 2006, 0.1341 Adv. First algorithm to summarize query
oriented multi-documents by deep learning.
DUC 2007 On DUC 2006, ROUGE-1 = 0.4015, Significant concepts are pushed out layer by
layer efficiently. Perfect model for feature
ROUGE-2 = 0.0928, ROUGE-SU4 = extraction.
0.1479

On DUC 2007, ROUGE-1 = 0.4295,

ROUGE-2 = 0.1163, ROUGE-SU4 =

0.1685

2015 Deep learning DUC 2002, On DUC 2002, ROUGE-1 = Input document. Multi documents.
(CNN Language 0.51013, ROUGE-2 = 0.26972,
model) + Cosine DUC 2004 ROUGE-SU4 = 0.29431 Adv. Powerful model in sentence
similarity + On DUC 2004, ROUGE-1 = representation based on Neural network
Optimization 0.40907, ROUGE-2 = 0.10723, language model. Handle redundancy issue.
method (DivSelect Provide DivSelect as diversified selection
with help of ROUGE-SU4 = 0.14969 method. Keep the diversity and prestige of
PageRank chosen sentences to be balanced.
algorithm) [80]

2015 Deep learning DUC 2001, The results are (%) Input document. Multi documents.
(CNN) +
Regression model DUC 2002, On DUC 2001, ROUGE-1 = 35.98, Adv. Pick up the independent features of the
+ Greedy document which reflect it. The model able
algorithm [81] DUC 2004, ROUGE-2 = 7.89, to avail all potential semantic representation
aspects hidden in the text. Handle
On DUC 2002, ROUGE-1 = 36.63, redundancy issue.
ROUGE-2 = 8.97,

On DUC 2004, ROUGE-1 = 38.91,

ROUGE-2 = 10.07,

2016 Graph ranking DUC 2002 ROUGE-1 Recall = 0.48645, Input document. Single document.
algorithm (word
order relationship ROUGE-2 Recall = 0.39927, Adv. Find keywords that represent text
for connecting topic based on lexical association. Spending
vertices) + Lexical low time while extracting keywords. Good
association [57] coherence in the final summary.

2016 Deep learning DUC 2005, The results are (%), Input document. Multi documents.
(CNN) + Greedy
algorithm [82] DUC 2006, On DUC 2005, ROUGE-1 = 37.01, Adv. Used to summarize query-focused
multi documents. Handle query relevance
DUC 2007 ROUGE-2 = 6.99, and saliency of sentences issues jointly
together. Applied neural attention method

7758
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

On DUC 2006, ROUGE-1 = 40.90, simulate human nature while reading a

document and having query in their mind.
ROUGE-2 = 9.40,

On DUC 2007, ROUGE-1 = 43.92,

ROUGE-2 = 11.55,

2017 Deep learning Summarization On Subject-oriented Input document. Single document.

(Deep Auto- and Keyword summarization with SKE and for
encoder) [79] Extraction from 5 sentences summary length, the Adv. Ability to generate concept vector
Emails (SKE), ROUGE-2 Recall of LTF-ENAE representation for the original sentences.
BC3 from British (Gaussian) = 0.5031, Generate high informative and semantic
Columbia summaries. Handle sparse representation
University On Key- phrase oriented problem by local term frequency (LTF) and
summarization with SKE and for extra random noise.
5 sentences summary length, the
ROUGE-2 Recall of LTF-AE = Dis-Adv. Training computational cost and
0.5657, the requirement of tuning the training hyper-
parameters.
On subject oriented
summarization with BC3 and for 4
sentences summary length, the
ROUGE-2 Recall of LTF-AE =
0.1084,

2017 Deep learning CNN/Daily Mail On Daily Mail, The Recall value Input document. Single document.
(GRU-RNN) + corpus, with:
Greedy algorithm Adv. Interpretability of visualization for its
[83] DUC 2002 75 bytes of summary length: predictions. Allow the extractive model to
ROUGE-1 = 26.2, ROUGE-2 = 10.8, be trained using extractive labels (via
ROUGE-L = 14.4 unsupervised way which convert abstractive
summaries to extractive labels), and using
275 bytes of summary length: human (abstractive) summaries without the
ROUGE-1 = 42.0, ROUGE-2 = 16.9, needs of labeled data.
ROUGE-L = 34.1

On DUC 2002, The Recall value

with 75 words of summary length:
ROUGE-1 = 46.6, ROUGE-2 = 23.1,
ROUGE-L = 43.03

2017 Adaptive Neuro- DUC 2002 Precision = 0.7128, Recall = 0.6982, Input document. Single document.
Fuzzy Inference
System (ANFIS) F-measure = 0.7054 Adv. Tackle the problem of needing the
(Fuzzy-logic based human experts for building fuzzy rules by
on neural network) using subtractive clustering method to
[97] automatically generate rules.

2017 Graph ranking DUC 2002 For ROUGE-1: Precision = Input document. Single document.
algorithm (Topic 0.51430, Recall = 0.61643,
Association Adv. Present new technique for connecting
Graph) + Lexical F-measure = 0.56050 the vertices by the way that increase the
association [58] incoming edges for topic central words
For ROUGE-2: Precision = (Topic Association Graph). Enabling the
0.40323, Recall = 0.48410, usage of centrality measures degrees for
calculating vertices strength.
F-measure = 0.43977

7759

View publication stats

Brachistochrone Curve Math IA Final
100% (2)
Brachistochrone Curve Math IA Final
21 pages
Sarason ComplexFunctionTheory PDF
No ratings yet
Sarason ComplexFunctionTheory PDF
177 pages
Paper A Survey On ETS
No ratings yet
Paper A Survey On ETS
6 pages
An Extractive Approach for English Text
No ratings yet
An Extractive Approach for English Text
11 pages
22mca025 22mca032 22mca034
No ratings yet
22mca025 22mca032 22mca034
14 pages
moawad2012
No ratings yet
moawad2012
7 pages
State of The Art Text - Summarisation
No ratings yet
State of The Art Text - Summarisation
15 pages
Feature Based Automatic Text Summarization Methods a Comprehensive State-Of-The-Art Survey
No ratings yet
Feature Based Automatic Text Summarization Methods a Comprehensive State-Of-The-Art Survey
23 pages
Abstractive Text Summarization: State of The Art, Challenges, and Improvements
No ratings yet
Abstractive Text Summarization: State of The Art, Challenges, and Improvements
38 pages
Optimal Features Set For Extractive Automatic Text Summarization
No ratings yet
Optimal Features Set For Extractive Automatic Text Summarization
6 pages
Research Paper On Text
No ratings yet
Research Paper On Text
7 pages
Analysis of Abstractive and Extractive Summarizati
No ratings yet
Analysis of Abstractive and Extractive Summarizati
11 pages
A Review Paper On Extractive Techniques of Text Summarization
No ratings yet
A Review Paper On Extractive Techniques of Text Summarization
4 pages
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
No ratings yet
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
8 pages
EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
No ratings yet
EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
12 pages
An Overview of Extractive Based Automati
No ratings yet
An Overview of Extractive Based Automati
12 pages
Text Summarizing Using NLP
No ratings yet
Text Summarizing Using NLP
8 pages
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
No ratings yet
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
29 pages
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
No ratings yet
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
6 pages
A.V.C. College of Engineering: Mayiladuthurai, Mannampandal-609 305
No ratings yet
A.V.C. College of Engineering: Mayiladuthurai, Mannampandal-609 305
21 pages
Text Summarization Using Python NLTK
No ratings yet
Text Summarization Using Python NLTK
8 pages
A Multi-Metric Model For Analyzing and Comparing e
No ratings yet
A Multi-Metric Model For Analyzing and Comparing e
18 pages
Research Final
No ratings yet
Research Final
6 pages
Text Summarization Using Natural Language Processing
No ratings yet
Text Summarization Using Natural Language Processing
5 pages
Implementation-of-NLP-based-automatic-text-summarization-using-spacy
No ratings yet
Implementation-of-NLP-based-automatic-text-summarization-using-spacy
15 pages
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
No ratings yet
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
13 pages
Abstractive Text Summarization Using Transformer Based Approach
No ratings yet
Abstractive Text Summarization Using Transformer Based Approach
10 pages
Robin 3 PDF
No ratings yet
Robin 3 PDF
6 pages
Proposing An Extractive Mono-Document Summarization System For Persian Language
No ratings yet
Proposing An Extractive Mono-Document Summarization System For Persian Language
8 pages
IEEE_Conference_Template__3_.pdf
No ratings yet
IEEE_Conference_Template__3_.pdf
4 pages
Conceptual Framework For Abstractive Text Summarization
No ratings yet
Conceptual Framework For Abstractive Text Summarization
11 pages
ATSSI Abstractive Text Summarization Using Sentiment Infusion
No ratings yet
ATSSI Abstractive Text Summarization Using Sentiment Infusion
7 pages
Automatic_Text_Summarization_using_Text_Rank_Algorithm
No ratings yet
Automatic_Text_Summarization_using_Text_Rank_Algorithm
6 pages
An Automatic Text Summarization Using Feature Terms For Relevance Measure
No ratings yet
An Automatic Text Summarization Using Feature Terms For Relevance Measure
5 pages
Abstractive Survey
No ratings yet
Abstractive Survey
8 pages
Text Summarization Using Word Frequency
No ratings yet
Text Summarization Using Word Frequency
3 pages
Irsw Project
No ratings yet
Irsw Project
8 pages
Text Summarization:An Overview: October 2013
No ratings yet
Text Summarization:An Overview: October 2013
6 pages
Analysis On Text Summarization
No ratings yet
Analysis On Text Summarization
10 pages
Automatic Summarization of Document Using Machine Learning
No ratings yet
Automatic Summarization of Document Using Machine Learning
3 pages
Malayalam 2
No ratings yet
Malayalam 2
4 pages
Automatic Text Summarization Using Natural Language Processing
No ratings yet
Automatic Text Summarization Using Natural Language Processing
54 pages
Automatic Text Summarization Using Natural Language Processing PDF
No ratings yet
Automatic Text Summarization Using Natural Language Processing PDF
54 pages
Automatic Text Summarization Methods: A Comprehensive Review
No ratings yet
Automatic Text Summarization Methods: A Comprehensive Review
20 pages
Irjet V6i4564
No ratings yet
Irjet V6i4564
3 pages
(IJCST-V3I4P21) : Ms - Pallavi.D.Patil, P.M.Mane
No ratings yet
(IJCST-V3I4P21) : Ms - Pallavi.D.Patil, P.M.Mane
7 pages
Automatic Text Summarization Using Python
No ratings yet
Automatic Text Summarization Using Python
8 pages
A_Survey_of_Advances_in_Text_Summarization_Methods
No ratings yet
A_Survey_of_Advances_in_Text_Summarization_Methods
5 pages
IEEE_Conference_Template__3_
No ratings yet
IEEE_Conference_Template__3_
4 pages
Coas Ojit 0502 03065k
No ratings yet
Coas Ojit 0502 03065k
16 pages
Comparative Study of Text Summarization Methods
No ratings yet
Comparative Study of Text Summarization Methods
6 pages
Project File
No ratings yet
Project File
23 pages
150
No ratings yet
150
6 pages
A Graph Based Approach On Extractive Summarization
No ratings yet
A Graph Based Approach On Extractive Summarization
9 pages
Abstractive Text Summarization Using Transformer Architecture
No ratings yet
Abstractive Text Summarization Using Transformer Architecture
5 pages
Automatic Text Document Summarization Based On Machine Learning
No ratings yet
Automatic Text Document Summarization Based On Machine Learning
4 pages
Text Summarization
No ratings yet
Text Summarization
3 pages
PPT FOR MP
No ratings yet
PPT FOR MP
13 pages
Text Summarization
No ratings yet
Text Summarization
6 pages
Extractive Text Summarization Using Word Frequency
No ratings yet
Extractive Text Summarization Using Word Frequency
6 pages
Economic Multi Agent Systems: Design, Implementation, and Application
From Everand
Economic Multi Agent Systems: Design, Implementation, and Application
Gottfried Haber
4/5 (1)
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
From Everand
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
Ruchi Doshi
No ratings yet
QM UNIT 5 Hypothesis Testing
No ratings yet
QM UNIT 5 Hypothesis Testing
34 pages
University of Perpetual Help Dalta - Calamba
No ratings yet
University of Perpetual Help Dalta - Calamba
70 pages
Material
No ratings yet
Material
21 pages
Unit 1 Fuzzy Logic
No ratings yet
Unit 1 Fuzzy Logic
29 pages
Part 2 + Part 3
No ratings yet
Part 2 + Part 3
43 pages
_Copy of task 2 (2)
No ratings yet
_Copy of task 2 (2)
11 pages
Ftest (Anova)
No ratings yet
Ftest (Anova)
28 pages
PAES 228 - Fiber Decorticator - Specs
No ratings yet
PAES 228 - Fiber Decorticator - Specs
7 pages
DLL - Mathematics 2 - Q1 - W4
No ratings yet
DLL - Mathematics 2 - Q1 - W4
4 pages
Institute of Ethiopian Studies Journal of Ethiopian Studies
No ratings yet
Institute of Ethiopian Studies Journal of Ethiopian Studies
18 pages
Evolution of Management
No ratings yet
Evolution of Management
2 pages
Enhancing students' learning achievements, self-efficacy, and motivation using mobile augmented reality
No ratings yet
Enhancing students' learning achievements, self-efficacy, and motivation using mobile augmented reality
15 pages
Midterm 1 Phys 7A SP 14 Sec 2 and 3 DeWeese
No ratings yet
Midterm 1 Phys 7A SP 14 Sec 2 and 3 DeWeese
3 pages
The Importance of Saving Habits in Terms of Economy
No ratings yet
The Importance of Saving Habits in Terms of Economy
2 pages
Sciegraphics 1
No ratings yet
Sciegraphics 1
11 pages
Abrencillo, Angel Alcaide, Mica Aquino, Miles Aranilla, Vea Arpon, Jessica Asistin Rose Ann
No ratings yet
Abrencillo, Angel Alcaide, Mica Aquino, Miles Aranilla, Vea Arpon, Jessica Asistin Rose Ann
11 pages
Chapter - 2 O G S S: Verview of RID Tiffened Tructure
No ratings yet
Chapter - 2 O G S S: Verview of RID Tiffened Tructure
12 pages
Zeshui Xu (Auth.) - Uncertain Multi-Attribute Decision Making - Methods and Applications-Springer-Verlag Berlin Heidelberg (2015)
No ratings yet
Zeshui Xu (Auth.) - Uncertain Multi-Attribute Decision Making - Methods and Applications-Springer-Verlag Berlin Heidelberg (2015)
375 pages
Celestial Navigation
No ratings yet
Celestial Navigation
40 pages
Choiceboard
No ratings yet
Choiceboard
2 pages
Week 2 - Le
No ratings yet
Week 2 - Le
10 pages
Visvesvaraya Technological University: (Your Internship Title)
No ratings yet
Visvesvaraya Technological University: (Your Internship Title)
4 pages
Monroe's Persuassive Sequence 1
No ratings yet
Monroe's Persuassive Sequence 1
6 pages
ARI 126 (2017) 256-262 Practical Implementation of ISO 11929 2010 - de Felice, P., Jerome, S., & Petrucci, A.
No ratings yet
ARI 126 (2017) 256-262 Practical Implementation of ISO 11929 2010 - de Felice, P., Jerome, S., & Petrucci, A.
21 pages
Chapter One and Two
No ratings yet
Chapter One and Two
131 pages
Dissertation Poetics of Blackness W Intr PDF
No ratings yet
Dissertation Poetics of Blackness W Intr PDF
235 pages
Module 2 Ge LWR
No ratings yet
Module 2 Ge LWR
52 pages
FORENSIC 106FORENSIC BALLISTICS CHAPTER 1 To Chapter5
No ratings yet
FORENSIC 106FORENSIC BALLISTICS CHAPTER 1 To Chapter5
11 pages

Types of Extractive Methods

Uploaded by

Types of Extractive Methods

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Review of recent techniques for extractive text summarization

Article in Journal of Theoretical and Applied Information Technology · December 2018

The user has requested enhancement of the downloaded file.

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

REVIEW OF RECENT TECHNIQUES FOR EXTRACTIVE

1. INTRODUCTION However, the generation of automatic text

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Table 1: Extractive Text Summarization Common Features

Features Description Comments

Type Statistical features Linguistic features

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

3. TEXT SUMMARIZATION EVALUATION

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

centroid based on cosine similarity and then

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Figure1: Taxonomy of Extractive Text Summarization Techniques Categorized by Learning Type

Table 4: Advantages and Disadvantages of Extractive Text Summarization Approaches

Techniques Advantages Disadvantages

Fuzzy logic based approaches 1. Knowledge-driven reasoning based, can take

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

T. A. Schwimmer, and A. D. Brooks, “Concept linguistics-Volume 1, 2002, pp. 1–7.

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Proceedings of the 17th international [71] M. Mendoza, S. Bonilla, C. Noguera, C.

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

and the 7th International Joint Conference on 1, pp. 450–453.

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Table 5: Comparison of Unsupervised Extractive Text Summarization Techniques

Year Used Algorithm Dataset Evaluation Comments

Dis-Adv. Take large time to compute SVD.

On DUC 2005, ROUGE-1 =

And WF+CQP On fairy tales, F- measure of:

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Precision = 0.47589, Adv. Solve binary values of features or

On BC3, ROUGE-1 = 79.8,

On Concisus, ROUGE-1 = 47.7,

On TAC 2009, ROUGE-2 = 0.110,

On TAC 2010, ROUGE-2 = 0.108,

On TAC 2011, ROUGE-2 = 0.131,

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Recall = 0.342, Precision = 0.83, F-

On DUC 2007, ROUGE-1 = 0.4295,

ROUGE-2 = 0.1163, ROUGE-SU4 =

On DUC 2004, ROUGE-1 = 38.91,

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

On DUC 2006, ROUGE-1 = 40.90, simulate human nature while reading a

On DUC 2007, ROUGE-1 = 43.92,

2017 Deep learning Summarization On Subject-oriented Input document. Single document.

On DUC 2002, The Recall value

View publication stats

You might also like