Types of Extractive Methods
Types of Extractive Methods
net/publication/329945598
CITATION READS
1 378
3 authors, including:
Ahmed El-Refaiy
Zagazig University
1 PUBLICATION 1 CITATION
SEE PROFILE
All content following this page was uploaded by Ahmed El-Refaiy on 26 February 2019.
ABSTRACT
In the view of a significant increase in the burden of information over and over the limit by the amount of
information available on the internet, there is a huge increase in the amount of information overloading and
redundancy contained in each document. Extracting important information in a summarized format would
help a number of users. It is therefore necessary to have proper and properly prepared summaries.
Subsequently, many research papers are proposed continuously to develop new approaches to automatically
summarize the text. “Automatic Text Summarization” is a process to create a shorter version of the original
text (one or more documents) which conveys information present in the documents. In general, the
summary of the text can be categorized into two types: Extractive-based and Abstractive-based.
Abstractive-based methods are very complicated as they need to address a huge-scale natural language.
Therefore, research communities are focusing on extractive summaries, attempting to achieve more
consistent, non-recurring and meaningful summaries. This review provides an elaborative survey of
extractive text summarization techniques. Specifically, it focuses on unsupervised techniques, providing
recent efforts and advances on them and list their strengths and weaknesses points in a comparative tabular
manner. In addition, this review highlights efforts made in the evaluation techniques of the summaries and
finally deduces some possible future trends.
Keywords: Extractive Text Summarization - Summarization Review - Artificial Intelligence - Information
Retrieval - Natural Language Processing
7739
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
is not clear, even by human beings, what kind of the way that form opinion mining, according to
information the summary should contain [8]. person feelings towards the subject, product or
entities. Many researchers have been presented
The two basic types of text summarization are
comprehensive survey about this type of
abstractive and extractive [1]. Extractive summary
summarization [29], [30].
extract the important and meaningful sentences
from the original text and placing them into Based on the content type of the original text, the
summary without any changes. Abstractive summarization may be considered as Generic or
summary doesn’t not rely on concatenating Query based [11], [12], [13], [9], and [14]. In
sentences; instead of that, it analyze the original Generic summary, Extracted information is not a
text semantically to understand it and build more user specific and doesn’t rely on the document
coherent meaningful related conclusion summary. subject. In Query based summarization, the
The sentences in the summary may not be present generated summary based on the user query. So, it
in the original text. Abstractive summary give more present the user view. Query based summarization
generalized summary but it is difficult to compute. can be named as Topic-focused or user- focused
summaries.
Many researchers have presented comprehensive
surveys about text summarization. Some of them Based on the limitation of input text,
still focusing on improving extractive text summarization can be Genre specific, Domain
summarization and the others move toward Dependent or Domain Independent systems [9]. In
abstractive summarization. Previously analysis on Genre specific systems, specific inputs types only
extractive text summarization presented elaborative can be accepted such as, stories, newspaper articles
studies for well-known approaches, recently etc. Domain Dependent systems deal with text
discussed types or evaluation techniques to gain which their subject defined in the fixed domain.
knowledge about text summarization key issues. In Domain Independent systems can accept any type
this paper, classification of extractive text of text as they are not relied on the domain.
summarization techniques is done into different
Based on the number of input documents, in
new categories including supervised, semi-
which the system input can be one or more
supervised and unsupervised. We focusing on
documents [9]. It can be divided into Single
unsupervised techniques, providing state of the art
Document or Multi-Document Summarization [7],
efforts and advances on them and list their strengths
[15]. In Single Document Summarization,
and weaknesses points in a comparative tabular
summarization is built on one document only
manner. In addition, we highlight efforts made in
whereas in Multi-Document Summarization,
the evaluation task and finally deduces some
summarization is built based on more than one
possible future trends. To our knowledge, we are
document, all of them are of the same topic. Multi-
the first to present such study for the unsupervised
document summarization may suffer from some
field in extractive text summarization.
issues such as redundancy, sentence ordering,
In recent years, progress has been made in text temporal dimension, co-reference which make this
summarization in various aspects, leading to the task more difficult than summarizing task of single
appearance of different subtypes under the two document type. [5].The most prominent issue which
basic types. Based on the summarization Purpose, also appeared more with multi-document task is
type of details or style of output, the summarization redundancy. So, there are some attempts to tackle
can be Indicative, Informative or Critical [9], [10]. this problem such as selecting the sentences at the
Indicative summary present the main idea of the beginning of the paragraph and then measure the
entire document, it gives the user a quick view from similarity of the following sentence with the
the original text. So, it may not contain all sentences already chosen and this sentence is
important factual content. Informative summary retained only if it consists of new related content
express the important concise information of the [16]. Maximal Marginal Relevance approach
original text to the user. In Critical summary the produced at 1998 [17]. Another different methods
document is criticized. For example, In the case of suggested by researchers trying to achieve best
the scientific paper, it can expresses an opinion results in multi-document summarization [18], [13],
[10]. The most feasible type to automate is [19], [20], [21], and [22].
Indicative summary and the least one is Critical.
Based on the language of the text, which the
Like Critical summary which can express an
system can accept. Summarization can be Mono
opinion for the document, scientific paper etc.
Lingual System, Multi Lingual or cross-lingual
Sentiment based summary generate summaries in
7740
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
System. Mono Lingual System deal with documents 2. EXTRACTIVE TEXT SUMMARIZATION
with specific language and the produced summary BACKGROUND
is based on that language. In Multi-Lingual
Extractive text summarization done by
System, source documents are more than one
picking up the most important sentences from the
language and generated summary are in these
original text in the way that forms the final
different languages. In cross-lingual, the input
summary. Extractive techniques generally generate
document is in specific language and the output is
summaries through 3 phases or it essentially based
in a different language than input language.
on them. These phases are preprocessing step,
Based on the level of linguistic space. processing step and generation step:
Summarization approaches can be either Shallow 1) Preprocessing step: the representation
Approach or Deeper Approach [23]. Shallow space dimensionality of the original text is reduced
approaches limited on syntactically representation to involve a new structure representation. It usually
and try to extract the prominent parts of the text. includes:
Deeper approach restricted on semantically a. Stop-word elimination: Common words
representation and basically depend on linguistic without semantics that do not collect information
processes during the extraction method. relevant to the task (for example, "the", "a", "an",
"in") are eliminated.
Every year the Web pages increase significantly
b. Steaming: Acquire the stem of each
and there are some of search engines return list of
word by bringing the word to its base form.
web pages as a result for a single search query.
c. Part of speech tagging: The process of
Users usually need to know which documents are
identifying and classifying words of the text on the
relevant and which are not through going through
basis of part of speech category they belong (nouns,
multiple pages. In addition, they are abandoning the
verbs, adverbs, adjectives).
search in the first attempt. Therefore, it’s important
to generate summaries and pick up important
information in web pages. Such summaries are Another technique used here is case
web-based summaries. WebInEssence is a search folding, in which all characters are converted to the
engine which can generate summaries from clusters same kind of letter case, either lower case or upper
of related documents [24]. Due to e-mail case [23] . But, it's not good to use this technique
overloading problem that happens when e-mails when dealing with documents in domains which
keep coming in the inbox and great time consuming suppose for example that the appearance of upper
in reading or archiving them, there is a need to case word in the sentence increase its importance
summarize email conversations. Such type of [32]. Finally in this phase, the sentences are
summarization is called E-mail based analyzed and transformed in terms of features to be
summarization. ready for the next stage. The sentences are analyzed
on the basis of statistical, linguistic or hybrid
Summarization also can be Personalized which analysis of features where statistical features
generate summary of information related to the user doesn’t take into consideration word meanings but,
interests. Therefore, the summary system need to linguistic features goes deeply to capture semantic
keep tracking with user profile to be able to meanings. Each sentence in the document is
determine relevant information that the user is transformed in terms of these features so that we
interested in. User profile can be determined by can determine whether it is important enough to
statistical mapping method from personality include it in the summary or not. Table 1 below
characteristics such as genders with some other shows extractive text summarization common
features [25]. Another different methods suggested features and table 2 below shows comparison
by researchers using this type of summarization between extractive text summarization statistical
[26], [27]. Update based summary generate and linguistic features.
summaries by acquiring the latest updates related to
the topic by taking into considerations that users 2) Processing step: It uses an algorithm
already have fundamental knowledge on the subject with the help of features generated in the
[28]. Survey summaries are another kind which preprocessing step to convert the text structure to
present a long overview for a specific subject or the summary structure. In which, the sentences are
entity, trying to gathering the most significant facts scored.
belonging to any entity, person, place etc. Survey 3) Generation step: sentences are ranked.
summaries contain these types: Wikipedia articles, Then, it pick up the most important sentences from
Survey summary and biographical summary [31].
7741
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
Table 2: Comparison between Extractive Text Summarization Statistical and Linguistic Features
7742
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
the ranked structure to generate the final required consuming, a lot of techniques have been made to
summary. automate evaluation task. Evaluation can be
The last two stages - processing and generation computed by two ways:
steps - can be also described approximately as three 1. Extrinsic evaluation: evaluation of
main components: sentence scoring, selection and summary done based on how it provides help to
paraphrasing (reformulation). other tasks. It includes several methods like:
At sentence scoring, for each sentence a a. Relevance assessment: it evaluate the
score is assigned which points to its significance. relevance of a topic in the summary or original text.
After that, the most important sentences is b. Reading comprehension: it represents
extracted. Sentence scoring can be done via several the capability or correctness of answering multiple
approaches: supervised, semi-supervised or choices questions that can be gathered after reading
unsupervised approaches (cf. Sect. 4). At sentence summary.
selection, the summarization system has to specify
the best collection of significant sentences that form 2.Intrinsic evaluation: it depends on human
the final summary with taking into consideration judgment as, it evaluate the summary based on the
the most prominent factors: redundancy and coverage of this summary (system summary) and
cohesion. The traditional method for sentence the human-written summary and so, the evaluation
selection is to pick up the top ranked sentences of the summary can be Quality or informativeness.
directly but, the redundancy elimination is the key a. Informativeness evaluation: it is
issue especially for multi-document summarization. computed by comparing system summary with
There are more than one approach used for this task human-written summary or comparing the system
(sentence selection). For instance, Maximum summary with the original text to check that the
Marginal Relevance (MMR) is the most popular summary contains similar contents as original text.
approach for such task [17] which find the linear It includes: ROUGE [41], [42], Relative utility
incorporation for relevance and novelty – [43], Factoid Score [44], Pyramid Method [45], etc.
independently– measures. Another approaches b. Quality evaluation: it is provided based
based on the Kullback–Leibler (KL) divergence in on linguistics so expert humans evaluate summaries
which sentences are selected in the way that manually based on five linguistic questions
decrease the KL divergence between words including: non redundancy, focus, grammaticality,
probability distribution of the candidate summary referential clarity, and structure and Coherence.
and probability distribution from the input [38], Due to none of the previous questions can be
[39]. And because decreasing KL divergence are properly modeled automatically; thus, manual
mathematically tenacious, it is optimized via greedy evaluation is irreplaceable.
selection. Recall-Oriented Understudy for Gisty
Evaluation (ROUGE) [41], [42] is the standard
At sentence paraphrasing (reformulation), method to evaluate summarization automatically. It
the selected sentences to form the summary are is based on the comparison of n-grams between the
modified or reformulated in order to enhance the system summary (to be evaluated) and reference
summary, provide more cohesion and clarity and summaries (human-written summaries). ROUGE
also eliminate redundant or unnecessary metrics have more than one shape including:
information, for example the usage of reformulation ROUGE-N (refer to n-grams), ROUGE-S (skip
and sentence fusion [6]. bigrams), ROUGE-L (longest common
subsequence), ROUGE-W (weighted longest
The summarization process main phases common subsequence), or ROUGE-SU (skip
can be discussed by another view in which it bigrams and unigrams). The most commonly used
contains the following three main subtasks: topic one is ROUGE-N, in which n-gram based metrics
identification, interpretation and finally the are computed with the recall, precision and f-
summary generation [40]. measure oriented score as following:
7743
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
7744
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
assign relative weights for the retrieved keywords learn the ranking function that used in sentences
which used after that in sentences scoring. scoring. Based on RankNet, NetSum was
Ravinuthala in 2016 [58] assumed that the topics developed on 2007 [65], two layer neural network
are formed by identified words and then the central trained by RankNet - actually RankNet here was
idea formed through the topics, called theme. So, implemented in a more enhanced algorithm called
the technique depends on lexical association LambdaRank - to score sentences and then pick up
relationship to extract words that form document the highest ones. LambdaRank framework [66] is a
themes. TextRank and LexRank are fully- flexible enhanced algorithm for ranking which
unsupervised algorithms as they didn’t rely on works through non smooth target cost function,
training set but rather they depends on the entire providing a training speed up and more accuracy.
text. Support Vector Regression (SVR) algorithm is used
in [67], based on some features (such as sentence
4.3 Machine-Learning Approaches position, name entities, semantic features, word and
Variety of techniques based on machine- phrase features) in which the model trained to score
learning approaches are proposed which can be text sentences. Support Vector Machine (SVM) was
classified into supervised, semi-supervised or used in [68] for query-based summarization to
unsupervised approach. Supervised approaches reveal the relevant sentences to be inserted in the
needs training datasets (labeled data) represented in final summary. Also in 2009 [69], structural SVM
a set of documents with their human summaries so, used to summarize a single document taking into
it can be easily to learn and detect important consideration diversity, coverage, and the balance
features of the sentences. Supervised learning issues. A trainable summarizer was proposed in
techniques are such as Regression, Multilayer 2009 [15], focused on some features including
Neural network, Decision Tree, Support Vector sentence position, sentence centrality, positive and
machine, Genetic Algorithm and Naïve Bayesian negative word, Bushy path of node (sentence), etc.
Classier. Semi-supervised approaches depends on And the following models including: GA,
labeled and unlabeled data to produce the Mathematical Regression, Feed Forward NN,
convenient classifier; For instance, Support Vector Probabilistic NN and Gaussian Mixture Model are
machine (SVM) and Naïve Bayes Classier are used used to train previous features. Also Fattah and Ren
as semi-supervised learning techniques [59]. On the discuss the effects of each feature and showed that
other hand, unsupervised approaches generate the sentence Bushy path feature is the most
summaries without needing of training data. Hidden significant one, also showed that Gaussian Mixture
Markov Model, Clustering and Deep learning Model results outperform other models results. In
techniques (RBM, Autoencoder, Convolutional 2014 [70], multi-document summarization
network, RNN) are instances of unsupervised technique based on hybrid model of Maximum
learning technique. Entropy, Naïve Bayes and SVM which are trained
on some features to score sentences and then form
The earlier machine-learning techniques the final summary. Another algorithm to summarize
used are binary classifier, Bayesian method [60] single mono-lingual documents based on Memetic
and Hidden Markov Model. In Binary Classifier Algorithm (MA) is [71], in which genetic operators
using Bayes’ rule [61], the probability to include is used with the help of local search strategy, called
the sentence in summary is calculated for each MA-SingleDocSum and this technique
sentence given some features. And for Hidden outperformed state of the art methods. Another
Markov Model [62], the algorithm detects a technique for summarization belonging to
likelihood of each sentence to be included in the supervised approaches is Conditional Random Field
summary. Also in 2002 [63] a summarization (CRF), a popular probabilistic model that focusing
algorithm proposed based on Logistic Regression on machine-learning and used for structured
Model (LRM) and Hidden Markov Model (HMM) prediction. CRF in [72], used as a sequence
using a joint distribution to the features collection labelling problem to detect the correct features that
rather than the assumption of features include the interactions between sentences.
independency in Naive Bayesian techniques. And
for this assumption, HMM have advantage over On the other hand, a great efforts in
Naive Bayesian algorithm. unsupervised machine-learning approaches
occurred on the last years and updated
In 2005 [64], RankNet was discussed, a continuously. Starting with HMM as we mentioned
gradient descent method using Neural Network to before [62] where HMM detects the probability that
7745
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
each sentence should be included in the summary. Furthermore, the cosine similarity between each
Based on some statistical features including sentence and the query are calculated to accurately
sentence position, number of terms, baseline term select best sentences for the summary.
probability and document term probability,
calculate the posterior probability that each Despite, researchers face difficulty to
sentence can be picked up to be in the summary. cluster sentences compared to clustering the
The algorithm handle naïve Bayes classifier documents. Louvain clustering algorithm was
limitations by some dependency assumptions, introduced with the help of dependency graph for
including sentence positional dependency, single document summarization [75]. The
dependency among all features and dependency algorithm build dependency graph for sentences
between each two sentences where the probability and applying Louvain algorithm for words
to select one sentence to be in the summary clustering so, words within each cluster are scored
depends on the status of the previous one (it was based on the dependency relations. Furthermore,
included or not in the summary), called Markovity. scores of words are strengthened and enhanced by
several approaches, including increasing word
In [73], Fung and Ngai proposed a new score by one if it was mentioned in the context of
unsupervised training multi document another keyword (related keyword), and also
summarization technique which can be used to adding term frequency score of each word to its
generate summaries by picking up the prominent scores. After that, sentence score is calculated by
sentences or used to detect topics. The proposed the summation of its words scores, and then top
method combines vector space clustering model via sentences in scores are selected to form the
modified K-means for iteratively classifying summary.
articles and segmental K-means decoding for
paragraph and sentences classifications and tagging Another single document summarization
data into sentence-class pairs with a probabilistic approach based on Agglomerative clustering is
model via Hidden Markov Model for sentences proposed in [76]. After the document is
cohesion and clustering improvements. And then, preprocessed, it is represented by Vector Space
it’s easy now to extract the prominent sentences Modeling and the weights are assigned using TF-
from each theme (class) for the final summary. ISF measure. After that, Agglomerative nested
clustering (hierarchical approach) applied for
In recent years, leap occurred in sentences clustering based on cosine similarity
unsupervised machine learning approaches; measures and then sentences within each cluster are
especially in clustering, deep learning techniques. scored based on sentence similarities with other
A query-based document summarizer based on sentences in its cluster added to sentence similarity
OpenNLP tool and Clustering technique is with the title. Finally, from each sentence-cluster,
presented in [74]. The summarizer obtain pick up top two ranked sentences for the final
paragraphs from the document and build document summary. (Disadvantage here: lack of coherence).
graph, where nodes represent paragraphs and edges
represent syntactic relationships between nodes Moreover, Deep Learning Techniques
which calculated by semantic parsing. After that, represented by Boltzmann machines [77], [78],
K-mean clustering algorithm applied to group [34], Auto-Encoder [79], Convolutional Neural
coherent sentences with each other based on Network [80], [81], [82] and Recurrent Neural
associativity degree according to keywords in the Network [83] are recently proposed in
user’s query. Finally, picking up the top five nodes summarization field. The first paper that uses Deep
to form the final summary. Learning technique is [77], in which a Deep
Boltzmann machine is utilized for query oriented
And in [32], another clustering based multi document summarization. This algorithm
approach technique discussed to summarize query- tries to predict concept importance via Query
based multi documents, in which the documents are Oriented Deep Extraction (QODE); a three stages
clustered using cosine similarity; then sentences of Deep Belief Network (DBN): concept extraction,
within each document-cluster are clustered and then reconstruction validation, and summary generation.
pick up the best sentences from each sentence- In first stage, DBN is used to filter out not
cluster. This paper introduced the user query important words and discover others through DBN
strengthening where the most repeatedly words in layers. Then, apply fine tuning process (for
documents are picked up and added to the query. reconstructing distribution of data) to get important
7746
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
sentences. And finally, Dynamic Programming prestige and diversity cost of them. PriorSum
(DP) is used to maximize summary importance that model is proposed in [81] to determine the chance
make summary length equal to 250 words. of the sentence to be selected in a summary without
considering its context. An enhanced CNN is
In [78], Restricted Boltzmann machine applied to learn the overall set of document
(RBM) is used with two hidden layers where each independent features from variable-length phrases.
sentence represented by four features including title The enhanced CNN applies two max-over-time
similarity, sentence position, term weight and pooling operations, first one to detect the most
concept feature and so RBM input is sentences prominent features and the second to capture the
features vector. RBM aim to refine sentences by get best representative features. After that, the
optimal feature vector set and then score sentences generated independent features are combined with
by calculating intersection between each one and document dependent features such as position, term
user query, after that ranking sentences and select frequency and cluster frequency and working after
top sentences for the summary. Depending on the that with the regression model [67] for ranking
previous algorithm, another technique for single sentences. A query focused multi-document
document summarization proposed [34] where summarization model based on CNN is discussed in
features increased to be eleven-feature vector [82], where the model use weighted-sum pooling
values including sentence position, TF-ISF, over sentence embeddings to represent document
sentence to sentence and centroid similarity, named cluster by learning query relevance of the sentence
entity, etc. (from attention over sentence representations based
on the query). After that, sentences are ranked
In [79], a Deep Auto-Encoder technique is using their similarity representation to the
document cluster.
used for extractive query-based single document
summarization and based on local term frequency
feature the AE tries to detect and learn the features In [83], a Recurrent Neural Network
and then rank sentences using cosine measure with (RNN) based on Gated Recurrent Unit neural
subjects or key phrases. Unlike others deep learning network (GRU) is proposed to handle single
techniques which may suffer from sparse input document extractive summarization as sequence
representation, this technique proposed solutions to classification task in which a binary decision is
reduce this problem via two techniques. First, computed for each sentence (taking into
developing local word representation (a bag-of- consideration the previous decision made) to detect
words (BOW) representation) consisting of input whether it should be selected or not.
representations of each sentence in the document
and second, additional random noise value added to 4.4 Latent Semantic Analysis Approaches
the word representation weight. Also in this paper, Latent Semantic Analysis (LSA) is
another a Deep Auto-Encoder technique based on considered a fully-unsupervised method for
ensemble approach called Ensemble Noisy Auto- learning and representing the contextual usage
Encoder (ENAE) is used in which the model runs meaning of words by statistical computations; so, it
multiple times on the same input, each with has the ability to avoid the problem of synonymy
different added random noise to input by using semantic content of words. LSA
representation. This led to different extractive composed of three main steps including: input
summaries and then aggregate the ranking of these matrix creation, singular value decomposition
different experiments, after that sentences that (SVD) and sentence selection. In input matrix
occur most frequently are obtained to form the final creation, the input document is represented by a
summary. matrix in which columns are mapped to sentences,
rows are mapped to words and cells represent
In [80], Convolutional neural network importance of words in sentences. The function that
calculate cells values is called a weight function
(CNN) is applied for multi document
summarization to model and project sentences into which can be Normal, GFIDF, IDF or Entropy
distributed representation and then cosine similarity weight function [84]. In singular value
decomposition, to model the relationship between
measurement is applied for representing and
modeling the sentences redundancy. After that, words and sentences as it decompose the input
sentence selection method called diversified matrix into three other matrices (first and third
matrices represents vector of extracted values for
selection is used as an optimization problem to pick
up the high quality sentences by minimizing the original rows and original columns respectively
7747
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
and the second matrix represents scaling values and Ozsoy's 3. SVD Based on matrix VT,
the third matrix represents original columns as (Topic Calculations. creation of concept x
Method) [87] concept matrix,
vector of extracted values). In sentence selection, 4. Sentence strength value of each
important sentences are selected from SVD results, Selection. concept and
different algorithms used here like Gong and Liu discovering the main
[11], Steinberger and Jezek [85], Murray, Renals and sub concepts.
and Carletta [86] and Ozsoy [87]. A comprehensive
survey about these algorithms have been presented
here [88]. Table 3 below shows these methods in
comparative manner.
4.5 Fuzzy-Logic Approaches
Latent Semantic Analysis-based Text Some of the features used in the previous
relationship map (LSA + TRM) is proposed for summarization approaches such as main concepts,
automatic summarization [89], in which LSA is occurrence of anaphors and proper nouns have
used to obtain text’s semantic matrix and build binary values (zeros and ones) which sometimes are
relationship map based on sentence’s semantic not exact. To solve this problem, these binary
representation. After that, a global bushy path is feature can be redefined as fuzzy quantities to take
used to select important sentences to generate final values ranging from zero to one [91]. Fuzzy logic
summary. A multi-document Summarization are able to model common sense reasoning in
technique was proposed based on Optimal addition to dealing with uncertainty in an
Combinatorial Covering Algorithm (OCCAMS) unsupervised manner. On the other hand, the
[90] and outperforms all human generated classification solution is another task appeared
summaries (CLASSY11). OCCAMS is based on using fuzzy logic to summarize text. For instance,
LSA algorithm to learn terms distribution for in [92], fuzzy-rough set aided method is proposed
documents and then use optimization methods to extract key sentences, in which approach the
(greedy methods for Budgeted Maximal Coverage sentences takes relevance ranking based on fuzzy
and dynamic programming method Fully relevance clustering. The relevance of each
Polynomial Time Approximation Scheme) for sentence is maintained by a vector of these features:
maximizing combination of covered terms weight sentence position, length, TF-ISF and semantic
and minimizing redundancy. pattern, after that these vectors are clustered by
fuzzy c-mean algorithm (FCM) and the relevance
Table 3: LSA sentence selection algorithms score is computed for each sentences. Finally, pick
up sentences with relevance score larger than 0.5 to
LSA Main Steps Selection Criteria be candidate sentences and then select highest
algorithm scored sentence from each cluster to form the final
summary. This method tackle the problem of
Gong and 1. Input matrix Based on matrix VT.
“sentences of similar semantic meaning but written
Liu's Method creation.
[13]
in synonyms are treated differently” by depending
2. SVD on senses rather than raw words.
Steinberger Calculations. Based on matrix VT
and Iezek's and length of sentence
Method [85] 3. Sentence In [93], a single document summarization
vector.
Selection. approach is discussed based on nine features
Murray, Based on matrix VT including sentence centrality, position, length,
Renals and and ∑ matrices. number of proper noun, etc with using the
Carletta's combination of fuzzy rules and sets to pick up
Method [86]
sentences based on their features. On the other
Ozsoy's 1. Input matrix Based on matrix VT, hand, there are some researches supposing that
(Cross creation. average value of each
Method) [87] sentence and length of
integration of fuzzy logic with other approaches
2. Preprocessing. each sentence. will give better results, such as previously
mentioned approach which integrate fuzzy set with
rough set [92]. Another integration approach was
proposed in [94], which incorporated fuzzy logic
with swarm intelligence where features weights is
obtained from the swarm algorithm to adjust
features score and use them as inputs for the fuzzy
7748
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
7749
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
inference system to gather the final scores. In [95], training datasets to generate high accuracy
fuzzy logic approach integrated with latent summaries. Latent semantic approaches are the best
semantic analysis (to keep aware of text semantics) to provide semantic relations and generate a good
for single document summarization where each coverage knowledge with least noise, but they still
approach generate a summary and then intersect suffer from polysemy problem. Fuzzy logic
both summaries to find the final one. Like the approaches are good alternative to improve
previous technique, another one is proposed in [96] sentence scoring problem and enhance
where fuzzy logic, bushy path and WordNet summarization if integrating with other techniques,
synonyms are used, each algorithm give different but human experts are need to define fuzzy rules.
summary and then find the intersection of these
summaries to form the final summary. Therefore, to handle the limitations of
given approach, it can be integrated with another
In [97], Adaptive Neuro-Fuzzy Inference helper technique to improve the accuracy of the
System (ANFIS) – that is used to summarize single summary. For instance, the usage of Fuzzy c-mean
documents – is a fuzzy inference system which clustering technique in [92] which reduce the
implemented based on the frameworks of NN. A redundancy and give good information coverage.
vector of nine features for each sentence including: Integrating of Fuzzy-Logic with LSA [95] which
title similarity, sentence position and similarity, handle sentence scoring problem and LSA that
numerical data, proper noun, etc will be input to generate semantic summaries. Also Greedy
nine neurons in ANFIS model. After that, each algorithms or Dynamic programming techniques
input converted to a fuzzy value using membership can be integrated to handle sentence selection task
function which then used to compute the firing to achieve high coverage and low redundancy [55],
strength of the corresponding rule. ANFIS model [56], [90].
contained premise and consequent parameters for
the IF and THEN that will be adjusted during the Table 4 above shows advantages and
training based on a combination of least-square disadvantages of the previous discussed 5
estimation and back-propagation gradient descent approaches.
method. The ANFIS model is learned to be able
from classifying sentences as summary and non- The recent unsupervised techniques that
summary sentence. This model tackle the problem have been discussed above (under the 5 approaches
of needing the human experts for building fuzzy
in Sect. 4) are compared in a tabular form with
rules by using subtractive clustering method to additional details about them. Table 5 below shows
automatically generate rules.
such a comparison of these unsupervised extractive
text summarization techniques.
5. COMPARING UNSUPERVISED
EXTRACTIVE TEXT SUMMARIZATION In text summarization supervised training
TECHNIQUES approaches, there is a need to obtain human labeled
class-sentence pairs to complete training and testing
While, there are many approaches for operations; but, hand labeling large collection of
extractive text summarization, each approach still documents with theme-classes is very tedious and
suffer from some limitations. time consuming task. In addition, there is a huge
amount of dispute between humans on manual
Statistical approaches have simple and fast labeling (annotation) of document themes and
processing without the need for training datasets, topics. How many themes or topics should be
but they generate summaries with no linguistic or present? What’s the beginning and ending of each
semantic knowledge. Graph approaches can topic? Therefore, it would be better to learn and
generate query or topic specific summaries with decode the hidden theme or topic of text using an
good information coverage, but the accuracy unsupervised training method without manually
depends on the used affinity function. Machine- labeled data (manually annotated data) and this is
learning approaches can represent document the first reason why we focus on unsupervised
features in appropriate manner, test the approaches.
performance of high number of features, providing
a solution for sentence scoring problem, but it is The second reason is that in supervised
recommended to use statistical data and a huge training approaches and given any corpus datasets,
7750
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
it’s possible to learn corpus rules and features by [4] A. Nenkova and K. McKeown, “Automatic
training and testing; but, such approaches become summarization,” Found. Trends® Inf. Retr.,
corpus-based approaches which cannot guarantee vol. 5, no. 2–3, pp. 103–233, 2011.
that the generated summaries are helpful, due to its [5] J. Goldstein, V. Mittal, J. Carbonell, and M.
shortage of coherence and cohesion and the Kantrowitz, “Multi-document summarization
disability of working with different datasets fields. by sentence extraction,” in Proceedings of the
So, it’s desirable to develop unsupervised algorithm 2000 NAACL-ANLP Workshop on Automatic
that learn and decode current document features summarization, 2000, pp. 40–48.
rather than training on its belonging corpus [6] R. Barzilay and K. R. McKeown, “Sentence
features. fusion for multidocument news
summarization,” Comput. Linguist., vol. 31,
6. CONCLUSION AND FUTURE WORKS no. 3, pp. 297–328, 2005.
[7] D. M. Zajic, B. J. Dorr, and J. Lin, “Single-
This paper provides an elaborative study
document and multi-document summarization
of different extractive text summarization
techniques for email threads using sentence
techniques and especially focusing on recent efforts
compression,” Inf. Process. Manag., vol. 44,
and advances in unsupervised approaches.
no. 4, pp. 1600–1610, 2008.
Moreover, we present quick discussion on text
[8] A. Nenkova, “Summarization evaluation for
summarization types and the evaluation task.
text and speech: issues and approaches,” in
While, there are many researchers focusing on
Ninth International Conference on Spoken
improving extractive summarization by supervised
Language Processing, 2006.
approaches by learning datasets features, there are
[9] S. Gholamrezazadeh, M. A. Salehi, and B.
also other researchers work toward the
Gholamzadeh, “A comprehensive survey on
improvement based on unsupervised approach.
text summarization systems,” in Computer
Unsupervised approaches aim to discover document
Science and its Applications, 2009. CSA’09.
hidden features or learn document semantic
2nd International Conference on, 2009, pp. 1–
representation without the need to train model over
6.
datasets. Furthermore, Due to the availability issues
[10] C. T. Shubhangi, “An approach to single
of summaries labels, the unsupervised approaches
document text summarization and
can be used to build these labels automatically. So,
simplification,” IOSR J. Comput. Eng., vol. 16,
there is a space to improve unsupervised techniques
no. 3, pp. 42–49, 2014.
in extractive summarization to discover new
[11] H. D. Kim, K. Ganesan, P. Sondhi, and C.
features for documents. On the other hand, the
Zhai, “Comprehensive review of opinion
evaluation field still representing a challenging task
summarization,” 2011.
and need more updates as due to the variety of
[12] B. Liu, “Sentiment analysis and opinion
summarization types, it’s required to find best
mining,” Synth. Lect. Hum. Lang. Technol.,
evaluation method that works effectively with each
vol. 5, no. 1, pp. 1–167, 2012.
type. Beside, while building manual summaries is a
[13] Y. Gong and X. Liu, “Generic text
tedious task and also two human experts usually
summarization using relevance measure and
build different summaries, there are need to make
latent semantic analysis,” in Proceedings of the
evaluation method automated; but, still we don’t
24th annual international ACM SIGIR
know whether evaluation automation can be done
conference on Research and development in
sufficiently.
information retrieval, 2001, pp. 19–25.
[14] D. M. Dunlavy, D. P. O’Leary, J. M. Conroy,
REFRENCES: and J. D. Schlesinger, “QCS: A system for
[1] N. Munot and S. S. Govilkar, “Comparative querying, clustering and summarizing
Study of Text Summarization Methods,” Int. J. documents,” Inf. Process. Manag., vol. 43, no.
Comput. Appl., vol. 102, no. 12, pp. 975–8887, 6, pp. 1588–1605, 2007.
2014. [15] X. Wan, “Using only cross-document
[2] H. P. Luhn, “The automatic creation of relationships for both generic and topic-
literature abstracts,” IBM J. Res. Dev., vol. 2, focused multi-document summarizations,” Inf.
no. 2, pp. 159–165, 1958. Retr. Boston., vol. 11, no. 1, pp. 25–49, 2008.
[3] K. S. Jones, “Automatic summarising: The [16] Y. Ouyang, W. Li, S. Li, and Q. Lu, “Applying
state of the art,” Inf. Process. Manag., vol. 43, regression models to query-focused multi-
no. 6, pp. 1449–1481, 2007. document summarization,” Inf. Process.
7751
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
Manag., vol. 47, no. 2, pp. 227–237, 2011. J. Zimmerman, “User study for generating
[17] M. A. Fattah and F. Ren, “GA, MR, FFNN, personalized summary profiles,” in Multimedia
PNN and GMM based models for automatic and Expo, 2005. ICME 2005. IEEE
text summarization,” Comput. Speech Lang., International Conference on, 2005, pp. 1094–
vol. 23, no. 1, pp. 126–144, 2009. 1097.
[18] K. Sarkar, “Syntactic trimming of extracted [28] A. Díaz and P. Gervás, “User-model based
sentences for improving extractive multi- personalized summarization,” Inf. Process.
document summarization,” J. Comput, vol. 2, Manag., vol. 43, no. 6, pp. 1715–1734, 2007.
no. 7, pp. 177–184, 2010. [29] C. Kumar, P. Pingali, and V. Varma,
[19] J. Carbonell and J. Goldstein, “The use of “Generating personalized summaries using
MMR, diversity-based reranking for reordering publicly available web documents,” in Web
documents and producing summaries,” in Intelligence and Intelligent Agent Technology,
Proceedings of the 21st annual international 2008. WI-IAT’08. IEEE/WIC/ACM
ACM SIGIR conference on Research and International Conference on, 2008, vol. 3, pp.
development in information retrieval, 1998, pp. 103–106.
335–336. [30] R. Witte, R. Krestel, and S. Bergler,
[20] Y. Tao, S. Zhou, W. Lam, and J. Guan, “Generating update summaries for DUC
“Towards more effective text summarization 2007,” in Proceedings of the Document
based on textual association networks,” in Understanding Conference, 2007, pp. 1–5.
Semantics, Knowledge and Grid, 2008. [31] L. Zhou, M. Ticrea, and E. Hovy, “Multi-
SKG’08. Fourth International Conference on, document biography summarization,” arXiv
2008, pp. 235–240. Prepr. cs/0501078, 2005.
[21] D. Wang, S. Zhu, T. Li, Y. Chi, and Y. Gong, [32] A. R. Deshpande and L. Lobo, “Text
“Integrating document clustering and summarization using clustering technique,” Int.
multidocument summarization,” ACM Trans. J. Eng. Trends Technol., vol. 4, no. 8, 2013.
Knowl. Discov. from Data, vol. 5, no. 3, p. 14, [33] L. Vanderwende, H. Suzuki, C. Brockett, and
2011. A. Nenkova, “Beyond SumBasic: Task-
[22] C. Wang, L. Long, and L. Li, “HowNet based focused summarization with sentence
evaluation for Chinese text summarization,” in simplification and lexical expansion,” Inf.
Natural Language Processing and Knowledge Process. Manag., vol. 43, no. 6, pp. 1606–
Engineering, 2008. NLP-KE’08. International 1618, 2007.
Conference on, 2008, pp. 1–6. [34] A. Haghighi and L. Vanderwende, “Exploring
[23] D. Wang, T. Li, S. Zhu, and C. Ding, “Multi- content models for multi-document
document summarization via sentence-level summarization,” in Proceedings of Human
semantic analysis and symmetric matrix Language Technologies: The 2009 Annual
factorization,” in Proceedings of the 31st Conference of the North American Chapter of
annual international ACM SIGIR conference the Association for Computational Linguistics,
on Research and development in information 2009, pp. 362–370.
retrieval, 2008, pp. 307–314. [35] D. R. Radev, E. Hovy, and K. McKeown,
[24] D. Wang, S. Zhu, T. Li, and Y. Gong, “Multi- “Introduction to the special issue on
document summarization using sentence-based summarization,” Comput. Linguist., vol. 28,
topic models,” in Proceedings of the ACL- no. 4, pp. 399–408, 2002.
IJCNLP 2009 Conference Short Papers, 2009, [36] I. H. Witten, G. W. Paynter, E. Frank, C.
pp. 297–300. Gutwin, and C. G. Nevill-Manning, “KEA:
[25] J. L. Neto, A. A. Freitas, and C. A. A. Practical automatic keyphrase extraction,” in
Kaestner, “Automatic text summarization using Proceedings of the fourth ACM conference on
a machine learning approach,” in Brazilian Digital libraries, 1999, pp. 254–255.
Symposium on Artificial Intelligence, 2002, pp. [37] S. P. Singh, A. Kumar, A. Mangal, and S.
205–215. Singhal, “Bilingual automatic text
[26] D. R. Radev, W. Fan, and Z. Zhang, summarization using unsupervised deep
“Webinessence: A personalized web-based learning,” in Electrical, Electronics, and
multi-document summarization and Optimization Techniques (ICEEOT),
recommendation system,” Ann Arbor, vol. International Conference on, 2016, pp. 1195–
1001, p. 48103, 2001. 1200.
[27] L. Agnihotri, J. R. Kender, N. Dimitrova, and [38] L. H. Reeve, H. Han, S. V Nagori, J. C. Yang,
7752
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
7753
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
7754
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
7755
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
2004 Graph-based DUC 2002 ROUGE-1 = 0.4229 Input document. Single document.
ranking algorithm
(TextRank) [53] Adv. Adaptability with any language or
domain.
2004 Graph-based DUC 2003, DUC On DUC 2003, ROUGE-1 = 0.3646 Input document. Multi documents.
ranking algorithm 2004
(LexRank) [54] On DUC 2004, ROUGE-1 = 0.3966 Adv. Obtain good information coverage in
generated summary. Prevents unnaturally
On 17% noisy DUC 2003, high idf scores from increasing the score of
ROUGE-1 = 0.3621 a sentence that is unrelated to the topic
(work well with noisy data).
On 17% noisy DUC 2004,
ROUGE-1 = 0.3905
2005 LSA+TRM [89] 100 political Recall = Precision = F- measure = Input document. Single document.
articles from 0.4442
New Taiwan Adv. Generated summary composed of
Weekly semantically related sentences. Approach is
language independent.
2006 Fuzzy-Rough set 8 pdf articles F-Measure = 0.4620391 Input document. Single document.
(Fuzzy c-mean from Journal of
clustering) [92] Artificial Adv. Give good information coverage and
Intelligence reduce redundancy.
Research (JAIR)
2007 Graph ranking DUC 2002, DUC On DUC 2002, ROUGE-1 = Input document. Multi documents.
algorithm (Affinity 2003, DUC 0.38111, ROUGE-2 = 0.08163,
Graph) + Greedy 2004, DUC 2005 ROUGE-W = 0.12292 Adv. Generate generic and Topic-focused
algorithm (for high summaries. Handle redundancy issue.
information On DUC 2004, ROUGE-1 =
richness & 0.39926, ROUGE-2 = 0.08793, Dis-Adv. Words are independent with each
novelty). [55] ROUGE-W = 0.12228 other; so, it may contain a shortage in
semantic relations.
On DUC 2003, ROUGE-1 =
0.36187, ROUGE-2 = 0.07114,
ROUGE-W = 0.11464
2009 WF+TE+CQP DUC 2002, 5 On DUC 2002, F- measure of: Input document. Single document.
features in DUC articles from
2002. fairy tales ROUGE-1 = 0.45611, ROUGE-2 = Adv. Can summarize documents that have
domain 0.20252, ROUGE-SU4 = 0.22200, no title. Doesn’t require much processor.
ROUGE-L = 0.41382 Handle redundancy problem.
7756
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
2009 Fuzzy-Logic [93] DUC 2002 ROUGE-1: Input document. Single document.
2012 dependency DUC 2001, DUC Recall Score %: On DUC 2001, Input document. Single document.
graphs + 2002, British
Louvain Colombia ROUGE-1 = 45.7, ROUGE-L = Adv. Can summarize documents that have
clustering conversation 40.6, ROUGE-SU1 = 26.2 no title. Can summarize multiple genres of
algorithm corpus (BC3), documents and is language independent.
(keywords level) Concisus corpus On DUC 2002, ROUGE-1 = 48.8,
[75] of event ROUGE-L = 44, ROUGE-SU1 =
summaries 29.4
2012 LSA + DUC 2005, On DUC 2005, ROUGE-2 = 0.081, Input document. Multi documents.
Optimization ROUGE-SU4 = 0.134,
methods (Greedy DUC 2006, Adv. Using greedy method and dynamic
method + On DUC 2006, ROUGE-2 = 0.102, programming algorithm to handle weight
Dynamic DUC 2007, terms computation task and sentences
programming) ROUGE-SU4 = 0.152, extraction task separately which achieve
TAC 2008, high coverage with low redundancy. The
[90]
On DUC 2007, ROUGE-2 = 0.128, model is language-independent.
TAC 2009,
ROUGE-SU4 = 0.175,
TAC 2010,
On TAC 2008, ROUGE-2 = 0.103,
TAC 2011,
ROUGE-SU4 = 0.136,
ROUGE-SU4 = 0.142,
ROUGE-SU4 = 0.135,
ROUGE-SU4 = 0.162
2013 Graph ranking DUC 2004, 5 On DUC 2004, Input document. Multi documents.
algorithm + real life
Association rule documents in ROUGE-2: Adv. Can discover correlations between
mining + Greedy news. terms by association rules. A flexible and
algorithm (for Recall = 0.093, Precision = 0.099, F- portable approach.
maximum measure = 0.097
coverage &
ROUGE-SU4:
relevance). [56]
Recall = 0.015, Precision = 0.021, F-
measure = 0.019
7757
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
2014 Deep learning Documents from On networking domain, Input document. Multi documents.
(Restricted networking and
Boltzmann software Recall = 0.429, Precision = 0.6, F- Dis-Adv. Sensitivity to datasets.
Machine) [78] engineering measure = 0.490
domains
On software engineering domain,
2015 Fuzzy-Logic + 10 different Average results (%): Input document. Single document.
LSA [95] datasets
Recall = 44.36375, Precision = Adv. Handle sentences scoring problem by
90.77572, F-measure = 67.56974 fuzzy logic and generate semantically
summaries based on LSA.
2015 Deep learning DUC 2005, On DUC 2005, ROUGE-1 = 0.3751, Input document. Multi documents.
(DBN) + Dynamic ROUGE-2 = 0.0775, ROUGE-SU4 =
programming [77] DUC 2006, 0.1341 Adv. First algorithm to summarize query
oriented multi-documents by deep learning.
DUC 2007 On DUC 2006, ROUGE-1 = 0.4015, Significant concepts are pushed out layer by
layer efficiently. Perfect model for feature
ROUGE-2 = 0.0928, ROUGE-SU4 = extraction.
0.1479
2015 Deep learning DUC 2002, On DUC 2002, ROUGE-1 = Input document. Multi documents.
(CNN Language 0.51013, ROUGE-2 = 0.26972,
model) + Cosine DUC 2004 ROUGE-SU4 = 0.29431 Adv. Powerful model in sentence
similarity + On DUC 2004, ROUGE-1 = representation based on Neural network
Optimization 0.40907, ROUGE-2 = 0.10723, language model. Handle redundancy issue.
method (DivSelect Provide DivSelect as diversified selection
with help of ROUGE-SU4 = 0.14969 method. Keep the diversity and prestige of
PageRank chosen sentences to be balanced.
algorithm) [80]
2015 Deep learning DUC 2001, The results are (%) Input document. Multi documents.
(CNN) +
Regression model DUC 2002, On DUC 2001, ROUGE-1 = 35.98, Adv. Pick up the independent features of the
+ Greedy document which reflect it. The model able
algorithm [81] DUC 2004, ROUGE-2 = 7.89, to avail all potential semantic representation
aspects hidden in the text. Handle
On DUC 2002, ROUGE-1 = 36.63, redundancy issue.
ROUGE-2 = 8.97,
ROUGE-2 = 10.07,
2016 Graph ranking DUC 2002 ROUGE-1 Recall = 0.48645, Input document. Single document.
algorithm (word
order relationship ROUGE-2 Recall = 0.39927, Adv. Find keywords that represent text
for connecting topic based on lexical association. Spending
vertices) + Lexical low time while extracting keywords. Good
association [57] coherence in the final summary.
2016 Deep learning DUC 2005, The results are (%), Input document. Multi documents.
(CNN) + Greedy
algorithm [82] DUC 2006, On DUC 2005, ROUGE-1 = 37.01, Adv. Used to summarize query-focused
multi documents. Handle query relevance
DUC 2007 ROUGE-2 = 6.99, and saliency of sentences issues jointly
together. Applied neural attention method
7758
Journal of Theoretical and Applied Information Technology
15th December 2018. Vol.96. No 23
© 2005 – ongoing JATIT & LLS
ROUGE-2 = 11.55,
2017 Deep learning CNN/Daily Mail On Daily Mail, The Recall value Input document. Single document.
(GRU-RNN) + corpus, with:
Greedy algorithm Adv. Interpretability of visualization for its
[83] DUC 2002 75 bytes of summary length: predictions. Allow the extractive model to
ROUGE-1 = 26.2, ROUGE-2 = 10.8, be trained using extractive labels (via
ROUGE-L = 14.4 unsupervised way which convert abstractive
summaries to extractive labels), and using
275 bytes of summary length: human (abstractive) summaries without the
ROUGE-1 = 42.0, ROUGE-2 = 16.9, needs of labeled data.
ROUGE-L = 34.1
2017 Adaptive Neuro- DUC 2002 Precision = 0.7128, Recall = 0.6982, Input document. Single document.
Fuzzy Inference
System (ANFIS) F-measure = 0.7054 Adv. Tackle the problem of needing the
(Fuzzy-logic based human experts for building fuzzy rules by
on neural network) using subtractive clustering method to
[97] automatically generate rules.
2017 Graph ranking DUC 2002 For ROUGE-1: Precision = Input document. Single document.
algorithm (Topic 0.51430, Recall = 0.61643,
Association Adv. Present new technique for connecting
Graph) + Lexical F-measure = 0.56050 the vertices by the way that increase the
association [58] incoming edges for topic central words
For ROUGE-2: Precision = (Topic Association Graph). Enabling the
0.40323, Recall = 0.48410, usage of centrality measures degrees for
calculating vertices strength.
F-measure = 0.43977
7759