0% found this document useful (0 votes)
20 views11 pages

An Extractive Approach For English Text

summarization

Uploaded by

yeshi telay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views11 pages

An Extractive Approach For English Text

summarization

Uploaded by

yeshi telay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

IJSAR, 6(5), 2019; 20-30

International Journal of Sciences & Applied Research

www.ijsar.in

An Extractive Approach for English Text Summarization


Kanchan D. Patil 1, Sandip A. Patil 2, Yogesh S. Deshmukh1*
1
Department of Information Technology, Sanjivani College of Engineering, Kopargaon, India.
2
Department of Computer Engineering, Sanjivani College of Engineering, Kopargaon, India.
Corresponding author: *Yogesh S. Deshmukh, Department of Information Technology, Sanjivani College of
Engineering, Kopargaon, India.
_____________________________________________________________________________________________
Abstract
Natural-language processing (NLP) is a vast area of computer science, artificial intelligence
concerned with the interactions between computers and human languages. The “natural
language” means a language that is used for daily communication by humans. The development
of NLP applications is challenging because computers traditionally require humans to "speak" to
them in a programming language that is precise, unambiguous and highly structure. Text
summarization is one of the research area of NLP which gives us meaningful and short
description of the vast text documents using different NLP tools and techniques. Nowadays, as
we are dealing with huge amount of Digital data it is necessary to have automatic Text
Summarization Techniques. Text summarization specifically classified into two major categories
as Extractive Text Summarization and Abstractive Text Summarization. This paper focuses on
different Extractive Text Summarization Techniques used for Indian Languages.
Keywords: NLP, Text Summerization; Extractive Text Summerization, Abstractive Text
Summerization

Introduction no longer than half of the original text(s) and


Nowadays we are dealing with the large usually, significantly less than that”.
amount of Digital data on the Internet. If Automatic summarization can be defined as
you want to search some information on a process of shortening a text document with
search engine for example ‘Text’, that web software, in order to create a summary with
gives you large documents of information the major points of the original document.
that consists of your search word ‘Text’. The These Automatic summarization tools and
information may be relevant to your search Techniques help human to read and
are may not be. As well the contents may be understand the document in short time.
duplicate. It is difficult for human to read all Automatic Text summarization is used in
the documents completely [1, 2, 3]. So, we various applications such as Search Engine,
require the automatic Text summarization. Articles, Newspapers, Research Abstract
According to Radef et al. [4] a summary is etc[5]. Research in the area of Text
defined as “a text that is produced from one summarization started in 1950’s but till now
or more texts, that conveys important no system is available that summarizes the
information in the original text(s), and that is text like human. Some are focusing on
20
IJSAR, 6(5), 2019; 20-30
Abstractive Text Summarization and some model has to first truly understand the
on Extractive Summarization. Extractive document and then try to express that
methods work by selecting a subset of understanding in short possibly using new
existing words, phrases, or sentences in the words and phrases. It is quite harder than
original text to form the summary. extractive approach. It has complex
Oppositely in Abstractive Text capabilities like generalization, paraphrasing
Summarization, system understands the and incorporating real-world knowledge.
contents of document and then creates a The basic form subject-verb-object of the
summery in its own words [3]. As this sentences is considered for abstractive
technique tries to give the generalized summarization method. Some steps of this
summary like human, it needs advanced method are given below:
Natural Language Processing Techniques. Step I: Preprocessing: It is considered for
Abstractive text summarization method each sentence to create a semantic graph.
generates a sentence from a semantic The actions performed in this step are
representation and then uses natural sentence segmentation, stop word removal
language generation technique to create a and stemming.
summary that is closer to what a human Preprocessing Content Selection
Summary Generation
might generate. There are summaries Text as
- Sentence Segmentation - Sentence scoring using
- Using NL Generation of
containing word sequences that are not Input
- Stopword Removal various set of rules applied
Sentences, Predefine
- Stemming to get interesting contents,
present in the original - Sentences Removal salient info and filtering etc.
Templates etc.
(Steinberger & Ježek, 2008). It consists of
understanding the original text and re-telling Figure 1. Generic Abstractive Text
it in fewer words. It uses the linguistic Summarization
approach such as lexical chain, word net,
graph theory, and clustering to understands Demerits:
the original text and generate the summary. The biggest challenge for abstractive
On the other hand, Extractive text summary is the representation problem.
summarization works by selecting a subset Systems’ capabilities are constrained by the
of existing words, phrases or sentences from richness of their representations and their
the original text to form summary. ability to generate such structures, systems
Moreover, it is mainly concerned with what cannot summarize what their representations
the summary content should be. It usually cannot capture [9,12].
relies on the extraction of sentences (Das &
Martins, 2007). This type of summarization Extractive Text Summarizaton
uses the statistical approach like title The basic process flow of Extractive Text
method, location method, Term Frequency- Summarization is given as in figure 1
Inverse Document Frequency (TF-IDF) Preprocessing Scoring of Sentences Summary Generation
method, and word method for selecting - Sentence Segmentation - Sentence scoring using - Selection of sentences
important sentences or keyword from Text as
Input
- Stopword Removal various Word level, as per their ranking in the
document (Munot & Govilkar, 2014).[16] - Stemming Sentence Level and Graph same order as in the
- Sentences Removal Level features original text document
Text summarization techniques
Figure 2. Generic Extractive Text
Abstractive Text Summarization
Summarization
In Abstractive Text Summarization systems
generate new phrases, possibly rephrasing or
Step I: Preprocessing
using words that were not in the original
In this step, sentences are segmented using
text. Naturally abstractive approaches are
some appropriate method. Generally,
harder. For perfect abstractive summary, the
21
IJSAR, 6(5), 2019; 20-30
symbols like ‘.’ , ‘?’, and ‘!’ are used to Entity Presence. Stanford NER is used to
show the sentence end. Stop words like a, identify all the named entities present in the
an, at are removed as they do not convey sentences. Stanford NER is a
relevant information to the actual topic of implementation of a Named Entity
summarization. After that stemming is Recognizer. Named Entity Recognition
performed. Stemming is the process of (NER) labels sequences of words in a text
reducing derived words to their word stem, which are the names of things, such as
base or root form. For example, A English person and company names, or gene and
stemmer should reduce the words ‘singing’, protein names. The sentences are scored
‘sang’ to the root word ‘sing’. After that the based on named entities present in them.
sentences containing unnecessary Lastly, POS tagger is used to identify Proper
information for summary like ‘diagrams’, Noun Feature. A Part-Of-Speech Tagger
‘Tables’ are removed. (POS Tagger) is a piece of software that
Step II: Scoring of Sentences reads text in some language and assigns
After preprocessing, the sentences in the parts of speech to each word such as noun,
original document are scored using different verb, adjective, etc., Sentences which
word level, Sentence Level and Graph level contain proper nouns are the most important
features. and convey maximum information. In this
Step III: Summary Generation method, The researcher started the summary
The sentences after scoring are selected in from the first sentence of the original
the same order as they appear in the original document and ended with the last sentence
document in this step. of the document. These sentences improve
the readability as well they are the important
Demerits: sentences according to the sentence location
1. Extracted sentences usually tend to be feature. And the score of intermediate
longer than average. Due to this, parts of the sentences calculated using the following
segments that are not essential for summary Feature Priority Filtering Algorithm.
also get included, consuming space. Although it is very difficult to find an
2. Important or relevant information is efficient extractive summary using different
usually spread across sentences, and feature combination, This method works
extractive summaries cannot capture this effectively as takes the advantage of
(unless the summary is long enough to hold sentence location feature.
all those sentences).
3. Conflicting information may not be Algorithm: Feature Priority Filtering
presented accurately [9, 12, 13]. Algorithm
1. Compute TF-ISF score for each term.
Extractive text summarization methods 2. Calculate TF-ISF score of sentences on
Feature Priority Based Sentence Filtering the basis of terms present in the sentence.
Method [6] 3. Select top 50% sentences on the basis of
In this method, The features such as Term TF-ISF score
Frequency-Inverse Sentence Frequency[TF- 4. Apply named entity recognizer on
ISF], Named Entity presence, Proper Noun selected sentences.
Presence are used to select the sentences. 5. Select top 50% sentences on the basis of
The Inverse sentence frequency is variation named entity presence.
of inverse document frequency[IDF]. ISF 6. Apply POS tagging on selected
suggest that if the term is less frequent in sentences.
whole document then it is more important
for the sentence. Second feature is Named
22
IJSAR, 6(5), 2019; 20-30
7. Put the sentences in a list L in Term Frequency-Inverse Document
decreasing order of their score using Frequency (TF-IDF) Approach [15,16]
proper nouns. Term Frequency-Inverse Document
8. To generate a summary of n documents Frequency is a numerical statistic method
select first sentence of the document and which shows how the word is important to a
add it to the summary then select n-2 top document in the collection. To generate a
sentences (other than first and last) from summary non-stop words that occured more
L add them to summary then add the last frequently are consider as query word. Then
sentence to the summary. the term frequency and the inverse
document frequency is calculated for each
Graph Based Approach [17] non-stop word. The number of times a term
In every document, the nouns of the text occurs in a document is called its term
play the most vital role in helping us frequency [TF]. TF says that the word
understand the meaning of the text basis the occured more frequently better reflect the
context it was written in. One of the content of the document than the word
approach constructs a graph of all the nouns occured less frequently. It can be calculated
of the text to determine how closely related as,
the nouns of the text are to each other which
ultimately helps in weighing the sentences. TF(t) = (Number of times term t appears in a
The sentences are scored based on how document) / (Total number of terms in the
significant the nouns present in the sentence document). (1)
are to the entire document. The high scoring
sentences are considered the most important The inverse document frequency [IDF] is a
sentences in the text and these sentences are measure of how much information the word
chosen for the summary. All the Graph provides, means, whether the term is
Based methods mainly include the tasks of common or rare across all documents. IDF
pre-processing, building graph models, suggest that if the term is less frequent in
applying ranking algorithms and finally whole document then it is more important
generating summaries. for the document. It can be calculated as,
In preprocessing phase, The text is divided
into sentences, and further sentences are IDF (t) = log(Total number of documents in
decomposed into word. Each word is the collection / Number of documents with
assigned the most appropriate part of speech term t in it). (2)
tag depending on the form of the word and
the tags of its neighboring words. After that, Thematic words are obtained by comparing
A graph is built with nouns as vertices and the ratio between two frequencies, referred
the weights of the edges connecting them as (TF(t) * IDF(t)) measure. Once (TF(t) *
represent the relevance between the nouns. IDF(t)) score has been computed for each
After this any one of the techniques like word. The next step is to calculate number
weighted graph model using a hybrid of such thematic words per sentence. With
approach, Ranking algorithm, Shortest path this value sentences in the input text are
algorithm are used to score the sentences ranked and highest scored sentences are
then the sentences with highest score is used picked to be part of summary. Redundancy
to generate a summary. This method of text of information is extremely high in this
summarization works well with news method.
articles, Wikipedia searches and technical Text Summerization using Fuzzy Logic
documents etc. [20,21,22]

23
IJSAR, 6(5), 2019; 20-30
Fuzzy logic system design usually logic system is the defuzzification. The
implicates selecting fuzzy rules and output membership function which is
membership function. The selection of fuzzy divided into three membership functions:
rules and membership functions directly Output Unimportant, Average, and
affect the performance of the fuzzy logic Important is used to convert the fuzzy
system. results from the inference engine into a crisp
The fuzzy logic system consists of four output for the final score of each sentence.
components: fuzzifier, inference engine, In fuzzy logic method, each sentence of
defuzzifier, and the fuzzy knowledge base. the document is represented by sentence
In the fuzzifier, crisp inputs are translated score. Then all document sentences are
into linguistic values using a membership ranked in a descending order according to
function to be used to the input linguistic their scores. A set of highest score sentences
variables. After fuzzification, the inference are extracted as document summary based
engine refers to the rule base containing on the compression rate. It has been proven
fuzzy IF-THEN rules to derive the linguistic that the extraction of 20 percent of sentences
values. In the last step, the output linguistic from the source document can be as
variables from the inference are converted to informative as the full text of a document.
the final crisp values by the defuzzifier using Finally, the summary sentences are arranged
membership function for representing the in the original order.
final sentence score. In order to implement
text summarization based on fuzzy logic,
first, the features such as sentence length,
term weight, sentence position, sentence to
sentence similarity, Title Word etc are used
as input to the fuzzifier. Triangular
membership functions and fuzzy logic is
used to summarize the document.
The input membership function for each
feature is divided into five fuzzy set which
are composed of unimportant values (low
(L) and very low (VL), Median (M)) and
important values (high (H) and very high
(VH)). Figure 3. Fuzzy Inference Engine
In inference engine, the most important part Clustering Techniques[6,10]
in this procedure is the definition of fuzzy
IF-THEN rules. The important sentences are Different approaches to clustering data can
extracted from these rules according to our be described with the help of the hierarchy
features criteria. Sample of IF-THEN rules shown in Figure 4 (other taxonometric
shows as the following rule. representations of clustering methodology
are possible; ours is based on the discussion
IF (NoWordInTitle is VH) and in Jain and Dubes [1988]). At the top level,
(SentenceLength is H) and (TermFreq is there is a distinction between hierarchical
VH) and (SentencePosition is H) and and partitional approaches (hierarchical
(SentenceSimilarity is VH) and methods produce a nested series of
(NoProperNoun is H) and partitions, while partitional methods produce
(NoThematicWord is VH) and only one).
(NumbericalData is H) THEN (Sentence is The taxonomy shown in Figure 2 must be
important) Likewise, the last step in fuzzy supplemented by a discussion of cross-
24
IJSAR, 6(5), 2019; 20-30
cutting issues that may (in principle) affect designed to optimize a squared error
all of the different approaches regardless of function. This optimization can be
their placement in the taxonomy. accomplished using traditional techniques or
Agglomerative vs. divisive: This aspect through a random search of the state space
relates to algorithmic structure and consisting of all possible labelling.
operation. An agglomerative approach
begins with each pattern in a distinct Incremental vs. non-incremental: This issue
(singleton) cluster, and successively merges arises when the pattern set to be clustered is
clusters together until a stopping criterion is large, and constraints on execution time or
satisfied. A divisive method begins with all memory space affect the architecture of the
patterns in a single cluster and performs algorithm. The early history of clustering
splitting until a stopping criterion is met. methodology does not contain many
Monothetic vs. polythetic: This aspect examples of clustering algorithms designed
relates to the sequential or simultaneous use to work with large data sets, but the advent
of features in the clustering process. Most of data mining has fostered the development
algorithms are polythetic; that is, all features of clustering algorithms that minimize the
enter into the computation of distances number of scans through the pattern set,
between patterns, and decisions are based on reduce the number of patterns examined
those distances. A simple monothetic during execution, or reduce the size of data
algorithm reported in Anderberg [1973] structures used in the algorithm’s operations.
considers features sequentially to divide the
given collection of patterns. Related works in Indian languages
The extractive summarization research
works in Indian Languages are not up to
date as compared to the other languages like
English, German, and Spanish etc. It is
mainly due to the diversity in the Indian
Languages and the lack of resources such as
raw data, various NLP tools etc. This section
explains the extractive summarization works
in Indian Languages like Malayalam, Hindi,
and Bengali etc.
Malayalam Text Summarization [3]
Krishnaprasad P, Sooryanarayanan A and
Ajeesh Ramanujan uses abstractive
Figure 4 Clustering Techniques. approach to summarize the text in
Malayalam language. They generated the
Hard vs. fuzzy: A hard clustering algorithm summary from the given document by
allocates each pattern to a single cluster recombining the extracted important
during its operation and in its output. A sentences from the text. In order to identify
fuzzy clustering method assigns degrees of the important sentences in the text they
membership in several clusters to each input follow the content word method. Content
pattern. A fuzzy clustering can be converted word is extracted from the frequency
to a hard clustering by assigning each distribution of except stop words. The
pattern to the cluster with the largest proposed system comprises of two
measure of membership. components, Text analyzing component and
Deterministic vs. stochastic: This issue is the summary generation component. The
most relevant to partitional approaches Text analyzing component is used to
25
IJSAR, 6(5), 2019; 20-30
identify the features associated with the After the sentence ranking the next task is
sentences and based upon the features it Sentence Selection. In this phase, top N
assign the score to each sentence. The main scored sentences may be used to generate
tasks involved are sentence marking, feature the summary. But this generate the
extraction and sentence ranking. Summary coherence. So, After selecting the sentences,
generation component uses the sentence the sentences are recombined in the
score to generate the summary and it chronological order present in the original
involves two main tasks Sentence Selection input text for getting readable summary.
and Summary generation. The proposed system for Malayalam
provides faster method to generate the
summary. For each news article 4
summaries have been generated based on the
condensation rate of 10,15,20,25
percentages and the generated summaries
are evaluated with the reference summaries
by using standard metric ROUGH. The
performance of the given system may be
improved by adding the stemming process,
improvement in the sentence splitting
criteria and adding more number of features.

Hindi Text Summarization


Nikita Desai and Prachi shah in their paper
proposed automatic text summarization
Figure 5. The Extractive Summarization using Supervised Machine Learning
System for Malayalam Technique for Hindi Language. They
represented Each sentence in the document
Figure 5 shows the architecture of proposed by a set of various features namely- sentence
summarization system. Firstly text paragraph position, sentence overall
normalization process is performed and it position, numeric data, presence of inverted
involves the splitting the text into sentences commas, sentence length and keywords in
and further split the sentences into words. sentences. The sentences are classified into
Then the normalized text will be used for one of four classes namely- most important,
feature extraction process. Feature important, less important and not important.
extraction process extracts the features The classes are in turn having ranks from 4
associated with the sentences and the to 1 respectively with “4”indicating most
features associated with words. The features important sentence and “1” being least
used are frequency of word and number of relevant sentence. Next a supervised
characters in the word etc. After that the machine learning tool SVMrank is used to
next process is scoring of sentences. System train the summarizer to extract important
will calculate the score of each sentence sentences, based
based on word frequency and average on the feature vector. The sentences are
number of characters per sentence. In the ordered according to the ranking of classes.
sentence ranking task system will rank each Then based on the required compression
and every sentence in the given text based ratio, sentences are included in the final
on the frequency words in the text and summary. The experiment was performed on
average number of characters in each word. news articles of different category such as
26
IJSAR, 6(5), 2019; 20-30
bollywood, politics and sports. The knowledge etc, can be added to improve the
performance of the technique is compared technique. Also it would be interesting to
with the human generated summaries. The find other suitable machine learning
average result of experiments indicates 72% classifiers other than SVM for the task.
accuracy at 50% compression ratio and 60% Panjabi Text Summarization
accuracy at 25% compression ratio. In their paper, the authors Vishal Gupta and
The proposed technique is grouped into 3 Gurpreet Singh Nehal proposed Automatic
major Blocks - Pre-processing, Processing Punjabi Text Extractive Summarization
and Extraction The general outline of the System in which it comprises of two main
methodology used is as follows. phases: 1) Pre Processing 2) Processing. Pre
Input: A text file (original -Og). Processing is structured representation of
Output: A summarized text(S) of original original Punjabi text. Preprocessing phase
(Og), as per compression ratio. includes Punjabi words boundary
1. Read Input text File -Og identification, Punjabi sentences boundary
2. Pre-process the file Og . //Preprocessing identification, Punjabi stop words
step elimination, Punjabi language stemmer for
2.1 Segment text file into sentences. Nouns and proper names, applying input
2.2 Tokenize each sentence into words. restrictions and elimination of duplicate
2.3 Remove stop-words. sentences. In processing phase, sentence
3. //Processing step features are calculated and final score of
3.1 Extract following features from Og file each sentence is determined using feature-
Sentence Paragraph Position (f1), weight equation. Top ranked sentences in
Sentence Overall Position (f2), proper order are selected for final summary.
Numerical Data in Sentence (f3), This demo paper concentrates on Automatic
Presence of Inverted Commas (f4), Punjabi Text Extractive Summarization
Sentence Length (f5), System.
Keywords in Sentence (f6)
3.2 Apply SVM model to rank sentences in PROPOSED SYSTEM
range from4 to1,with “4” indicating most In proposed system, we developed a system
important sentence and “1” indicating not which will generate the summary of
important sentence. document by extractive approach. The single
4. Generate summary . //Extraction step document is given as input and then it goes
While (lines in summary file (S) does not through four phases. The first phase includes
exceed maximum limit as per given by sentence segmentation, tokenization,
compression ratio) stemming
Do and stop word removal. For stemming
4.1Extract all lines from Og with rank4 purpose, porter’s algorithm is used. Then it
4.2 Extract all lines from Og with rank3 goes through word and sentence scoring. It
4.3 Extract all lines from Og with rank2 takes the features like title feature, Upper
5. Display summary file (S). case word, Sentence length and word
frequency. After this the sentence ranking is
In this system the performance in each of the done on the basis of frequencies of word and
subtasks directly affects the ability to sentences. Then the summary is generated.
generate high quality summaries. Also it is
noteworthy that the summarization is more The detailed System phases are as follows.
difficult if need more compression. More Phase 1:
features like named entity recognition, cue
words, context information, world
27
IJSAR, 6(5), 2019; 20-30
In this phase, the proposed system is going 1. Formula to calculate word frequency:
to accept the single text document and it 2. Formula to calculate sentence frequency:
contains the following phases: Following features are used for determining
1) Sentence segmentation: From the set of the sentence weights:
input in the text documents, each individual 1) Title word feature: Sentences containing
document D is segmented separately as D = words that appear in the title are also
S1, S2, . . ., Sn, where n is the number of indicative of the theme of the document.
sentences in document. These sentences are having greater chances
2) Tokenization: Terms of each sentence are for including in summary.
tokenized as T = t1, t2,. . ., tm, where m is 2) Term frequency Method: This method is
the number of terms. based on the frequency of the term. The
3) Stemming: Stemming performs the word having more frequency is taken into
elimination of ’ed’ and ’ing’ suffixes from the summary.
the given word. For that purpose Porter’s 3) Sentence Length feature: Very large and
algorithm is used. very short sentences are usually not included
4) Stop word removal: Commonly used in summary.
words in English language such as ’a’, ’an’, 4) Upper-case word feature: upper-case
and ’the’ which has less important words (with certain obvious exceptions) are
significance with respect to the document treated as thematic words, as well. Sentences
are removed. containing acronyms or proper names are
included.

Phase 3:
In this phase, depending on the frequency of
each sentence the ranking is done, which
means the sentences are get assigned with
some frequency and after that they are
ranked in descending order. This became the
input to next process.

Phase 4:
To select the sentences which we got in
phase 3 for summary generation, we set
threshold value. Depending upon this
threshold value high score sentences are
used to generate summary.

Conclusion
The main aim of this research work is to
Figure 6 Proposed Architecture combine the both approaches of query
Phase 2: dependent summarization and clustering of
In this phase, the frequencies are get the document. The proposed work will be
allocated to each word. the frequency is mainly focused on summarization of text
depends on how many times that particular files (i.e. .txt).The proposed work will be
term occurs in that document. For that limited to clustering of text files of Standard
purpose standard formula is used as follows: files related to the topic popular amongst
researchers will be used. Standard
28
IJSAR, 6(5), 2019; 20-30
performance evaluation metrics will be used Abstractive Text Summarization
to validate performance. Techniques”, American Journal of
Acknowledgment Engineering Research, 2017, Volume-6,
We would like to thank Mr. Amit Kolhe, Issue-8, pp-253-260.
Managing Trustee of Sanjivani College of 8. Nikita desai, prachi shah, “Automatic
Engineering, Kopargaon, India and Principal Text Summarization Using Supervised
of Sanjivani College of Engineering, Machine Learning Technique For Hindi
Kopargaon, India for providing the Langauge”, International Journal of
resources needed to carry out the proposed Research in Engineering and Technology,
work. 2016, Vol. 5, Issue. 6.
9. Vishal Gupta, Gurpreet Singh Lehal, “A
References Survey of Text Summarization Extractive
1. Elena Lloret, “Text Summerization: An Techniques”, Journal of Emerging
overview”. Technologies in Web Intelligence,
2. Mehdi Allahyari, Seyedamin Pouriyeh, August 2010, Vol. 2, No. 3.
Mehdi Assefi, Saeid Safaei, Elizabeth D. 10. Sheetal Shimpikar, Sharvari Govilkar, “A
Trippe, Juan B. Gutierrez, Krys Kochut, Survey of Text Summarization
“Text Summarization Techniques: A Techniques for Indian Regional
Brief Survey”, In Proceedings of arXiv, Languages”, International Journal of
USA, July 2017. Computer Applications, Volume 165,
3. Krishnaprasad P, Sooryanarayanan A, No.11, May 2017.
Ajeesh Ramanujan, “Malayalam Text 11. Yogesh Kumar Meena, Dinesh Gopalani,
Summarization: An Extractive “Domain Independent Framework for
Approach”, IEEE, International Automatic Text Summarization”,
Conference on Next Generation Elsevier, International Conference on
Intelligent Systems (ICNGIS), 2016 Intelligent Computing, Communication
4. Dragomir R Radev, Eduard Hovy, and & Convergence (ICCC), 2015.
Kathleen McKeown. 2002. “Introduction 12. Jimmy Lin., “Summarization.”,
to the special issue on summarization”. Encyclopedia of Database Systems.
Computational linguistics 28, 4 (2002), Heidelberg, Germany: Springer-Verlag,
399–408. 2009.
5. Sunitha C., Dr. A. Jaya, Amal Ganesh, 13. Jackie CK Cheung, “Comparing
“A Study on Abstractive Text Abstractive and Extractive
Summarization Techniques in Indian Summarization of Evaluative Text:
Languages”, Elsevier, Fourth Controversiality and Content Selection”,
International Conference on Recent B. Sc. (Hons.) Thesis in the Department
Trends in Computer Science and of Computer Science of the Faculty of
Engineering, 2016. Science, University of British Columbia,
6. Yogesh Kumar Meena, Dinesh Gopalani, 2008.
“Feature Priority Based Sentence 14. Soumye Singhal, Arnab Bhattacharya,
Filtering Method for Extractive “Abstractive Text Summarization”
Automatic Text Summarization”, 15. Rene Arnulfo Garcia-Herandez and Yulia
Elsevier, International Conference on Ledeneva, “Word Sequence Models for
Intelligent Computing, Communication Single Text Summarization”, IEEE, 44-
& Convergence (ICCC), 2015. 48, 2009.
7. Sabina Yeasmin, Priyanka Basak Tumpa, 16. Hans christian, mikhael pramodana agus,
Adiba Mahjabin Nitu, Md. Palash Uddin, derwin suhartono, “Single Document
Emran Ali, Masud Ibn Afjal, “Study of Automatic Text Summarization Using
29
IJSAR, 6(5), 2019; 20-30
Term Frequency-Inverse Document Scientific Research, Intelligent
Frequency (tf-idf)”, ComTech Vol. 7 No. Information Management, November
4, December 2016,285-294 2009, 1, 128-138.
17. Akash Ajampura Natesh, Somaiah 20. Mr. S. A. Babar, Prof. S. A. Thora,
Thimmaiah Balekuttira, Annapurna P “Improving Text Summarization using
Patil, “Graph Based Approach for Fuzzy Logic & Latent Semantic
Automatic Text Summarization”, Analysis”, International Journal of
International Journal of Advanced Innovative Research in Advanced
Research in Computer and Engineering (IJIRAE), Volume 1, Issue
Communication Engineering, Vol. 5, 4, May 2014.
Special Issue 2, October 2016. 21. L. Suanmali , N. Salim and M.S.
18. Khushboo S. Thakkar, Dr. R. V. Binwahlan,"Fuzzy Logic Based Method
Dharaskar, M. B. Chandak, “Graph- for Improving Text Summarization" ,
Based Algorithms for Text International Journal of Computer
Summarization”, IEEE, Third Science and Information Security, 2009,
International Conference on Emerging Vol. 2, No. 1,pp. 4-10.
Trends in Engineering and Technology, 22. S. A. Babar, Pallavi D. Patil, “Improving
2010. Performance of Text Summarization”,
19. Rasim ALGULIEV, Ramiz Elsevier, International Conference on
ALIGULIYEV, “Evolutionary Algorithm Information and Communication
for Extractive Text Summarization”, Technologies (ICICT), 2014.

30

You might also like