0% found this document useful (0 votes)
4 views

Automation of Text Summarization Using Hugging Face NLP

The document presents a study on the automation of text summarization using Hugging Face NLP models, focusing on both extractive and abstractive techniques, particularly transformer-based models like BERT and GPT. It evaluates various summarization methods against the CNN/Daily Mail dataset, highlighting the effectiveness of the Hugging Face model 'ml6team/mbart-large-cc25-cnn-dailymail-nl-finetune' in generating concise summaries with promising accuracy. The research emphasizes the need for continuous innovation in summarization tools to adapt to the increasing volume of textual information in the digital age.

Uploaded by

uliseraja1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Automation of Text Summarization Using Hugging Face NLP

The document presents a study on the automation of text summarization using Hugging Face NLP models, focusing on both extractive and abstractive techniques, particularly transformer-based models like BERT and GPT. It evaluates various summarization methods against the CNN/Daily Mail dataset, highlighting the effectiveness of the Hugging Face model 'ml6team/mbart-large-cc25-cnn-dailymail-nl-finetune' in generating concise summaries with promising accuracy. The research emphasizes the need for continuous innovation in summarization tools to adapt to the increasing volume of textual information in the digital age.

Uploaded by

uliseraja1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2024 5th International Conference for Emerging Technology (INCET)

Karnataka, India. May 24-26, 2024

Automation of Text Summarization using


Hugging Face NLP
Asmitha M1 , Aashritha Danda1 , Hemanth Bysani1 , Rimjhim P adam Singh1,∗ , Sneha Kanchan2
*Department of Computer Science and Engineering, Amrita School of Computing, Bengaluru,
Amrita Vishwa Vidyapeetham, India
[2]
Department of Internet Engineering and Computer science, U niversiti T unku Abdul Rahman, M alaysia
[email protected], [email protected], [email protected],
ps [email protected], [email protected]

Abstract—Within the expansive domain of ”Natural Language tion models, unraveling the intricacies of context comprehen-
2024 5th International Conference for Emerging Technology (INCET) | 979-8-3503-6115-5/24/$31.00 ©2024 IEEE | DOI: 10.1109/INCET61516.2024.10593316

Processing” (NLP), the task of ”text summarization” emerges as sion, theme extraction, and abstraction. The significance of
a foundational element, playing a pivotal role in distilling relevant these models lies not only in their ability to generate concise
information from extensive textual corpora. In the digital age,
the importance of efficient summarization becomes increasingly summaries but also in their aptitude for grasping the nuanced
critical, given the overwhelming volume of textual information. layers of meaning embedded within textual data. As infor-
This comprehensive study delves into the intricacies of both mation proliferates across digital platforms, the demand for
extractive and abstractive summarization techniques, placing a sophisticated summarization tools becomes increasingly im-
specific focus on transformer-based models like BERT and GPT. perative. The study aims to contribute comprehensive insights
These models, celebrated for their remarkable capabilities in con-
text comprehension and coherent summarization, are rigorously into the evolving landscape of text summarization, delineating
evaluated alongside established methods like TF-IDF, TextRank, the varied approaches adopted by different models to address
Sumy, Fine Tuning Transformers, Model-T5, LSTM, greedy, and the challenges posed by the CNN/Daily Mail dataset.
beam search. The practical implications of text summarization Among the array of models under scrutiny, the Hug-
extend across diverse fields, encompassing news stories, academic ging Face model ”ml6team/mbart-large-cc25-cnn-dailymail-
papers, and social media content, underscoring its broad utility
in various domains. This study not only incorporates cutting- nl-finetune” takes center stage. Training is done on the dataset,
edge models but also explores a gamut of evaluation methods this model exhibits promising results in the form of accuracy
to discern the quality of summarization. By intertwining theory and efficiency. The study meticulously unravels the intricacies
and application, this research positions itself at the forefront of the model’s training process, shedding light on the fine-
of evolving summarization approaches, shedding light on the tuning mechanisms that optimize its performance for the
transformative impact on information consumption patterns. The
dynamic landscape of summarization methods underscores the specific characteristics of the dataset. Model evaluation and
need for continuous research and innovation, as technological visualization techniques are employed to provide a granular
advancements continue to reshape how individuals access and understanding of the Hugging Face model’s output, presenting
comprehend information. a comprehensive view of its strengths and potential avenues
Index Terms—text summarization, Extractive Summarization, for refinement.
Abstractive Summarization, News Summarization
In the pursuit of comprehensive analysis, the study ex-
tends its scrutiny to various summarization models, creating
I. I NTRODUCTION
a comprehensive benchmark for performance evaluation. The
In the widely evolving landscape of Natural Language objective is not only to identify the superior model but also
Processing (NLP), the task of text summarization emerges to discern the unique strengths and limitations inherent in
as a cornerstone, playing a pivotal role in distilling relevant each approach. As the study unfolds, the Hugging Face model
information from extensive textual corpora. The profound stands out, showcasing superior accuracy and adaptability
impact of NLP technologies on information retrieval, content within the challenging landscape of the dataset. However,
curation, and user experience underscores the importance of this analysis is not merely a proclamation of success; it’s a
advancing text summarization techniques. This comprehensive recognition of the iterative nature of model development. The
study embarks on a nuanced exploration of various state-of- study underscores the importance of continuous refinement
the-art summarization models, with a specific emphasis on and adaptation to meet the evolving intricacies of textual data.
their performance within the intricacies of the CNN/Daily In conclusion, this study provides a holistic understanding
Mail dataset. This dataset, renowned for its diversity and of text summarization models, their training processes, and
complexity, serves as an ideal testing ground for evaluating their performance on the dataset. As the Hugging Face model
the robustness and adaptability of text summarization models emerges as a frontrunner, the study contributes valuable in-
under real-world conditions. sights to the ongoing discourse in NLP research. It illuminates
Delving into the heart of the matter, the study meticulously the path forward, emphasizing the need for robust, adaptive
dissects the methodologies employed by different summariza- models that can navigate the complexities of real-world tex-

979-8-3503-6115-5/24/$31.00 ©2024 IEEE 1


Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
tual data. The implications of this study extend beyond the system and quality scores for adequacy and fluency. Their best-
confines of academic research, shaping the trajectory of NLP performing model achieved a ROUGE-1 F1 score of 0.29 and a
applications in diverse domains. ROUGE-2 F1 score of 0.12 on the EASC dataset. The authors
note that there is still room for improvement in the quality of
II. R ELATED WORKS the summaries produced by their models.
For Text summarization, Suleiman Dima [1] and the team In this study by Minakshi Tomer et al. [5], a novel approach
have proposed many models related to machine learning and for extractive text summarization for multiple documents
deep learning were used. Suleiman and his team review various based on the firefly algorithm is presented. The research
deep learning models, including the ”sequence-to-sequence utilizes the DUC-2002, DUC-2003, DUC-2004, and TAC-
encoder-decoder architecture” for RNN and CNN Seq2Seq 11 datasets for evaluation. The methodology involves pre-
models, and address the methodologies, datasets, evaluation processing, weight assignment using Shark Smell Optimiza-
measures, and the challenges associated with each approach tion, scoring by a fuzzy system, and the generation of the end
mentioned. Additionally, in this paper the authors discuss summary. The system-generated summaries are evaluated us-
about the limitations of existing evaluation measures, such as ing the ROUGE toolkit, demonstrating improved performance
the use of ROUGE, and proposes the need for new evaluation compared to traditional methods. The results indicate that the
metrics that consider the context of words. The survey also proposed algorithm yields better summarization accuracy, as
highlights the challenges in generating high-quality datasets, supported by the ROUGE score.
particularly in languages like Arabic, and emphasizes the M. F. Mridha et al. [6] provide a comprehensive literature
importance of addressing these challenges for the advancement survey on automatic text summarization. The survey covers
of abstractive text summarization techniques. various datasets used in automatic text summarization, includ-
Xu et al. [2] present a novel multitask learning framework ing TeMario Corpus, CNN News, Daily Mail dataset, EASC,
for abstractive text summarization, incorporating a key infor- LCSTS, Wikihow, New Taiwan Weekly, Opinosis, SKE, and
mation guide network to enhance the summarization process. Enron dataset. It discusses a wide range of methodologies,
The study introduces a multi-view attention guide network including extractive and abstractive summarization techniques.
to automatically extract key information from the input text The paper also addresses supervised and unsupervised learning
and utilize it to guide the generation of human-compliant methods, semantic-based, and hybrid methods. Evaluation
summaries. The given model’s evaluation is done using the techniques such as ROUGE, BLEU, and METEOR are ex-
ROUGE evaluation metric, achieving notable results with plored, along with the need for human evaluation. While the
17.70 ROUGE-2 and 36.57 ROUGE-L scores. paper does not present specific results or accuracy scores, it
The results obtained from the study by Fabbri [3] in- highlights open problems and challenges in automatic text
clude the re-evaluation of 14 automatic evaluation metrics, summarization, including the need for better evaluation tech-
benchmarking 23 recent summarization models, and combin- niques, achieving a higher level of abstraction, and the lack of
ing together the largest collection of summaries generated meaningful, intuitive, and robust summarization results.
by models trained on the dataset. The study also provides This literature survey, conducted by A.P. Widyassari et
a toolkit for evaluating the different summarization models al.[7], provides a systematic review of automatic text sum-
across a broad range of automatic metrics and shares the most marization research from 2008 to 2019. The study employs
diverse collection of human judgments of model-generated the Systematic Literature Review (SLR) technique to iden-
summaries on the dataset. Additionally, the study highlights tify, evaluate, and interpret research results in the field of
the shortcomings of extractive and abstractive models, with text summarization. The review encompasses 85 journal and
extractive models scoring lower on coherence and relevance, conference publications, focusing on topics/trends, datasets,
and abstractive models showing an improving trend over time. preprocessing, features, techniques, methods, evaluations, and
The limitations of reference summaries in terms of consistency challenges in text summarization. The survey categorizes
and relevance are also revealed. research into extractive and abstractive summarization ap-
This paper, authored by Ahmed El-Kishky, Kareem Dar- proaches, highlighting the shift towards abstractive summa-
wish, and Walid Magdy [4], presents a ”cross-lingual fine- rization and real-time summarization. Evaluation metrics such
tuning” approach for abstractive text summarization in Arabic as copy rate and accuracy are discussed, with a focus on the
language . The authors compiled a reliable Arabic abstractive challenges and opportunities in text summarization research.
news summary corpus consisting of 658 articles and their cor- The paper by El-Kassas provides [9] a comprehensive sur-
responding summaries, and used the Essex Arabic Summaries vey of Automatic Text Summarization (ATS) techniques and
Corpus (EASC) for evaluation. They fine-tuned a pre-trained methodologies. The authors, Wafaa S. El-Kassas, Cherif R.
BERT model which is multilingual on their Arabic news Salama, Ahmed A. Rafea, and Hoda K. Mohamed, review the
summary corpus using a sequence to sequence model with different approaches to ATS, including extractive, abstractive,
attention mechanism, and experimented with other models, and hybrid methods. They also discuss the various building
including a model based on AraBERT and a model based blocks and techniques used in ATS systems, such as text sum-
on T5. The authors used the ROUGE-N metric for automatic marization operations, statistical and linguistic features, and
evaluation and conducted manual evaluation using a ranking text summarization evaluation. The authors provide examples

2
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
of real-world applications of ATS, including book, story/novel, sentence lengths. Table 1 gives us the information about the
and email summarization. The paper also covers the challenges Sentence length analysis.
involved in evaluating the quality of generated summaries,
and the different evaluation methods used. Overall, the paper
provides a comprehensive overview of the current state of ATS
research and its potential applications.

III. DATASET
The CNN/DailyMail Dataset is a comprehensive English-
language collection comprising over 300,000 unique news
articles from CNN and the Daily Mail. Initially designed for
machine-reading and comprehension, versions 2.0.0 and 3.0.0 Figure 1: IQR analysis of of sentence lengths
transformed the dataset to support abstractive and extractive
summarization. With a focus on model evaluation through
Table I: Sentence length analysis
ROUGE scores, the dataset consists of train (287,113), valida-
tion (13,368), and test (11,490) splits. Each instance includes ValueCategory label length
count 1 000 1 000
an article, highlights, and a unique identifier. The mean token mean 4 .73600 16 .54400
counts for articles and highlights are 781 and 56, respectively. std 0 .78161 10 .567106
The dataset, spanning April 2007 to April 2015, aims to min 1 .00000 3 .000000
0.25 5 .00000 9 .000000
facilitate the development of models adept at summarizing 0.5 5 .00000 14 .000000
extensive text into concise sentences. Notably, concerns about 0.75 5 .00000 20 .000000
biases, gender bias measurements, and potential limitations max 5 .00000 64 .000000
in article structure and co-reference errors are discussed,
highlighting the dataset’s nuances for future research. C. Data Generation

IV. M ETHODOLOGY The data generation process, facilitated by the custom


data generator (’CustomDataGenerator’), efficiently created
Text summarization is the process of distilling the most training, validation, and test data batches. Notably, the training
important information from a piece of text to create a concise data involved 85000 points, validating on 10000, and testing
and coherent summary while retaining the main ideas and key on 5000, showcasing a diverse yet manageable dataset. While
points. The goal of text summarization is to provide users with using the huggingface models, the dataset was fetched on
a shorter version of the original text that captures its essence demand from the huggingface API using library calls which
and allows them to quickly grasp the main ideas without reduced the need to download the entire dataset to perform
having to read the entire document. analysis. Data generation of the dataset is given in Figure 2.
The proposed methodology involves training and fine-
tuning Pegasus, RNN, Seq2Seq, Hugging Face models on
text summarization datasets. Subsequently, the effectiveness of
Beam Search and Greedy Search decoding algorithms will be
evaluated for generating summaries. Comparative analysis will
be conducted to determine the performance and efficiency of
each model and decoding algorithm combination in producing
accurate and concise text summaries.

A. Data Preprocessing
The data collected from CNN/Daily Mail has been pre-
processed by the previous authors which include the removal Figure 2: Data generation
of stop words, cleaning of unnecessary punctuation, and re-
moval of non concise statements.

B. Sentence Length Analysis D. Modelling approaches


The data collected from CNN/Daily Mail has been pre- 1) Recurent Neural Netowrk model: The language model’s
processed by the A critical analysis of sentence lengths re- architecture, centered around a bidirectional LSTM with
vealed insightful statistics. The decision to cap the maximum an embedding layer, was strategically designed [11]. The
sentence length at 50 stemmed from a nuanced understanding model comprised 146,575 parameters, weighing 572.56 KB,
of data distribution. The mean sentence length was 16.54, striking a balance between complexity and computational
with a standard deviation of 10.57, guiding the choice of the efficiency[12-13]. Training the model involved orchestrating
maximum length. Figure 1 describes the graphical analysis of callbacks, including ModelCheckpoint, EarlyStopping, and a

3
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
LearningRateScheduler. The model underwent 2 epochs, with vocabulary size determined by words occurring at least 50
a learning rate of 0.001. Notably, the acknowledgment of over- times [20]. The two-epoch training employs essential call-
fitting and the need for further experimentation underscored a backs, including ModelCheckpoint and EarlyStopping. Model
thoughtful and adaptive training approach. parameters total 146,575, and the learning rate is set at
2) Beam and Greedy Search: Beam search takes into 0.001. A visual representation of the training history aids in
account a predetermined number of the most likely candidates, understanding the model’s learning dynamics.
known as the beam width, and chooses the sequence with the 5) Huggingface model: The Hugging Face model, based
highest joint probability as opposed to the greedy search algo- on the ”ml6team/mbart-large-cc25-cnn-dailymail-nl-finetune”
rithm, which chooses the highest probability token at each time architecture, is trained to utilize a large-scale dataset with
step[15-16]. When the end-of-sequence token is generated, the diverse news articles. The training process involves fine-tuning
algorithm keeps producing candidates and updating the beam; the pre-existing model on the specific summarization task,
the candidate with the highest joint probability is then chosen considering a maximum sequence length of 1024 tokens[21].
as the output sequence. The training corpus includes a mix of news articles, allowing
The decoder chooses the token with the highest likelihood the model to grasp the nuances of diverse writing styles
to be the next token in the output sequence at each decoding and content structures. Key hyperparameters, such as learn-
step in a greedy search[14]. Until an end-of-sequence token ing rate, batch size, and training epochs, are optimized for
is issued, indicating that the output sequence is finished, this effective convergence. The Hugging Face model, renowned
operation is repeated. for its innovation in natural language processing (NLP),
3) Pegasus Model: The Pegasus model emerges as a boasts impressive technical specifications. With a foundation
groundbreaking innovation in natural language generation, in transformer architecture, it leverages attention mechanisms
renowned for its exceptional capabilities in producing coherent to process input sequences efficiently. The model’s multi-
and contextually relevant long-form text summaries. Leverag- layered structure enables it to capture intricate linguistic
ing advanced pre-training objectives that prioritize document- patterns and nuances, facilitating tasks such as text generation,
level understanding, Pegasus excels in distilling crucial infor- translation, and sentiment analysis with remarkable accuracy.
mation from documents, articles, or web pages into succinct Additionally, its parameter-efficient design allows for faster
and comprehensible summaries. Its finely tuned attention inference without compromising performance, making it a
mechanisms ensure the effective capture and synthesis of preferred choice for various NLP applications. The Hugging
salient details, striking a balance between conciseness and in- Face model’s versatility, speed, and state-of-the-art capabilities
formativeness. What sets Pegasus apart is its impressive trans- continue to redefine the landscape of language understanding
fer learning prowess, effortlessly adapting to diverse domains and generation in the realm of artificial intelligence.
and tasks with minimal fine-tuning required. Furthermore,
its scalability enables efficient processing of large datasets, E. MODEL EVALUATION
empowering researchers and practitioners to extract insights 1) RNN model and Pegasus model: Post-training evaluation
from extensive textual data efficiently. The amalgamation of showcased the RNN model’s journey, with loss values of
these features positions Pegasus at the forefront of natural 2.5572 and 2.5004 for the two epochs. The visual representa-
language processing, promising transformative advancements tion of the model’s loss over epochs provided a clear narrative
in information synthesis and knowledge extraction. The trans- of the learning process, guiding further refinement strategies.
formers library is employed to configure the Pegasus model for In conclusion, the human-centric approach to data exploration,
conditional generation. Both the Pegasus tokenizer and model preprocessing, and model development incorporated essential
are loaded seamlessly, showcasing a reliance on pre-trained numerical considerations. The dataset’s initial size, vocabulary
models for text summarization tasks[17]. This strategic use of dimensions, and key statistics on sentence lengths and model
state-of-the-art transformer models aligns with best practices parameters provided a quantitative foundation for effective
in natural language processing[18]. The code proceeds to load language modeling. Refer to Figure 3 and Figure 4.
and explore data from the CNN-DailyMail dataset, providing Rouge scores are calculated to quantitatively check the
insights into the structure of the dataset. Comprising articles quality of the summaries that are generated. The metrics in-
and corresponding highlights, the dataset forms the foundation clude precision, recall, and F-score for Rouge-1, Rouge-2, and
for subsequent text summarization tasks[19]. Text summariza- Rouge-l. The results are presented in tabulated form, offering
tion is achieved through a loop that iterates over a subset a comprehensive overview of the summarization performance.
of articles. Summaries are generated using a function named Specifically, the calculated Rouge-1 precision, recall, and F-
’text summarization,’ and the resulting summaries are stored score are approximately 0.335041,0.339785, and 0.335659
in a dictionary for further analysis. This section exemplifies respectively. For Rouge-2, the corresponding values are ap-
the practical application of the Pegasus model for real-world proximately 0.156672,0.174546 and 0.16399. Lastly, Rouge-
summarization tasks. l scores are approximately 0.319168,0.323091 and 0.319436.
4) Seq2Seq: The Seq2Seq model undergoes training using The Rouge scores are further visualized using a bar chart,
a bidirectional LSTM architecture with an embedding layer, providing a clear comparative analysis of precision, recall,
considering 85,000 data points. Tokenization is applied with a and F-score across Rouge-1, Rouge-2, and Rouge-l metrics.

4
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
3) Seq2Seq: Post-training, the Seq2Seq model is rigorously
evaluated using Rouge metrics, resulting in precision, recall,
and F-score values. Rouge-1 precision, recall, and F-score
are approximately 0.2049, 0.3461, and 0.2494. For Rouge-
2, values are about 0.0770, 0.1600, and 0.1035, and Rouge-l
scores are approximately 0.1912, 0.3179, and 0.2312. A bar
chart visually represents these metrics, enhancing accessibility
for a comprehensive analysis. The Seq2Seq model, driven
by bidirectional LSTM architecture, showcases robust training
and evaluation. Careful consideration of hyperparameters, in-
Figure 3: Comparative plot for precision, recall, fscore metric tegration of essential callbacks, and use of Rouge metrics col-
for RNN model and Pegasus model lectively contribute to an effective text summarization model.
The model’s comprehensive evaluation, both numerically and
visually, attests to its ability to generate coherent and contex-
tually relevant text summaries.

Figure 4: Loss function graph for RNN model and Pegasus Figure 6: Training and validation loss for Seq2Seq model
model

This graphical representation enhances the interpretability of


the summarization quality and highlights potential areas for
improvement. In conclusion, the code integrates foundational
library usage, dependency management, model configuration,
and data processing. It culminates in a robust evaluation of text
summarization quality, supported by precise numerical metrics
and visualizations for comprehensive analysis.
2) Beam and Greedy Search: Every sequence in the beam,
which is a collection of partially decoded sequences, is repre-
sented by a node in the search tree, which is how the algorithm
operates. By extending the beam nodes and calculating their
conditional probabilities, the decoder produces a set of poten- Figure 7: Comparative plot for precision, recall, fscore metric
tial candidates at each time step. Only the candidates with the for Seq2Seq model
highest conditional probability are kept in the beam, and the
number of candidates to consider at each time step is limited
4) Huggingface model: Post-training, the model undergoes
by the beam width.
rigorous evaluation using established metrics such as Rouge-
1, Rouge-2, and Rouge-l. The evaluation results in precision,
recall, and F-score values. For instance, Rouge-1 precision is
approximately 0.6579, Rouge-2 precision is around 0.4324,
and Rouge-l precision is about 0.6316. Visualization of the
evaluation metrics is presented through informative bar charts,
aiding in a comprehensive understanding of the model’s per-
formance. The hugging Face model, after meticulous train-
ing and evaluation, demonstrates proficiency in abstractive
Figure 5: output of beam and greedy search summarization tasks. The fine-tuned architecture, enriched
with knowledge from the vast dataset, achieves commend-
able Rouge scores, reflecting its ability to generate concise

5
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
and contextually relevant summaries. The model’s success in based on metrics such as ROUGE scores, where this model
handling diverse news articles attests to its adaptability and consistently demonstrated superior performance, achieving the
effectiveness in real-world summarization scenarios. highest precision, recall, and F-score among the compared
models. The accuracy of the Hugging Face model can be
attributed to its effective fine-tuning of the CNN/Daily Mail
dataset, ensuring a better understanding and generation of
concise summaries. To further enhance summarization mod-
els, in the future improvements can be made in terms of
handling longer sequences, as indicated by challenges faced
during model evaluation with certain articles. Additionally,
incorporating more diverse datasets and exploring advanced
pre-training strategies may contribute to creating even more
robust and effective summarization models, Extractive and
abstractive summarization along with transformer models like
T5, and GPT models can be used. In comparison to other
models, the Hugging Face model showcased a notable edge
Figure 8: Comparative plot for precision, recall, F-score metric in producing high-quality and coherent summaries, making it
for Hugging Face model a preferred choice for applications demanding accurate and
informative content condensation.

Table II: Rouge scores for all the models R EFERENCES


[1] Suleiman, Dima, and Arafat Awajan. ”Deep learning based abstractive
Metrics Seq2Seq RNN/Pegasus huggingface text summarization: approaches, datasets, evaluation measures, and
Rouge-1 0.2049 0.3350 0.3931 challenges.” Mathematical problems in engineering 2020 (2020): 1-29.
Rouge-2 0.077 0.1566 0.1621 [2] Xu, Weiran, Chenliang Li, Minghao Lee, and Chi Zhang. ”Multi-task
precision
Rouge-l 0.1912 0.3191 0.2771 learning for abstractive text summarization with key information guide
Rouge-Lsum - - 0.3299 network.” EURASIP Journal on Advances in Signal Processing 2020,
Rouge-1 0.3461 0.3397 0.4019 no. 1 (2020): 1-11.
Rouge-2 0.16 0.1745 0.1757 [3] Fabbri, Alexander R., Wojciech Kryściński, Bryan McCann, Caiming
recall
Rouge-l 0.3179 0.3230 0.2928 Xiong, Richard Socher, and Dragomir Radev. ”Summeval: Re-evaluating
Rouge-Lsum - - 0.3431 summarization evaluation.” Transactions of the Association for Compu-
Rouge-1 0.2494 0.3356 0.3961 tational Linguistics 9 (2021): 391-409.
Rouge-2 0.1035 0.1639 0.1720 [4] Kahla, Mram, Zijian Győző Yang, and Attila Novák. ”Cross-lingual fine-
F-score
Rouge-l 0.2312 0.3194 0.2816 tuning for abstractive Arabic text summarization.” In Proceedings of
Rouge-Lsum - - 0.3392 the International Conference on Recent Advances in Natural Language
Processing (RANLP 2021), pp. 655-663. 2021.
[5] Tomer, Minakshi, Manoj Kumar, Adeel Hashmi, Bharti Sharma, and
We can observe, compared to the Seq2Seq model, RNN Uma Tomer. ”Enhancing metaheuristic based extractive text summariza-
showed a good score in terms of rogue -l rouge sum. One tion with fuzzy logic.” Neural Computing and Applications 35, no. 13
disadvantage of Seq2Seq models is their tendency to struggle (2023): 9711-9723.
[6] Mridha, Muhammad F., Aklima Akter Lima, Kamruddin Nur, Sujoy
with handling long input sequences due to vanishing gradients, Chandra Das, Mahmud Hasan, and Muhammad Mohsin Kabir. ”A sur-
potentially leading to information loss. On the other hand, an vey of automatic text summarization: Progress, process and challenges.”
advantage of recurrent neural networks (RNNs) lies in their IEEE Access 9 (2021): 156043-156070.
[7] Widyassari, Adhika Pramita, Supriadi Rustad, Guruh Fajar Shidik,
inherent ability to capture sequential dependencies effectively, Edi Noersasongko, Abdul Syukur, and Affandy Affandy. ”Review of
making them well-suited for tasks requiring temporal model- automatic text summarization techniques and methods.” Journal of King
ing such as time series prediction. An inherent limitation of Saud University-Computer and Information Sciences 34, no. 4 (2022):
1029-1046.
recurrent neural networks (RNNs) is their susceptibility to the [8] Ye, Xia, Zengying Yue, and Ruiheng Liu. ”MBA: A multimodal bilinear
vanishing gradient problem, hindering long-term dependency attention model with residual connection for abstractive multimodal
modeling in sequential data. hugging face a pre-trained model summarization.” In Journal of Physics: Conference Series, vol. 1856,
no. 1, p. 012070. IOP Publishing, 2021.
has a very good score compared to other models. Conversely, [9] El-Kassas, Wafaa S., Cherif R. Salama, Ahmed A. Rafea, and Hoda K.
an advantage of the Hugging Face model lies in its versatility Mohamed. ”Automatic text summarization: A comprehensive survey.”
and ease of use, offering a wide range of pre-trained language Expert systems with applications 165 (2021): 113679.
[10] Tsai, Chih-Fong, Kuanchin Chen, Ya-Han Hu, and Wei-Kai Chen.
models and streamlined interfaces for rapid deployment in ”Improving text summarization of online hotel reviews with review
natural language processing tasks. helpfulness and sentiment.” Tourism Management 80 (2020): 104122.
[11] Zhang, Jian, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu.
V. CONCLUSION ”Retrieval-based neural source code summarization.” In Proceedings of
the ACM/IEEE 42nd International Conference on Software Engineering,
After evaluating various summarization models, it can pp. 1385-1397. 2020.
be concluded that the Hugging Face model, specifically [12] Lewis, Mike, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Ab-
”ml6team/mbart-large-cc25-cnn-dailymail-nl-finetune,” stands delrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettle-
moyer. ”Bart: Denoising sequence-to-sequence pre-training for natural
out as the most accurate, this observation is made based language generation, translation, and comprehension.” arXiv preprint
on the results mentioned in Table III. The evaluation was arXiv:1910.13461 (2019).

6
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.
[13] Kryściński, Wojciech, Bryan McCann, Caiming Xiong, and Richard
Socher. ”Evaluating the factual consistency of abstractive text summa-
rization.” arXiv preprint arXiv:1910.12840 (2019).
[14] Xu, Jiacheng, Zhe Gan, Yu Cheng, and Jingjing Liu. ”Discourse-aware
neural extractive text summarization.” arXiv preprint arXiv:1910.14142
(2019).
[15] Mohamed, Muhidin, and Mourad Oussalah. ”SRL-ESA-TextSum: A text
summarization approach based on semantic role labeling and explicit
semantic analysis.” Information Processing and Management 56, no. 4
(2019): 1356-1372.
[16] A. M and S. M. Rajgopal, ”Exploring Unique Techniques to Preserve
Confidentiality and Authentication,” 2024 2nd International Confer-
ence on Intelligent Data Communication Technologies and Internet of
Things (IDCIoT), Bengaluru, India, 2024, pp. 440-447, doi: 10.1109/ID-
CIoT59759.2024.10467248.
[17] Nair, A.R., Singh, R.P., Gupta, D. and Kumar, P., 2024. Evaluating the
Impact of Text Data Augmentation on Text Classification Tasks using
DistilBERT. Procedia Computer Science, 235, pp.102-111.
[18] Paul, Pretty, and Rimjhim Padam Singh. ”Sentiment Rating Predic-
tion using Neural Collaborative Filtering.” In 2022 IEEE 7th Interna-
tional Conference on Recent Advances and Innovations in Engineering
(ICRAIE), vol. 7, pp. 148-153. IEEE, 2022.
[19] Kavitha C.R.Rajarajan S J, R. Jothilakshmi, Kamal Alaskar, Mohammad
Ishrat, V. Chithra, Study of Natural Language Processing for Sentiment
Analysis, 2023 3rd International Conference on Pervasive Computing
and Social Networking (ICPCSN), 19-20 June 2023. 2023 3rd Inter-
national Conference on Pervasive Computing and Social Networking
(ICPCSN) 19th-20th June 2023
[20] Vidya Kumari K. R., Kavitha C. R., Data Mining for the Social
Awareness of the Social Networks, 3rd International Conference on
Computational System and Information Technology Sustainable Solu-
tions (CSITSS 2018), December 2018, pp: 7-17.
[21] Mridula A., Kavitha C. R., Opinion Mining and Sentiment Study of
Tweets Polarity Using Machine Learning, Proceedings of 2nd Inter-
national Conference on Inventive Communication and Computational
Technologies (ICICCT 2018), April 2018, pp: 621-626.
[22] Venkataramani, Eknath and Gupta, Deepa. (2010). English-Hindi
Automatic Word Alignment with Scarce Resources. 253-256.
10.1109/IALP.2010.5

7
Authorized licensed use limited to: JNT University Kakinada. Downloaded on December 16,2024 at 10:49:54 UTC from IEEE Xplore. Restrictions apply.

You might also like