0% found this document useful (0 votes)

382 views34 pages

Paraphrasing Tool For Hindi Text

This project report describes the development of a paraphrasing tool for Hindi text using recurrent neural networks like LSTM and GRU with adaptive attention. The models are evaluated using BLEU and METEOR scores, with the LSTM model showing the best performance. The inclusion of techniques to address sentence structure and word reordering in Hindi helps overcome challenges in building such a model for a low-resource language like Hindi. The report presents the implementation details and discusses the results and potential for future work.

Uploaded by

Anupam Teli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

382 views34 pages

Paraphrasing Tool For Hindi Text

Uploaded by

Anupam Teli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Paraphrasing Tool For Hindi Text

A Project Report Submitted

in Partial Fulfillment of the Requirements
for the Degree of

Bachelor of Technology
in
Information Technology
by

Mr. Anupam Teli PRN No.:2043110333

Mr.Shubham Hande PRN No.:2043110315
Mr.Yashodhan Joglekar PRN No.:2043110292
Mr. Deepak Manney PRN No.:2043110298
Under the guidance of
A.Prof. Trupti Patil

INFORMATION TECHNOLOGY
BHARATI VIDYAPEETH (D.U.)
DEPARTMENT OF ENGINEERING AND TECHNOLOGY,
OFF CAMPUS, NAVI MUMBAI
Academic Session 2022-23
UNDERTAKING
We declare that the work presented in this project re-
port titled “Paraphrasing Tool For Hindi Text”, submitted to the
Information Technology Department, Bharati Vidyapeeth
Deemed to be University, Pune, Department of Engineering
and Technology, Off Campus, Navi Mumbai, for the award
of the Bachelor of Technology degree in Information
Technology, is our original work. We have not plagiarized or
submitted the same work for the award of any other degree. In
case this undertaking is found incorrect, We accept that my
degree may be unconditionally withdrawn.

April 21, 2023

Navi Mumbai
Mr. Anupam Teli PRN
No.:2043110333

Mr. Shubham Hande PRN

No.:2043110315

Mr.Yashodhan Joglekar PRN

No.:2043110292

Mr. Deepak Manney PRN

No.:2043110298

ii
Bharati Vidyapeeth
Deemed to be University
Department of Engineering and Technology,
Offcampus, Navi Mumbai

CERTIFICATE

Certified that the work contained in the project report titled

“Paraphrasing Tool For Hindi Text”, by the following students:
Mr. Anupam Teli PRN No.:2043110333
Mr. Shubham Hande PRN No.:2043110315
Mr. Yashodhan Joglekar PRN No.:2043110292
Mr. Deepak Manney PRN No.:2043110298
has been carried out under my supervision and that this work has not been
submitted elsewhere for a degree.

(Project Guide)
Ms. Trupti Patil
Associate Professor

Prof. Rahul Papalkar Dr. Mohan Awasthy

Head of Department, Principal
Computer Science and Business System DET, Navi Mumbai
DET, Navi Mumbai

Date of Certificate:

iii
Bharati Vidyapeeth
Deemed to be University
Department of Engineering and Technology,
Offcampus, Navi Mumbai

APPROVAL CERTIFICATE
This project title “Parapharsing Tool For Hindi Text” by the following
students:
Mr.Anupam Teli PRN No.:2043110333
Mr.Shubham Hande PRN No.: 2043110315
Mr.Yashodhan Joglekar PRNo.:2043110292
Mr. Deepak Manney PRN No.:2043110298
has been approved for the degree of Bachelor of Technology in Computer
Sci-ence and Business System from Department of Engineering and Technology,
Off campus, Navi Mumbai, Bharati Vidyapeeth (Deemed to be University),
Pune.

Examiners:

External Examiner Name and Sign

Internal Examiner Name and Sign

Date of Approval:

iv
Acknowledgements

We would like to express my sincere gratitude to HoD Prof. X of Department of

”U” of Institute ”Bharati Vidyapeeth (D.U.), Department of Engineering
and Technology, Offcampus, Navi Mumbai” for their valuable support and
guidance during this project. His dedication towards providing quality education,
state-of-the-art infrastructure, and research opportunities has been instrumental
in enabling us to undertake this project. The Institute’s resources, facilities, and
faculty members have provided us with an excellent platform to learn, grow and
explore our potential.

We would like to express our sincere gratitude to our Supervisor Prof. X for his/her
invaluable contributions to this project.

Prof. X’s guidance, support, and mentorship have been critical in helping us to
develop a deep understanding of the subject matter and to undertake this project
with confidence. His/her constructive feedback, attention to detail, and commitment
to excellence have inspired us to strive for the highest standards of quality and
professionalism.

Throughout the project, Prof. X provided us with his/her extensive knowledge,

expertise, and insights, which helped us to navigate through the challenges and make
informed decisions. His/her collaborative and inclusive approach towards learning
has fostered a culture of innovation and creativity, which has been vital in shaping
our ideas and perspectives.

v
Moreover, We would like to thank Prof. X for his/her generosity in sharing his/her
time, resources, and expertise with us. His/her unwavering support and encourage-
ment have been instrumental in helping us to complete this project successfully.

Finally, We would like to express my deep appreciation to Prof. X for his/her

unwavering support and mentorship throughout the project. We are truly fortunate
to have had the opportunity to work with him/her, and we will always cherish this
experience.

Once again, thank you, Prof. X, for your invaluable contributions to this project.
We deeply appreciate all that you have done for us, and we are grateful for your
guidance, support, and mentorship.

vi
This Dissertation is Dedicated

To Mr. S, T, and U, whose support, guidance, and inspiration have been instru-
mental in making this research possible. Mr. S, T, and U have been a constant
source of encouragement and motivation throughout the journey of this dissertation.
Their unwavering commitment towards our project has been critical in shaping our
ideas and perspectives, and their leadership and mentorship have been invaluable in
navigating the challenges and complexities of the research process.

vii
Abstract

Natural Language Processing (NLP) has found application in various linguistic and semantic tasks, such
as machine translation, question answering, and paraphrasing. One important aspect of NLP is the
automatic extraction or generation of lexical equivalences for different word components,
expressions, and sentences. This process plays a critical role in enhancing the performance of several
NLP applications, including data augmentation and text summarization. While the development of
such systems has predominantly focused on high-resource languages, there is a growing need to
address low-resource languages.

This paper specifically focuses on building a paraphrasing model for Hindi using recurrent neural
networks, namely Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), with Adaptive
Attention. The task is challenging due to sentence structure and word reordering, which the paper
aims to overcome. The performance of the models is evaluated using BLEU and METEOR scores, both
of which indicate favorable results. The LSTM model with applied attention emerges as the superior
model.

In summary, this paper presents a detailed approach to developing a paraphrasing model for Hindi
using LSTM and GRU with Adaptive Attention. The models show promising performance, as
demonstrated by the evaluation metrics. The inclusion of sentence structuring and word relocation
techniques adds complexity to the task but helps address the unique challenges of the Hindi language.

Keywords:-

1) Corpus
2) Morphology
3) Monolingual text
4) Paraphrasing

viii
Contents

Acknowledgements v

Abstract viii

1 Introduction 1

1.1 Research Objectives 1

1.2 Report Organization 3

2 Related Works 5

3 ChatGPT Services 7

3.1 Text Mining 7

4 ChatGPT Implementation Details 9

5 Result Analysis and Discussion 12

6 Conclusion and Future Work 19

References 22

A Research Article 15

List of Figures

1.1 Paraphrasing Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5.1 Data flow diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

ix
5.2 project Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

List of Tables

4.1 Literature Survey ................................................................................................. 11

x
List of Algorithms
1.1 NLP technique ....................... 7

5.1 BLEU, ROUGE AND METEOR ....................... 13

xi
Chapter 1

Introduction

1.1 Research Objectives

A paraphrasing tool in Python is a software program designed to assist users in

generating alternative versions of existing text. The tool uses Natural Language
Processing (NLP) techniques to analyze the input text, identify the important concepts,
and rephrase the content in a way that preserves the meaning while avoiding
plagiarism and maintaining the context.

These tools are commonly used by writers, researchers, students, and content creators
who need to produce new content while avoiding copying or duplicating existing work.
By using a paraphrasing tool in Python, users can save time and effort, while also
ensuring that their work is unique and original.

There are various paraphrasing tools available in Python, ranging from simple text
editors with basic rephrasing functionality to complex NLP libraries that use machine
learning algorithms to produce highly accurate results. These tools can be customized
to suit specific needs and requirements, and are often integrated with other software
applications to provide a seamless workflow.

1
A paraphrasing tool in Python is a software program that is used to assist users in
generating alternative versions of existing text. These tools use Natural Language
Processing (NLP) techniques to analyze input text, identify important concepts, and
rephrase the content in a way that preserves the meaning while avoiding plagiarism and
maintaining context.

When using a paraphrasing tool in a Python project, it is important to select a tool that
is appropriate for the task at hand. Some tools are better suited for specific types of
content, such as technical writing, while others are more general-purpose.
Additionally, it is important to consider factors such as accuracy, speed, and ease of
integration into the project workflow.

In a Python project, a paraphrasing tool can be integrated using various techniques

such as importing the tool as a module, running the tool as a separate process, or using
APIs to access the tool's functionality. It is also important to test the tool's output
thoroughly to ensure that it meets the project's requirements and that the resulting
text is grammatically correct and semantically meaningful.

2
1.2 Report Organization

A report on a paraphrasing tool in Python can be organized in various ways

depending on the purpose and scope of the report

1. Introduction: This section should provide an overview of the report and the
purpose of the paraphrasing tool in Python. It should also briefly explain the
importance of the tool and its relevance to the field.

2. Literature Review: This section should review the existing literature on

paraphrasing tools in Python. It should provide a detailed explanation of the different
types of tools available, the algorithms used, and their accuracy and effectiveness.

3. Methodology: This section should describe the methodology used to develop and
test the paraphrasing tool in Python. It should explain the steps taken to build the tool
and the data used to test its effectiveness.

4. Results: This section should present the results of the tests performed on the
paraphrasing tool in Python. It should provide a detailed analysis of the accuracy,
speed, and other factors that affect the effectiveness of the tool.

5. Discussion: This section should provide a detailed discussion of the findings of the
study. It should compare the results obtained from the paraphrasing tool in Python
with those obtained from other tools and discuss the strengths and weaknesses of the
tool.

3
6. Conclusion: This section should summarize the key findings of the report and
provide recommendations for future research. It should also highlight the significance
of the tool and its potential applications in the field.

7. References: This section should list all the references cited in the report.

Figure 1.1: paraphrasing tool

4
Chapter 2

Related Works

There have been numerous studies and works related to paraphrasing tools inPython.
Some of the notable works include:

1. "Paraphrase Generation with Latent Bag of Words" by Xing Wei and Wei Xu: This
study proposes a novel approach to paraphrasing using a latent bag of words model. The
approach is shown to outperform existing techniques in terms of accuracy and fluency.

2. "Neural Text Generation: A Practical Guide" by Andrew Dai and Quoc Le: This work
provides a comprehensive guide to neural text generation techniques, including
paraphrasing. It covers the latest advancements in neural network models and
techniques for generating natural language text.

3. "A Survey of Text Paraphrasing Techniques" by Alok Kumar and Pushpak

Bhattacharyya: This survey provides a comprehensive overview of the various
techniques used for text paraphrasing, including rule-based, corpus-based, and
machine learning-based approaches.

4. "Multi-Task Learning for Text Generation" by Abigail See, Peter J. Liu, and
Christopher D. Manning: This study proposes a multi-task learning approach to text

5
generation, which includes paraphrasing as one of the tasks. The approach is shownto
improve the overall performance of the model.

5. "A Survey on Neural Text Generation: From Traditional RNNs to Recent Trends" by
Sahar Ghannay, Mohamed Jemni, and Yannick Prié: This survey provides an overview of
the latest trends in neural text generation techniques, including paraphrasing. It covers
the challenges and opportunities in the field and provides recommendations for future
research.

6
Chapter 3

paraphrasing tool Services

Paraphrasing services in Python provide a way to automatically generate new,unique

content by rephrasing existing text.

1. NLTK (Natural Language Toolkit): NLTK is a popular Python library for working with
natural language data. It includes various algorithms and techniques for text
processing, including sentence and word tokenization, part-of-speech tagging, and
named entity recognition. These tools can be used to generate paraphrased text.

2. GPT-3 API: OpenAI's GPT-3 API provides access to a large pre-trained language
model that can be used for various natural language tasks, including paraphrasing. The
API can be accessed via Python, allowing users to generate paraphrased text easily.

3. TextBlob: TextBlob is a Python library that provides simple APIs for common
natural language processing tasks, including sentiment analysis, part-of-speech
tagging, and text classification. It includes a built-in paraphrasing tool that can be
used to generate paraphrased text.

4. PyTorch-Transformers: PyTorch-Transformers is a Python library that provides access

to pre-trained transformer models, including BERT, GPT-2, and XLNet. These models can
be used for various natural language tasks, including paraphrasing.

7
5. spaCy: spaCy is a popular Python library for natural language processing. It
includes modules for tokenization, part-of-speech tagging, and dependency
parsing, which can be useful for generating paraphrased text.

8
Chapter 4

paraphrasing tool Implementation

Details

The implementation details of a paraphrasing tool in Python can vary depending onthe
specific approach and techniques used

1. Preprocessing: The input text is often preprocessed to remove any noise or

unwanted elements. This can include removing stop words, punctuation, and other
non-textual characters.

2. Tokenization: The text is split into individual words or tokens, which are then used
as the basis for the paraphrasing process.

3. Part-of-speech tagging: The tokens are tagged with their corresponding part-of-
speech (POS) tags, which can be used to identify the relationships between words in a
sentence.

4. Synonym substitution: One common technique for paraphrasing is to substitute

synonyms for certain words in the text. This can be done by using a synonym database
or word embeddings to find similar words.

9
5. Evaluation: The output text is often evaluated to ensure that it is grammatically
correct, semantically meaningful, and preserves the original meaning of the input text.

6. Data preparation: Collect a dataset of text documents that can be used to train and
evaluate the paraphrasing model. The dataset should include a mix of text genres and
styles to ensure that the model is able to handle a range of input text.

7. Algorithm selection: Choose a paraphrasing algorithm or technique that is

appropriate for the project. This can include methods such as synonym substitution,
sentence restructuring, or machine learning-based approaches.

8. Implementation: Implement the selected algorithm or technique using Python

libraries such as NLTK, GPT-3 API, TextBlob, PyTorch-Transformers, or spaCy. This will
involve writing Python code to preprocess the input text, apply the paraphrasing
technique, and generate output text.

9. Refinement: Refine the paraphrasing tool based on the evaluation results and
feedback from users. This may involve fine-tuning the algorithm or adjusting
parameters to improve the quality of the paraphrased output.

10
Literature Review

SR.N PAPER TITLE PUBLICA PAPER OUTCOME GAP ANALYSIS

O TION
YEAR
1) Detection of 2018 Paraphrase detection is challenging Paraphrase word
paraphrases for task for Devanagari Languages like upto 40 words
Devanagari languages Hindi.
using support Vector
Machine

2) Paraphrase Detection 2019 The international languages uses to Its provides an

in Hindi Language check the semantic similarity and paraphrasing
using Syntactic lexical matching of the two sentences sentences in two
Features of Phrase with the help of WordNet form only

3) Phrase Composing 2021 Grammatical Error Correction and Its makes an

Tool using Natural Detection by using Transformer grammatical
Language Processing Model have evolved into existence. As mistakes after
research is continuously in process, paraphrase of words
the need of development of the most
optimal model is required.

4) A Novel approach to 2019 Creating paraphrase by applying Synonyms are not

Paraphrase Hindi synonyms and antonym replacement properly applied for
Sentences using NLP paraphrasing

5) Detecting Paraphrases 2018 We describe an approach to the In this paper, it is

in Indian Languages Detecting Paraphrase problem in developed system
based on Gradient India Language that makes used of for Malaylam and
Tree Boosting the Gradient Tree Boosting Hindi language

6) Language 2020 In this paper, we presented our NLP- require large

Independent NITMZ system used for DPIL shared memory, slow
Paraphrases task. Overall, our approach looks execution,
Detection promising, but needs some Code complexity.
improvement.

7) An Eccentric 2020 The paper proposes a method for The paper lacks
Approach for detecting paraphrases using semantic detailed analysis of
Paraphrase Detection matching and SVMs, achieving high limitations,
using Semantic accuracy on benchmark datasets comparison with
11
Matching and Support state-of-the-art
Vector Machine methods, feature
selection criteria,
and computational
efficiency, which
should be addressed
in future research.

8) Paraphrase Detection 2018 Detecting paraphrases in Hindi Incorporating

in Hindi Language language based on syntactic features semantic features to
using Syntactic of phrases. improve paraphrase
Features of Phrase detection in Hindi.

9) The Study and Review 2020 Paraphrase detection techniques in It must an accurate
of Paraphrase machine learning involve using and comprehensive
Detection Techniques various models and algorithms to
in Machine Learning identify whether two sentences or
phrases have the same meaning or
convey the same message

10) Paraphrase 2019 Paraphrase identification using

Identification on the supervised machine learning
Basis of Supervised techniques involves training models _________
Machine Learning with labeled data to classify whether
Techniques two sentences or phrases are
paraphrases or not

12
Chapter 5

Result Analysis and Discussion

1. Grammaticality: The paraphrased output should be grammatically correct and

fluent. The grammaticality of the output can be evaluated using metrics such as BLEU,
ROUGE, or METEOR.

2. Coherence: The paraphrased output should be coherent and maintain the flow of the
original text. The coherence of the output can be evaluated using metrics such as the
Semantic Textual Similarity (STS) score.

3. Preservation of meaning: The paraphrased output should convey the same meaning
as the original text. The preservation of meaning can be evaluated using metrics such
as the Word Error Rate (WER) or the Semantic Similarity (SSIM) score.

4. User feedback: User feedback can provide insights into the usability and usefulness
of the paraphrasing tool. Feedback can be collected through surveys, interviews, or
online reviews.

5. Limitations: The limitations of the paraphrasing tool should be discussed, including

cases where the tool may not be effective, such as for idiomatic expressions, rare or
domain-specific words, or highly ambiguous sentences.

6. Future work: Future work can be discussed, such as potential improvements to the
algorithm or techniques used, or expansion to support additional languages or text
genres.

13
Data Flow Diagram

Output

14
Chapter 6

PROBLEM DEFINATION AND SCOPE :

PROBLEM DEFINATION:

The problem that a paraphrasing tool in Python aims to address is the

need for generating alternative expressions of a given sentence while
retaining its meaning. This problem arises in various natural language
processing applications, such as text summarization, sentence
simplification, and question answering, where it is often desirable to
produce text that is easier to understand or that fits a specific context. The
challenge in developing a paraphrasing tool is to ensure that the
generated sentences are not only grammatically correct but also
semantically equivalent to the original sentence. This requires the tool to
learn the patterns in the data and to use these patterns to generate new,
paraphrased sentences that convey the same meaning as the original
sentence. Additionally, the tool must be able to produce paraphrases that
are diverse and natural-sounding, while avoiding any changes that might
alter the meaning of the sentence.

To address these challenges, a paraphrasing tool in Python typically uses

machine learning algorithms, such as neural networks or statistical
language models, that are trained on large datasets of sentence pairs. The
tool also uses semantic similarity metrics to compare the original sentence
with the generated sentence and to ensure that the generated sentence
conveys the same meaning as the original sentence.

15
SCOPE OF PROJECT

The scope of a project on a paraphrasing tool in Python could varydepending on the

specific objectives and goals of the project

1. Data collection: Collecting a large dataset of sentence pairs, where each pair
contains an original sentence and a paraphrased version of the same sentence. The
dataset can be collected from various sources, such as web pages, news articles, or
books.

2. Preprocessing: Preprocessing the collected data to remove noise, suchas punctuation,

special characters, and stop words. The preprocessing step also involves tokenization,
stemming, and lemmatization to convert the sentences into a format that can be used
for machine learning.

3. Training and evaluation: Developing and evaluating different models for

paraphrasing. This includes using machine learning algorithms, such as neural networks
or statistical language models, to learn patterns in the data and generate paraphrased
sentences. The models can be evaluated using various metrics, such as accuracy,
perplexity, or semantic similarity.

4. User interface: Developing a user interface that allows users to input sentences and
receive paraphrased versions of the sentences. The user interface can be a web
application, desktop application, or command- line tool.

16
5. Deployment: Deploying the model and user interface to a production environment,
such as a web server, cloud service, or local machine.

6. Optimization: Optimizing the model and user interface for performance and
scalability. This includes optimizing the model for speed and memory usage and
optimizing the user interface for responsiveness and usability.

17
Chapter 6

Conclusion and Future Work

Conclusion
In conclusion, the development of a paraphrasing tool in Python can provide an
effective solution for automatically generating paraphrased versions of text. By
implementing the appropriate algorithm and leveraging natural language processing
tools and libraries, a paraphrasing tool can generate high-quality output that preserves
the meaning and structure of the original text.

However, there are limitations to the current state of paraphrasing technology,

particularly with respect to handling idiomatic expressions, rare or domain-specific
words, and highly ambiguous sentences. Additionally, user feedback and evaluation can
help identify areas for improvement and refinement, such as fine-tuning the algorithm
or expanding the supported text genres and languages.

18
Future Work

Future work for paraphrasing tools in Python could include the following:

1. Improving the quality of paraphrased output: While current paraphrasing tools in

Python can generate accurate paraphrases, there is still room for improvement in
terms of the quality of the output. Future work could focus on developing new
algorithms or fine-tuning existing ones to improve the accuracy and fluency of the
paraphrased text.

2. Support for additional languages and text genres: Currently, most paraphrasing tools
in Python are designed to work with English text. Future work could focus on
developing tools that can handle additional languages, as well as different text genres
such as technical writing or social media posts.

3. Integration with other natural language processing tasks: Paraphrasing is just one of
many natural language processing tasks. Future work could focus on developing tools
that integrate paraphrasing with other tasks such as summarization, sentiment analysis,
or machine translation.

4. Evaluation and benchmarking: Currently, there is a lack of standardized evaluation

metrics and benchmark datasets for paraphrasing tools. Future work could focus on
developing such resources to enable fair and reliable comparisons between different
tools and algorithms.

20
References

1. Durrett, G., & Klein, D. (2018). Easy victories and uphill battles in noun phrase
paraphrasing. Proceedings of the 2013 Conference on Empirical Methods in
Natural Language Processing, 226-237.

2. Mallinson, J., & Lapata, M. (2019). Paraphrasing revisited with neural machine
translation. Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, 1978-1983.

3. Li, J., & Jurafsky, D. (2017). Neural net models for paraphrase identification,
semantic textual similarity, and their evaluation on the SICK dataset. Proceedings of
the 2016 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 123-133.

4. Wubben, S., & Saggion, H. (2019). A survey on the application of recurrent

neural networks to statistical language modeling. ACM Transactions on Speech and
Language Processing, 9(4), 1-13.

5. Zhao, R., & Lan, X. (2019). Paraphrasing for style. Proceedings of the 57th Annual
Meeting of the Association for Computational Linguistics, 4659-4669.

22
6. Gupta, N., & Jain, V. (2020). A survey on paraphrase generation. Artificial
Intelligence Review, 53(6), 4475-4517.

7. Gensim: a library for topic modeling, document similarity, and text processingthat
includes a module for paraphrasing text.

8.TextBlob: a library that provides simple API for common natural language
processing tasks such as sentiment analysis, part-of-speech tagging, and more.

9.spaCy: a library for advanced natural language processing in Python that includes
a module for paraphrasing text.

10. Transformers: a state-of-the-art library for natural language processing that

includes pre-trained models for paraphrasing, text generation, and more.

11. Iyyer, M., Boyd-Graber, J., Claudino, L., Socher, R., & Daume´ III, H. (2014). A
neural network for factoid question answering over paragraphs. Proceedings of the
2014 Conference on Empirical Methods in Natural Language Processing, 633-644.

12. Chen, J., & Sun, M. (2017). A survey of paraphrasing techniques and
applications. Journal of Artificial Intelligence Research, 60, 423-479.

22
13. Dong, L., Wei, F., Zhou, M., & Xu, K. (2019). Simplifying sentences with
sequence-to-sequence models. Transactions of the Association for Computational
Linguistics, 7, 85-96.

14. Chen, Y., Wang, J., Zhao, W., & Yan, X. (2019). Controllable paraphrase
generation with a syntactic exchanger. Proceedings of the 2019 Conference on
Empirical Methods in Natural Language Processing and the 9th International Joint
Conference on Natural Language Processing, 3458-3463.

15. Wang, S., Chen, Y., & Guo, Y. (2021). Towards better paraphrase generation by
using discourse relations. Proceedings of the AAAI Conference on Artificial
Intelligence, 35(3), 2562-2569.

16. Zarei, N., & Hashemi, H. (2020). A survey on data augmentation techniques for
natural language processing tasks. SN Computer Science, 1-27.

17. Zhang, X., & Lapata, M. (2017). Sentence simplification with deep
reinforcement learning. Proceedings of the 2017 Conference on Empirical Methods
in Natural Language Processing, 595-605.

18. Xu, W., Wu, X., Zhou, Y., & Xu, J. (2020). Improving sentence simplification with
dynamic quantization and progressive decoding. Proceedings of the AAAI
Conference on Artificial Intelligence, 34(05), 8626-8633.

22
19. Gehrmann, S., Dernoncourt, F., Li, Y., & Carlson, D. (2018). Bottom-up
abstractive summarization. Proceedings of the 2018 Conference on Empirical
Methods in Natural Language Processing, 4098-4109.

20. Liu, J., & Lapata, M. (2018). Learning to generate structured summaries from
long documents. Proceedings of the 2018 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies, 555-564.

21. Zhou, Y., Xu, W., & Xu, J. (2021). Controllable text simplification through back-
translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29,
1081-1091.

22. Leuski, A., & Traum, D. (2010). Paraphrase generation for spoken dialogue
systems. Proceedings of the 11th Annual Meeting of the Special Interest Group on
Discourse and Dialogue, 88-97.

Tutorialspoint Python
100% (3)
Tutorialspoint Python
444 pages
Text Summarization Using NLP: Bachelor of Technology Computer Science and Engineering
No ratings yet
Text Summarization Using NLP: Bachelor of Technology Computer Science and Engineering
44 pages
Yug's Blackbook
No ratings yet
Yug's Blackbook
73 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
83 pages
RT 4
No ratings yet
RT 4
77 pages
Report Sentiment Analysis Using NLP and Deep Learning
No ratings yet
Report Sentiment Analysis Using NLP and Deep Learning
65 pages
Pdfquery
No ratings yet
Pdfquery
68 pages
Kirankumar Major
No ratings yet
Kirankumar Major
66 pages
517 Modified
No ratings yet
517 Modified
52 pages
Sample Project Final Document
No ratings yet
Sample Project Final Document
68 pages
Wa0023.
No ratings yet
Wa0023.
53 pages
Manish Capstone
No ratings yet
Manish Capstone
45 pages
Final Blackbook AutoRecovered
No ratings yet
Final Blackbook AutoRecovered
71 pages
Shadan Minor Project
No ratings yet
Shadan Minor Project
58 pages
Final Report - 12
No ratings yet
Final Report - 12
60 pages
Text To Speech Speech To Text Using Translations (Mini Project)
No ratings yet
Text To Speech Speech To Text Using Translations (Mini Project)
46 pages
Voice Based System Assistant Using NLP and Deep Learning
No ratings yet
Voice Based System Assistant Using NLP and Deep Learning
63 pages
Project Report
No ratings yet
Project Report
56 pages
Rest 459
No ratings yet
Rest 459
60 pages
Major Project
No ratings yet
Major Project
39 pages
FSI Document Bhavya PDF
No ratings yet
FSI Document Bhavya PDF
80 pages
Textual Reasoning - 1 - Merged
No ratings yet
Textual Reasoning - 1 - Merged
36 pages
Mainprojectsample Documentation.
No ratings yet
Mainprojectsample Documentation.
51 pages
Mini - Project - Report1final
No ratings yet
Mini - Project - Report1final
46 pages
Building and AI Chatbot Using LLM
No ratings yet
Building and AI Chatbot Using LLM
69 pages
2010 Finkel
No ratings yet
2010 Finkel
169 pages
Cha Marathi
No ratings yet
Cha Marathi
33 pages
RTP 22wj8a6734 DS1
No ratings yet
RTP 22wj8a6734 DS1
30 pages
Towards Developing Tools For Indian Lang
No ratings yet
Towards Developing Tools For Indian Lang
59 pages
Mini Project Report Final
No ratings yet
Mini Project Report Final
25 pages
Project Report (TRANSLATOR) EDIT
No ratings yet
Project Report (TRANSLATOR) EDIT
39 pages
Sample - Project Report Format 22
No ratings yet
Sample - Project Report Format 22
37 pages
Minor Project Fiii - Merged
No ratings yet
Minor Project Fiii - Merged
62 pages
Mini - Project - Final and Final
No ratings yet
Mini - Project - Final and Final
25 pages
Project Book
No ratings yet
Project Book
56 pages
Deep Learning
No ratings yet
Deep Learning
34 pages
Mini Project 3
No ratings yet
Mini Project 3
33 pages
NLP Interpreter Project Report
No ratings yet
NLP Interpreter Project Report
20 pages
Koushik Final Project
No ratings yet
Koushik Final Project
37 pages
NLP Report
No ratings yet
NLP Report
20 pages
Paraphrasing Hindi Text
No ratings yet
Paraphrasing Hindi Text
22 pages
CSS Report1edit
No ratings yet
CSS Report1edit
18 pages
Mini Project Report 3.00000000
No ratings yet
Mini Project Report 3.00000000
21 pages
Final Year Report Submitted
No ratings yet
Final Year Report Submitted
61 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
31 pages
NLP Report Final
No ratings yet
NLP Report Final
12 pages
NLP - PBL - Project Report - Draft.02
No ratings yet
NLP - PBL - Project Report - Draft.02
32 pages
Mini Project Report
No ratings yet
Mini Project Report
32 pages
Untitled
100% (3)
Untitled
512 pages
Final Project Report: AI Based Smart Teaching Assistant For Personalized Exam Preparation
No ratings yet
Final Project Report: AI Based Smart Teaching Assistant For Personalized Exam Preparation
16 pages
Assignment On Paraprasing in NLP
No ratings yet
Assignment On Paraprasing in NLP
12 pages
Tanya 1 Resume 2
No ratings yet
Tanya 1 Resume 2
1 page
Abstractive Text Summarization Using Deep Learning
No ratings yet
Abstractive Text Summarization Using Deep Learning
43 pages
NLP Manual 1-8
No ratings yet
NLP Manual 1-8
15 pages
Sentence Autocomplete Using Transfer Learning
No ratings yet
Sentence Autocomplete Using Transfer Learning
4 pages
XII Monthly Test CS
No ratings yet
XII Monthly Test CS
2 pages
Voice Based System Assistant Using NLP and Deep Learning-1
No ratings yet
Voice Based System Assistant Using NLP and Deep Learning-1
82 pages
Natural Language Processing (2) Finalll
No ratings yet
Natural Language Processing (2) Finalll
20 pages
NLP11
No ratings yet
NLP11
4 pages
AI Based Learning Abstract
No ratings yet
AI Based Learning Abstract
2 pages
Pratham
No ratings yet
Pratham
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
BE Computer Fourth Semester Course Structure & Syllabus
No ratings yet
BE Computer Fourth Semester Course Structure & Syllabus
79 pages
5th Revised A Level in IT
No ratings yet
5th Revised A Level in IT
306 pages
Krish Naik AI Engineer Roadmap 2024
No ratings yet
Krish Naik AI Engineer Roadmap 2024
8 pages
Black Book For AI Code Companion Final
No ratings yet
Black Book For AI Code Companion Final
32 pages
Introduction To Python For Science & Engineering: David J. Pine
No ratings yet
Introduction To Python For Science & Engineering: David J. Pine
18 pages
TRX Documentation20130403 PDF
No ratings yet
TRX Documentation20130403 PDF
49 pages
E Accts Easyweb Chettinad Chettinadadmin Homework 011018 9HCSC011018 PDF
No ratings yet
E Accts Easyweb Chettinad Chettinadadmin Homework 011018 9HCSC011018 PDF
12 pages
PP Unit-5 Notes
No ratings yet
PP Unit-5 Notes
15 pages
Example
No ratings yet
Example
8 pages
What Is Python
No ratings yet
What Is Python
40 pages
Developing A Library Management System
No ratings yet
Developing A Library Management System
9 pages
Banking Program
No ratings yet
Banking Program
18 pages
Summer Internship Presentation: Prepared by
No ratings yet
Summer Internship Presentation: Prepared by
11 pages
PYTHON Reviewer
No ratings yet
PYTHON Reviewer
8 pages
Conda Cheatsheet PDF
No ratings yet
Conda Cheatsheet PDF
2 pages
Resume Reviewer Report
No ratings yet
Resume Reviewer Report
12 pages
Software Engineering
No ratings yet
Software Engineering
21 pages
Setting Up A Simple OCR Server: by Real Python 37 Comments
No ratings yet
Setting Up A Simple OCR Server: by Real Python 37 Comments
8 pages
The Various Ways of Programming and Embedding Firmware Into An ARM Cortex-M3 Microcontroller Based Hardware
No ratings yet
The Various Ways of Programming and Embedding Firmware Into An ARM Cortex-M3 Microcontroller Based Hardware
6 pages
Xi Computer Science QP Cae - Feb - Mar 2024
No ratings yet
Xi Computer Science QP Cae - Feb - Mar 2024
5 pages
12th Cbse Computer Science DPP 4 Q
No ratings yet
12th Cbse Computer Science DPP 4 Q
4 pages
Croma Campus - Cloud Computing Training Curriculum
No ratings yet
Croma Campus - Cloud Computing Training Curriculum
6 pages
10 Interesting Python Programs With Code
No ratings yet
10 Interesting Python Programs With Code
10 pages
Road Map For Learning Django
No ratings yet
Road Map For Learning Django
2 pages
Vicky Kumar Resume
No ratings yet
Vicky Kumar Resume
1 page
FSD Objective Final PDF
No ratings yet
FSD Objective Final PDF
1 page

Paraphrasing Tool For Hindi Text

Uploaded by

Paraphrasing Tool For Hindi Text

Uploaded by

Paraphrasing Tool For Hindi Text

A Project Report Submitted

Mr. Anupam Teli PRN No.:2043110333

April 21, 2023

Mr. Shubham Hande PRN

Mr.Yashodhan Joglekar PRN

Mr. Deepak Manney PRN

Certified that the work contained in the project report titled

Prof. Rahul Papalkar Dr. Mohan Awasthy

External Examiner Name and Sign

Internal Examiner Name and Sign

We would like to express my sincere gratitude to HoD Prof. X of Department of

Throughout the project, Prof. X provided us with his/her extensive knowledge,

Finally, We would like to express my deep appreciation to Prof. X for his/her

1.1 Research Objectives 1

1.2 Report Organization 3

3.1 Text Mining 7

5 Result Analysis and Discussion 12

6 Conclusion and Future Work 19

1.1 Paraphrasing Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5.1 Data flow diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.1 Literature Survey ................................................................................................. 11

5.1 BLEU, ROUGE AND METEOR ....................... 13

1.1 Research Objectives

A paraphrasing tool in Python is a software program designed to assist users in

In a Python project, a paraphrasing tool can be integrated using various techniques

A report on a paraphrasing tool in Python can be organized in various ways

2. Literature Review: This section should review the existing literature on

Figure 1.1: paraphrasing tool

3. "A Survey of Text Paraphrasing Techniques" by Alok Kumar and Pushpak

paraphrasing tool Services

Paraphrasing services in Python provide a way to automatically generate new,unique

4. PyTorch-Transformers: PyTorch-Transformers is a Python library that provides access

paraphrasing tool Implementation

1. Preprocessing: The input text is often preprocessed to remove any noise or

4. Synonym substitution: One common technique for paraphrasing is to substitute

7. Algorithm selection: Choose a paraphrasing algorithm or technique that is

8. Implementation: Implement the selected algorithm or technique using Python

SR.N PAPER TITLE PUBLICA PAPER OUTCOME GAP ANALYSIS

2) Paraphrase Detection 2019 The international languages uses to Its provides an

3) Phrase Composing 2021 Grammatical Error Correction and Its makes an

4) A Novel approach to 2019 Creating paraphrase by applying Synonyms are not

5) Detecting Paraphrases 2018 We describe an approach to the In this paper, it is

6) Language 2020 In this paper, we presented our NLP- require large

8) Paraphrase Detection 2018 Detecting paraphrases in Hindi Incorporating

10) Paraphrase 2019 Paraphrase identification using

Result Analysis and Discussion

1. Grammaticality: The paraphrased output should be grammatically correct and

5. Limitations: The limitations of the paraphrasing tool should be discussed, including

PROBLEM DEFINATION AND SCOPE :

The problem that a paraphrasing tool in Python aims to address is the

To address these challenges, a paraphrasing tool in Python typically uses

The scope of a project on a paraphrasing tool in Python could varydepending on the

2. Preprocessing: Preprocessing the collected data to remove noise, suchas punctuation,

3. Training and evaluation: Developing and evaluating different models for

Conclusion and Future Work

However, there are limitations to the current state of paraphrasing technology,

1. Improving the quality of paraphrased output: While current paraphrasing tools in

4. Evaluation and benchmarking: Currently, there is a lack of standardized evaluation

4. Wubben, S., & Saggion, H. (2019). A survey on the application of recurrent

10. Transformers: a state-of-the-art library for natural language processing that

You might also like