0% found this document useful (0 votes)
2 views

Abstract

The document discusses the development of SESRUA, a new method for code retrieval that utilizes natural language queries to find relevant code snippets from existing repositories. It employs a Long-Short Term Memory (LSTM) model for tag prediction and analyzes code content to generate descriptions for functions. The approach has been evaluated using a benchmark dataset from Kaggle, achieving an average accuracy of 68% and recall of 78% in proposing source code.

Uploaded by

Hamza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Abstract

The document discusses the development of SESRUA, a new method for code retrieval that utilizes natural language queries to find relevant code snippets from existing repositories. It employs a Long-Short Term Memory (LSTM) model for tag prediction and analyzes code content to generate descriptions for functions. The approach has been evaluated using a benchmark dataset from Kaggle, achieving an average accuracy of 68% and recall of 78% in proposing source code.

Uploaded by

Hamza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Abstract

The Code retrieval seeks most relevant snippets of code user request (i.e. a natural language
description).For this purpose there are many code-search methods that support natural-language
queries. There are problems with figuring out the pros and cons of each method and choosing the
best one for different uses. This is because

 The implementations of these methods and the datasets used to evaluate them are usually
not available to the public
 Some methods use different training datasets or auxiliary data sources, so their
effectiveness can't be measured fairly and may be hurt in real-world uses.

Therefore we collected benchmark dataset


https://www.kaggle.com/stackoverflow/stackoverflow. This dataset is taken from the
competition that was held on kaggle in 2019. It includes 500000 queries posted on stack
overflow from which we train 100000 queries for retrieving relevant results In our approach we
employ a neural network model called 'Long-Short Term Memory'(LSTM) for tag prediction.
We've developed a new method for making source code recommendations. We've developed a
new method for making source code recommendations called SESRUA (Smart Search Engine
for Source Code Retrieval) in order to enable programmer find relevant implementation for
sample code based on software requirement specification. SESRUA assist programmer to search
existing code repositories using natural language query .Our proposed approach summarize
python code and its 500+ tags into sentence and paraphrase to match them against user queries.
SESRUA pulls out and analyses the content of code, such as variables, functions, docstrings, and
comments, to make a code description for each function that is then modelled to the correct
function. For evaluating, we made a Web-based tool that lets users type in a text search query
and get the top ten most effective findings. Our proposed approach achieves an average accuracy
of 68% and an average recall of 78% were found in the system's ability to propose source code.

KEYWORDS: Code Retrieve, LSTM, Tag prediction, Semantics, Stack overflow, Search
Engine, Reuse.

You might also like