0% found this document useful (0 votes)

109 views5 pages

A Novel Approach For Code Smells Detection Based On Deep Learning

This document proposes a novel approach for detecting code smells using deep learning. Specifically, it uses convolutional neural networks to identify various types of code smells based on code semantics. The experiments show the approach achieves high F2 scores, particularly for detecting uncontrolled side effects and contrived complexity. Key aspects of the approach include transforming source code to XML, using CNNs to propose and classify code segments, and achieving precision, recall, and F2 scores over 0.75 on average for several common code smells.

Uploaded by

4048 Sivashalini.G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

109 views5 pages

A Novel Approach For Code Smells Detection Based On Deep Learning

Uploaded by

4048 Sivashalini.G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

A Novel Approach for Code Smells Detection Based on

Deep Leaning

Tao Lin1, Xue Fu2, Fu Chen3, Luqun Li4

1 Amazon, Seattle WA 98109, USA, [email protected]
2 Shanghai Normal University, China
3 Central University of Finance and Economics, China, [email protected]
4 Shanghai Normal University, China, [email protected]

Abstract. Compared to software bugs, code smells are more significant in

software engineering research. It is not easy to detect code smells through
traditional methods. In this work, we propose a novel code smells detection
approach based on deep learning. The experiments show that our work achieves
high scores in terms of F2 score.

Keywords: Code Smells, Deep Learning, Convolutional Neural Network.

1 Introduction

Code smells are increasingly generated by modern agile software development. This
is because code changes are much more frequent and occur on a daily basis for large
software companies and dominant open-source communities.
Although there are many more test approaches to detect code smells, these methods
have some defects. Due to the frequent changes, it is increasing probable to generate
code smells overheads. Code smells, like software bugs, are a serious problem in
modern software.
Nowadays, the code smells are being researched by many practitioners. Software
developers are not aware of what is the code smell, although they are aware of software
bugs, thanks integrated environment development kits that can provide many instant
suggestions and notifications when there are bugs.
The question here is how to detect code smells effectively? And what is the
motivation for detecting code smells? Although there are many more test approaches
to detect code smells, these methods have some defects. There are mainly two
categories of deep learning networks. One is Recurrent Neural Networks, and another
is Convolutional networks.

Corresponding Authors: Tao Lin (Amazon, USA, [email protected]), Luqun Li (Shanghai Normal
University, China, [email protected]), Fu Chen (Central University of Finance and
Economics, China, [email protected]), Xue Fu (Shanghai Normal University, China)
2

Convolutional networks have already demonstrated its usage by leveraging

hierarchy features. In this paper, we use fully convolutional networks for code smells
detection based on semantic features. We will use fully convolutional networks for this
work.
We will define the type of neural network we use, and explain how it is used to detect
code smells. An advantage of using convolutional network is its ability to identify and
use local correspondences.
In recognition and machine learning, convolutional networks are increasingly
significant. Convolutional network presents the improvement on image recognition. An
example is using convolutional network on local correspondence. In software
engineering, we definitely can use these kinds of information for code smells detection.
To our knowledge, this is the first work to train a convolutional network for code
smells recognition. The inference is much more improved through convolutional
network.
This work is an extension of the authors previous work [8].
Unlike previous works that needs additional information for code smells detection,
this work does not use any existing information for code smells detection. One of the
major challenges in code smells detection is to find the relationship between code
semantics and code location. There is a tradeoff between identifying the correct
semantics compared to identifying the correct location of the smells.
Although there are several success stories from image recognition by using deep
networks [1].It is hard to transfer these approaches to software engineering, which is
more deterministic. Fully convolutional network has been used for one-layered
computation, and has a potential to be deployed to multi-layered environments.

2 Code Smells Detection Based on Convolutional Networks

We can define a multi-dimensional array to represent the convolutional network, h

* w * d, where h and w are space dimensions, and d is the channel. The first layer is
our source code inputs.
The second layer is the networks for sequence modeling. For example, the inputs are
x0, x1, x2, x3, x4,… xn, and the outputs are y0, y1, y2, y3, y4,… yn. The second layer will
be y’0, y’1, y’2, y’3, y’4,… y’n.
The outputs will be reshaped to a one-dimensional array, where size will be D *1024.
This output array will be dilation blocks. For the encoder task, we should process noise.
Each layer in the encoder is processed by normalization and liner analysis.

3 High level design

With recent development in software engineering, it is easy to find software bugs using
several methods from compiling to running. Our work is based on the state-of-the-art
deep learning methods for detection and recognition task.
3

Firstly, we transform the software source code to XML file, in order to be processed
by deep learning models [2]. Then there are two steps: code segment proposal and
classification. The code segment proposal leverages the heuristic search to generate
following inputs. These segments are processed by CNN classifier. We try to avoid to
use R-CNN, otherwise. One of the main reasons is that R-CNN uses selective search
algorithm, which is time consuming.
We use the following equation for segments:
𝑙
Seg = (max𝑟𝑝2 ⁄𝐷 + 𝑚𝑖𝑛𝑟𝑞) ∗ (𝐼/𝐷)

Seg is the segments, r is the rule of limits, and p is process variable, q is the next
graph inputs, I is interception, and D is next destination.

3.1 Experiments results

We use an open-source database published by the authors’ previous work[11].

The experiments results are shown as following table:

Table 1 Experiments results

Precision Recall F-Score Kappa

Long 0.528 0.674 0.754 0.635
Method
Lazy Class 0.624 0.678 0.613 0.632

Speculative 0.712 0.734 0.689 0.643

Generality
Refused 0.698 0.701 0.711 0.677
Bequest
Duplicated 0.543 0.568 0.594 0.585
code
Contrived 0.783 0.792 0.810 0.802
complexity
Shotgun 0.597 0.596 0.501 0.601
surgery
Uncontrolled 0.801 0.799 0.805 0.810
side effects

From Table 1, this work achieves high performance in terms of F2 score, especially
for the category of uncontrolled side effects and contrived complexity.
4

4 Conclusion

In this work, we conducted a research for code smells detection based on deep learning.
Our solution uses convolutional neural network for training a model to detect several
common code smells problems in software engineering. The solution achieves satisfied
F2 score with the average above 0.75.

Acknowledge

Part of this work is from the author’s PhD study [1], before the author joining
Amazon. Professor Fu Chen from Central University of Finance and Economics
provided many constructive suggestions and perspectives for this work during author’s
PhD study. Professor Fu Chen and this work was supported in part by National Science
Foundation of China under No.61672104.

Reference

[1] T. Lin, “A Data Triage Retrieval System for Cyber Security Operations Center,”
Pennsylvania State Univ. Thesis, 2018.
[2] T. Lin, “A Container - Destructor – Explorer Paradigm to Code Smells
Detection,” J. Chinese Comput. Syst., vol. 37, no. 3, 2016.
[3] T. Lin and X. Fu, “Flame Detection Based on SIFT Algorithm and One Class
Classifier with Undetermined Environment,” Comput. Sci., vol. 42, no. 6, 2015.
[4] T. Lin, C. Zhong, J. Yen, and P. Liu, “Retrieval of Relevant Historical Data
Triage Operations in Security Operation Centers,” in From Database to Cyber
Security, Springer, Cham, 2018, pp. 227–243.
[5] T. Lin, “A Novel Image Matching Algorithm Based on Graph Theory,”
Comput. Appl. Softw., vol. 33, no. 12, 2016.
[6] T. Lin, “Graphic User Interface Testing Based on Petri Net,” Appl. Res.
Comput., vol. 33, no. 3, 2016.
[7] T. Lin, “A Novel Direct Small World Network Model,” J. Shanghai Norm.
Univ., vol. 45, no. 5, 2016.
[8] T. Lin, J. Gao, X. Fu, and Y. Lin, “A Novel Bug Report Extraction Approach,”
in International Conference on Algorithms and Architectures for Parallel
Processing, 2015, pp. 771–780.
[9] C. Zhong, T. Lin, P. Liu, J. Yen, and K. Chen, “A cyber security data triage
operation retrieval system,” Comput. Secur., vol. 76, pp. 12–31, 2018.
[10] T. Lin, “Deep Learning for IoT,” 39th IEEE -- International Performance
Computing and Communications Conference, 2020.
[11] T.Lin, “Security Operations Center Retrieval,” 2021.
https://github.com/ltaocs/SecurityOperationsCenterRetrieval.
5

Programming, Problem Solving & Abstraction With C (PDFDrive)
100% (1)
Programming, Problem Solving & Abstraction With C (PDFDrive)
253 pages
Practical Data Analysis
From Everand
Practical Data Analysis
Hector Cuesta
4.5/5 (14)
Sat - 3.Pdf - Code Smell Detection Using Machine Learning
No ratings yet
Sat - 3.Pdf - Code Smell Detection Using Machine Learning
11 pages
Code Smell Detection Using Machine Learning
No ratings yet
Code Smell Detection Using Machine Learning
5 pages
46 Iconip
No ratings yet
46 Iconip
9 pages
DACOS-A Manually Annotated Dataset of Code Smells: Himesh Nandani, Mootez Saad, Tushar Sharma
No ratings yet
DACOS-A Manually Annotated Dataset of Code Smells: Himesh Nandani, Mootez Saad, Tushar Sharma
5 pages
Python Code Smells Detection Using Conventional Machine Learning Models
No ratings yet
Python Code Smells Detection Using Conventional Machine Learning Models
21 pages
Prop Quatic
No ratings yet
Prop Quatic
6 pages
Lafi2019 Code Smells
No ratings yet
Lafi2019 Code Smells
4 pages
Smells Like Teen Spirit: Improving Bug Prediction Performance Using The Intensity of Code Smells
No ratings yet
Smells Like Teen Spirit: Improving Bug Prediction Performance Using The Intensity of Code Smells
12 pages
A Study On Code Smell Detection With Refactoring Tools in Object Oriented Languages
No ratings yet
A Study On Code Smell Detection With Refactoring Tools in Object Oriented Languages
4 pages
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
From Everand
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
William Sullivan
1/5 (1)
International Research Journal of Engineering and Technology (IRJET)
No ratings yet
International Research Journal of Engineering and Technology (IRJET)
8 pages
Re Engineering Slides
No ratings yet
Re Engineering Slides
30 pages
A Study of Dealing Class Imbalance Problem With Machine Learning Methods For Code Smell Severity Detection Using PCA-based Feature Selection Technique
No ratings yet
A Study of Dealing Class Imbalance Problem With Machine Learning Methods For Code Smell Severity Detection Using PCA-based Feature Selection Technique
18 pages
23 Code Smells Incidence Does It Depend On The Application Domain
No ratings yet
23 Code Smells Incidence Does It Depend On The Application Domain
6 pages
Factoring Thesis
100% (3)
Factoring Thesis
4 pages
3.taco - Icpc 2016
No ratings yet
3.taco - Icpc 2016
10 pages
Presentation 6
No ratings yet
Presentation 6
12 pages
On The Effectiveness of Developer Features in Code Smell Prioritization - A Replication Study
No ratings yet
On The Effectiveness of Developer Features in Code Smell Prioritization - A Replication Study
23 pages
Automated Vulnerability Detectionin Source Code Using Deep Representation Learning
No ratings yet
Automated Vulnerability Detectionin Source Code Using Deep Representation Learning
7 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Software Defect Prediction Via Convolutional Neural Network
No ratings yet
Software Defect Prediction Via Convolutional Neural Network
11 pages
On The Diffuseness and The Impact On
No ratings yet
On The Diffuseness and The Impact On
37 pages
Building A Library For Automatic Duplicate Code Detection
No ratings yet
Building A Library For Automatic Duplicate Code Detection
6 pages
DLDay18 Paper 40
No ratings yet
DLDay18 Paper 40
9 pages
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
From Everand
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Giancarlo Zaccone
No ratings yet
Improving Bug Detection Via Context-Based Code Rep
No ratings yet
Improving Bug Detection Via Context-Based Code Rep
30 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Smarter Decisions – The Intersection of Internet of Things and Decision Science
From Everand
Smarter Decisions – The Intersection of Internet of Things and Decision Science
Jojo Moolayil
No ratings yet
The Scent of A Smell: An Extensive Comparison Between Textual and Structural Smells
No ratings yet
The Scent of A Smell: An Extensive Comparison Between Textual and Structural Smells
24 pages
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
Reversing: Secrets of Reverse Engineering
From Everand
Reversing: Secrets of Reverse Engineering
Eldad Eilam
4.5/5 (16)
Software Defect Prediction PPR
No ratings yet
Software Defect Prediction PPR
11 pages
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Artificial Intelligence 2024 Book 2 of 2: AI, #2
From Everand
Artificial Intelligence 2024 Book 2 of 2: AI, #2
Yang Yen Thaw
No ratings yet
Touchpad Plus Ver. 3.1 Class 8
From Everand
Touchpad Plus Ver. 3.1 Class 8
Geeta Zunjani
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
A Survey of Different Machine Learning M
No ratings yet
A Survey of Different Machine Learning M
13 pages
allReturnedMaterial2000 2013
No ratings yet
allReturnedMaterial2000 2013
304 pages
Code Smells
No ratings yet
Code Smells
34 pages
Deep Learning
From Everand
Deep Learning
Manish Soni
No ratings yet
IEEE Xplore Reference Download 2025.2.9.22.39.11
No ratings yet
IEEE Xplore Reference Download 2025.2.9.22.39.11
3 pages
Code Smells and Detection Techniques: A Survey: Conference Paper
No ratings yet
Code Smells and Detection Techniques: A Survey: Conference Paper
7 pages
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
From Everand
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
Florian Fittkau
No ratings yet
Learning Advanced Programming
From Everand
Learning Advanced Programming
IT Campus Academy
No ratings yet
Shirley Yang Masc Thesis
No ratings yet
Shirley Yang Masc Thesis
65 pages
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
From Everand
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
Dr. Rajkumar Tekchandani
No ratings yet
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
From Everand
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
Timothy Eastridge
No ratings yet
Graph Data Science with Python and Neo4j
From Everand
Graph Data Science with Python and Neo4j
Timothy Eastridge
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Ai in Engeneering at Facebook PDF
No ratings yet
Ai in Engeneering at Facebook PDF
10 pages
Automatic Test Smell Detection Using Information
No ratings yet
Automatic Test Smell Detection Using Information
12 pages
Designing deep learning systems: Software engineering, #1
From Everand
Designing deep learning systems: Software engineering, #1
rayaan
No ratings yet
Bad Code Smells
No ratings yet
Bad Code Smells
4 pages
Machine Learning For Source Code Vulnerability Detection: What Works and What Isn't There Yet
No ratings yet
Machine Learning For Source Code Vulnerability Detection: What Works and What Isn't There Yet
17 pages
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cloudlab-Print2022 - Student
No ratings yet
Cloudlab-Print2022 - Student
98 pages
Presentation 1
No ratings yet
Presentation 1
64 pages
In Rsa Algorithm If P 7 Q 11 and e 13 Then 5ee785c420f2df0d16a93d2c
No ratings yet
In Rsa Algorithm If P 7 Q 11 and e 13 Then 5ee785c420f2df0d16a93d2c
2 pages
Data Flow Diagrams
No ratings yet
Data Flow Diagrams
3 pages
VBA
No ratings yet
VBA
9 pages
Worflow Configuration
No ratings yet
Worflow Configuration
14 pages
Iare DS PPT 0
No ratings yet
Iare DS PPT 0
221 pages
Hussin wassof-CV
No ratings yet
Hussin wassof-CV
4 pages
Full Physical Report
No ratings yet
Full Physical Report
71 pages
Design Analysis and Algorithm
100% (1)
Design Analysis and Algorithm
78 pages
Special Instructions For Brazilian Localization Installation in R12 (Doc ID 428474.1)
No ratings yet
Special Instructions For Brazilian Localization Installation in R12 (Doc ID 428474.1)
5 pages
UNIX Shells
No ratings yet
UNIX Shells
17 pages
Chapter 4
No ratings yet
Chapter 4
32 pages
0380010
No ratings yet
0380010
14 pages
Computer Architecture & Organization: B.E. (Computer Science & Engineering) (New) Third Semester (C.B.S.)
No ratings yet
Computer Architecture & Organization: B.E. (Computer Science & Engineering) (New) Third Semester (C.B.S.)
2 pages
Snake Game
No ratings yet
Snake Game
23 pages
PLSQL Course Content
No ratings yet
PLSQL Course Content
5 pages
2019 CS420 17CTT Lab01
No ratings yet
2019 CS420 17CTT Lab01
3 pages
Chapter-3 Javascript
No ratings yet
Chapter-3 Javascript
59 pages
Cd206 Business Programming: (Tutorial 3) (L3B-4
No ratings yet
Cd206 Business Programming: (Tutorial 3) (L3B-4
7 pages
Advanced Windows NT
100% (1)
Advanced Windows NT
455 pages
Template CV ATS Friendly - FSD
No ratings yet
Template CV ATS Friendly - FSD
1 page
Mock Exam 3 For SCJP 6
No ratings yet
Mock Exam 3 For SCJP 6
17 pages
Floating Point in Qsys
No ratings yet
Floating Point in Qsys
19 pages
Solid Project
No ratings yet
Solid Project
9 pages
Inline Functions Lect 15
No ratings yet
Inline Functions Lect 15
3 pages
20 01 2022 22 51 29 Net
No ratings yet
20 01 2022 22 51 29 Net
9 pages
Pay Thon
No ratings yet
Pay Thon
19 pages
Lecture 5p2 - Index Construction & Compressing
No ratings yet
Lecture 5p2 - Index Construction & Compressing
40 pages
Elementary Datatypes PDF
No ratings yet
Elementary Datatypes PDF
1 page
Francois Fleuret - C++ Lecture Notes
No ratings yet
Francois Fleuret - C++ Lecture Notes
146 pages
Fibonacci Search Technique
No ratings yet
Fibonacci Search Technique
3 pages
COP 4600 Spring 2016 Project 2 - Semaphores: Implement Semaphores in Minix3
No ratings yet
COP 4600 Spring 2016 Project 2 - Semaphores: Implement Semaphores in Minix3
2 pages

A Novel Approach For Code Smells Detection Based On Deep Learning

Uploaded by

A Novel Approach For Code Smells Detection Based On Deep Learning

Uploaded by

A Novel Approach for Code Smells Detection Based on

Tao Lin1, Xue Fu2, Fu Chen3, Luqun Li4

Abstract. Compared to software bugs, code smells are more significant in

Keywords: Code Smells, Deep Learning, Convolutional Neural Network.

Convolutional networks have already demonstrated its usage by leveraging

2 Code Smells Detection Based on Convolutional Networks

We can define a multi-dimensional array to represent the convolutional network, h

3 High level design

3.1 Experiments results

We use an open-source database published by the authors’ previous work[11].

Table 1 Experiments results

Precision Recall F-Score Kappa

Speculative 0.712 0.734 0.689 0.643

You might also like