Dynamic Filtering and Prioritization of Static Code Analysis Alerts

This document discusses a proposed approach to dynamically filter and prioritize static code analysis alerts as developers review them. It constructs a Prolog knowledge base to capture code data flow and reported alerts/properties. As developers review alerts and identify actual faults, the knowledge base is updated, providing information to eliminate or prioritize remaining alerts based on shared root causes. An example is presented to illustrate how tools could automate this approach.

Uploaded by

cagla.cengiz

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Dynamic Filtering and Prioritization of Static Code Analysis Alerts

Uploaded by

cagla.cengiz

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)

Dynamic Filtering and Prioritization of

Static Code Analysis Alerts
2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) | 978-1-6654-2603-9/21/$31.00 ©2021 IEEE | DOI: 10.1109/ISSREW53611.2021.00086

Ulaş Yüksel Hasan Sözer

Vestel Electronics Ozyegin University
Manisa, Turkey Istanbul, Turkey
[email protected] [email protected]

Abstract—We propose an approach for filtering and prioritiz- review process regarding an alert constitutes useful informa-
ing static code analysis alerts while these alerts are being reviewed tion for the validity and priority of other alerts. Our goal is to
by the developer. We construct a Prolog knowledge base that record and exploit this information. We propose a novel and
captures the data flow information in the source code as well as
the reported alerts, their properties and associations with the data complementary approach for filtering and prioritizing static
flow. The knowledge base is updated as the developer reviews the code analysis alerts while these alerts are being reviewed by
listed alerts and decides whether they point at an actual fault or the developer.
not. These updates provide useful information since some of the
alerts of the same type can be related in terms of their root cause.
Hence, dynamically updated knowledge base can be queried to

eliminate or prioritize the remaining alerts in the review list. We
present a motivating example to illustrate the approach and its
automation by integrating a set of tools.
Index Terms—program analysis, static code analysis, process-
ing alarms/warnings/alerts, Prolog, code reviews

I. I NTRODUCTION
Static code analysis tools [1] analyze source code without
executing it. They pinpoint potential software faults that might
lead to failures at runtime. Their output constitute a list of
alerts [2] (also called as alarms [3] and warnings [4]) each
of which describes a potential fault together with a number

of features such as the corresponding line of code, type and
severity of the fault. As strong points, the analysis is fully
automated and scalable. As a drawback, developers are usually

exposed to a large number of alerts, some of which are subject
to false positives [2], [5], [6], although some others can be
associated with critical faults [5]. Empirical studies report false Fig. 1. The overall approach and the toolset.
positive rates that range between 30% and 100% [7] and the
density of alerts can be typically 2 alerts per KLOC (thousand The overall approach is depicted in Figure 1, which relies
lines of code) on average [6]. As a result, around 3,000 alerts on a knowledge base managed by a Prolog engine. We
are generated for a system with 1,500 KLOC. Each of these construct a Prolog knowledge base that captures the data ﬂow
alerts should be manually inspected by developers to focus on information in the source code as well as the listed alerts, their
those that are true positives, i.e., alerts that are actionable [2]. properties and root causes. The knowledge base is updated as
This inspection process is time and effort consuming. 250 man the developer reviews the source code according to the listed
hours might be needed to inspect 3,000 alerts assuming the alerts one by one and decides whether they point at an actual
inspection time per alert is 5 minutes on average [2], [8]. fault or not. This decision is added to the knowledge base,
To address this problem, numerous approaches have been which is then queried after each such update to re-prioritize the
proposed for providing developers a reduced and/or prioritized remaining alerts. In the next section, we present a motivating
list of alerts [2], [3] by employing a variety of techniques example and a set of related tools for automating the approach.
including testing [9], runtime veriﬁcation [10] and machine The most recent and the most related work to ours is
learning [11]. We argue that the list of alerts can still be that of Zhang et al. [13], where an interactive approach is
reduced or better (re)prioritized while they are being reviewed proposed to learn from the developer feedback for prioritizing
by the developer. Some alerts are related with each other in alerts. The developer answers a set of questions and the
terms of their root causes [12]–[14]. Hence, the outcome of the corresponding answers are recorded in the form of Datalog

DOI 10.1109/ISSREW53611.2021.00086
Authorized licensed use limited to: Ozyegin Universitesi. Downloaded on February 08,2024 at 10:27:17 UTC from IEEE Xplore. Restrictions apply.
(a subset of Prolog) facts together with the rules regarding detect the source of both NUM2 and NUM3 as the function
alerts. Our goal is to provide a less intrusive approach where get numner in the first code snippet. The knowledge base can
the alert review process remains the same for the developer. be saved for later use and it can extended with new facts
The only additional effort will include the labeling of alerts dynamically. For instance, the following fact can be added
(as actionable or not [2]) as they are already being reviewed. later on as a developer decision.
Developers will not be exposed to any kind of formalism.
tp(a1)
II. M OTIVATING E XAMPLE
Then, the knowledge base can be queried for filtering alerts.
A sample code snippet in Python is listed below. Hereby, For instance, the following query in Pytholog retrieves those
the values of both NUM2 and NUM3 are obtained as the return alerts that are actionable.
value of the function get number in lines 2 and 3, respectively.
scatkb.query(pl.Expr("actionable(X)"))
1 NUM1 = 5
2 NUM2 = get_number() As a result of this query and the current status of the
3 NUM3 = get_number() knowledge base, a list of alerts are displayed as follows.
4 result = NUM1 / NUM2
[{’X’: ’a1’}, {’X’: ’a2’}]
5 result = NUM1 / NUM3
6 print(result) The developer does not have to manually add facts to the
Static code analysis tool Semgrep1 allows the specification knowledge base. The review process is not disrupted as the
of patterns for reporting alerts. For instance, the pattern shown developer makes a decision for each alert in a given order.
below leads to two division by zero alerts for lines 4 and 5 in These decisions can be gradually added to the knowledge base
the listed code snippet above. as facts. The status of the remaining alerts can automatically
1 $ZERO = $FUNC(...) be derived due to the logical links among them that are
2 ... established based on the shared data sources.
3 $X / $ZERO R EFERENCES
An alert that is reported based on this pattern is true [1] A. Gosain and G. Sharma, Static Analysis: A Survey of Techniques and
positive (TP), i.e., points at an actual fault, if the function Tools. New Delhi: Springer India, 2015, pp. 581–591.
get number returns 0. This rule can be expressed in Pytholog2 [2] S. Heckman and L. Williams, “A systematic literature review of action-
able alert identification techniques for automated static code analysis,”
only once per alert type in a generic way as shown in line 3 Information and Software Technology, vol. 53, no. 4, pp. 363–387, 2011.
of the code snippet below. The rule specifies that an alert A is [3] T. Muske and A. Serebrenik, “Survey of approaches for handling static
TP if it is of type divison by zero and if the corresponding analysis alarms,” in Proceedings of the 16th IEEE International Working
Conference on Source Code Analysis and Manipulation, Raleigh, NC,
data source is V and if V evaluates to 0. There are two USA, 2016, pp. 157–166.
other rules that specify when an alert is actionable. First, [4] M. Li, Y. Chen, L. Wang, and G. Xu, “Dynamically validating static
the corresponding alert can be directly labelled as TP by the memory leak warnings,” in Proceedings of the 2013 International
Symposium on Software Testing and Analysis, 2013, pp. 112–122.
developer (Line 4 below). Second, the corresponding alert [5] P. Anderson, “Measuring the value of static-analysis tool deployments,”
can have the same type and source with another alert that IEEE Security and Privacy, vol. 10, no. 3, pp. 40–47, 2012.
is labelled by the developer as TP (Lines 5-7 below). These [6] U. Yuksel and H. Sozer, “Automated classification of static code analysis
alerts: A case study,” in Proceedings of the 29th IEEE Conference on
rules can be added to a knowledge base together with the Software Maintenance, Eindhoven, Netherlands, 2013, pp. 532–535.
information regarding the listed alerts as shown below. [7] T. Kremenek and D. Engler, “Z-ranking: using statistical analysis to
1 scatkb = pl.KnowledgeBase("scat") counter the impact of static analysis approximations,” in Proceedings
of the 10th international conference on Static analysis, San Diego, CA,
2 scatkb([ USA, 2003, pp. 295–315.
3 "tp(A):-type(A,divby0),src(A,V),zero(V)", [8] “Effective management of static analysis vulnerabilities and defects,”
4 "actionable(A):-tp(A)", White Paper, Coverity Inc., 2009.
[9] A. K. Joshy, X. Chen, B. Steenhoek, and W. Le, “Validating static
5 "actionable(Z):-tp(A), warnings via testing code fragments,” in Proceedings of the 30th ACM
6 type(A,divby0),src(A,V), SIGSOFT International Symposium on Software Testing and Analysis,
7 type(Z,divby0),src(Z,V)", 2021, p. 540–552.
[10] H. Sozer, “Integrated static code analysis and runtime verification.”
8 "type(a1,divby0)","src(a1,lib1)", Software Practice and Experience, vol. 45, no. 10, pp. 1359–1373, 2015.
9 "type(a2,divby0)","src(a2,lib1)"]) [11] U. Yuksel, H. Sozer, and M. Sensoy, “Trust-based fusion of classifiers
for static code analysis,” in Information Fusion (FUSION), 2014 17th
Note that rules are agnostic to the analyzed system and some International Conference on, Salamanca, Spain, 2014, pp. 1–6.
of the facts regarding alerts, e.g., type(a1,divby0), are provided [12] T. Muske and U. Khedker, “Cause points analysis for effective handling
in the alert description. Data sources can also be automatically of alarms,” in Proceedings of the 27th IEEE International Symposium
on Software Reliability Engineering, 2016, pp. 173–184.
obtained with a data flow analysis. For instance, there exists a [13] X. Zhang, R. Grigore, X. Si, and M. Naik, “Effective interactive reso-
Typescript library3 for Python source code analysis, which can lution of static analysis alarms,” Proceedigns of the ACM Programming
Languages, vol. 1, no. 57, pp. 1–30, 2017.
1 https://semgrep.dev/ [14] T. Muske, R. Talluri, and A. Serebrenik, “Repositioning of static analysis
2 https://pypi.org/project/pytholog/ alarms,” in Proceedings of the 27th ACM SIGSOFT International
3 https://github.com/microsoft/python-program-analysis Symposium on Software Testing and Analysis, 2018, pp. 187–197.

295

Authorized licensed use limited to: Ozyegin Universitesi. Downloaded on February 08,2024 at 10:27:17 UTC from IEEE Xplore. Restrictions apply.

Manual Testing Interview Questions
No ratings yet
Manual Testing Interview Questions
36 pages
ST Module 1
No ratings yet
ST Module 1
40 pages
Static Analysis Vs Dynamic Analysis: What Is Sonarqube?S
No ratings yet
Static Analysis Vs Dynamic Analysis: What Is Sonarqube?S
3 pages
IEC Certification Kit: Model-Based Design For ISO 26262
No ratings yet
IEC Certification Kit: Model-Based Design For ISO 26262
31 pages
Future Challenges in Context Aware Computing: October 2007
No ratings yet
Future Challenges in Context Aware Computing: October 2007
6 pages
University Research Graph Database
No ratings yet
University Research Graph Database
5 pages
JARVIS Joining Adversarial Training With Vision Tr-Sıkıştırıldı
No ratings yet
JARVIS Joining Adversarial Training With Vision Tr-Sıkıştırıldı
14 pages
Drilling-Operations Learning Through Visualization
No ratings yet
Drilling-Operations Learning Through Visualization
2 pages
Drilling-Operations Learning Through Visualization
No ratings yet
Drilling-Operations Learning Through Visualization
2 pages
Multisensor Feature Fusion For Bearing Fault Diagnosis Using Sparse Autoencoder and Deep Belief Network
No ratings yet
Multisensor Feature Fusion For Bearing Fault Diagnosis Using Sparse Autoencoder and Deep Belief Network
10 pages
1-s2.0-S0020025523013130-main
No ratings yet
1-s2.0-S0020025523013130-main
18 pages
Implementation_of_SOC_using_ELK_with_Integration_of_Wazuh_and_Dedicated_File_Integrity_Monitoring
No ratings yet
Implementation_of_SOC_using_ELK_with_Integration_of_Wazuh_and_Dedicated_File_Integrity_Monitoring
5 pages
Artificial Intelligence Student Management Based On Embedded System
No ratings yet
Artificial Intelligence Student Management Based On Embedded System
7 pages
Poster Atif Saeed
No ratings yet
Poster Atif Saeed
1 page
Performance Evaluation of Neural Networks in Road Sign Recognition
No ratings yet
Performance Evaluation of Neural Networks in Road Sign Recognition
6 pages
Toward Developing Benchmark Dataset PDF
No ratings yet
Toward Developing Benchmark Dataset PDF
18 pages
ELK
No ratings yet
ELK
10 pages
Module – II
No ratings yet
Module – II
75 pages
Automation of Nmap Scanning of Information Systems
100% (1)
Automation of Nmap Scanning of Information Systems
5 pages
Best paper Automated_Style-Aware_Selection_of_Annotated_Pre-Training_Databases_in_Biomedical_Imaging
No ratings yet
Best paper Automated_Style-Aware_Selection_of_Annotated_Pre-Training_Databases_in_Biomedical_Imaging
5 pages
An Engineering Toolbox To Build
No ratings yet
An Engineering Toolbox To Build
7 pages
Web-Based Information Systems Developing A Design Theory
No ratings yet
Web-Based Information Systems Developing A Design Theory
6 pages
Navigo: Harshal Kamble, Mayuri Waghmare, Rajeshree Sonwane, Sonal Shende, Sonali Tiwari. Prof.N.R.Hatwar
No ratings yet
Navigo: Harshal Kamble, Mayuri Waghmare, Rajeshree Sonwane, Sonal Shende, Sonali Tiwari. Prof.N.R.Hatwar
3 pages
Automatic Root Cause Analysis For LTE Networks Based On Unsupervised Techniques
No ratings yet
Automatic Root Cause Analysis For LTE Networks Based On Unsupervised Techniques
18 pages
p62 Distefano
No ratings yet
p62 Distefano
9 pages
ISA Transactions: Te Han, Chao Liu, Wenguang Yang, Dongxiang Jiang
No ratings yet
ISA Transactions: Te Han, Chao Liu, Wenguang Yang, Dongxiang Jiang
13 pages
An Intelligent Model To Assess Information Systems Security Level
No ratings yet
An Intelligent Model To Assess Information Systems Security Level
6 pages
Sda - 2
No ratings yet
Sda - 2
29 pages
Sensors 21 01121 With Cover
No ratings yet
Sensors 21 01121 With Cover
35 pages
E-commerce projects
No ratings yet
E-commerce projects
1 page
Else Iver
No ratings yet
Else Iver
16 pages
Health Monitoring of A Truss Bridge Using Adaptive Identification
No ratings yet
Health Monitoring of A Truss Bridge Using Adaptive Identification
6 pages
VLSI Design and Test View of Computer Security
No ratings yet
VLSI Design and Test View of Computer Security
4 pages
VLSI Design and Test View of Computer Security
No ratings yet
VLSI Design and Test View of Computer Security
4 pages
Peerj Cs 254
No ratings yet
Peerj Cs 254
30 pages
Exploring The Security Risks of Using Large Language Models
100% (1)
Exploring The Security Risks of Using Large Language Models
15 pages
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
No ratings yet
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
23 pages
Data Quality Check in Cancer Imaging Research Deploying and Evaluating The DIQCT Tool
No ratings yet
Data Quality Check in Cancer Imaging Research Deploying and Evaluating The DIQCT Tool
5 pages
Federated Learning Challanges
No ratings yet
Federated Learning Challanges
21 pages
Wan 2014
No ratings yet
Wan 2014
6 pages
LinkAnalysisByBarneaSCIP - INSIGHTFeb 2010
No ratings yet
LinkAnalysisByBarneaSCIP - INSIGHTFeb 2010
2 pages
An Approach To Reachability Analysis For Feed-Forward
No ratings yet
An Approach To Reachability Analysis For Feed-Forward
10 pages
1 s2.0 S1566253522002081 Main
No ratings yet
1 s2.0 S1566253522002081 Main
19 pages
2505.00282v1
No ratings yet
2505.00282v1
72 pages
Fault Detection in Complex Mechatronic Systems by A Hi - 2024 - Reliability Engi
No ratings yet
Fault Detection in Complex Mechatronic Systems by A Hi - 2024 - Reliability Engi
11 pages
Visual Exploration of Neural Document Embedding
No ratings yet
Visual Exploration of Neural Document Embedding
12 pages
Ontoenricher: A Deep Learning Approach For Ontology Enrichment From Unstructured Text
No ratings yet
Ontoenricher: A Deep Learning Approach For Ontology Enrichment From Unstructured Text
16 pages
1 s2.0 S016740482300442X Main
No ratings yet
1 s2.0 S016740482300442X Main
13 pages
Abd-Elmagid, Dhillon, Pappas - 2019 - A reinforcement learning framework for optimizing age-of-information in RF-powered communication s
No ratings yet
Abd-Elmagid, Dhillon, Pappas - 2019 - A reinforcement learning framework for optimizing age-of-information in RF-powered communication s
14 pages
Alignment-Free Sequence Comparison A Systematic Survey From A Machine Learning Perspective
No ratings yet
Alignment-Free Sequence Comparison A Systematic Survey From A Machine Learning Perspective
17 pages
Automatic Valuation of Essay using Machine Learning-Web of Science Core Collection
No ratings yet
Automatic Valuation of Essay using Machine Learning-Web of Science Core Collection
3 pages
adtu-idp (2)
No ratings yet
adtu-idp (2)
186 pages
versluis2021
No ratings yet
versluis2021
22 pages
A High Accuracy and Adaptive Anomaly Detection Model With Dual-Domain Graph
No ratings yet
A High Accuracy and Adaptive Anomaly Detection Model With Dual-Domain Graph
15 pages
DBMS Project Prosposal by Ali Ahmad, Sultan
No ratings yet
DBMS Project Prosposal by Ali Ahmad, Sultan
9 pages
Sensors 23 07171 v2
No ratings yet
Sensors 23 07171 v2
16 pages
Moving From Informal Interaction To Focused Knowledge Sharing Among Users in An Agent-Based Learning Environment
No ratings yet
Moving From Informal Interaction To Focused Knowledge Sharing Among Users in An Agent-Based Learning Environment
16 pages
Design and Analysis of DWH and BI in Education Dom
No ratings yet
Design and Analysis of DWH and BI in Education Dom
8 pages
Quality of Service System That Is Self-Updating by Intrusion Detection Systems Using Reinforcement Learning
No ratings yet
Quality of Service System That Is Self-Updating by Intrusion Detection Systems Using Reinforcement Learning
9 pages
conference paper
No ratings yet
conference paper
6 pages
Towards Emotinally Aware AI Smart Classroom Current Issues and Directions For Engineering and Education
No ratings yet
Towards Emotinally Aware AI Smart Classroom Current Issues and Directions For Engineering and Education
22 pages
DCDS2009 - Medina - Weber - Simon - Iung Review HAL
No ratings yet
DCDS2009 - Medina - Weber - Simon - Iung Review HAL
6 pages
K2E: Building Mlops Environments For Governing Data and Models Catalogues While Tracking Versions
No ratings yet
K2E: Building Mlops Environments For Governing Data and Models Catalogues While Tracking Versions
4 pages
Autonomic Networking: Fundamentals and Applications
From Everand
Autonomic Networking: Fundamentals and Applications
Fouad Sabry
No ratings yet
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
No ratings yet
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
22 pages
Transformers and Large Language Models
No ratings yet
Transformers and Large Language Models
30 pages
Context-Free Grammars and Constituency Parsing
No ratings yet
Context-Free Grammars and Constituency Parsing
26 pages
Sequence Labeling For Parts of Speech and Named Entities: To Each Word A Warbling Note A Midsummer Night's Dream, V.I
No ratings yet
Sequence Labeling For Parts of Speech and Named Entities: To Each Word A Warbling Note A Midsummer Night's Dream, V.I
27 pages
A Survey and Analysis of Multi Robot Coordination
No ratings yet
A Survey and Analysis of Multi Robot Coordination
18 pages
Source Code Analysis To Remove Security Vulnerabilities in Java Socket Programs: A Case Study
No ratings yet
Source Code Analysis To Remove Security Vulnerabilities in Java Socket Programs: A Case Study
16 pages
29-11 Static and Dynamic Testing1234
No ratings yet
29-11 Static and Dynamic Testing1234
11 pages
Diggit: Automated Code Review Via Software Repository Mining
No ratings yet
Diggit: Automated Code Review Via Software Repository Mining
5 pages
Agile and Devops Quizz @accenture - Updates
No ratings yet
Agile and Devops Quizz @accenture - Updates
39 pages
Intellij Idea Ide
No ratings yet
Intellij Idea Ide
12 pages
STE UNIT-1 Notes
No ratings yet
STE UNIT-1 Notes
14 pages
Systems Integration and Architecture 2: Mr. Jandell Morales
No ratings yet
Systems Integration and Architecture 2: Mr. Jandell Morales
50 pages
Chapter 3 Slides
No ratings yet
Chapter 3 Slides
77 pages
MDN 0811DG
No ratings yet
MDN 0811DG
102 pages
Basic Analysis-Malware Analysis-Fall2015
No ratings yet
Basic Analysis-Malware Analysis-Fall2015
15 pages
Spark-Guidance-1.2-web
No ratings yet
Spark-Guidance-1.2-web
91 pages
OWASP MSTG-v1.4.0 WIP
No ratings yet
OWASP MSTG-v1.4.0 WIP
749 pages
A Sonarqube Static Analysis of The Spectral Workbench: January 2021
No ratings yet
A Sonarqube Static Analysis of The Spectral Workbench: January 2021
16 pages
Software Security
No ratings yet
Software Security
4 pages
Arshad Sofware Reengineering
No ratings yet
Arshad Sofware Reengineering
3 pages
Dynamic Filtering and Prioritization of Static Code Analysis Alerts
No ratings yet
Dynamic Filtering and Prioritization of Static Code Analysis Alerts
2 pages
Definition Voting Process
No ratings yet
Definition Voting Process
26 pages
STLC - Software Testing Life Cycle
No ratings yet
STLC - Software Testing Life Cycle
31 pages
SCA Guide 19.1.0
No ratings yet
SCA Guide 19.1.0
210 pages
Cheatsheet SAST
No ratings yet
Cheatsheet SAST
2 pages
Emenda Aligned Workshop 04 March 2014
No ratings yet
Emenda Aligned Workshop 04 March 2014
67 pages
HP Fortify Static Code Analyzer: User Guide
No ratings yet
HP Fortify Static Code Analyzer: User Guide
136 pages
Verification and Static Analysis: A Brief Overview
No ratings yet
Verification and Static Analysis: A Brief Overview
8 pages
APWINE SAS Audit Public Report
No ratings yet
APWINE SAS Audit Public Report
15 pages
Software Construction Assignment
No ratings yet
Software Construction Assignment
3 pages
Code Sight Documents
No ratings yet
Code Sight Documents
105 pages

Dynamic Filtering and Prioritization of Static Code Analysis Alerts

Uploaded by

Dynamic Filtering and Prioritization of Static Code Analysis Alerts

Uploaded by

2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)

Dynamic Filtering and Prioritization of

Ulaş Yüksel Hasan Sözer

978-1-6654-2603-9/21/$31.00 ©2021 IEEE 294

You might also like