Learning Software Requirements Syntax An Unsupervised Approach To Recognize Templates
Learning Software Requirements Syntax An Unsupervised Approach To Recognize Templates
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys
article info a b s t r a c t
Article history: Requirements are textual representations of the desired software capabilities. Many templates have
Received 27 September 2021 been used to standardize the structure of requirement statements such as Rupps, EARS, and User
Received in revised form 4 April 2022 Stories. Templates provide a good solution to improve different Requirements Engineering (RE)
Accepted 27 April 2022
tasks since their well-defined syntax facilitates the different text processing steps in RE automation
Available online 4 May 2022
researches. However, many empirical studies have concluded that there is a gap between these RE
Keywords: researches and their implementation in industrial and real-life projects. The success of RE automation
Requirements Engineering approaches strongly depends on the consistency of the requirements with the syntax of the predefined
Requirements templates recognition templates. Such consistency cannot be guaranteed in real projects, especially in large development
Natural Language Processing (NLP) projects, or when one has little control over the requirements authoring environment.
Syntax learning In this paper, we propose an unsupervised approach to recognize templates from the requirements
Graph community detection themselves by extracting their common syntactic structures. The resultant templates reflect the actual
syntactic structure of requirements; hence it can recognize both standard and non-standard templates.
Our approach uses techniques from Natural Language Processing and Graph Theory to handle this
problem through three main stages (1) we formulate the problem as a graph problem, where each
requirement is represented as a vertex and each pair of requirements has a structural similarity, (2)
We detect main communities in the resultant graph by applying a hybrid technique combining limited
dynamic programming and greedy algorithms, (3) finally, we reinterpret the detected communities as
templates.
Our experiments show that the suggested approach can detect templates that follow well-known
standards with a 0.90 F1-measure. Moreover, the approach can detect common syntactic features for
non-standard templates in more than 73.5% of the cases. Our evaluation indicates that these results
are robust regardless of the number and the length of the processed requirements.
© 2022 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.knosys.2022.108933
0950-7051/© 2022 Elsevier B.V. All rights reserved.
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
However, many empirical studies concluded that there is a gap RQ2: How effectively our approach detects templates when han-
between RE tasks automation researches and its implementation dling requirements which do not follow one of the well-
undertaken in industrial and real-life projects [18–20]. These known templates or do not follow any clear template?
automation approaches usually need the requirements to be rep- RQ3: To what degree does the number of processed require-
resented based on specific templates [21], hence their success ments, and requirements statements’ length influence ex-
strongly depends on the consistency of the requirements with the perimental results?
predefined templates [4,21]. Using a ‘‘template-based’’ approach
to handle RE tasks could lead to lower precision when applied The main contributions of this paper include:
on new requirements written based on some variations of the (1) One of the first works which study the syntax of templates
predefined template or on a completely different template. These and try to recognize software requirements templates au-
situations are common in industrial projects [4,17–22]. Two main tomatically.
sources have been mentioned in the literature for this gap: (1) (2) A methodology, based on NLP and Graph theory, which can
the template is insufficient to fully express the requirements of
be applied to any set of requirements to detect common
some industrial cases (2) in some real-life projects, it is hard to
syntax.
control requirements authoring environments, especially in large
(3) A public-access source code2 to facilitate the replication of
development projects or when one has little control over the
our study and to enable other researchers to build on our
requirements authoring environments. The latter case is common
results.
when multiple organizations are involved in requirements writ-
(4) A data set of 82 lists of requirements labeled manually
ing, or when working on projects related to crossover services
based on their templates. To our knowledge, this is the first
[4,18].
large labeled data set in this domain.
Focusing on the above problems, we set a research goal to
(5) Experimental evaluation of the methodology to answer the
develop an automated approach enabling templates recognition from
above three research questions.
requirements themselves. Our work is motivated by perceived
advantages and potential applications of recognizing templates: The rest of the paper is structured as follows: Section 2 pro-
vides a quick overview of Natural Language Processing (NLP),
(1) Each company has its specific jargon and standards [22],
thus having a tool which automatically recognizes the ac- graph theory, and the community detection (CD) problem. These
tual used (not the supposed) templates leads to recogniz- topics are used in the different stages of our approach. In Sec-
ing both standard or non-standard templates. We argue tion 3, we provide an overview of the related works in the
that RE tasks can achieve more real-life solutions when literature. Section 4 presents the suggested approach to extract
requirements representation and analysis starts from re- templates automatically. Section 5 provides an initial evalua-
quirements themselves not from standard or predefined tion of our work. Section 6 identifies the limitations and ana-
templates. lyze threats to validity. We conclude our paper in Section 7 by
(2) Understanding the syntax of requirements is an essential discussing our findings and future works.
step to understand their semantic; recognizing the key
syntactic components in requirements texts helps find- 2. Background
ing more suitable semantic representations for require-
ments [23]. This section provides a brief overview of two key concepts
(3) Recognizing requirements syntax becomes a laborious task used in our work: Natural Language Processing that is used to
for large projects since one project may contain a large analyze requirements texts, and community detection in a graph
number of requirements.1 [16,25]. that is used to group syntactically similar requirements to predict
(4) Detecting the syntax of requirements texts leads to build- templates.
ing a supportive environment for requirements author-
ing; it can help in building more adaptive tools to de- 2.1. Natural language processing
tect syntactic inconsistency, and to measure the quality of
requirements. Natural language processing is one of the main artificial in-
telligence disciplines. It aims to enable computer programs to
In order to achieve our research goal, we developed an unsu- ‘‘understand’’ and process natural language texts to achieve some
pervised approach to detect the syntactic structure of the require- specific goal [26,27]. NLP has been used in various domains
ments and to recognize their templates. The proposed solution to build applications like text classification, semantic relation
starts by applying a set of Natural Language Processing (NLP) detection, conceptual diagram extraction, semantic labeling, etc.
steps to process requirements and to represent them as a graph. In this section, we define the main NLP concepts which are
Each vertex in the graph represents one requirement, and each related to our approach:
pair of vertexes has a structural similarity which reflects the com-
mon syntax of their related requirements. Then, the main com- • Tokenization: the process of tokenizing or splitting a text
munities in the resultant graph are detected by applying a hierar- into a list of tokens. Tokens can be words, numbers or punc-
chical community detection algorithm. Each community consists tuation marks. For example, the sentence ‘‘As an applicant,
of a set of requirements which share a common syntax. Finally, I want to submit Supporting Documentation’’ consists of 10
we extract templates based on the recognized communities. tokens:
Our work was guided by the following research questions: [‘‘As’’, ‘‘an’’, ‘‘applicant’’, ‘‘,’’, ‘‘I’’, ‘‘want’’, ‘‘to’’, ‘‘submit’’, ‘‘Sup-
porting’’, ‘‘Documentation’’]
RQ1: To which extent can we recognize templates automati-
• Lemmatization: the process of finding the dictionary form,
cally when requirements follow well-known standard tem-
or the lemma, of each word. It is useful to group all inflected
plates?
forms of a word in a single form (lemma). For example, the
lemma of ‘‘Supporting’’ and ‘‘Supported’’ is ‘‘Support’’.
1 Although there are no previous works that define what number of require-
ments is considered large, some researchers [24] suppose that 50 requirements
per project is considered a ‘‘large’’ number. 2 https://zenodo.org/record/6525271
2
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Fig. 2. Overview of the proposed approach to recognize templates from requirements showing the key stages.
(2) Detecting main communities in the resultant graph by ap- (d) Find top frequent words: Top frequent words, denoted
plying a hierarchical community detection algorithm. The by W , represents words that occur in more than
output of this stage is a number of communities, each of a certain percentage N of the number of the pro-
which consists of a set of vertexes (i.e. requirements) which cessed requirements. The threshold N is determined
have a large degree of structural similarity. experimentally. In this work, we used N = 1/3
(3) Reinterpreting the output of the community detection al- i.e. W contains words that occur in one-third of the
gorithm to a set of templates. We consider each community requirements. These words are expected to be part of
as a candidate template, and we define its structure using templates’ structures like ‘‘as’’, ‘‘so’’, ‘‘that’’, ‘‘ability’’,
the community related requirements. ‘‘shall’’, etc.
(e) Noun Phrase Chunking: We extract noun phrases from
The next subsections explain the detailed steps for each stage: each requirement using OpenNLP toolkit [46]. A noun
phrase is a noun, plus all the words that surround
4.1. Formulating the problem as a graph problem and modify it, such as adjectives, relative clauses and
prepositional phrases, for example ‘‘Car Alarm’’.
We convert the requirements list into a simple undirected graph
2. ‘‘Blurring’’ Process:
where each requirement r is represented as a vertex v in the
For each vertex v , the final Γ (v ) is retrieved by applying
graph, and each pair of requirements is represented as an edge
a set of blurring rules on v ’s related requirement. These
e. We denote the resultant graph G(V , E), where V consists of
rules hide the details that are related to the specific case
all vertexes and E consists of all edges based on the previous
described in the processed requirement, and focus on its
definitions.
overall syntactic structure. The blurring rules are as fol-
To formulate the task of template recognition as a community
lows:
detection task, we define the following concepts:
(a) All detected noun phrases are converted to empty
4.1.1. Vertex structure slots: Applying this rule does not affect the general
Let v ∈ V , the structure of vertex v , denoted by Γ (v ), reflects structure of the requirements, and therefore pre-
the core syntactic structure of its related requirement r. Γ ‘‘blurs’’ serves the overall syntax of the desired templates.
the details of requirement statements to focus on their overall For example, all these sentences share the same
structure. It behaves like blurring techniques in image processing, overall structure:
which apply a set of image filters to hide the tiny details and focus
‘‘Car alarm shall be inhibited’’
on the overall ‘‘shape’’ in the image (Fig. 3):
‘‘Electrical and manual commands shall be
Γ values can be retrieved by applying the following steps on
inhibited’’
requirements texts:
(b) All verbs and nouns are converted to empty slots
1. Text preprocessing:
unless they are contained in top frequent words W .
(a) Tokenization: We split each requirement text into a These verbs and nouns are usually related to the spe-
list of words. cific functionality or constraint which is described in
(b) POS-Tagging: We use Stanford pos-tagger [45] to tag each requirement. We blur these details since they
each word with its suitable part of speech tag that are not related to the desired syntax, such as ‘‘key’’
gives the syntactic role of the word (such as, Plural and ‘‘ignition’’ in Fig. 3.
Noun, Singular Noun, Adverb, Adjective, . . . ). (c) All remaining words are kept in their original form
(c) Remove articles: We remove words that have the tag (as they appear in the requirement) without any
‘‘DET’’ i.e. remove all articles. blurring.
4
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Fig. 3. Example showing how Γ ‘‘blurs’’ the details a requirement statement and focus on its general syntactic structure.
Fig. 5. The two main steps to detect main Communities in the graph.
Γ (C ) = Γ (α )
(1)
w here V (C ) = {α}
(1) 1: Input Seed Communities
2: Output Final Recognized Communities
Then, we generate larger communities and calculate their 3: Q ← Seed Communities
structure gradually in a bottom-up direction as following: 4: output ← { } ▷ This set will include recognized templates
5: covered ← { } ▷ list of covered requirements
Γ (C (i+1)
) = σ (Γ (C ), α )
(i)
where V (C (i+1) (i)
) = V (C ) ∪ {α}
6: while Q ̸ = Ø OR V ⊆ covered do
We stop this process after k iterations since the number of 7: C ← MAX(Q ) ▷ Get the community that has the max
generated communities increases exponentially after each internal similarity in Q
step. The number of possible communities at the level k 8: Remove C from Q
can be calculated by finding the binomial coefficient of the 9: if C is a meaningful community then ▷ Based on its
definition in Section 4.1.3
(|V |)
(number
|V |
) of vertexes |V | and the level k i.e. k (there are
10: output.add(C )
k
different ways to select k vertexes from |V | vertexes).
11: covered.add( V (C ) )
Fig. 6 shows how the number of the generated commu-
12: else
nities grows as the number of vertexes (i.e. number of
13: Find Ci from Q which have the max similarity structure
requirements) increases. The same figure shows different
σ with C.
values for k ranged from 2 to 5. These charts clarify the
14: Remove Ci from Q.
need to set a limit for the dynamic programming step.
15: Merge Ci with C and add the resultant community to
For example, the 4th level includes more than 2 × 105
Q.
communities when |V | = 50.
16: end if
In our work, we set k = 3, and we considered the re-
17: end while
trieved communities at that level as a seed communi-
18: return output
ties. These seed communities are passed to the next step
where a greedy algorithm will be used to retrieve the final
communities. At the end of this procedure, we retrieve a set of communities,
2. Greedy step: In this step, we apply a greedy technique to each of which has a set of related vertexes and Γ (C ) representing
find main communities based on the output of dynamic the common structure over these vertexes. Table 1 shows a
programming step (see Fig. 5). The proposed algorithm in sample output for this stage.
6
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Table 1 Table 3
Sample output showing how ‘‘Detecting main community’’ stage General statistics about the final data set.
divides a set of 35 requirements statements into 4 communities. Number of sets 82
Community Γ (C ) Size(C ) Number of requirements 8084
C1 ‘‘if is , then shall ’’ 10 Avg. number of requirements per set 99
Well-known templates used Rupp, User Story, EARS, Use Case
C2 ‘‘while is , shall ’’ 12
C3 ‘‘when is , I want to ’’ 8
C4 ‘‘ shall ’’ 5
5. Experiments and results
4.3. Recognize the syntax of possible templates: • Covering different levels of control over requirements au-
thoring environment i.e. including both requirements with
clear templates and requirements with no clear templates.
In this stage, we reinterpret the output of the CD algorithm • Covering different templates including both standard tem-
into requirements templates. We consider each community as a plate (such as user stories, Rupps), and non-standard ones
candidate template. Γ (C ) is used to define the syntactic structure (such as company-related templates, author-related tem-
of each template. plates).
For each resultant community, the following steps are applied • Covering different levels of conformance with the templates.
to find the representative template: • Covering different sets sizes in terms of number of require-
ments in each set.
1. Insert Dummy Slots: Community structure Γ (C ) is cal- • Covering various domains (healthcare, e-commerce, . . . ).
culated based on LCS, thus we cannot suppose that all • Covering both academic and industrial sources.
successive tokens in Γ (C ) are also successive in the original
requirements texts. For example, theoretically, we can- 5.1.2. Data cleaning and standardization
not guarantee that there are no chunks of texts between The collected data sets have different formats: text files, PDFs,
‘‘want’’ and ‘‘to’’ in all related requirements of C3 in the XMLs, and My SQL tables. We extracted requirements texts from
previous example (Table 1). each of which, and prepared a text file for each set of require-
For this reason, we add slots in all possible places to match ments. Each line in this text file represents one requirement.
any possible undetected chunk of text. Γ (C3 ) will be: Fig. 8 shows statistics about the resultant requirements sets.
Γ (C3 ) = ‘‘ A when B is C , D I E want F to G ’’ About 72% of these sets consist of more than 50 requirements,
and more than 30% of sets include more than 100 requirements
This includes adding a slot between each two tokens (like
(see Fig. 8(a)). In addition, the final sets include short and long
slots ‘‘E’’ and ‘‘F’’) and at the beginning (slot ‘‘A’’).
requirements in terms of number of words per requirement.
2. Retrieve Slots Examples: We use a regular expression to
Fig. 8(b) shows the distribution of these sets in terms of the length
slice each requirement based on the Γ of its community.
of their requirements. About 75% of the data set comes from
Based on that, we can retrieve a set of examples for each industrial sources, the rest are from academic sources (Fig. 9).
slot. Table 2 shows sample matched chunks for the first Table 3 provides general statistics about the final data set.
three slots in Γ (C3 ). Note that slot A is a fake slot since it
is empty in all related requirements. These fake slots will 5.2. Data annotation
be ignored in the last representative template.
3. Analyze Slots Syntax: In this step, we can understand the To evaluate our approach, we annotated each requirement
syntax of each non-fake slot using its retrieved examples. with its matched template. We used 5 labels to annotate all sets:
Using the syntactical tags of these examples, we can decide 4 of them for the well-known used templates (User Story, Rupp,
if a slot is verbal (like the slot C in the last example) or EARS, Use Case), and an additional label for the remaining cases
nominal (like slot B). (Others). The initial annotation stage has been done by two an-
notators with business analysis experience. The inter-annotator
Fig. 7 shows the result of applying the last 3 steps on the agreement (Cohen’s kappa) between the two annotator reaches
community C3 . 0.92 with a percentage agreement of 93.3% which represents
7
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Fig. 8. The distribution of different sets over the number of requirements and the average length of requirements.
Fig. 9. The distribution of different sets over sources. Number of requirements w ith T̂ matches Ti
Recall(Ti ) =
Table 4
The distribution of data set requirements over templates. Number of requirements w ith T = Ti , and T̂ matches Ti
Template Number of requirements Percentage
Number of requirements w ith T = Ti
Rupp 3678 45.5%
User Story 1855 22.9% Precision(Ti ) . Recall(Ti )
EARS 235 2.9% F 1(Ti ) = 2.
Precision(Ti ) + Recall(Ti )
Use Case 31 0.4%
Others 2285 28.3% These measures are widely used in similar problems. Precision
gives an idea about quality of recognized templates, while recall
gives an idea about the coverage of results.
almost perfect agreement level [53]. Finally, a third annotator
5.4. Results
(the first author of this paper) resolved conflicts (about 630
requirements) to produce the final data set.
To answer our research questions, three experiments were
Table 4 shows the number of requirements for each label. conducted. We explain the details of each experiment below.
About 72% of annotated requirements follow one of the well- RQ1: To which extent can we recognize templates automat-
known standard templates. The most used template is Rupp and ically when requirements follow well-known standard tem-
user story. On the other hand, more than 28% of the data set re- plates?
quirements do not follow any well-known standard: about 37.5% Table 6 provides detailed results for recognized syntax when
follow different non-standard templates, while the remaining applying our approach on the prepared requirements sets. Results
62.5% have no clear standard. The final annotated data set is show that the automatically recognized templates match the
publicly available for research purposes.4 manually annotated ones with 0.90 F1-measure (0.92 precision
and 0.89 recall). This percentage increases to more than 0.98
5.3. Evaluation methodology when requirements follow templates with more restrictions (such
as user story or use case), while the percentage decreases to 0.89
For each set of requirements, we apply our approach to detect when more flexible templates (like Rupp) are used.
templates. The final output is a set of recognized templates each The detailed values of precision and recall show that the ap-
of which is matched with a set of requirements. proach recognizes well-known templates with perfect precision,
In our evaluation, we consider that a recognized template T̂ i.e. whenever the approach recognizes a well-known template T̂ ,
matches the template T based on these three cases: this template matches the manually annotated template T in all
tested cases.
To check the stability of these results over all sets, we cal-
4 https://github.com/... culated F1-measure values for each of the 82 sets separately.
8
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Table 5
Examples of matched and not matched templates.
T̂ T Evaluation
B G L
‘‘as , I want to know , so that I can .’’ User Story Matched
A C D
‘‘ should be to .’’ Rupp Matched
B C F G
‘‘when is , I want to ’’ EARS Matched
A D J
‘‘ want to know , so that .’’ User Story Not Matched
A B
‘‘ and ’’ Others Matched
‘‘Requirement definition: D system shall provide G
to H
.
requirement specification : L system will N to O
. Others Matched
origin : S priority : U .’’
Table 6
Evaluation results.
Label Number of requirements TP FP FN Precision Recall F1-measure
Rupp 3678 2925 0 753 1.00 0.78 0.89
User Story 1855 1796 0 59 1.00 0.97 0.98
EARS 235 188 0 47 1.00 0.80 0.89
Use Case 31 30 0 1 1.00 0.97 0.98
Others 2285 2285 860 0 0.73 1.00 0.84
Weighted average 0.92 0.89 0.90
Table 7
Evaluating the non-standard cases.
Info Number of requirements Percentage
Template is correctly detected 611 73.5%
Template is partially detected 171 20.1%
Template is not detected 49 5.9%
Fig. 11. F1-measure values when dividing the data set based on the requirement length (# of words per requirement).
Fig. 12. F1-measure values when dividing the data set based on their set size.
new templates, (6) the ability to handle non-standard templates 1. We construct our data set based on well-known require-
(organization-specific templates), (7) the number of provided ments sets which have been used previously in various RE
case studies. automation works [4,48–52].
Table 8 summarizes the main differences between the re- 2. Data annotation is carried out by two domain experts, and
lated works and our approach. As shown in this comparison, the differences were then rechecked. Our inter-annotator
our approach differs from other works in its focus; instead of agreement analysis shows almost perfect agreement (more
handling the requirements based on pre-defined templates, we than 0.9) which provides confidence about the quality of
learn templates from the requirements themselves. Also, using annotated data set.
an unsupervised technique makes our approach more practical 3. A detailed guideline has been made available to the an-
and efficient – compared to related works in the literature – notators explaining the syntax of the used well-known
when analyzing requirement without any previous knowledge templates.
about their authoring environments, or the used templates in
Another potential threat to validity is the number of levels
these requirements. In addition, we use 82 sets of requirements
used in the dynamic programming step when detecting main
to validate the efficiency of our approach, which represent the
communities (Section 4.2). We apply this step on three levels
largest set of case studies in related literature.
and then passed the results (the seed communities) to the greedy
step. The number of levels have been chosen based on a theoret-
6. Limitations and threats to validity ical analysis since practical experiments cannot be done because
of the complexity of the problem.
This section discusses the limitations and potential threats to Moreover, another possible threat to internal validity arises
validity of our methodology and experimental results. from the two thresholds used to define meaningful communities:
community size that should be more than 10% of total number of
6.1. Limitations: requirements, and internal similarity that should be more than
3. However, we consider these thresholds acceptable since they
Our approach is applicable for any clean list of requirements: only eliminate non-important communities i.e. communities with
recognizing whether a statement is part of requirement text or small number of requirements, or communities that do not have
not is out of the scope of our work. Thus, in our experiments, we significant common structure. Note that Γ (C ) is used to construct
cleaned the used software requirements specification documents templates, and the internal similarity |Γ (C )| equals the size of
by selecting only requirements sections. the template which represent C ’s requirements. Thus, the internal
similarity threshold only eliminate template with two tokens
6.2. Internal validity: (one of them represent a slot) such as ‘‘ and’’ or ‘‘admin ’’ .
A potential threat to internal validity arises from the fact that 6.3. External validity:
the evaluation data set is developed during this research. Several
mitigation actions have been taken to control bias related to the We tested our approach on 82 sets of requirements covering
data set: different aspects for requirements: various syntactic structure,
10
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
Table 8
A comparative study with related works.
Arora et al. [4] Lucassen et al. [38] Femmer et al. [41] DODT [39] Our approach
Covered (or Rupp, EARS User Story Rupp, User Story, Rupp Rupp, EARS, User
supported) Use Case Story, Use Case
standard
templates
Needed effort to High, need to High, the suggested Low, since it just High, it requires a No additional work,
support new add new criteria are specific to applies a rapid high-quality domain the approach is
templates patterns and user stories checklist that is ontology to be dynamic and can
rules for any applicable for any developed first learn new templates
new template template automatically
different sources, different domains, different size, and different Declaration of competing interest
requirement length. Our experiments provides reasonable confi-
dence that the quality of results is preserved over these aspects. The authors declare that they have no known competing finan-
However, larger scale empirical studies can benefit to improve cial interests or personal relationships that could have appeared
the external validity of our approach. to influence the work reported in this paper.
7. Conclusion
References
In this paper, we presented an automated approach to learn
[1] ISO/IEC/IEEE, Systems and Software Engineering—Life Cycle Processes—
requirements syntax and recognize their templates. The pro- Requirements Engineering, ISO Switzerland, 2018.
posed approach uses NLP techniques to extract the syntactical [2] A. Chakraborty, M.K. Baowaly, A. Arefin, A.N. Bahar, The role of require-
features of requirements statements, and uses community de- ment engineering in software development life cycle, J. Emerg. Trends
tection algorithm to group them in coherent communities based Comput. Inf. Sci. 3 (5) (2012) 723–729.
on their syntactic similarity. These communities are then used to [3] J. Holtmann, J.-P. Steghöfer, M. Rath, D. Schmelter, Cutting through the
jungle: Disambiguating model-based traceability terminology, in: 2020
construct the final templates. Our experiments show that the sug- IEEE 28th International Requirements Engineering Conference, RE, IEEE,
gested approach can detect well-known standard templates with 2020, pp. 8–19.
0.90 F1-measure. This result depends on template structure; it [4] C. Arora, M. Sabetzadeh, L. Briand, F. Zimmer, Automated checking of
increases to about 0.98 for templates with strong restrictions (like conformance to requirements templates using natural language processing,
user story), or decreases to 0.89 for less restricted templates (like IEEE Trans. Softw. Eng. 41 (10) (2015) 944–968.
[5] F.S. Bäumer, M. Geierhos, Flexible ambiguity resolution and incompleteness
Rupp template). Experiments results indicate that F1-measure
detection in requirements descriptions via an indicator-based configuration
is approximately preserved regardless of the number and the of text analysis pipelines, 2018.
length of the processed requirements. Moreover, the approach [6] F. Dalpiaz, I. Van der Schalk, G. Lucassen, Pinpointing ambiguity and incom-
can detect common syntactic features for non-standard templates pleteness in requirements engineering via information visualization and
in more than 73.5% of the cases. For future works, we plan to NLP, in: International Working Conference on Requirements Engineering:
investigate how can we use the retrieved templates to formulate Foundation for Software Quality, Springer, 2018, pp. 119–135.
[7] A. Aurum, C. Wohlin, Requirements engineering: setting the context, in:
a semantic representation of requirements. These syntactic and
Engineering and Managing Software Requirements, Springer, 2005, pp.
semantic representations may lead to more accurate techniques 1–15.
for different RE tasks. [8] T. Ambreen, N. Ikram, M. Usman, M. Niazi, Empirical research in require-
ments engineering: trends and opportunities, Requir. Eng. 23 (1) (2018)
CRediT authorship contribution statement 63–95.
[9] X. Lian, M. Rahimi, J. Cleland-Huang, L. Zhang, R. Ferrai, M. Smith, Mining
requirements knowledge from collections of domain documents, in: 2016
Riad Sonbol: Conceptualization, Methodology, Software, Data
IEEE 24th International Requirements Engineering Conference, RE, IEEE,
curation, Writing – original draft, Writing – review & editing. 2016, pp. 156–165.
Ghaida Rebdawi: Supervision, Writing – review & editing. Nada [10] A. Ferrari, A. Esuli, An NLP approach for cross-domain ambiguity detection
Ghneim: Supervision, Writing – review & editing. in requirements engineering, Autom. Softw. Eng. 26 (3) (2019) 559–598.
11
R. Sonbol, G. Rebdawi and N. Ghneim Knowledge-Based Systems 248 (2022) 108933
[11] C. Denger, D.M. Berry, E. Kamsties, Higher quality requirements specifica- [33] A. Lancichinetti, S. Fortunato, Community detection algorithms: a
tions through natural language patterns, in: Proceedings 2003 Symposium comparative analysis, Phys. Rev. E 80 (5) (2009) 056117.
on Security and Privacy, IEEE, 2003, pp. 80–90. [34] E. Castrillo, E. León, J. Gómez, Fast heuristic algorithm for multi-scale
[12] B. DeVries, B.H. Cheng, Automatic detection of incomplete requirements hierarchical community detection, in: Proceedings of the 2017 IEEE/ACM
via symbolic analysis, in: Proceedings of the ACM/IEEE 19th International International Conference on Advances in Social Networks Analysis and
Conference on Model Driven Engineering Languages and Systems, 2016, Mining 2017, 2017, pp. 982–989.
pp. 385–395. [35] L. Zhao, W. Alhoshan, A. Ferrari, K.J. Letsholo, M.A. Ajagbe, E.-V. Chioasca,
[13] A. Umber, I.S. Bajwa, Minimizing ambiguity in natural language software R.T. Batista-Navarro, Natural language processing for requirements engi-
requirements specification, in: 2011 Sixth International Conference on neering: A systematic mapping study, ACM Comput. Surv. 54 (3) (2021)
Digital Information Management, IEEE, 2011, pp. 102–107. 1–41.
[14] J. Schumann, Generation of formal requirements from structured natural [36] F. Nazir, W.H. Butt, M.W. Anwar, M.A.K. Khattak, The applications of
language, in: Requirements Engineering: Foundation for Software Quality: natural language processing (NLP) for software requirement engineering-a
26th International Working Conference, REFSQ 2020, Pisa, Italy, March systematic literature review, in: International Conference on Information
24–27, 2020, Proceedings, vol. 12045, Springer Nature, 2020, p. 19. Science and Applications, Springer, 2017, pp. 485–493.
[15] A. Mavin, P. Wilkinson, A. Harwood, M. Novak, Easy approach to re- [37] R. Sonbol, G. Rebdawi, N. Ghneim, The use of NLP-based text represen-
quirements syntax (EARS), in: 2009 17th IEEE International Requirements tation techniques to support requirement engineering tasks: A systematic
Engineering Conference, IEEE, 2009, pp. 317–322. mapping review, IEEE Access under review (2022).
[16] K. Pohl, C. Rupp, Requirements engineering fundamentals, Rocky Nook Inc, [38] G. Lucassen, F. Dalpiaz, J.M.E. Van Der Werf, S. Brinkkemper, Forging high-
2011. quality user stories: towards a discipline for agile requirements, in: 2015
[17] Y. Wautelet, S. Heng, M. Kolp, I. Mirbel, Unifying and extending user story IEEE 23rd International Requirements Engineering Conference, RE, IEEE,
models, in: International Conference on Advanced Information Systems 2015, pp. 126–135.
Engineering, Springer, 2014, pp. 211–225. [39] S. Farfeleder, T. Moser, A. Krall, T. Stålhane, H. Zojer, C. Panis, DODT:
[18] Z. Liu, B. Li, J. Wang, R. Yang, Requirements engineering for crossover Increasing requirements formalism using domain ontologies for improved
services: Issues, challenges and research directions, IET Softw. 15 (1) embedded systems development, in: 14th IEEE International Symposium
(2021) 107–125. on Design and Diagnostics of Electronic Circuits and Systems, IEEE, 2011,
[19] A. Mavin, P. Wilkinson, S. Teufl, H. Femmer, J. Eckhardt, J. Mund, Does pp. 271–274.
goal-oriented requirements engineering achieve its goal? in: 2017 IEEE [40] RQA: The Requirements Quality Analyzer Tool https://www.reusecompany.
25th International Requirements Engineering Conference, RE, IEEE, 2017, com/rqa-quality-studio.
pp. 174–183. [41] H. Femmer, D.M. Fernández, S. Wagner, S. Eder, Rapid quality assurance
[20] U. Eklund, H.H. Olsson, N.J. Strøm, Industrial challenges of scaling agile in with requirements smells, J. Syst. Softw. 123 (2017) 190–213.
mass-produced embedded systems, in: International Conference on Agile [42] ISO, IEC, IEEE: ISO/IEC/IEEE 29148, systems and software engineering, life
Software Development, Springer, 2014, pp. 30–42. cycle processes, Requir. Eng. (2011).
[21] G. Fanmuy, A. Fraga, J. Llorens, Requirements verification in the industry, [43] T. Stålhane, T. Wien, The DODT tool applied to sub-sea software, in: 2014
in: Complex Systems Design & Management, Springer, 2012, pp. 145–160. IEEE 22nd International Requirements Engineering Conference, RE, IEEE,
[22] A. Ferrari, F. Dell’Orletta, A. Esuli, V. Gervasi, S. Gnesi, Natural lan- 2014, pp. 420–427.
guage requirements processing: A 4D vision, IEEE Softw. 34 (6) (2017) [44] M. Kamalrudin, N. Mustafa, S. Sidek, A template for writing secu-
28–35. rity requirements, in: Asia Pacific Requirements Engeneering Conference,
[23] R. Sonbol, G. Rebdawi, N. Ghneim, Towards a semantic representation Springer, 2017, pp. 73–86.
for functional software requirements, in: 2020 IEEE Seventh International [45] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C.D. Manning, Stanza: A python
Workshop on Artificial Intelligence for Requirements Engineering, AIRE, natural language processing toolkit for many human languages, 2020, arXiv
IEEE, 2020, pp. 1–8. preprint arXiv:2003.07082.
[24] S. Hatton, Early prioritisation of goals, in: International Conference on [46] J. Baldridge, The apache OpenNLP project, 2005, URL: https://Opennlp.
Conceptual Modeling, Springer, 2007, pp. 235–244. Apache. Org.
[25] C. Arora, M. Sabetzadeh, L. Briand, F. Zimmer, R. Gnaga, RUBRIC: A flexible [47] J.W. Hunt, T.G. Szymanski, A fast algorithm for computing longest common
tool for automated checking of conformance to requirement boilerplates, subsequences, Commun. ACM 20 (5) (1977) 350–353.
in: Proceedings of the 2013 9th Joint Meeting on Foundations of Software [48] A. Ferrari, G.O. Spagnolo, S. Gnesi, Pure: A dataset of public requirements
Engineering, 2013, pp. 599–602. documents, in: 2017 IEEE 25th International Requirements Engineering
[26] D. Jurafsky, Speech and Language Processing : An Introduction to Natural Conference, RE, IEEE, 2017, pp. 502–505.
Language Processing, Computational Linguistics, and Speech Recognition, [49] F. Dalpiaz, Requirements data sets (user stories), Mendeley, 2018, http://dx.
Prentice Hall, Upper Saddle River, N.J, 2000. doi.org/10.17632/7ZBK8ZSD8Y.1, URL https://data.mendeley.com/datasets/
[27] V. Teller, Speech and Language Processing: an Introduction to Natural Lan- 7zbk8zsd8y/1.
guage Processing, Computational Linguistics, and Speech Recognition, MIT [50] E. Knauss, S. Houmb, K. Schneider, S. Islam, J. Jürjens, Supporting require-
Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info . . . , ments engineers in recognising security issues, in: International Working
2000. Conference on Requirements Engineering: Foundation for Software Quality,
[28] G.A. Pavlopoulos, M. Secrier, C.N. Moschopoulos, T.G. Soldatos, S. Kossida, Springer, 2011, pp. 4–18.
J. Aerts, R. Schneider, P.G. Bagos, Using graph theory to analyze biological [51] G. Lucassen, M. Robeer, F. Dalpiaz, J.M.E. Van Der Werf, S. Brinkkemper,
networks, BioData Min. 4 (1) (2011) 1–27. Extracting conceptual models from user stories with visual narrator,
[29] S. Fortunato, D. Hric, Community detection in networks: A user guide, Requir. Eng. 22 (3) (2017) 339–358.
Phys. Rep. 659 (2016) 1–44. [52] J.H. Hayes, J. Payne, M. Leppelmeier, Toward improved artificial intelligence
[30] M.E. Newman, Detecting community structure in networks, Eur. Phys. J. B in requirements engineering: Metadata for tracing datasets, in: 2019 IEEE
38 (2) (2004) 321–330. 27th International Requirements Engineering Conference Workshops, REW,
[31] A. Clauset, C. Moore, M.E. Newman, Hierarchical structure and the IEEE, 2019, pp. 256–262.
prediction of missing links in networks, Nature 453 (7191) (2008) 98–101. [53] J. Pustejovsky, A. Stubbs, Natural Language Annotation for Machine Learn-
[32] N. Gulbahce, S. Lehmann, The art of community detection, BioEssays 30 ing: A Guide To Corpus-Building for Applications, O’Reilly Media, Inc.,
(10) (2008) 934–938. 2012.
12