Non-Functional Requirements For Machine Learning: Challenges and New Directions
Non-Functional Requirements For Machine Learning: Challenges and New Directions
Abstract—Machine Learning (ML) provides approaches which well understood. However, when the software solution involves
use big data to enable algorithms to “learn”, producing outputs ML, some of our knowledge about NFRs may no longer
which would be difficult to obtain otherwise. Despite the advances apply. Fundamentally, the way in which we ‘design’, ‘run’, and
allowed by ML, much recent attention has been paid to certain
qualities of ML solutions, particularly fairness and transparency, ‘maintain’ ML-based solutions differs. The broad question of
but also qualities such as privacy, security, and testability. From how SE methods and procedures can be adapted for ML-based
a requirements engineering (RE) perspective, such qualities are solution development is already starting to be considered in
also known as non-functional requirements (NFRs). In RE, the venues such as the SEMLConf [9]. Here we focus particularly
meaning of certain NFRs, how to refine those NFRs, and how on methods for NFRs.
to use NFRs for design and runtime decision making over
traditional software is relatively well established and understood. In particular, the nature of ML means that the meaning
However, in a context where the solution involves ML, much of of many NFRs for ML solutions differs compared to regular
our knowledge about NFRs no longer applies. First, the types of software, and these NFRs are often not well understood (e.g.,
NFRs we are concerned with undergo a shift: NFRs like fairness what is fairness? [10]). What does it mean for an ML-enabled
and transparency become prominent, whereas other NFRs such system to be maintainable? Are NFRs such as compatibility
as modularity may become less relevant. The meanings and
interpretations of NFRs in an ML context (e.g., maintainability, and modularity still relevant? Some NFRs may have reduced
interoperability, and usability) must be rethought, including importance for ML solutions compared to typical software. On
how these qualities are decomposed into sub-qualities. Trade- the other hand, NFRs such as fairness [2] and transparency [3]
offs between NFRs in an ML context must be re-examined. have become critical from an ML perspective, whereas previ-
Beyond the changing landscape of NFRs, we can ask if our ous NFR work has not typically emphasized these dimensions.
known approaches to understanding, formalizing, modeling, and
reasoning over NFRs at design and runtime must also be Further, as-yet-unexplored NFRs such as “retrainability” may
adjusted, or can be applied as-is to this new area? Given these also become relevant.
questions, this work outlines challenges and a proposed research The complexity of NFRs has long been managed by re-
agenda for the exploration of NFRs for ML-based solutions. finement, e.g., security is typically refined to confidentiality,
Index Terms—Non-Functional Requirements, NFRs, qualities, integrity, etc. Not only may the meaning of certain NFRs
Machine Learning, Requirements Engineering
change in an ML context, but the refinements may also need
to be rethought and updated. In typical NFR research, we are
I. I NTRODUCTION
aware of common quality trade-offs, often called conflicting
Machine Learning (ML) describes a computational ap- NFRs [7], e.g., security and performance. But recent work is
proach which uses large amounts of data to enable algorithms only just beginning to explore quality trade-offs in the ML
to “learn”, performing tasks which are difficult to achieve space [2]. Do known trade-offs still apply in the case of ML?
via standard software. This enables solving difficult problems Do new trade-offs exist?
such as recognizing images, diagnosing cancer, and estimating In a traditional system, one can collect and implement many
insurance [1]. Despite the advances allowed by ML, recently functional requirements (FRs). The overall function or purpose
much attention has been paid to certain qualities of ML of an ML application is much more focused; e.g., recognize
solutions, particularly fairness [2], but also transparency [3], a face or diagnose a disease. Thus, there are far fewer FRs,
security [4], privacy [5], and testability [6]. and ML research has focused on the NFRs associated directly
Much existing work has been devoted to understanding, with those key FRs, e.g., accuracy of facial recognition, perfor-
decomposing, managing, formalizing and reasoning over qual- mance of diagnosis. Because an ML application has few FRs,
ities of typical non-ML software. Such qualities are often one can argue that the effective satisfaction of NFRs becomes
included as part of non-functional requirements (NFRs). These particularly critical. However, in practice, ML implementations
include well-studied NFRs such as performance, reliability, will be integrated with more standard software as part of larger
maintainability, and usability, but also security, privacy, and and more complex systems (e.g., in a self-driving car), which,
customer satisfaction [7], [8]. as a whole, will have many complex FRs and NFRs.
From the perspective of traditional software, the meaning of In this paper, we consider whether traditional knowledge
certain qualities, how to refine those qualities, and how to use about NFRs and quality from a requirements perspective can
such qualities for design and runtime reasoning is relatively apply to ML-based systems. We can view this knowledge
from two dimensions: 1) knowledge of NFRs, i.e., what Languages and Reasoning. Much work focuses on captur-
are common and important NFRs, how are they interpreted, ing NFRs in visual modeling languages, sometimes with an
refined, measured, and how they can conflict, and 2) methods underlying metamodel and semantics, facilitating (semi-) auto-
for NFRs, i.e., catalogues of NFRs [8], modeling methods mated qualitative and quantitative methods to support decision
like the NFR framework [7], methods to reason over NFRs, making, e.g., [7], [11], [12]. Usually, approaches allow users
e.g., [11], [12], to use NFRs to monitor software, e.g., [13], and to use NFRs to select among possible alternative functional
drive software adaptation and evolution, e.g., [14], [15], etc. requirements, e.g., given FRs and NFRs, many of which are
In this work, we make the argument that the first dimension in conflict, which requirements should we implement?
(1) must be at least partially re-thought in light of the rise of Runtime, Adaptation, and Evolution. NFR approaches
ML. Many of the ideas and techniques considering NFRs for were extended to consider a requirements-based view of
traditional software (2) may still be valid, but it is possible that runtime system operation, where functional and quality re-
techniques may need revamping in light of this new paradigm, quirements could be monitored at runtime, based on data from
or that new, completely novel techniques are needed. the running system, e.g., [13], [20]. Work in this area went
The next section outlines the state-of-the-art, followed by an further to consider requirements-based runtime adaptation,
illustrative and motivating ML example. Research challenges e.g., a certain quality aspect is not sufficiently satisfied at
and an agenda are outlined, followed by a discussion and runtime, thus the system will evolve and adjust to try to gain
consideration of future work. better performance or quality, all while considering quality
trade-offs [14], [15].
II. S TATE - OF - THE -A RT Linking Data to Quality. A related line of work uses an
adaption of common requirements notations to link business
A. NFRs in Requirements Engineering
data to organizational goals, including qualities [21], allowing
Requirements Engineering (RE) research has long made the for continuous goal-based business intelligence. More recent
argument that eliciting and considering NFRs is critical for work focuses on the design of data analytic systems for
the success of systems [7]. Such systems could be technically business, which may include ML algorithms [22], [23]. This
sound, but fail due to issues in quality. Such an argument is work focuses on finding designs which fit domain-specific
particularly relevant for ML solutions, whose effectiveness lies analytic questions, considering aspects of quality performance
mainly in the quality of the outcomes they provide. for various ML options. In this case, the authors adapt existing
What is an NFR? Simply, an NFR is any quality or at- RE languages to consider data analytics at the syntax level (no
tribute which is non-functional. This broad definition, defining formal semantics or metamodel), and they use existing analysis
something critical in terms of what it is not, is not ideal, as has procedures without modification.
been discussed by several authors, e.g. [16], [17]. Our purpose
here is not to define NFRs in a satisfactory way, but to explore B. Qualities for Machine Learning.
their application to ML. The concept of quality has had better ML encompasses over a dozen algorithm types (e.g., Re-
luck in terms of a precise definition, being covered by several gression, Bayesian, Instance-based, Deep Learning, Neural
prominent ontologies, e.g., DOLCE [18]. More recent work in Networks), with many more specific algorithms (e.g., Lo-
RE uses ideas from DOLCE to treat NFRs as qualities over an gistic Regression, Linear Regression, Naı̈ve Bayes, Nearest
entity [19], usually a functional requirement, the system, or a Neighbor) [1]. Most work on ML topics provide examples
system component, e.g., “send mail (entity) quickly (quality)” and algorithm details, including performance results, but do
or “the system (entity) should be secure (quality)”. not focus on a wide range of NFRs or quality aspects. We
Although qualities of ML solutions and NFRs for ML summarize a selection of current work considering NFRs for
solutions are similar, technically one can think of NFRs as ML in the following.
the requirements over the quality, e.g., the quality is usability Accuracy & Performance. Most ML work reports on
of system X, while the NFR is “System X must be usable”, algorithm accuracy (often precision and recall), i.e., how
which ideally should be defined in a measurable way, e.g., “correct” the output is compared to reality. Further work looks
“90% of test users would rate the system as an 8/10 in terms more broadly at algorithm performance (e.g., [24]), including
of usability”. Although there is a distinction, for simplicity, in comparisons of performance in specific contexts (e.g., [25]).
this work we treat NFRs and qualities as synonyms. Fairness. Recent work has focused on technical solutions
NFR Catalogues. To facilitate a consideration of NFRs, to make ML algorithms more fair, finding that the removal of
catalogues of software qualities were created. For example, sensitive features (e.g., race, gender) is not sufficient to ensure
the ISO/IEC 25010 standard divides system/software product fair results, and considering the trade-off between fairness
quality into eight categories, including performance efficiency, and other NFRs [2]. Work in this area has attempted to find
compatibility, usability, and security [8]. Each quality is further mathematical or formal definitions of fairness, e.g. statistical
decomposed; e.g., compatibility is refined into co-existence parity, individual fairness, and has found that the accurate
and interoperabilty. Such catalogues provide iterative refine- implementation of fairness depends more on how fairness
ment of NFRs into sub-qualities, possibly sub-sub-qualities, is defined and measured than how it is implemented [26].
sometimes down to measurable indicators, when possible. Empirical work has asked practitioners about their needs for
fairness in ML, finding that in practice, engineers want to
consider the side effects of fairness and see fairness in the Classification Regression
Novelty
Detection ...
context of the broader system [27]. Problem Type
OR
Transparency. Although the results of ML can have signif- Binary
Multi-
class
icant real-world impact, it is often not clear how these results
are derived, causing issues in trust and transparency. Work has Algorithm Supervised Unsupervised ...
begun to look at better explaining ML results [3], [28] to try Characteristics
OR
to mitigate this issue. Semi- Active Reinforcement
Security & Privacy. Efforts have been made to address pri-
vacy concerns when using big (often personal) data to facilitate Algorithm Regression Bayesian
Instance-
based
ML. Work in [4] introduces protocols for preserving privacy Type
OR OR OR ...
in various ML approaches, and explicitly acknowledges the Algorithm Logistic Linear ... Naive ... Nearest
neighbour
...
trade-off in terms of algorithm speed when revising techniques Assumptions and
A
for privacy. Similarly, Bonawitz et al. introduce a method to Optimizations
O O