2016 3 1 2 Haenisch
2016 3 1 2 Haenisch
2016 3 1 2 Haenisch
By Till Haenisch
In this paper a case study, which shows that the code size and complexity of a
system which collects and interprets sensor data in an Internet of Things
scenario can be reduced using functional programming techniques, is
presented. On one hand this is especially important for security reasons: Such
a system must run for a long time without an effective way to distribute
software patches. On the other hand in this kind of system the consequences of
a malfunction (intended or not) are much more critical than in standard
computing situations, because real world buildings or industrial sites are
affected. From a high level perspective the data processing at the base station
of such a sensor network can be considered as a set of mathematical functions
operating on a stream of values. Each function creates a new stream of values,
which might be processed by another function. This means that the complete
functionality can easily be described and programmed in a functional
language, such as elixir, Erlang or Scala.
Introduction
29
Vol. 3, No. 1 Haenisch: A Case Study on Using Functional Programming...
efforts to develop methods for creating such systems, there are no practical
solutions yet. The Defense Advanced Research Projects Agency (DARPA) has
identified this option of addressing security problems and started the Cyber
Grand Challenge in 2013 to stimulate research about self healing networks
(DARPA, 2013). In 2015 DARPA launched an initiative called Building
Resource Adaptive Software Systems (BRASS) to build "software systems and
data to remain robust and functional in excess of 100 years" (DARPA, 2015).
The only way to achieve this is enabling self adaptive systems which adapt
themselves to changing environments. While this might lead to interesting
results in the future, this solution is not available for current systems.
Without usable techniques to automatically solve security problems, it is
desirable to keep the number of bugs close to zero. One way to lower the
number of bugs is small code size and low complexity. Fewer lines of code and
lower coupling, especially as few side effects as possible, means fewer bugs.
The question is, how to achieve that.
In the simplest case, and only this case will be considered here, IoT means,
that things talk to the internet. There are two common architectures for this
kind of system: The first and simplest is a sensor node that is directly
connected to the internet, typically by WLAN (Figure 1a). This requires a
WLAN interface and, in most cases, an operating system that provides the
necessary functionality. Typical hardware platforms for these kinds of
applications are Raspberry Pi, Intel Galileo or Carambola, running some kind
of Unix or Windows OS. These systems are flexible and powerful, however
they require a continuous power supply since their energy consumption of up
to 15 Watt (Reese, 2015) cannot be delivered by batteries.
30
Athens Journal of Technology & Engineering March 2016
Sensor
Graphics
Database
31
Vol. 3, No. 1 Haenisch: A Case Study on Using Functional Programming...
set contains only one value. Thus, the functionality of a processing node for an
IoT application can be considered as a set of mathematical functions operating
on a stream of values (Newton and Welsh, 2004). Each function creates a new
stream of values, which might be processed by another function (see Figure 2).
This means, that the complete functionality can easily be described and
programmed in a functional language like elixir, Erlang, Scala or Haskell.
There is a considerable debate about the advantages of using functional
programming languages or at least functional programming techniques. Many
languages adopt functional features to allow using functional techniques in the
preferred environment, for example (Subramaniam, 2014). This debate is not
new (Gat, 2000). In Gat's classic experiment it was shown, that many
properties of programs like programmer productivity, performance etc. were
better when the programs were written in Lisp, a very old functional language,
compared to Java, a then modern imperative language.
According to (Wortmann and Flüchter, 2015) Internet of Things platforms
have to be open, simple and prospective, functional programming beeing one
of the key features. Platforms like Erlang/OTP have these properties and
should therefore be considered candidates for these applications.
Functional programming languages (and their environments like
Erlang/OTP) are very good for writing reliable, highly concurrent applications
with many concurrent processes and especially process failures (Armstrong,
2010). Writing applications like that was the reason for the development of the
Erlang ecosystem in telecommunication systems like phone exchanges.
The same reasons for using functional languages in these environments are
given in IoT scenarios. Concurrent event sources, e.g. sensor modules,
unreliable communication with spurious errors because of wireless data
transmission and a system that has to work highly reliable under any of these
problems, for a discussion of the relevance of these properties in IoT
applications see (Sivieri et al., 2012). Even if some sensors in a building or a
factory setting are not working correctly, the data and data transformation must
continue at least with the undisturbed data, the main control flow must not be
affected by errors in other parts of the system. Nobody would tolerate a
building where you can not turn on the lights, because a thermostat node
crashes.
But this is not the most important point for choosing functional languages.
An even stronger advantage of functional languages, is, that the code for
transformations like the ones described above, is much more concise than with
traditional imperative languages. Although there is no formal proof for this
assumption, there is a large number of anecdotal cases, for example from
(Ford, 2013) or the case study described in a later section of this paper. An
impressive case is John Carmack from ID software, who reimplemented
Wolfenstein 3D in Haskell and found, besides other promising benefits that the
code size was reduced significantly (Carmack, 2013).
Short code without side effects (pure functional languages do not have side
effects) is easier to verify for correctness than the imperative code. That means,
it contains fewer errors. While there is a significant, but only small correlation
32
Athens Journal of Technology & Engineering March 2016
between the programming language and the error rate, there is a clear
dependency between code size and error rate (Ray et al., 2014). Since
programs written in functional languages tend to be shorter than programs
written in imperative languages, they should contain fewer errors.
Fewer errors means less security problems, which is the main point.
Internet of Things applications have a direct relation to the real world. Security
problems in this context mean not only damaged files on a disk, which might
be restored from a backup, but cause damage and or monetary loss in the real
world.
Another advantage of functional programming techniques is that they
reduce side effects: This is the main idea of functional programming,
composing a program from "pure" functions. Security wise this is a good idea,
especially in embedded scenarios. The Industrial Internet Reference
Architecture (IIC, 2015) recommends to "avoid introducing unknown or
undesired side effects"
To limit the possible damage by security problems in the IoT applications,
it is either neccessary to develop and deploy a widely accepted platform, that
has few bugs and is constantly updated throughout the world like for example
Apples iOS or we need as much diversity in these systems as we can get to
reduce the risk of a complete failure (Schneier, 2010). Software diversity is a
promising way to achieve anti-fragile systems (Hole, 2015). Lacking other
accepted technical solutions that means individually developed software with
as few bugs as possible. And that means short, simple programs, which are
easy to test and verify.
In the following chapters a case study is presented which shows that the
code size and complexity for systems which collects and interprets sensor data
in an IoT scenario can be reduced using functional programming techniques.
Case Study
The case discussed in this paper is a low cost low power sensor network to
save energy in paper machines (Hänisch, 2014). By using wireless sensors for
measuring temperature and humidity in the dryer section of a paper machine it
is possible to optimize energy consumption by adjusting heating and air flow.
Because the sensors need to be battery powered and send the data almost in
real time for monitoring purposes, a low power network technology is needed,
in this case ZigBee. The data is sent to a base station in packets with no
guaranteed delivery, resulting in an at most once semantic. This results in some
complexity of the base station code, which consists mainly of error handling
and monitoring or logging functions.
In this article, only the code running on the base station (see Figure 3) will
be considered, the code on the sensor nodes mainly handles communication
with the sensor hardware and has a very simple structure since no data is stored
locally. In more complex cases this part could also be implemented using
33
Vol. 3, No. 1 Haenisch: A Case Study on Using Functional Programming...
34
Athens Journal of Technology & Engineering March 2016
35
Vol. 3, No. 1 Haenisch: A Case Study on Using Functional Programming...
Figure 6. Complexity Measures of the Ruby and Elixir Versions with the Same
Functionality
The original version (ruby and C) was replaced by the second, simplified
ruby version because it showed a number of critical errors which were hard to
find because they occured only rarely. This lead to a simplified ruby version
which worked over a period of nearly two years with only two non critical
bugs. The elixir version is running for only a few months now and showed no
bug til today.
Discussion
The problems with a study with n=1 are well known, see for example
(Harrison, 2000). But experiments in software engineering are hard to do:
Controlled experiments with n>1 give better results, if and only if both samples
are from the same basic population. This basic population must be
representative for the real word. This is the problem with the controlled
experiment approach. Usually experiments are done with voluntary students,
but it would be difficult to find students which have the same amount of
experience level, in ruby and elixir in this case. Typically someone knowing
those two languages has way more experience with ruby than with elixir, since
elixir is newer. Programmers knowing elixir or Lisp, Scala, Erlang will tend to
have a more theoretical background than the typical developer of embedded
systems, but much less practical experience. So even if an experiment with a
large number of participants would be of limited use for real projects.
The size of the example described above is much smaller than typical
industrial projects. So the only firm conclusion that may be drawn from this
case is that further, larger experiments are needed. On the other hand the
processing of data in Internet of Things scenarios might (and probably should)
(Namiot and Sneps-Sneppe, 2014) be implemented as microservices, with a
size comparable to the case described here.
36
Athens Journal of Technology & Engineering March 2016
Both of these points are valid, but controlled experiments with realistic
project sizes are very hard to do: The group of people who would volunteer to
work for a few years on a software project that is developed by a large number
of other teams concurrently just to get some statistically valid data about
program complexity is limited and certainly not representative for real world
software engineers. So this problem is unsolvable and we will have to stay with
small n=1 case studies.
Using the cyclomatic complexity as a measure for the expected number of
errors in code is debatable, see for example (Abran et al., 2004). On the other
hand it is widely accepted and used in tools to measure complexity for exactly
this purpose. In conclusion the correlation might not be absolutely proven, but
in real world experience it works and it is plausible: The more paths in the
code, the harder to understand and test, the harder to understand and test, the
more errors.
Conclusions
References
Abran, A., Lopez, M., and Habra, N. 2004. An Analysis of the McCabe Cyclomatic
Complexit Number, Proceedings of the 14th International Workshop on Software
Measurement (IWSM) IWSM-Metrikon, 2004, Magdeburg, Germany: Springer-
Verlag, 391-405.
AWS IoT 2015. aws.amazon.com/iot.
Armstrong, J. 2010. Erlang, Communications of the ACM 53(9).
Bin Tang, C., 2015. Explore MQTT and the Internet of Things service on IBM
Bluemix. http://ibm.co/1LDiJFD.
Carmack, J. 2013. Keynote at QUAKECON 2013, http://bit.ly/1Ij5u2a.
DARPA 2013. DARPA Cyber Grand Challenge Competitor Portal. https://cgc.darpa.
mil.
DARPA 2015. DARPA Seeks to Create Software Systems That Could Last 100 Years.
http://bit.ly/1aLNazw.
Ford, N. 2013. Functional thinking: Why functional programming is on the rise.
http://ibm.co/1jsUymLGat 2000.
Gat, E. 2000. Point of view: Lisp as an alternative to Java, Intelligence 11(4): 21-24.
Hänisch, T. 2014. Using a Sensor Network for Energy Optimization of Paper Machine
Dryer Sections, Athens Journal of Technology Engineering 1(3).
37
Vol. 3, No. 1 Haenisch: A Case Study on Using Functional Programming...
38