0% found this document useful (0 votes)
38 views11 pages

Compressed Data-Stream Protocol An Energy-Efficient

The document presents a compressed data-stream protocol called CDP for energy-efficient data collection in wireless sensor networks. CDP uses a generalised predictive coding algorithm that can support both lossless and lossy data compression. It is designed to reduce packet overhead through a novel data stream concept. The performance of CDP is evaluated using simulations with real-world sensor data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views11 pages

Compressed Data-Stream Protocol An Energy-Efficient

The document presents a compressed data-stream protocol called CDP for energy-efficient data collection in wireless sensor networks. CDP uses a generalised predictive coding algorithm that can support both lossless and lossy data compression. It is designed to reduce packet overhead through a novel data stream concept. The performance of CDP is evaluated using simulations with real-world sensor data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

www.ietdl.

org
Published in IET Communications
Received on 1st February 2011
Revised on 8th August 2011
doi: 10.1049/iet-com.2011.0118

Special Section: Green Technologies for Wireless


Communications and Mobile Computing

ISSN 1751-8628

Compressed data-stream protocol: an energy-efficient


compressed data-stream protocol for wireless
sensor networks
N. Erratt Y. Liang
Department of Computer and Information Science, Indiana University, Purdue University, Indianapolis, IN 46202, USA
E-mail: [email protected]

Abstract: In this study, the authors present an energy-efficient data compression protocol for data collection in wireless sensor
networks (WSNs). WSNs are essentially constrained by motes’ limited battery power and networks bandwidth. The authors focus
on data compression algorithms and protocol development to effectively support data compression for data gathering in WSNs.
Their design of compressed data-stream protocol (CDP) is generic in the sense that other lossless or lossy compression algorithms
can be easily ‘plugged’ into the proposed protocol system without any changes to the rest of the CDP. This design intends to
support various different WSN applications where users may prefer more specific compression algorithms, tailored to the
sensing data characteristics in question, to their general algorithm. CDP is not only able to significantly reduce energy
consumptions of data gathering in multi-hop WSNs, but also able to reduce sensor network traffic and thus avoid congestion
accordingly. The proposed CDP is implemented on the tinyOS platform using the nesC programming language. To evaluate
their work, the authors conduct simulations via TOSSIM and PowerTOSSIM-z with real-world sensor data. The results
demonstrate the significance of CDP.

1 Introduction algorithmic level and only examined by numerical


simulations (e.g. [10 –12]). Notably, S-LZW [13] is a novel
Wireless sensor networks (WSNs) are increasingly important sensor version of the well-known dictionary-based lossless
for enabling continuous monitoring in many fields including compression Lempel – Ziv – Welch (LZW) algorithm [14],
environment sciences, water resources, ecosystems, structural and was implemented as a specific application for some
health and health-care applications. In many such targeted scenarios. The experimental results of [13] clearly
applications, a large amount of observation data in a demonstrate the advantages of data compression approach
monitoring sensornet needs to be transferred to data sink(s) to energy savings in a real-world sensornet testbed. Despite
for analyses (e.g. [1 – 5]). Consisting of a large number of those works, there are still some concerns about the merit
tiny, battery-powered, autonomous sensor nodes (motes), of the data compression approach in sensornets for energy
sensornets are fundamentally constrained by motes’ energy conservation, in the sense that the packet overheads and
limitation and communication bandwidth. Energy-efficient additional computations for data compression might
technologies, such as energy-efficient communications, eliminate the gain achieved by data compression. Such
cannot only fundamentally address sensornets’ power concerns appear to, in a large degree, root from the fact that
limitations but also foster environmental sustainability and there is a lack of development of any general transport
the economics of energy efficiency. In this paper, we protocol based on data compression for data gatherings in
present the design and implementation of CDP, a WSNs. This motivates our work. We investigate if and how
compressed data-stream protocol for energy- and data compression can be effectively supported in general
bandwidth-efficient data collections for WSNs, to WSN transport protocols in order to be widely used for
simultaneously address the challenges of both energy energy-efficient data collections in various application
limitation and bandwidth constraint in sensornets. situations; we also explore the performance limits of data
A number of collection protocols have been proposed in the compression approach built in such a general WSN
area of WSNs, including collection tree protocol (CTP) [6, transport protocol for data collection. To this end, our
7], Flush [8], Fetch [4], Wisden [2] and Fusion [9]. These design of CDP is generic and other lossless and/or lossy
protocols focused on reliable data transport in WSNs to compression algorithms can be easily ‘plugged’ into our
address wireless link dynamics, and rate and congestion protocol system without any changes to the rest of the CDP.
control, but none of them considered a data compression We envision that the development of a general transport
approach. On the other hand, most existing works on data protocol based on data compression approach, such as
compression algorithms for sensornets are focused on the CDP, is able to not only provide the first of its kind

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2673


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011
www.ietdl.org
compression-based transport protocol for easy and wide 2.2 GPC framework
practical use for data collection in sensornets, but also
offers a useful and handy research tool for people to further The basic idea of the GPC is to, for a given residue
investigate and validate different compression algorithms distribution model, encode only those residues falling inside
and their effectiveness for diverse WSN applications to a relatively small range [2R, R] (R . 0 and is called
advance the understanding of benefits and limitations of compression radius hereafter) by entropy coding (referred to
data compression approach in real-world WSNs. as predictive compression mode) and to transmit the
The rest of the paper is organised as follows. In Section 2 original raw samples un-coded otherwise (referred to as
we describe the system model of temporal compression for normal mode). Clearly, the normal transmission mode in
many-to-one data collections in WSNs, and then introduce the framework also provides a direct (re)synchronisation
our unified compression algorithm, referred to as mechanism between the predictors at sensor node and the
generalised predictive coding (GPC), for both lossless and sink. Thus the GPC can overcome two fundamental
lossy compression for resource-constrained motes. Section 3 difficulties associated with traditional predictive coding
presents our CDP design and focuses on how to reduce the approaches such as recent LEC algorithm [11]: (i) no
packet overhead by our novel concept of data stream. mechanism to (re)synchronise the predictors used at both
Section 4 describes the implementation of CDP in the nesC transmitting and receiving sides, and (ii) potentially bad
language and TinyOS operating system. In Section 5, we residue distribution shapes (i.e. ‘long tails’) in practice
present detailed evaluation of the CDP based on TOSSIM having adverse impact on entropy coding performance.
and PowerTOSSIM-z simulation environments using real- Moreover, for lossy compression, our GPC essentially
world sensor data streams. Finally, the conclusions and makes use of synchronised iterative multi-step prediction at
future work are given in Section 6. both sensor nodes and the data sink, in which the predicted
output for a given time step will be used as an input for
computing the sensing signal series at the next time step,
2 System model and GPC with all other predictor’s inputs being shifted back one time
2.1 System model unit. This is in contrast with the lossless compression where
the single-step prediction is used at both sensor nodes and
WSNs can be modelled by graphs. A graph G ¼ (V, E) the sink. As prediction errors propagate in this iterative
consists of a set of nodes V and a set of edges E , V 2. multi-step prediction procedure, eventually a residue would
Nodes in V represent autonomous sensor nodes, and edges become larger than the allowed error bound. At this point,
in E correspond to wireless links among the nodes. Let the compression mode has to be switched to the normal
SINK , V denote a small set of particular nodes referred to mode in our GPC, and the original raw reading(s) will be
as data sinks where observations from individual sensor transmitted to resynchronise the predictors at both sensor
nodes in V should be gathered. The sensor nodes are node and sink. The number of raw readings to be
battery-operated whereas the sinks are assumed not power- transmitted is equal to the input demission of predictor
limited. Sensor nodes’ transmitting and receiving are the used. Thus, the embedded normal mode for transmitting
most energy-consuming operations. For example, studies raw samples in the GPC framework has also been able to
have shown that about 3000 instructions could be executed directly support iterative multi-step prediction scheme to
for the same energy cost as sending a bit for 100 m by facilitate lossy compression.
radio [15] and, in general, receiving has comparable energy In our unified GPC algorithmic framework, a compression
cost to transmitting. Therefore it is appropriate and error bound (denoted as e) is used as the control knob, and
desirable for one to reduce the total energy usage at sensor lossless compression can be processed as e ¼ 0 in our
nodes by carefully minimising nodes’ transmission (and framework. Also, please note that the GPC framework is a
hence the corresponding reception), probably offset by a general framework in which one has complete flexibility to
slight increase of computation operations. This leads to data choose appropriate predictor and entropy encoder based on
compression-based approach. given tasks. The algorithmic procedure at source nodes of
In this paper, we consider temporal sensor data the GPC is presented in Fig. 1. The corresponding
compression in WSN data collection paradigm, in which a algorithmic procedure at the sink(s) can be described
few number of data streams (the accurate definition of the accordingly and easily.
data stream to be given later in Section 3) will be
consecutively gathered from each individual sensor node. 2.3 GPC realisation
We first briefly describe our novel general data compression
framework referred to as GPC [16] upon which the CDP is In our development of CDP, which employs GPC for data
developed. The GPC extends the previous work on two- compression, we adopt the simplest linear predictor to
modal transmission approach for WSN energy-efficient predict the next sample based on the last observed sample,
communication [10, 12], and combines both lossless and that is, x̂i = predictor(xi−1 ) = xi−1 . Then the residue is the
lossy compression in the same framework efficiently. In difference ri ¼ xi 2 xi21 . The choice of this simplest
contrast, existing WSN lossless and lossy compression predictor is based on our following considerations. First, we
algorithms follow different principles and thus none of found that sensor observations in many real-world
those algorithms can be applied to both lossless and lossy applications, such as environmental monitoring (e.g. [10]),
compression. For example, recent lossy compression the prediction performance of this simplest predictor is
algorithms such as LTC [17] and PLAMLiS [18] are based comparable with other more sophisticated predictors
on piecewise linear approximation, and would result in including higher order of linear models and non-linear
more compressed bits than the raw data bits when applied models. Thus, the selection of the simplest predictor can
to lossless compression. For a comprehensive survey of greatly reduce the computation overhead of making the
recent developments of practical WSN data compression prediction at the motes. Second, the adoption of the
algorithms, see [19]. simplest predictor WSN-wide improves the scalability of

2674 IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683


& The Institution of Engineering and Technology 2011 doi: 10.1049/iet-com.2011.0118
www.ietdl.org

Fig. 1 GPC framework

WSN deployment, because the sink only needs to maintain Table 1 Coding table (K ¼ 14) in GPC of CDP
one simplest predictor for thousands of sensors in the
ni Si ri
sensornet. Otherwise, if individual sensors used their ‘best’
predictors, the sink would have to potentially maintain 0 00 0
thousands of different predictors and thus would suffer 1 010 21, +1
from the scalability issue. Third, as our design of CDP is 2 011 23, 22, +2, +3
intended to be generic, so that it can be used as a tool in 3 100 27, . . ., 24, +4, . . ., +7
research as well as in applications, our initial selection of a 4 101 215, . . ., 28, +8, . . ., +15
predictor in the GPC only serves as a ‘default’ predictor, 5 110 231, . . ., 216, +16, . . ., +31
since our design and implementation of CDP allows the 6 1110 263, . . ., 232, +32, . . ., +63
default predictor to be easily replaced with any other 7 11110 2127, . . ., 264, +64, . . ., +127
predictors that could be better for given applications. 8 111110 2255, . . ., 2128, +128, . . ., +255
Furthermore, as described in Section 3, users can even 9 1111110 2511, . . ., 2256, +256, . . ., +511
easily replace the entire GPC, the default data compression 10 11111110 21023, . . ., 2512, 512, . . ., 1023
framework in CDP, with another compression mechanism. 11 111111110 22047, . . ., 21024, 1024, . . ., 2047
With the same assumption of the residual distribution 12 1111111110 24095, . . ., 22048, 2048, . . ., 4095
model used in LEC [11], we adopted the entropy encoder 13 11111111110 28191, . . ., 24096, 4096, . . ., 8191
employed in LEC [11] in the realisation of GPC’s 14 111111111110 original raw sample
predictive compression mode. (We note that any specific
residual distribution model and entropy encoder are
certainly not tied to the GPC framework, and hence can be
easily replaced with other alternatives in the CDP.) The the coding table when K ¼ 14, where Si represents the
adopted encoder is a modified version of the Exponential- residue group code for residue ri and ni indicates the
Golomb code of order 0 [20]. Basically, the alphabet of number of bits of ri′ ’s index that follows Si . For example, if
residues is divided into groups to reduce the alphabet size. ri ¼ 2 and rj ¼ 22, then their group codes will be 01110
Thus, any residue ri is represented in two parts: group code and 01101, respectively. Note that the index for negative
it belongs to and its index in that group. Based on the ri is computed by 2ni − 1 − |ri |. When Si ¼ 111111111110,
residual model and entropy encoder adopted, the Si is no longer a group code in the compression mode
compression radius R is simply selected as 2 K21 2 1, where of the GPC but the code flagging the normal mode of the
K is the resolution of A/D converters used in WSN motes. GPC, which is followed by an original raw sample.
In the normal mode of the GPC, uncompressed raw
samples are transmitted. The size of the coding table used
in the GPC of our CDP implementation is just K + 1 3 CDP protocol design
entries, whereas S-LZW [13] uses significantly more
memory space for its dictionary entries and mini-cache In designing CDP we attempt to achieve two specific goals.
entries (e.g. MAX_DICT_ENTRIES being 512 and The first goal is to minimise the packet overhead, so that
MINI-CACHE_ENTRIES being 32 [11, 13]). Table 1 gives the protocol overhead does not negate the benefits of the

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2675


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011
www.ietdl.org
data compression. To this end, a novel concept of streams is stream eliminates the need to identify each compressed
developed which is described in Section 3.1. The second goal sensor reading in the data segment of the packet. This
is to provide a platform for researchers to develop and test any reduces each packet’s overhead by m × log2 n bits, where
new compression algorithms effectively. This is achieved by m is the number of readings in that packet and n is the
keeping our protocol design modular for easy plugging of number of sensors associated with that node.
different compression algorithms, which is described in Owing to the limited memory of motes, buffer space may
Section 3.2. We note that the reliability of transport is not be prohibitive. Commonly used WSN operating systems,
addressed by CDP, as CDP is intended to be a lightweight such as TinyOS, do not support dynamic memory
transport protocol. The reliability of transport in CDP allocation, which makes the efficient use of memory quite
depends on the reliability of the underlying network layer challenging. Streams can reduce buffer overhead. To
protocol. illustrate, let us consider an alternative solution in which
each packet only contains data from a single sensor. In this
case, the packet overhead would only require the algorithm
3.1 Data streams id and the sensor id. Although this solution could eliminate
much of the packet overhead, it would introduce a
In order to minimise the packet overhead and memory use in significant memory overhead. This is because each node
CDP, we introduce a novel concept of streams. In the context will have to maintain a packet-sized buffer, because of lack
of the CDP protocol, a data stream is defined as an aggregate of dynamic memory, for each of its connected sensors.
flow of multiple individual sensor data flows from a single Taking the example node given in Fig. 2, Fig. 3 shows how
mote employing the same compression algorithm with packet buffers would work (i) without streams (Fig. 3a) and
corresponding parameters. Note that if any mote has sensors (ii) with streams (Fig. 3b). Clearly the solution with
with K different sampling rates, K individual streams have to streams can significantly reduce this memory overhead
be created in CDP for this mote. Fig. 2 illustrates an example whereas the solution without streams could quickly
node with two streams. Stream 1 will contain all data flows undermine the practicality of a compression-based protocol
from sensors 1, 2 and 3 whereas stream 2 will contain data by requiring excessive memory as the number of sensors
flow from sensor 4. The motivation of organising the sensor connected to each mote increases. Assuming s sensors per
data flows of a WSN into the newly defined data streams in stream, then the amount of buffer space required by CDP
CDP is that, by aggregating multiple sensor flows on a single will merely be 1/s of the memory which would be required
node, we can minimise the protocol overhead, reduce the otherwise for the alternative solution of a single sensor flow
number of packet buffers required at motes, and provide per packet.
flexibility for supporting different compression operations Owing to the lightweight design consideration of CDP,
(by either different compression algorithms or different the sampling rate configuration that is usually either supported
parameters of the same algorithm). As a specific case, a data by WSN configuration management or implemented by
stream can be an aggregate flow consisting of multiple applications is not specified in CDP. The only requirement
sensor data flows from a single mote with an identical is that the sampling rate for all sensors’ data flows within a
sampling rate but without any compression at all. stream should be identical, either static or dynamic. This is
a necessary design decision to provide the benefits of streams.
3.1.1 Overhead reduction: Protocol overhead is a major Moreover, our concept of streams allows a simplification of
issue in designing a compression protocol. Owing to the data collection in a complicated WSN where multiple
small size of packets, it is vitally important that we do not compression algorithms (e.g. lossless compression and lossy
negate the benefits of compression by introducing a high compression) are used at the same time in addition to
overhead to our packets. By only requiring information diverse sampling rates, because of the different physical
such as compression algorithm and sensor information be variables and mote locations in a WSN large-scale
sent once, at stream setup, we reduce much of the deployment. When several sensors data flows of a mote are
information that would be required in each data packet. grouped into a data stream, data packets only need to carry
After stream setup each data packet only requires the stream the stream id instead of individual sensor flow identifiers.
id as additional header information. Additionally, the
requirement of a consistent sampling rate within each 3.1.2 Stream setup and control packet: To set up a data
stream, one should specify a sampling rate shared by all data
flows in the stream, a compression algorithm used with given
parameters, and how to distinguish individual sensor data
flows collected in the stream. Since all sensor data flows in
a stream use the same sampling rate, the relative order of
data from individual sensors can be fixed for an easy
identification of individual sensor data flows within a
stream. In our design of CDP, we consider individual motes
in a sensornet are autonomous. The stream setup
specification for individual motes is achieved through
control packet(s) exchanged between the motes and the sink
by CDP during stream creation process. Once a stream is
set up via control packet, data flows from the stream can be
collected forever via data packets.
Each stream setup packet begins with the 16 bit node ID
but this ID is retrieved from the lower layer packet (i.e.
cross-layer information) to avoid additional overhead in
Fig. 2 Example of a node with two streams CDP. Additionally, each stream is assigned a stream id to

2676 IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683


& The Institution of Engineering and Technology 2011 doi: 10.1049/iet-com.2011.0118
www.ietdl.org

Fig. 3 Buffers required for the example node


a One packet buffer for each sensor on the node
b One packet buffer for each stream on the node

identify which stream a data packet belongs to, as each mote 3.2 Modular design
can support up to eight independent data streams in CDP.
Stream id, compression (selected among all the In order for CDP to be useful as a tool for researchers to
implemented algorithms) and sensor list are specified in investigate and test compression algorithms it is vital that
their corresponding fields in CDP control packet. The we design CDP in such a way that the GPC may be easily
sensor list for the created stream will specify the relative replaced by another compression algorithm. This
order of all sensors’ data flows belonging to that stream by consideration led to our modularised design for CDP. The
their identifiers within the mote. The control packet entire design of our CDP is broken down into three major
structure is illustrated in Fig. 4. The general packet structure components: network access, utility and compression.
is given in Fig. 4a, whereas the control packets for the two Fig. 6 shows the overall modular design of CDP.
streams illustrated in Fig. 2 are shown in Fig. 4b.
3.2.1 Network access: This module provides the interface
3.1.3 Data packets: Data packets in CDP are very simple. for initialising the lower level network, sending packets and
They simply contain the node id, stream id and the receiving packets. This module will maintain separation of
compressed data. Node id is, again, retrieved from lower network services from the rest of CDP so that the
layer header information. The payload section will consist underlying network protocols may be changed based on the
of a cycle of one reading from each sensor repeated until
the packet is filled. As CTP does not guarantee delivery,
it is important to ensure that a packet loss does not
introduce any errors to subsequent packets received at the
sink. We can simply use GPC normal mode for Fig. 5 Data packet format. Similar to CDP control packet, the
resynchronisation at the beginning of each packet to dotted field of node ID is ‘virtual’ as it does not exist in CDP data
achieve this. Fig. 5 shows the general structure of a CDP packet to minimise the overhead
data packet. Node ID is actually obtained from lower level protocol

Fig. 4 Illustration of control packet


a General control packet structure
b Illustrations of control packets corresponding to the stream setup node 1 shown in Fig. 2
Dotted field of node ID is ‘virtual’ as it does not exist in CDP control packet to minimise the overhead. The node ID is actually obtained from lower level protocol

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2677


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011
www.ietdl.org

Fig. 6 Overall modular design of CDP

requirements of the application. Although CDP is a group. Furthermore, the design for GPC allows for
collection-based protocol, it is possible with some logic in changing implementations of predictors and/or entropy
the network access module to implement CDP on top of coding tables with little or no code change of the module(s).
any protocol that provides a path from each mote to the The compression module passes the sensor data off to the
sink. If CDP is being used on top of a primitive network selected algorithm related to a stream. With a standardised
stack, additional logic may be added in this module to interface for compression algorithms, this module will allow
improve the performance. The network access module will, a new compression algorithm to be merged with CDP by
additionally, provide a platform for lower level protocol simply implementing the algorithm in a new module that
developers to easily test underneath CDP. This would allow conforms to the interface and modifying the compression
lower level network protocols to be designed to cater to module to point a specific algorithm id value to that module.
compressed data without actually implementing the GPC consists of three modules: predictive coding,
compression itself. predictor and entropy coder. The predictive coding module
implements the GPC framework and should not be
3.2.2 Utility modules: Three separate modules are modified if the default GPC is used. It supports defining a
designed to support the underlying structure of CDP. The compression radius and error bound as well as choosing
packet formation module is responsible for building and and executing sync operations and lossless/lossy
reading packet headers. This module allows for modifying compression. The predictor module supplies a module with
header structures, packet types and the method for actually a well-defined interface for predictors. The coding module
building the header. The module specifically defines does the same as the predictor module for entropy coding.
methods for building configuration and data headers as well This allows for new predictors and/or entropy coding
as reading received headers. The stream operations module techniques to be implemented to match our interfaces and
is responsible for everything related to streams. It provides simply switch in and out for easy testing as well as
methods for building and sending streams as well as customised GPC algorithms to optimally fit the sensor data
passing stream data to other modules that may require these characteristics at given tasks.
data. These packet formation and stream operations
modules define the basic operation of CDP. Modifying 4 Protocol implementation
these modules will, obviously, change the fundamental way
that CDP operates. The node module simply provides an For our reference implementation, we adopted TinyOS 2.1
easy to use interface, by abstracting away design details, to [21] as the underlying platform, due to the fact that TinyOS
applications using CDP. is an open source operating system for WSNs developed in
the NesC programming language [22] and is widely used
3.2.3 Compression modules: This major component both in the research community and real-world WSN
consists of the group of modules including compression, applications. CDP is intended, once fully tested, to be
predictive coding, predictor and entropy coder (see Fig. 6). useful for real-world WSN applications and the research
This group of modules allows for new compression community, the combination of TinyOS’ wide use and open
algorithms to be implemented and plugged into CDP with source nature makes it an ideal underlying platform for our
minimal modification on corresponding module(s) in this CDP development. Also, the nesC programming language

2678 IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683


& The Institution of Engineering and Technology 2011 doi: 10.1049/iet-com.2011.0118
www.ietdl.org
paradigm provides for a nearly one-to-one mapping of our motes. CDP’s design allows for minimal buffering in the
design modules into TinyOS. Fig. 7 illustrates the TinyOS- following two ways.
based network stack environment in which our CDP is By using streams, we must keep a one packet-sized buffer
implemented. As we can see from Fig. 7, the per stream. With default 29 bytes of packet on motes with
NetworkAccess module has three commands (Init, 802.15.4 radios such as the CC2420, the maximum buffer
SendCompressed and PushSend) and two events (sendDone size for up to eight streams per node is 232 bytes. In actual
and receive) that are wired to the CTP implementation in implementation, the maximum buffer size will be slightly
TinyOS. Command Init initialises the underlying CTP smaller, as we only keep the data segment in memory.
protocol and gets everything ready to send. Command However, many applications usually have a smaller number
SendCompressed queues compressed data to be sent, where of streams for motes, which can be improved by providing
the data will not actually be sent until we have a full a MAX_STREAM_NUM constant at compile time so
packet. Command PushSend forces an incomplete packet to unnecessary buffer space will not be allocated. On the data
be sent, for more time sensitive data. Event sendDone sink side; we maintain a configurable buffer of packets
signals after a packet was successfully sent and Event which may be kept low in the typical case where the data
receive signals at the root when a compressed message has are forwarded to the gateway immediately after reception.
been received. To achieve maximum data segment size for compressed
data we create and maintain two additional buffers on each
mote. The first buffer, ‘pending’, maintains s independent
4.1 Collection tree protocol
sections of ns × B bits of data, where s is the number of
In our CDP implementation, we adopted the CTP, a tree- streams, ns is the maximum number of sensors per stream
based collection protocol, as the underlying routing protocol and the B is the maximum size of one sample’s coding,
[6, 7]. As an integral part of TinyOS environment, CTP rounded to the nearest byte. From Table 1, B would be 12
provides a number of benefits over other routing protocols. bits (i.e. the size of normal mode code of the GPC, see
It is a collection protocol, which provides CDP with the Table 1) plus the size of an original raw sample. This
proper routing for data collection. It has been shown that buffer is maintained to determine if a whole set of readings
CTP outperforms similar collection-based protocols and is will fit in the remaining space in the packet. If not, the
quite reliable. The two primary benefits of CTP over other current packet will be sent and a new packet will be started.
collection protocols are because of datapath validation and The second buffer, raw, maintains the raw data
adaptive beaconing [6]. These two methods allow for 73% corresponding to data in pending so that the original raw
fewer packets and greater than 90% packet delivery rate [6]. data may be sent in normal mode if a new packet needs to
Additionally, CTP reduces topology repair latency by be generated.
99.8% [6]. More details about CTP can be found in [6, 7].
4.3 Decoding implementation
4.2 Packet buffers
In our implementation of the entropy coding used in GPC, we
CDP tries to maximise the amount of data in each packet tried to keep the code as simple and easy to maintain as
to help minimise the impact of packet header overhead. possible. While encoding is straightforward, decoding
However, NesC’s avoidance of dynamic memory allocation seemed to require a complex block of code. We decided to
means that we must keep full-sized packet buffers. This can use an array-based finite state machine for decoding. To
be an issue, because of the limited memory of sensor read the Si from Table 1 and return the n, we begin in an

Fig. 7 CDP network stack

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2679


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011
www.ietdl.org
including a randomly generated sonsornet node location
map and real-world sensor data sets used. We provide the
simulation results and analysis of CDP performance and
energy consumption in Sections 5.3 and 5.4, respectively.

5.1 TOSSIM and PowerTOSSIM-z

TOSSIM is a simulator for TinyOS-based WSNs [23]. TOSSIM


works by replacing a few key TinyOS modules during
compilation, primarily the hardware-reliant modules, with
simulation code allowing the TinyOS code to be compiled to
the simulator instruction set. The code is broken down into
events, discrete portions of code and queued as discussed in
[23]. This allows for efficient simulation of large networks.
Another advanced feature of TOSSIM is its environmental
noise modelling. TOSSIM allows its simulations to take in an
environmental noise trace and then attempts to accurately
Fig. 8 Decoding procedure simulate real-world noise using closest-fit pattern matching as
discussed in [25]. A commonly used real environmental noise
trace with TOSSIM today is called as Meyer Heavy noise
Table 2 State table for the codes 00 and 01 trace which was taken at the Meyer library at Stanford during
heavy 802.11 activity [25]. However, TOSSIM is unable to
Current state accurately simulate the per-instruction time and anything else
that relies on it, including energy usage. PowerTOSSIM-z
Bit 0 1 2 3
[24], a port of PowerTOSSIM [26], attempts to add accurate
0 1 2 2 3
energy modelling into TOSSIM. PowerTOSSIM achieves this
1 21 3 2 3
goal by logging power state transition information during
simulation. CPU energy usage modelling is more complex
and is discussed in depth in [26]. Although PowerTOSSIM-z
initial state and read bits to change state until we attain one of has some limitations [24], PowerTOSSIM-z performs without
the final states. This tells us how many bits the following needing to directly emulate the mote hardware, allowing for
reading will occupy. The operation of the decoding faster energy modelling than is available in more traditional
algorithm is described in Fig. 8. An example state table is emulation environments.
given in Table 2. Note that decoding is performed at the
gateway which is not energy-limited.
5.2 Simulation setup

5 Simulations and analyses In our simulations we generated a sensornet topology using the
topology generator included in the TinyOS package. We
We have performed simulations to thoroughly evaluate CDP. generated a WSN with total 33 nodes (i.e. one data sink and
Section 5.1 briefly describes the two simulators used in our 32 motes) randomly distributed over a 100 m by 100 m area,
experiments, TOSSIM [23] and PowerTOSSIM-z [24]. as shown in Fig. 9. We used the first 5000 lines of the Meyer
Section 5.2 describes the simulation setup in detail, Heavy noise trace for our simulations. By reducing the

Fig. 9 Mote location map of the WSN in our simulation

2680 IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683


& The Institution of Engineering and Technology 2011 doi: 10.1049/iet-com.2011.0118
www.ietdl.org
length of the noise trace we vastly reduced the memory and
time requirements of running a simulation. Owing to the
nature of noise modelling in TOSSIM and routing in CTP,
the network topology may vary greatly over time. Links
have various qualities, for example, the link from node 6 has
a link to node 0 which is at 274.99 db gain but its link to
node 2 has a 216.69 db gain. This layout should give a
realistic location map motes in the simulated multi-hop
sensornet with different number of hops to the data sink for
the evaluation of CDP for data collection. Examination of
the topology during our simulations shows that the number
of hops from motes to the sink ranges from one to nine, and Fig. 10 Comparison of retransmissions in the sensornet
discounting routing loops that may occur at times.
We adopted the publically available real-world WSN payload sent by CTP and the total bytes of CDP packets
Patrouille des Glaciers (PDG) 2008 datasets provided by (including CDP protocol overhead) sent by CTP for the
SensorScope [3] for the simulation of lossless data collection. same amount of sensed raw data to compute CDP
Sensor nodes use temperature and humidity data from a node compression ratio. The CDP compression ratio, as defined
in the original data collection. In our simulations, each set of by 1 2 D ′ /D where D ′ is the size of CDP packets and D is
data is replicated on four nodes at various distances from the the size of the corresponding raw data, is computed and
sink node, as shown in Table 3. The first two columns of listed in Table 3. Note that these actual CDP compression
Table 3 show the assignment of nodes in our simulations to ratios include the CDP protocol overhead, which is in
station ids in the original SensorScope data. Each node in the contrast to the algorithmic level compression ratios [10 – 12]
CDP simulations uses a single stream, composed of both that were obtained without including any protocol overhead.
humidity and temperature data, with the sampling rate set to The fourth and fifth columns of Table 3 show the total
be one reading every 100 ms. To thoroughly and fairly number of static packets of using CTP alone and CDP for
evaluate the performance of the CDP, CTP alone is simulated each sensor source, respectively. That is, sending the same
as a performance baseline. The selection of CTP alone as the amount of data by CDP requires significantly less static
performance baseline is due to the fact that CTP is popular, packets than those by CTP alone.
effective, well evaluated and integrated with tinyOS In addition to simply reducing total data payloads by the
environment. In addition, the CDP runs on top of CTP. In the significant static compression ratios presented above, CDP
CTP simulations each packet contains four humidity and four is found to be able to reduce retransmissions significantly.
temperature readings rather than a single set. This allows CTP Fig. 10 shows the average total number of retransmissions
to behave more favourably against the aggregation inherent in over five trials in CDP versus CTP alone. Fig. 11 illustrates
our CDP.

5.3 Performance evaluation

We first conducted TOSSIM simulations to study the general


performance of the CDP, including retransmissions, and total
protocol overhead (including lower layers’). Owing to the
randomness of environmental noises with TOSSIM’s noise
modelling, each simulation trial would have somewhat
different result. To effectively eliminate the random
fluctuations of the simulation results, all our simulation
results are based on the average of five individual
simulation trials.
We first compute the empirical static compression ratio of
CDP based on our simulations. Since in the benchmark of
using CTP alone the sensed raw data are sent directly over Fig. 11 Illustration of retransmission dynamics of CDP against
CTP, we can use the total bytes of the original raw data CTP alone

Table 3 Node assignment to sensorscope data and CDP compression ratios

Nodes PDG 2008 # Hum/Temp # CTP # CDP Compression


station ID reading pairs packets packets ratio, %

1, 9, 17, 25 1 3665 916 671 35.23


2, 10, 18, 26 3 3041 760 538 37.56
3, 11, 19, 27 4 3041 760 541 37.12
4, 12, 20, 28 5 1403 351 267 32.29
5, 13, 21, 29 7 4535 1134 772 40.31
6, 14, 22, 30 10 3657 914 618 40.55
7, 15, 23, 31 15 4662 1166 818 38.23
8, 16, 24, 32 16 3072 768 555 36.09

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2681


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011
www.ietdl.org

Fig. 14 Radio energy comsumptions of CDP against CTP (no


idle listening)
Fig. 12 Comparisons of total data transmitted in the sensornet (in
bytes)
To gather truthful energy usage statistics, it is necessary to
make some small changes to our previous simulations
the average retransmission dynamics over five trials of CDP reported in Section 5.3. This is because idle listening at
against CTP alone in the sensornet. This difference is due each node was included in the previous simulations. It was
to two factors. First, the frame error rate can be found, however, that such idle listening actually dominated
significantly reduced due to compression [10]. Additionally, the overall power consumptions in the simulated sensornet.
since the compression rates among nodes vary, CDP adds To obtain truthful energy usage statistics for CDP against
an ad hoc delay between different nodes’ transmissions that CTP alone, we actually conducted three sets of simulations
reduces the likelihood of collisions. with powerTOSSIM-z: (i) CTP alone simulations, (ii) CDP
Moreover, CDP reduces lower layer overhead by simulations and (iii) idle listening simulations. In the idle
minimising the total number of packets sent. CTP and the listening simulations, the radio transceiver on each mote
protocols below it, such as IEEE 802.15.4 for example, use was turned on but disenabled any protocol activities above
a combined 21 bytes (i.e., eight bytes of header for CTP, 10 the physical layer. The simulations of idle listening were
bytes of header for IEEE 802.15.4 and one byte of active run for the same length of time as the CTP and CDP
message type for TinyOS) of overhead per CTP packet. On simulations to determine the energy cost of idle listening.
average, this reduced the total lower layer overhead, Again, to eliminate the random fluctuations of the
accounted for both transmissions and retransmissions, from simulation results because of environmental noises, we
1, 528, 340 bytes to 739, 960 bytes over the five trials. That averaged the energy usage results of five simulation trials of
is, the average of total lower layer overhead reduction is CTP, CDP and idle listening, respectively. Then, the energy
more than half of the original. cost of the idle listening is excluded to determine the actual
Overall, the average total size of data packet transmissions energy savings of data collections using CDP over CTP alone.
and retransmissions sent by CDP is significantly less than that The simulation results of average single-node radio energy
sent by CTP alone as shown in Fig. 12, in which the average consumption obtained using PowerTOSSIM-z are shown in
total bytes contributed by transmissions and retransmissions Fig. 13, where the idle listening is a large portion of radio
have included all corresponding packet overheads of IEEE energy consumption in the simulated sensornet. In Fig. 14,
802.15.4, CTP, and CDP accordingly. The dynamic CDP by removing the idle listening energy consumption from
compression ratio, due to reducing retransmissions and both CTP alone and CDP simulations, we observe that data
corresponding lower level overheads, has an overall average of compression through CDP reduces radio energy
55.23% over five trials, realising additional savings over the consumption by about 26.2% in real-world sensor data
static compressions shown in Table 3. situations, in comparison to CTP alone for data collections.
Next, we study mote’s microcontroller energy usage for
5.4 Energy evaluation CDP processing. While typically sensor node’s power
consumption is dominated by its radio, we want to verify
In addition to our TOSSIM simulations on CDP performance, that CDP’s compression operations do not cause any
we have conducted further simulations with PowerTOSSIM-z significant increase in CPU’s power usage and thus would
to attain energy usage statistics for CDP against CTP alone. not negate the benefits of radio energy savings. Our
simulations show that the sensornet operating CDP only
consumed an average of 113.9 mJ for CPU energy usage
per mote. Even assuming zero CPU energy usage for CTP
alone, the increase of CPU energy usage per node for CDP
data compression is negligible compared to the average
radio energy savings of 3618 mJ per node by CDP. Thus,
the total energy savings (considering both radio and CPU
energy usages) of CDP over CTP would be about 25.4%,
close to the energy reduction ratio of 26.2% where the CPU
energy usage of CDP is not included.

6 Conclusions
In this paper, we have presented the design and
Fig. 13 Comparisons of radio energy usage in the sensornet implementation of CDP, an energy-efficient data

2682 IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683


& The Institution of Engineering and Technology 2011 doi: 10.1049/iet-com.2011.0118
www.ietdl.org
compression-based transport protocol for data collections in 6 Gnawali, O., Fonseca, R., Jamieson, K., Moss, D., Levis, P.: ‘Collection
WSNs. The key features of our CDP design and tree protocol’. Proc. Seventh ACM Conf. on Embedded Network Sensor
Systems, 2009, pp. 1– 14
implementation are: (i) exploiting our novel unified 7 Fonseca, R., Gnawali, O., Jamieson, K., Kim, S., Levis, P., Woo, A.:
algorithmic framework GPC, to effectively provide both ‘The collection tree protocol’, available at http://www.tinyos.net,
lossless and lossy compressions; (ii) minimising the Febraury 2007
protocol overhead and at the same time providing 8 Kim, S., Fonseca, R., Dutta, P., et al.: ‘Flush: a reliable bulk transport
considerable flexibility for complex network data gathering protocol for multihop wireless networks’. Proc. Fifth Int. Conf.
Embedded Networked Sensor Systems, Sydney, Australia, November
operations where diverse sampling rates and both lossless 2007, pp. 351– 365
and lossy compression algorithms with different parameters 9 Hull, B., Jamieson, K., Balakrishnan, H.: ‘Mitigating congestion in wireless
are simultaneously supported; (iii) demonstrating the merits sensor networks’. Proc. Second Int. Conf. Embedded Networked Sensor
of data compression-based general transport protocol for Systems, Baltimore, MD, USA, November 2004, pp. 134–137
10 Huang, F., Liang, Y.: ‘Towards energy optimization in environmental
network energy efficiency via simulations using real-world wireless sensor networks for lossless and reliable data gathering’.
sensor data; (iv) providing a research platform for IEEE Int. Conf. Mobile Ad hoc and Sensor Systems (MASS), Pisa,
developing and testing new data compression algorithms for Italy, October 2007, pp. 1 –6
networked data collection in that new algorithms can easily 11 Marcelloni, F., Vecchio, M.: ‘An efficient lossless compression
replace the current default one in the CDP without affecting algorithm for tiny nodes of monitoring wireless sensor networks’,
Comput. J., 2009, 52, (8), pp. 969–987
the rest of CDP implementation. To our knowledge, CDP is 12 Liang, Y., Peng, W.: ‘Minimizing energy consumptions in wireless
the first of its kind transport protocol for energy-efficient sensor networks via two-modal transmission’, ACM SIGACOMM
WSN data collections. Our simulation evaluation on CDP Comput. Commun. Rev., 2010, 40, (1), pp. 13– 18
shows that, in addition to remarkable compression ratios, 13 Sadler, C.M., Martonosi, M.: ‘Data compression algorithms for
CDP can significantly reduce retransmissions in noisy energy-constrained devices in delay tolerant networks’. Proc. Fourth
ACM Int. Conf. Embedded Networked Sensor Systems, 2006,
WSN, which does not only reduce the data retransmissions pp. 265–278
but also reduces the total lower layer packet overhead, 14 Welch, T.A.: ‘A technique for high-performance data compression’,
altogether resulting in substantial savings on total IEEE Comput., 1984, 17, (6), pp. 8–19
transmitted bytes. Moreover, our simulation shows that the 15 Pottie, J., Kaiser, W.J.: ‘Embedding the internet wireless integrated
network sensors’, Commun. ACM, 2000, 43, (5), pp. 51–58
proposed CDP design and implementation enables the 16 Liang, Y.: ‘Efficient temporal compression in wireless sensor networks’.
energy usage of CDP protocol processing including data 36th IEEE Conf. Local Computer Networks (LCN), 2011, pp. 470 –478
compression is negligible for real-world sensor data 17 Schoellhammer, T., Osterweil, E., Greenstein, B., Wimbrow, M., Estrin,
collections. Our future work includes to further thoroughly D.: ‘Lightweight temporal compression of microclimate datasets’. Proc.
evaluate CDP in a real watershed monitoring WSN testbed, 29th Annual IEEE Int. Conf. Local Computer Networks, 2004,
pp. 516–524
which is currently in deployment. We also plan to use the 18 Liu, C., We, K., Pei, J.: ‘An energy-efficient data collection framework
combination of the origin, collect_id and seqno from CTP for wireless sensor networks by exploring spatiotemporal correlation’,
to allow us to uniquely identify packets [7], in order to IEEE Trans. Parallel Distrib. Syst., 2007, 18, (7), pp. 1010–1023
attain a higher level of reliability with little or no additional 19 Srisooksai, T., Keamarungsi, K., Lamsrichan, P., Araki, K.: ‘Practical
packet overhead. data compression in wireless sensor networks: a survey’, J. Netw.
Comput. Appl., 2012, 35, (1), pp. 37–59
20 Teuhola, J.: ‘A compression method for clustered bit-vectors’, Inf.
7 Acknowledgment Process. Lett., 1978, 7, pp. 308– 311
21 The TinyOS 2.x Working Group: ‘TinyOS 2.0’. Proc. Third Int. Conf.
This work is supported in part by National Science Embedded Networked Sensor Systems, San Diego, CA, USA,
November 2005
Foundation under grant CNS-0758372.
22 Gay, D., Levis, P., Behren, R., Welsh, M., Brewer, E., Culler, D.: ‘The
nesC language: a holistic approach to networked embedded systems’.
8 References Proc. ACM SIGPLAN 2003 Conf. Programming Language Design
and Implementation, 2003, pp. 1 –11
1 Kim, S., Pakzad, S., Culler, D., et al.: ‘Health monitoring of civil 23 Levis, P., Lee, N., Welsh, M., Culler, D.: ‘TOSSIM: accurate and
infrastructures using wireless sensor networks’. Proc. Sixth Int. Conf. scalable simulation of entire TinyOS applications’. Proc. First Int.
Information Processing in Sensor Networks, Cambridge, MA, USA, Conf. Embedded Networked Sensor Systems, Los Angeles, CA, USA,
April 2007, pp. 254– 263 November 2003, pp. 126– 137
2 Paek, J., Chintalapudi, K., Cafferey, J., Govindan, R., Masri, S.: ‘A 24 Perla, E., O Cathain, A., Carbajo, R.S., Huggard, M., Mc Goldrick, C.:
wireless sensor network for structural health monitoring: performance ‘PowerTOSSIM z: realistic energy modelling for wireless sensor
and experience’. Proc. Second IEEE Workshop Embedded Networked network environments’. Proc. Third ACM Workshop Performance
Sensors, May 2005, pp. 1 –10 Monitoring and Measurement of Heterogeneous Wireless and Wired
3 http://sensorscope.epfl.ch/index.php/Main_Page, accessed July 2011 Networks, 2008, pp. 35–42
4 Werner-Allen, G., Lorincz, K., Johnson, J., Lees, J., Welsh, M.: ‘Fidelity 25 Lee, H.J., Cerpa, A., Levis, P.: ‘Improving wireless simulation through
and yield in a volcano monitoring sensor network’. Proc. ACM Symp. noise modelling’. Proc. Sixth Int. Conf. Information Processing in
Operating System Design and Implementation, 2006, pp. 381– 396 Sensor Networks, Cambridge, MA, USA, April 2007, pp. 21– 30
5 Song, W.Z., Huang, R., Xu, M., Shirazi, B., LaHusen, R.: ‘Design and 26 Shnayder, V., Hempstead, M., Chen, B., Allen, G.W., Welsh, M.:
deployment of sensor network for real-time high-fidelity volcano ‘Simulating the power consumption of large-scale sensor network
monitoring’, IEEE Trans. Parallel Distrib. Syst., 2010, 21, (11), applications’. Proc. Second Int. Conf. Embedded Networked Sensor
pp. 1658– 1674 Systems, Baltimore, MD, USA, November 2004, pp. 188–200

IET Commun., 2011, Vol. 5, Iss. 18, pp. 2673–2683 2683


doi: 10.1049/iet-com.2011.0118 & The Institution of Engineering and Technology 2011

You might also like