Compressed Data-Stream Protocol An Energy-Efficient
Compressed Data-Stream Protocol An Energy-Efficient
org
Published in IET Communications
Received on 1st February 2011
Revised on 8th August 2011
doi: 10.1049/iet-com.2011.0118
ISSN 1751-8628
Abstract: In this study, the authors present an energy-efficient data compression protocol for data collection in wireless sensor
networks (WSNs). WSNs are essentially constrained by motes’ limited battery power and networks bandwidth. The authors focus
on data compression algorithms and protocol development to effectively support data compression for data gathering in WSNs.
Their design of compressed data-stream protocol (CDP) is generic in the sense that other lossless or lossy compression algorithms
can be easily ‘plugged’ into the proposed protocol system without any changes to the rest of the CDP. This design intends to
support various different WSN applications where users may prefer more specific compression algorithms, tailored to the
sensing data characteristics in question, to their general algorithm. CDP is not only able to significantly reduce energy
consumptions of data gathering in multi-hop WSNs, but also able to reduce sensor network traffic and thus avoid congestion
accordingly. The proposed CDP is implemented on the tinyOS platform using the nesC programming language. To evaluate
their work, the authors conduct simulations via TOSSIM and PowerTOSSIM-z with real-world sensor data. The results
demonstrate the significance of CDP.
WSN deployment, because the sink only needs to maintain Table 1 Coding table (K ¼ 14) in GPC of CDP
one simplest predictor for thousands of sensors in the
ni Si ri
sensornet. Otherwise, if individual sensors used their ‘best’
predictors, the sink would have to potentially maintain 0 00 0
thousands of different predictors and thus would suffer 1 010 21, +1
from the scalability issue. Third, as our design of CDP is 2 011 23, 22, +2, +3
intended to be generic, so that it can be used as a tool in 3 100 27, . . ., 24, +4, . . ., +7
research as well as in applications, our initial selection of a 4 101 215, . . ., 28, +8, . . ., +15
predictor in the GPC only serves as a ‘default’ predictor, 5 110 231, . . ., 216, +16, . . ., +31
since our design and implementation of CDP allows the 6 1110 263, . . ., 232, +32, . . ., +63
default predictor to be easily replaced with any other 7 11110 2127, . . ., 264, +64, . . ., +127
predictors that could be better for given applications. 8 111110 2255, . . ., 2128, +128, . . ., +255
Furthermore, as described in Section 3, users can even 9 1111110 2511, . . ., 2256, +256, . . ., +511
easily replace the entire GPC, the default data compression 10 11111110 21023, . . ., 2512, 512, . . ., 1023
framework in CDP, with another compression mechanism. 11 111111110 22047, . . ., 21024, 1024, . . ., 2047
With the same assumption of the residual distribution 12 1111111110 24095, . . ., 22048, 2048, . . ., 4095
model used in LEC [11], we adopted the entropy encoder 13 11111111110 28191, . . ., 24096, 4096, . . ., 8191
employed in LEC [11] in the realisation of GPC’s 14 111111111110 original raw sample
predictive compression mode. (We note that any specific
residual distribution model and entropy encoder are
certainly not tied to the GPC framework, and hence can be
easily replaced with other alternatives in the CDP.) The the coding table when K ¼ 14, where Si represents the
adopted encoder is a modified version of the Exponential- residue group code for residue ri and ni indicates the
Golomb code of order 0 [20]. Basically, the alphabet of number of bits of ri′ ’s index that follows Si . For example, if
residues is divided into groups to reduce the alphabet size. ri ¼ 2 and rj ¼ 22, then their group codes will be 01110
Thus, any residue ri is represented in two parts: group code and 01101, respectively. Note that the index for negative
it belongs to and its index in that group. Based on the ri is computed by 2ni − 1 − |ri |. When Si ¼ 111111111110,
residual model and entropy encoder adopted, the Si is no longer a group code in the compression mode
compression radius R is simply selected as 2 K21 2 1, where of the GPC but the code flagging the normal mode of the
K is the resolution of A/D converters used in WSN motes. GPC, which is followed by an original raw sample.
In the normal mode of the GPC, uncompressed raw
samples are transmitted. The size of the coding table used
in the GPC of our CDP implementation is just K + 1 3 CDP protocol design
entries, whereas S-LZW [13] uses significantly more
memory space for its dictionary entries and mini-cache In designing CDP we attempt to achieve two specific goals.
entries (e.g. MAX_DICT_ENTRIES being 512 and The first goal is to minimise the packet overhead, so that
MINI-CACHE_ENTRIES being 32 [11, 13]). Table 1 gives the protocol overhead does not negate the benefits of the
identify which stream a data packet belongs to, as each mote 3.2 Modular design
can support up to eight independent data streams in CDP.
Stream id, compression (selected among all the In order for CDP to be useful as a tool for researchers to
implemented algorithms) and sensor list are specified in investigate and test compression algorithms it is vital that
their corresponding fields in CDP control packet. The we design CDP in such a way that the GPC may be easily
sensor list for the created stream will specify the relative replaced by another compression algorithm. This
order of all sensors’ data flows belonging to that stream by consideration led to our modularised design for CDP. The
their identifiers within the mote. The control packet entire design of our CDP is broken down into three major
structure is illustrated in Fig. 4. The general packet structure components: network access, utility and compression.
is given in Fig. 4a, whereas the control packets for the two Fig. 6 shows the overall modular design of CDP.
streams illustrated in Fig. 2 are shown in Fig. 4b.
3.2.1 Network access: This module provides the interface
3.1.3 Data packets: Data packets in CDP are very simple. for initialising the lower level network, sending packets and
They simply contain the node id, stream id and the receiving packets. This module will maintain separation of
compressed data. Node id is, again, retrieved from lower network services from the rest of CDP so that the
layer header information. The payload section will consist underlying network protocols may be changed based on the
of a cycle of one reading from each sensor repeated until
the packet is filled. As CTP does not guarantee delivery,
it is important to ensure that a packet loss does not
introduce any errors to subsequent packets received at the
sink. We can simply use GPC normal mode for Fig. 5 Data packet format. Similar to CDP control packet, the
resynchronisation at the beginning of each packet to dotted field of node ID is ‘virtual’ as it does not exist in CDP data
achieve this. Fig. 5 shows the general structure of a CDP packet to minimise the overhead
data packet. Node ID is actually obtained from lower level protocol
requirements of the application. Although CDP is a group. Furthermore, the design for GPC allows for
collection-based protocol, it is possible with some logic in changing implementations of predictors and/or entropy
the network access module to implement CDP on top of coding tables with little or no code change of the module(s).
any protocol that provides a path from each mote to the The compression module passes the sensor data off to the
sink. If CDP is being used on top of a primitive network selected algorithm related to a stream. With a standardised
stack, additional logic may be added in this module to interface for compression algorithms, this module will allow
improve the performance. The network access module will, a new compression algorithm to be merged with CDP by
additionally, provide a platform for lower level protocol simply implementing the algorithm in a new module that
developers to easily test underneath CDP. This would allow conforms to the interface and modifying the compression
lower level network protocols to be designed to cater to module to point a specific algorithm id value to that module.
compressed data without actually implementing the GPC consists of three modules: predictive coding,
compression itself. predictor and entropy coder. The predictive coding module
implements the GPC framework and should not be
3.2.2 Utility modules: Three separate modules are modified if the default GPC is used. It supports defining a
designed to support the underlying structure of CDP. The compression radius and error bound as well as choosing
packet formation module is responsible for building and and executing sync operations and lossless/lossy
reading packet headers. This module allows for modifying compression. The predictor module supplies a module with
header structures, packet types and the method for actually a well-defined interface for predictors. The coding module
building the header. The module specifically defines does the same as the predictor module for entropy coding.
methods for building configuration and data headers as well This allows for new predictors and/or entropy coding
as reading received headers. The stream operations module techniques to be implemented to match our interfaces and
is responsible for everything related to streams. It provides simply switch in and out for easy testing as well as
methods for building and sending streams as well as customised GPC algorithms to optimally fit the sensor data
passing stream data to other modules that may require these characteristics at given tasks.
data. These packet formation and stream operations
modules define the basic operation of CDP. Modifying 4 Protocol implementation
these modules will, obviously, change the fundamental way
that CDP operates. The node module simply provides an For our reference implementation, we adopted TinyOS 2.1
easy to use interface, by abstracting away design details, to [21] as the underlying platform, due to the fact that TinyOS
applications using CDP. is an open source operating system for WSNs developed in
the NesC programming language [22] and is widely used
3.2.3 Compression modules: This major component both in the research community and real-world WSN
consists of the group of modules including compression, applications. CDP is intended, once fully tested, to be
predictive coding, predictor and entropy coder (see Fig. 6). useful for real-world WSN applications and the research
This group of modules allows for new compression community, the combination of TinyOS’ wide use and open
algorithms to be implemented and plugged into CDP with source nature makes it an ideal underlying platform for our
minimal modification on corresponding module(s) in this CDP development. Also, the nesC programming language
5 Simulations and analyses In our simulations we generated a sensornet topology using the
topology generator included in the TinyOS package. We
We have performed simulations to thoroughly evaluate CDP. generated a WSN with total 33 nodes (i.e. one data sink and
Section 5.1 briefly describes the two simulators used in our 32 motes) randomly distributed over a 100 m by 100 m area,
experiments, TOSSIM [23] and PowerTOSSIM-z [24]. as shown in Fig. 9. We used the first 5000 lines of the Meyer
Section 5.2 describes the simulation setup in detail, Heavy noise trace for our simulations. By reducing the
6 Conclusions
In this paper, we have presented the design and
Fig. 13 Comparisons of radio energy usage in the sensornet implementation of CDP, an energy-efficient data