Kamban 2017

This paper discusses the design and implementation of a wireless surveillance sensor network using acoustic and vibration sensors to detect objects in real time. Acoustic and vibration data is collected from sensors and classified using machine learning techniques to identify humans, vehicles or other events to trigger camera monitoring accordingly.

Uploaded by

saathvekha16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views6 pages

Kamban 2017

Uploaded by

saathvekha16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2017 21st International Conference on Control Systems and Computer Science

Use of Acoustic and Vibration Sensor Data to Detect

Objects in Surveillance Wireless Sensor Networks

Selver Ezgi Küçükbay∗† , Mustafa Sert† and Adnan Yazıcı∗

∗ MiddleEast Technical University, Department of Computer Engineering
† Başkent
University, Department of Computer Engineering
Ankara, Turkey
Email: [email protected], [email protected], [email protected]

Abstract—Nowadays, people are using stealth sensors to detect high power. Especially when we compare them with cameras
intruders due to their low power consumption and wide coverage. or radars, these sensors have panoramic coverage.
It is very important to use lightweight sensors for detecting real
time events and taking actions accordingly. In this paper, we In the existing studies, there are lots of applications about
focus on the design and implementation of wireless surveillance acoustic seismic sensor classification. In [4], the study is
sensor network with acoustic and seismic vibration sensors to about human/animal and light vehicle classification in signals
detect objects and/or events for area security in real time. To this from orthogonal detectors in a sensor node. They use seismic,
end, we introduce a new environmental sensing based system ultrasonic microphone and ultrasonic Doppler sonar and radar
for event triggering and action. In our system, we first design sensors. They separate human/horse signals using cadence
an appropriate hardware as a part of multimedia surveillance
frequencies, also they separate human and high vehicles using
sensor node and use proper classification technique to classify
acoustic and vibration data that are collected by sensors in rhythm algorithm. In [5], authors present their experiences
real-time. According to the type of acoustic data, our proposed and elementary evaluations made with acoustic and seismic
system triggers a camera event as an action for detecting intruder signals of heavy military land vehicles. They use different
(human or vehicle). We use Mel Frequency Cepstral Coefficients sensor types such as microphone and geophone. Vehicle types
(MFCC) feature extraction method for acoustic sounds and are truck, medium and heavy road tractors, armored personnel
Support Vector Machines (SVM) as classification method for both carrier and main battle tank. According to the evaluations,
acoustic and vibration data. We have also run some experiments acoustic and seismic sensors are useful for detecting heavy
to test the performance of our classification approach. We show military vehicles. In [6], they intend to localize a seismic event
that our proposed approach is efficient enough to be used in real accurately using seismic sensors to capture ground vibrations
life.
generated by moving vehicles. They present a novel robust
Keywords—wireless sensor network, Raspberry Pi, acoustic sensor, time-frequency approach. In their study a tri-axial geophone
vibration sensor, MFCC, SVM, classification. is used as seismic sensor. Their algorithms have been tested on
total 44 events, which are generated under different conditions
and they give their results in terms of positive predictive value
I. I NTRODUCTION
and F-measure. In [7], they present a system that detects and
The Internet of Things (IoT) is a new trend in computer science classifies vehicles by measurement of road-pavement vibration.
research. According to Gartner1 , IoT is one of the popular They describe their data collection and classification parts and
concepts in 2015. For many people, IoT would be the most discuss their results. In their study, CEF C3M01 vibration
important development that will affect our life in the coming sensor is used. For signal processing they use Modular Audio
years. This concept can be used in many areas and applications Recognition Framework in Java and for machine learning they
such as smart environments, security, health, industrial control, use different algorithms such as NaiveBayes, RBFNetwork,
and home automation. Nowadays, there are examples, which SVM, MLP with multiclass. The performance range is between
focus on a specific type of applications with IoT data [1]– 94-100% for detection and 43-86% for classification. In [8],
[3]. Sensor technologies are one of the components of IoT they propose a new footstep detection technique for data
that enables perfect ubiquitous computing. Ubiquitous com- acquired using geophone sensor. In the study, three sensors
puting represents a computing concept where the computing are collocated orthogonally within a single casing. Their al-
is made in the background, which can be hardly noticeable. gorithm is effective and reduces computational complexity for
The integrated sensor-internet framework will shape the smart both footstep detection and signal denoising. In [9], authors
environment. For smart environment applications, we can use focused on the hardware and software implementation of
sensors and collect the data from the environment. Based on DSP algorithms for footstep detection using seismic sensors.
these real world data we can do analysis and study in different They present their hardware design, data acquisition process
areas such as detection and classification the type of sources using geophone and the algorithm that is used for human
that sensors collect. Some examples of primary sources in an footstep detection. They use kurtosis and cadence calculation
environment can be humans, vehicles and animals. For this to determine whether seismic signal is human footstep or
kind of objects, seismic and acoustic sensors are widely used not. In their experiments, they show that geophones are good
because they are stealth, they don’t need light of sight and solution for footstep detection due to their simplicity and
sensitivity. In [10], the authors present a fusion framework
1 http://www.gartner.com/newsroom/id/3412017 using video and acoustic sensors for vehicle detection. They

2379-0482/17 $31.00 © 2017 IEEE 207

DOI 10.1109/CSCS.2017.35
Fig. 2: Acoustic and vibration sensors hardware design

with Raspberry Pi2 in real time. For acoustic sensor, we

extract MFCC features and classify the sound events as animal,
human, or vehicle. If we determine any the type of sound
(animal, human or vehicle), we activate/deactivate the camera
to start monitoring. In vibration sensor, the values which are
collected from the sensor used as features and based on these
Fig. 1: General design of the system vibration values, the sources are classiﬁed as human, vehicle
or group of human.

use acoustic data for gathering estimation of target direction- A. Hardware Design
of-arrival. Markov Chain Monte Carlo techniques are used in The hardware system includes an acoustic sensor, a vibration
their study. They also propose a novel fusion approach for sensor, and analog to digital converters and Raspberry Pi
vehicle tracking. They present their experimental results for with SPI interface. The hardware design of acoustic and
both real and synthetic data. vibration sensors is depicted in Figure2. The algorithm for data
collection in the sensor node given in Algorithm 1.
In this paper, we design and implement a wireless surveillance
sensor network using acoustic and vibration sensors. We depict Since the Raspberry Pi receives only digital signals and the
the hardware design, real time data collection and classiﬁcation sensors we used give analog signals as output, we need to
process in this study. We use acoustic sensor for gathering the convert the signal to digital in order to gather digital signal
sounds of the objects and classify them as human, vehicle and from recordings.
animal. We also use vibration sensor for collecting different
vibration levels of the objects and classify them as group of
human, human and vehicle. The contributions of this paper
are; We ﬁrst collect real time acoustic sound and vibration
data.Then we activate/deactivate the camera in real time ac-
cording to the type of the sound source. In the sensor node,
we collect data and determine the type of objects detected as
animal, human and vehicle. Finally according to the classes,
the system triggers the camera event as an.
The paper organized as follows: In section 2, materials and
methods are described. Experimental evaluations and results
are given in section 3. Conclusion and future research are
presented in section 4.

II. M ATERIALS AND M ETHODS

Our general system given in Fig 1. The main steps of the
system are hardware design, data collection, acoustic fea- Fig. 3: Algorithm for data collection in the sensor
ture extraction, classiﬁcation, and event triggering. First, we
construct a hardware design for both acoustic and vibration
2 https://www.raspberrypi.org
sensors in order to gather the sound and vibration data from
the environment. We collect data using this hardware design

208
Fig. 4: Pinout diagram of sensors

To this end, we design MCP3201 and MCP3008 based hard-

ware to convert analog signals. MCP3008 and MCP3201
pinout diagrams are given in Figure 4. We can wire these
converters to Rasberry Pi using the required pins on the
converters.
Fig. 6: MCP3008 wiring diagram with Raspberry Pi
In acoustic sensor, we used MCP3201 analog to digital con-
verter. VDD (power) and Vss (digital ground) pins are used
to give power to MCP3201. For transferring the data from
analog to digital converter to raspberry Pi, Dout (data out), B. Data Collection & Feature Extraction
CLK (clock), Din (data in) ve CS (chip select) pins are needed.
Wiring acoustic sensor to MCP3201 and wiring MCP3201 to We collect the data from the real world using acoustic and vi-
the Raspberry Pi is done by using the technical datasheet of bration sensors. We used MCP3201 and MCP3008 in order to
analog to digital converter [11]. MCP3201 and Raspberry Pi gather the data by using the hardware design that are described
wiring diagram is given in Figure 5 in details. above. While collecting the acoustic data from the acoustic
sensor, we gather various type of actions of human. Speech
sounds of a woman and man separately, laughter sounds,
In vibration sensor we used MCP3008 analog to digital con- footsteps of a person, crowd sound from different locations
verter. VDD (power) and DGND (digital ground) pins are such as restaurant or public park are recorded. Also different
utilized to give power to MCP3008. Dout (data out), CLK human sounds like bird, cow and different vehicle sounds like
(clock), Din (data in) and CS (chip select) pins are used for motorcycle, bus are recorded. After collecting data, we extract
sending data from sensor to MCP3008. Wiring acoustic sensor the features and classify the samples. For acoustic sensor,
to MCP3008 and wiring MCP3008 to the Raspberry Pi is done classes are determined as animal, human and vehicle. We
by using the technical datasheet of analog to digital converter used Mel Frequency Cepstral Coefﬁcients (MFCC) technique
[12]. MCP3008 and Raspberry Pi wiring diagram is given in to extract features from the sound due to its success in speech
Figure 6 in details. recognition [13] The MFCC feature is calculated on the basis
of fast Fourier transform and it names as a short-term spectral
feature. It is closest to the human perception. In the MFCC,
the audio signal is processed as frames rather than as a whole.
There are many different methods for windowing the audio
signal. In this study, we used overlapping hamming windows
of size 30ms with 10ms overlaps. We use 13-coefﬁcient MFCC
vectors. The number of samples that gathered from sensor are
given in Table I. For the vibration sensor, while collecting the
data we placed the sensor on the loophole at the entry of the
garage to capture the vibration. We set up the node and waited
for record the vibrations of different classes. The classes are
determined as group of human, human and vehicle. We save
the voltage values while different type of sources crossing the
garage entry. According to the sensor feature, when vibration
is detected, it gives low output voltage. If there is no vibration,

TABLE I: Acoustic dataset information

Class Number of Samples
Animal 265
Fig. 5: MCP3201 wiring diagram with Raspberry Pi Human 240
Vehicle 219

209
TABLE II: Vibration dataset information TABLE IV: Example confusion matrix
Class Number of Samples
Group of Human 117 Predicted Class
Human 406
Vehicle 463
Human Animal Vehicle

then it gives high output voltage. We can adjust the sensitivity

Actual Class
of the sensor using the potentiometer, which is located on the Human 209 (TP) 27 (FN) 4 (FN)
top of the sensor. Vibration sensor outputs an analog signal
between 0 and 5 volt. MCP3008 analog to digital converter is Animal 29 (FP) 229 (TN) 7 (TN)
10 bit, then converted values are between 0 and 210 = 1024.
So if there is vibration, digital output is 0, if there is not then
it is 1024. In the test environment we construct for this study, Vehicle 5 (FP) 0 (TN) 214 (TN)
a vehicle with 20 km/h speed, produces a vibration between
the 650-695 depending on the size of the vehicle. In order to
compare the situations, a human run through the loophole, a
group of human walks on the loophole. In these 2 scenarios, III.E XPERIMENTAL R ESULTS
signal’s output is between 670 and 680. A human walking The results are presented in terms of precision, recall and F-
outputs signal between 750-800. The number of samples that score. The obtained results can be expressed by using the
gathered from sensor are given in Table II. confusion matrix. For our multiclass classification problem,
a sample confusion matrix for human sound is presented in
Table IV. According to confusion matrix, evaluations can be
C. Classifier Design distinguished as True Positive (TP), True Negative (TN), False
Negative (FN).
We choose Support Vector Machines (SVM) as classifier
method for both acoustic and vibration data. In our study, A. Acoustic Data Experiments
we used LIBSVM library for the implementation [14]. SVM
We collect the data from acoustic sensor, extract the features
classifier uses margin to classify given instances. It creates a
with specified methods and classify them using classification
hyperplane and separate the values by using this hyperplane.
technique. The performance results are given in Figure 7.
Since SVM is a binary classifier, but we have 3 classes in
According to results, the system classifies the acoustic sounds
each sensor, we need to convert SVM to solve the multiclass
clearly. We can distinguish human, animal and vehicle using
problem. In our model, we use one-versus-all (OVA) technique.
acoustic sensor with specified feature extraction and classifi-
In OVA, for each class we built a SVM model. Every single
cation methods. After we detect the object, we can trigger the
SVM is trained to detect the feature of particular classes.
events according to the object. Since the results are efficient
According to the probabilities, best result is picked as the class
enough for determine the type of sound, we can use this event
label for the given test sound. When validating the classifier,
triggering mechanism in real life.
we used 5 fold cross validation.
B. Vibration Data Experiments

D. Source Detection and Event Triggering We collect the data from vibration sensor, use the values which
are converted values of analog signal using ADC as features
After completing hardware design of the acoustic sensors, data
is collected and classiﬁed. Gathered acoustic sound determined
as animal, human and vehicle. According to type of the sound
source, we plan to trigger an event. We choose a rule-based
method for activating the camera. We want to monitor and
record the environment while human or vehicle is detected by
the system. We design the system in this way, so it does not
capture and store the unnecessary situations and saves energy.
Our triggering scenario is given in Table III.

TABLE III: Rules for event triggering

Rule 1 If (class == human) OR (class == vehicle)
THEN (camera = activate)
Rule 2 If (class == animal) Fig. 7: Performance Results for Acoustic Sensor
THEN (camera = deactivate)

210
Fig. 9: Performance Result for Vibration Sensor

R EFERENCES
[1] L. D. Xu, W. He, and S. Li, “Internet of things in industries: A survey,”
IEEE Transactions on Industrial Informatics, vol. 10, no. 4, pp. 2233–
2243, Nov 2014.

[2] G. Kortuem, F. Kawsar, V. Sundramoorthy, and D. Fitton, “Smart

objects as building blocks for the internet of things,” IEEE Internet
Computing, vol. 14, no. 1, pp. 44–51, Jan 2010.

Fig. 8: Test Environment [3] M. Nitti, L. Atzori, and I. P. Cvijikj, “Friendship selection in the social
internet of things: Challenges and possible strategies,” IEEE Internet of
Things Journal, vol. 2, no. 3, pp. 240–247, June 2015.

[4] A. Ekimov, “Modulation method in human/animal and light vehicle

and classify them using classification technique. We construct separation in detected signals,” in Second Annual Human and Light
a test environment for our study. The test environment is given Vehicle Detection Workshop, 2011.
in Figure 8. And our experimental results are given in Figure
[5] J. Altmann, “Acoustic and seismic signals of heavy military vehicles
9. According to results, vibration information is not enough for for co-operative verification,” Journal of Sound and Vibration,
classifying group of human and vehicle classes. We find that vol. 273, no. 4–5, pp. 713 – 740, 2004. [Online]. Available:
group of human and a car with 10-km/h speed creates almost http://www.sciencedirect.com/science/article/pii/S0022460X03009209
same vibration levels. This situation decreases the detection
accuracy of these two classes. [6] R. Ghosh, A. Akula, S. Kumar, and H. Sardana, “Time–frequency
analysis based robust vehicle detection using seismic sensor,” Journal of
Sound and Vibration, vol. 346, pp. 424 – 434, 2015. [Online]. Available:
IV. C ONCLUSION http://www.sciencedirect.com/science/article/pii/S0022460X15001455

In this study, we design and implement a wireless surveillance [7] M. Stocker, P. Silvonen, M. Rönkkö, and M. Kolehmainen, “Detec-
sensor network for intrusion detection using acoustic and tion and classification of vehicles by measurement of road-pavement
vibration sensors. According to our empirical results, the vibration and by means of supervised machine learning,” Journal of
Intelligent Transportation Systems, vol. 20, no. 2, pp. 125–137, 2016.
acoustic sensor can be used for classifying human, vehicle
and animal sounds. It gives good results and distinguishes the [8] V. V. Reddy, V. Divya, A. W. H. Khong, and B. P. Ng, “Footstep
classes clearly. For the vibration sensor, we construct a test detection and denoising using a single triaxial geophone,” in 2010 IEEE
environment. In our test environment, we placed the sensor on Asia Pacific Conference on Circuits and Systems, Dec 2010, pp. 1171–
1174.
the loophole at the entry of the garage. In this environment,
a group of human and a car with 10km/h speed produces [9] P. Anghelescu, G. V. Iana, and I. Tramandan, “Human footstep detec-
nearly same signal outputs. Classifier may easily confuse when tion using seismic sensors,” in 2015 7th International Conference on
the samples that belong to these classes come. In this case, Electronics, Computers and Artificial Intelligence (ECAI), June 2015,
vibration sensor is not enough to distinguish between group of pp. AE–1–AE–2.
human and vehicle. We can choose different kind of sensors [10] R. Chellappa, G. Qian, and Q. Zheng, “Vehicle detection and tracking
such as geophone to detect this kind of classes. One extent to using acoustic and video sensors,” in 2004 IEEE International Confer-
this study can be to apply data fusion techniques to the sensor ence on Acoustics, Speech, and Signal Processing, vol. 3, May 2004,
data. pp. iii–793–6 vol.3.

[11] MCP3201 2.7V 12-Bit A/D Converter with SPI Serial Interface, Mi-
ACKNOWLEDGMENT crochip Technology Inc, 2007.

This project is supported by TUBITAK (Project Number: [12] MCP3004/MCP3008 2.7V 4-Channel/8-Channel 10-Bit A/D Converter
114R082). with SPI Serial Interface, Microchip Technology Inc, 2008.

211
[13] L. Chen, S. Gunduz, and M. T. Ozsu, “Mixed type audio classiﬁcation
with support vector machine,” in 2006 IEEE International Conference
on Multimedia and Expo, July 2006, pp. 781–784.

[14] C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector
machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 27:1–
27:27, May 2011.

212