Smart Computing and Communication
Smart Computing and Communication
Smart Computing and Communication
LNCS 11910
Smart Computing
and Communication
4th International Conference, SmartCom 2019
Birmingham, UK, October 11–13, 2019
Lecture Notes in Computer Science 11910
Founding Editors
Gerhard Goos
Karlsruhe Institute of Technology, Karlsruhe, Germany
Juris Hartmanis
Cornell University, Ithaca, NY, USA
Smart Computing
and Communication
4th International Conference, SmartCom 2019
Birmingham, UK, October 11–13, 2019
Meikang Qiu
Columbia University
New York, NY, USA
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
This volume contains the papers presented at SmartCom 2019: the 4th International
Conference on Smart Computing and Communication held during October 11–13,
2019, in Birmingham, UK.
There were 286 submissions. Each submission was reviewed by at least 3 reviewers,
and on average 3.5 Program Committee members. The committee decided to accept 40
Recent booming developments in Web-based technologies and mobile applications
have facilitated a dramatic growth in the implementation of new techniques, such as
cloud computing, big data, pervasive computing, Internet of Things, and social
cyber-physical systems. Enabling a smart life has become a popular research topic with
an urgent demand. Therefore, SmartCom 2019 focused on both smart computing and
communications fields and aimed to collect recent academic work to improve the
research and practical application in the field.
The scope of SmartCom 2019 was broad, from smart data to smart communications,
from smart cloud computing to smart security. The conference gathered all high-quality
research/industrial papers related to smart computing and communications and aimed at
proposing a reference guideline for further research. SmartCom 2019 was held at
Birmingham City University in the UK and its conference proceedings publisher is
SmartCom 2019 continued in the series of successful academic get togethers, fol-
lowing SmartCom 2018 (Tokyo, Japan), SmartCom 2017 (Shenzhen, China), and
SmartCom 2016 (Shenzhen, China).
We would like to thank the conference sponsors: Springer LNCS, Birmingham City
University, Columbia University, Beijing Institute of Technology, The Alliance of
Emerging Engineering Education for Information Technologies, China Computer
Federation, North America Chinese Talents Association, and Longxiang High Tech
Group Inc.
General Chairs
Meikang Qiu Columbia University, USA
Mark Sharma Birmingham City University, UK
Program Chairs
Bhavani Thuraisingham The University of Texas at Dallas, USA
Zhongming Fei University of Kentucky, USA
Keke Gai Beijing Institute of Technology, China
Local Chairs
Yonghao Wang Birmingham City University, UK
Xiangyu Gao New York University, USA
Publicity Chairs
Yuanchao Shu Microsoft Research Asia, China
Han Qiu Télécom ParisTech, France
Zhenyu Guan Beihang University, China
Technical Committee
Jeremy Foss Birmingham City University, UK
Cham Athwal Birmingham City University, UK
Andrew Aftelak Birmingham City University, UK
Yue Hu Louisiana State University, USA
Aniello Castiglione University of Salerno, Italy
Maribel Fernandez King’s College, University of London, UK
Hao Hu Nanjing University, China
Oluwaseyi Oginni Birmingham City University, UK
Alan Dolhasz Birmingham City University, UK
Qianyun Zhang Beihang University, China
Dawei Li Beihang University, China
Bo Du ZF Friedrichshafen AG, Germany
Cefang Guo Imperial College London, UK
Thomas Austin San Jose State University, USA
Aniello Castiglione University of Salerno, Italy
Maribel Fernandez King’s College, University of London, UK
Zhiyuan Tan Edinburgh Napier University, UK
viii Organization
A Smart Roll Wear Check Scheme for Ensuring the Rolling Quality
of Steel Plates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Kui Zhang, Xiaohu Zhou, Heng He, Yonghao Wang, Weihao Wang,
and Huajian Li
Do Top Social Apps Effect Voice Call? Evidence from Propensity Score
Matching Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Hao Jiang, Min Lin, Bingqing Liu, Huifang Liu, Yuanyuan Zeng,
He Nai, Xiaoli Zhang, Xianlong Zhao, Wen Du, and Haining Ye
A Space Dynamic Discovery Scheme for Crowd Flow of Urban City . . . . . . 171
Zhaojun Wang, Hao Jiang, Xiaoyue Zhao, Yuanyuan Zeng, Yi Zhang,
and Wen Du
Analysis and Prediction of Commercial Big Data Based on WIFI Probe . . . . 286
Xiao Zeng, Hong Guo, and Zhe Liu
Machine Learning for Cancer Subtype Prediction with FSA Method . . . . . . . 387
Yan Liu, Xu-Dong Wang, Meikang Qiu, and Hui Zhao
Abstract. With the rapid development of deep learning in the fields of text
abstraction and dialogue generation, researchers are now reconsidering the long-
standing story-generation task from the 1970s. Deep learning methods are grad-
ually being adopted to solve problems in traditional story generation, making story
generation a new research hotspot in the field of text generation. However, in the
field of story generation, the widely used seq2seq model is unable to provide
adequate long-distance text modeling. As a result, this model struggles to solve the
story-generation task, since the relation between long text should be considered,
and coherency and vividness are critical. Thus, recent years have seen numerous
proposals for better modeling methods. In this paper, we present the results of a
comprehensive study of story generation. We first introduce the relevant concepts
of story generation, its background, and the current state of research. We then
summarize and analyze the standard methods of story generation. Based on var-
ious divisions of user constraints, the story-generation methods are divided into
three categories: the theme-oriented model, the storyline-oriented model, and the
human-machine-interaction–oriented model. On this basis, we discuss the basic
ideas and main concerns of various methods and compare the strengths and
weaknesses of each method. Finally, we finish by analyzing and forecasting future
developments that could push story-generation research toward a new frontier.
1 Introduction
A story is a series of real or fictional events that can be manifested in various forms,
such as pictures, videos, or texts. Story generation refers to the search for story texts
that meet the user constraints in an infinite text space. Probabilistic models for story
generation are probability distributions that describe a stochastic process for generating
a story. Research into story generation can be applied in the fields of education,
entertainment, and transaction processing, such as providing writing suggestions for
writers or providing a reasonable description of events for a group of pictures. The
research can also be applied in the fields of virtual society and intelligent simulation,
such as for automated generation of storylines in games.
Traditional automated story generation can be traced back to Meehan et al., who
used a TALE-SPIN system to automatically generate stories in the 1970s [1]. Early
attempts in this field relied on intelligent planning, whose main idea was to transform
story generation into a classical planning problem [2] in which the beginning of the
story corresponds to the initial state of the plan, and the author’s goal corresponds to
the goal state of the plan. The actions taken by the main characters in the story
correspond to those in the plan. The Interactive Storytelling System, developed by
Cavazza of IVE Research Lab of Teesside University in the UK, is the representative
work of this system [3]. Pizzi and Charles used heuristic search planning to generate
stories and developed the EmoEmma system [4]. Lebowitz [5], Turner [6], Bringsjord
and Ferrucci [7], Perez and Sharples [8], and Riedl and Young [9] also used story-
generation methods based on intelligent programming. Early attempts in this field also
included case-based reasoning (Gervas et al. [10]) or generalizing knowledge from
existing stories to assemble new stories (Swanson and Gordon [11]).
Traditional story-generation methods are knowledge intensive and depend on an a
priori domain model defined for the fictional world, including executable roles, loca-
tions, and actions [12]. Complex system design and domain knowledge are required, so
traditional story-generation methods have some drawbacks, such as high labor costs
and limited domains. However, deep learning can resolve these shortcomings, so the
rapid development of deep learning has led to significant improvements in deep
learning story-generation methods.
At present, the seq2seq model [13] is the basic model for story generation. It offers
substantial advantages for machine translation, automated summarization, and other
tasks, and the proposed attention model [14] has developed it further. However, dif-
ficulties remain in the application of story generation. First, producing coherent stories
remains a challenge. Martin [15], Yao [17], Xu [21], Fan [20], and others have pointed
out this problem and have used different models to solve it. Second is the problem of
repeated generation. In the field of dialogue, Li pointed out that the seq2seq model
tends to produce duplicate dialogue [16]. Accordingly, Yao et al. [17] argued that the
seq2seq model had the same problem in story generation, so they proceeded to opti-
mize it.
parts: (1) whether constraints are static; in other words, whether a human-computer
interaction exists in the process of generation, which means that the user constraints C
(u) will change dynamically as a function of human participation in the story-
generation process. (2) For the strength of the user constraints, the important measure is
whether the user constraints C(u) contain a complete storyline.
Theme-Oriented Models. When the user constraint C(u) is static; that is, no human-
computer interaction exists in the process of story-generation, and the constraint (e.g., a
theme word, a sentence, or some hints) is theme oriented and does not contain a
complete storyline, PD is reduced to a theme-oriented model.
Storyline-Oriented Models. When the user constraint C(u) is static and contains
complete story plots, such as a set of pictures that have been given about the devel-
opment of a storyline, an abstract description of the development of a storyline, or a
specific story that needs an ending, PD is reduced to a storyline-oriented model.
Human-Machine Interaction-Oriented Models. When the user constraint C(u) is
dynamic, i.e., user constraint varies with human-computer interaction, PD is reduced to
a storyline-oriented model.
We divide the related work into theme-oriented models, storyline-oriented models,
and human-machine-interaction–oriented models from the perspective of user con-
straints. Next, we compare and analyze the advantages and disadvantages of these three
show that the hierarchical neural story-generation model achieves more coherent stories
than the seq2seq model through hierarchical structure and the fusion mechanism.
Skeleton-Based Model. The user constraint C(u) of the skeleton-based model [21] is
the beginning of the story (i.e., a sentence that expresses the theme of the story).
Sentences written by humans are closely linked, and the whole story is coherent and
fluent. This is because humans advance the storylines and reorganize them into fluent
sentences. The skeleton-based model is inspired by the fact that the connection between
sentences is mainly reflected by key phrases, such as predicate, subject, object, etc.
Other words (such as modifiers) are not only redundant for understanding semantic
dependencies but also make for sparse dependency. Therefore, driven by the patterns of
human writing, the skeleton-based model takes the phrases that express the critical
meaning of sentences in story works as the skeleton and propose a skeleton-based
model to promote the coherence of story generation. Unlike the traditional model of
generating complete sentences at one time, the skeleton-based model first generates the
most critical phrases, which is called the “skeleton.” The skeleton is then extended to
complete and fluent sentences. In addition, the skeleton in the model is not defined
manually but learned through reinforced learning, which can make the generated story
more coherent and consistent with the theme.
Planning-Based Model. The user constraint C(u) of the planning-based model [17] is
a word (i.e., a word that expresses the theme of the story). According to Wang et al.
[23] in the field of poetry creation and Mou et al. [24] in the field of dialogue,
keywords are used as the main points to guide the generation. Inspired by the planning
method applied in the field of dialogue [22] and narrative [9], the planning-based model
realizes that story generation can be decomposed into two steps: the generation of the
story plot and the generation of the story text based on the plot. Therefore, it proposes a
planning-based story-generation framework that combines plot planning and story text
generation to generate stories from titles. Obviously, this is similar to the skeleton-
based model.
Multi-level Model. The user constraint C(u) of the multilevel model [25] is prompt.
Although existing language models can generate stories with good local coherence,
they have difficulty merging phrases into coherent plots, or even to maintain the
consistency of roles throughout the story. One reason for this failure is that the lan-
guage model generates the whole story at the word level, which makes it difficult for
the model to capture storyline interactions above the word level. To solve this problem,
Fan decomposes the problem into a series of easier sub-problems, from coarse grained
to fine grained. These decompositions give three advantages: first, more abstract rep-
resentations can be generated to solve the problem of modeling long-term dependen-
cies; second, different models are allowed to solve different sub-problems with more
pertinence; and third, data need not be labeled manually.
Upon analyzing these theme-oriented models, we find that they all use the idea of
hierarchical structure. Whether planning, keywords, or action sequence, they all model
the relationship of the storyline at a higher level than words and establish long-distance
dependencies so that they get better results than with the seq2seq model in terms of
story coherence and consistency.
A Survey of Deep Learning Applied to Story Generation 5
Comparative Analysis. Because of the inconsistency of datasets used for story gen-
eration and the lack of effective automated evaluation metrics, it is difficult to strictly
compare the advantages and disadvantages of each model. In this paper, we sort some
recent models and analyze from the perspective of whether the model focuses on
coherence and consistency, vividness, and common sense or logical reasoning, as
indicated in Table 1. The information given in Table 1 leads to the following
Coher- Common
Type Reference ence and Vividness sense or logical
consistency reasoning
Fan et al. [20] √ — —
Xu et al. [21] √ — —
Yao et al. [17] √ — —
Fan et al. [25] √ √ —
Storyline- Wang et al. [27] √ √ —
oriented Huang et al. [29] √ √ —
models Jain et al. [34] √ √ —
Guan et al. [18] √ — √
Human- Clark et al. [36] √ √ —
Goldfarb et al. [38] √ √ —
A Survey of Deep Learning Applied to Story Generation 7
(1) All models are concerned about coherence and consistency, because it is the basis
of story generation. Only when it relates to the theme and the story is coherent can
we explore the vividness and logical reasoning of the story on this basis.
(2) Only Guan et al. [18] explored the ability of common-sense judgment and logical
reasoning at the end of the story, which reflects the fact that little research is
available on this aspect at present, which is a direction worthy of study.
(3) Theme-oriented models focus mostly on coherence and consistency. Our analysis
suggests that the theme-oriented model has difficulty producing a consistent and
coherent story with the theme because of the lack of prior story planning. On this
basis, adding vividness, common sense judgment, and logical reasoning will
increase the complexity of the model.
Generally speaking, the theme-oriented model uses the idea of the hierarchical structure
model to further guide the story generation by first generating the context or keywords
of the storyline, thus increasing the long-distance dependence and making the story
more coherent. Because the storyline-oriented model gives the context of the devel-
opment of storyline, the model can focus on the vividness or logical reasoning of the
story on the basis of ensuring coherence. The human-machine-interaction–oriented
model is more concerned about the actual application scenarios and usage, and
increasing interaction with people is its first consideration. Through analysis, story
generation needs to unify multiple dimensions, including consistency, coherence,
vividness, and reasonability. It remains a difficult problem to model the unification of
these dimensions.
The seq2seq model has made great progress in abstract text summarization, dialogue
generation, and other fields but is unable to provide sufficient long-distance text
modeling. Thus, it has difficulty solving problems for story generation, which requires
long, coherent, and vivid stories. Specifically, the following four problems are
(1) Consistency between story and theme. Seq2seq models that generate stories often
degenerate into language models. The story lacks a central idea and loses its
consistency with the theme.
(2) Coherence of story. The stories generated from the seq2seq model often have no
logical relationship, which means that the word order is chaotic and the reader
does not know what is being expressed.
(3) Diversity and vividness of story. The story from the seq2seq model is not suffi-
ciently novel because it tends to choose words with high frequency and high
probability from the word probability distribution, rather than words with larger
and more vivid meaning.
(4) Common sense or logical reasoning of storyline. Seq2seq models do not have
common sense judgment and logical reasoning.
8 C. Hou et al.
5 Conclusion
Research into automated story generation has been very active in recent years.
According to the user constraint C(u) in the model PD ðujCðuÞÞ, we divide models into
theme-oriented models, storyline-oriented models, and human-machine-interaction–
oriented models. Through comparative analysis, we obtain the focus of each model.
Essentially, the behavior of story generation would be better described as a planning
process rather than a process of sampling conditioned on observations. The theme-
oriented model uses the hierarchical structure to describe the planning process, which
allows it to generate coherent stories. The storyline-oriented model complements the
theme-oriented model and adds vividness. Finally, the human-machine-interaction–
oriented model focuses more on the interaction with people and applies them in real-
world scenarios. This paper summarizes and analyses the latest methods to generate
stories and highlights some promising directions for future research in the hopes of
aiding researchers in related fields.
Acknowledgment. This work is supported by the National Key Research and Development
Program of China (No. 2017YFB1400805).
A Survey of Deep Learning Applied to Story Generation 9
1. Meehan, J.R.: TALE-SPIN, an interactive program that writes stories. IJCAI 77, 91–98
2. Zhu, F., Cungen, C.: A survey of narrative generation approaches. J. Chin. Inf. Process. 27
(3), 33–40 (2013)
3. Cavazza, M., Charles, F., Mead, S.J.: Character-based interactive storytelling. IEEE Intell.
Syst. 17(4), 17–24 (2002)
4. Pizzi, D.: Emotional planning for character-based interactive storytelling. Teesside
University (2011)
5. Lebowitz, M.: Story-telling as planning and learning. Poetics 14(6), 483–502 (1985)
6. Turner, S.R.: MINSTREL: a computer model of creativity and storytelling, pp. 1505–1505
7. Bringsjord, S., Ferrucci, D.: Artificial Intelligence and Literary Creativity: Inside the Mind of
Brutus, a Storytelling Machine. Psychology Press, Abingdon (1999)
8. Perez, R.P.Ý., Sharples, M.: MEXICA: a computer model of a cognitive account of creative
writing. J. Exp. Theor. Artif. Intell. 13(2), 119–139 (2001)
9. Riedl, M.O., Young, R.M.: Narrative planning: balancing plot and character. J. Artif. Intell.
Res. 39(1), 217–268 (2010)
10. Gervas, P., et al.: Story plot generation based on CBR. Knowl. Based Syst. 18(4), 235–242
11. Swanson, R., Gordon, A.S.: Say anything: using textual case-based reasoning to enable
open-domain interactive storytelling. Ksii Trans. Internet Inf. Syst. 2(3), 16 (2012)
12. Li, B., et al.: Story generation with crowdsourced plot graphs. In: National Conference on
Artificial Intelligence, pp. 598–604 (2013)
13. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In:
Neural Information Processing Systems, pp. 3104–3112 (2014)
14. Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural
machine translation. In: Empirical Methods in Natural Language Processing, pp. 1412–1421
15. Martin, L.J., et al.: Event representations for automated story generation with deep neural
nets. In: National Conference on Artificial Intelligence, pp. 868–875 (2018)
16. Li, J., et al.: Deep reinforcement learning for dialogue generation. In: Empirical Methods in
Natural Language Processing, pp. 1192–1202 (2016)
17. Yao, L., et al: Plan-and-write: towards better automatic storytelling. In: National Conference
on Artificial Intelligence (2019)
18. Guan, J., Wang, Y., Huang, M.: Story ending generation with incremental encoding and
commonsense knowledge. In: National Conference on Artificial Intelligence (2019)
19. Tambwekar, P., et al.: Controllable neural story generation via reinforcement learning.
arXiv: Computation and Language (2018)
20. Fan, A., Lewis, M., Dauphin, Y.N.: Hierarchical neural story generation. In: Meeting of the
Association for Computational Linguistics, pp. 889–898 (2018)
21. Xu, J., et al.: A skeleton-based model for promoting coherence among sentences in narrative
story generation. In: Empirical Methods in Natural Language Processing, pp. 4306–4315
22. Nayak, N., et al.: To plan or not to plan? Discourse planning in slot-value informed sequence
to sequence models for language generation. In: Conference of the International Speech
Communication Association, pp. 3339–3343 (2017)
10 C. Hou et al.
23. Wang, Z., et al.: Chinese poetry generation with planning based neural network. In:
International Conference on Computational Linguistics, pp. 1051–1060 (2016)
24. Mou, L., et al.: Sequence to backward and forward sequences: a content-introducing
approach to generative short-text conversation. In: International Conference on Computa-
tional Linguistics, pp. 3349–3358 (2016)
25. Fan, A., Lewis, M., Dauphin, Y.: Strategies for structuring story generation. arXiv:
Computation and Language (2019)
26. Huang, K., et al.: Visual storytelling. In: Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, pp. 1233–1239 (2016)
27. Wang, X., et al.: No metrics are perfect: adversarial reward learning for visual storytelling.
In: Meeting of the Association for Computational Linguistics, pp. 899–909 (2018)
28. Kim, T., et al.: GLAC Net: GLocal attention cascading networks for multi-image cued story
generation. arXiv: Computation and Language (2018)
29. Huang, Q., et al.: Hierarchically structured reinforcement learning for topically coherent
visual story generation. In: National Conference on Artificial Intelligence (2019)
30. Li, J., Luong, T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and
documents. In: International Joint Conference on Natural Language Processing, pp. 1106–
1115 (2015)
31. Yarats, D., Lewis, M.: Hierarchical text generation and planning for strategic dialogue. In:
International Conference on Machine Learning, pp. 5587–5595 (2018)
32. Mostafazadeh, N., et al.: CaTeRS: causal and temporal relation scheme for semantic
annotation of event structures. In: North American Chapter of the Association for
Computational Linguistics, pp. 51–61 (2016)
33. Zhou, D., Guo, L., He, Y.: Neural storyline extraction model for storyline generation from
news articles. In: North American Chapter of the Association for Computational Linguistics,
pp. 1727–1736 (2018)
34. Jain, P., et al.: Story generation from sequence of independent short descriptions. arXiv:
Computation and Language (2017)
35. Zhao, Y., et al.: From plots to endings: a reinforced pointer generator for story ending
generation. In: International Conference Natural Language Processing, pp. 51–63 (2018)
36. Clark, E.A., et al.: Creative writing with a machine in the loop: case studies on slogans and
stories. In: Intelligent User Interfaces, pp. 329–340 (2018)
37. Peng, N., et al.: Towards controllable story generation. In: Proceedings of the First
Workshop on Storytelling (2018)
38. Goldfarb-Tarrant, S., Feng, H., Peng, N.: Plan, write, and revise: an interactive system for
open-domain story generation. arXiv: Computation and Language (2019)
A Smart Roll Wear Check
Scheme for Ensuring the Rolling
Quality of Steel Plates
Abstract. Roll surface wear morphology directly affects the surface quality of
steel plates and even affects the texture composition of plates and strip steel
products. Using image processing methods to judge the wear state of a roll is
low cost, easy to operate, and easy to realize an automatic smart data processing
system. In this paper, we propose a Smart Roll Wear Check (SRWC) scheme
for ensuring the rolling quality of steel plates. In the SRWC scheme, roll
surface images in different wear stages are analyzed, from which seventeen
dimension features are extracted. At the same time, the fractal theory is
introduced to explore the relationship between fractal dimensions and roll wear
degree. The results show that four characteristic parameters, such as roundness,
equivalent area circle radius, second moment and texture entropy, and the
fractal dimension can be used as effective parameters to quantitatively judge
roll wear state. Lastly, a back-propagation (BP) neural network model for
recognition and judgment for roll wear is established. It provides an experi-
mental test to show that the five parameters as a quantitative evaluation for roll
wear morphology are effective. By processing the data on the images, the
SRWC scheme can demonstrate whether the roll needs to get off mill in time,
so as to avoid the hidden danger of safety and ensure the rolling quality of the
steel plate.
This work was supported by the National Natural Science Foundation of China under Grant
No. 61602351, 61802286.
1 Introduction
Roll is an important part of rolling mills. The rolling mill makes use of the pressure
produced by a pair or groups of rolls to mill the steel to produce plastic deformation [1,
2]. If damages of the surface are not found in time, it will affect the rolling size, shape
quality and appearance of the steel sheet, reduce the corrosion resistance, wear-
resistance and fatigue limit of the product, and even cause harm to personal safety [3].
Therefore, how to effectively detect the wear of roll is of great significance.
At present, in the industrial field, even in some large steelworks, the wear mor-
phology recognition of roll surface is basically completed by manual. The roll must get
off the machine, which is bound to affect the normal operation of production. And there
is no unified standard for the identification of morphological defects. There are many
external factors affect the worker’s undefined judgment of morphological wear defects,
which makes the identification results not objectively reflect the wear of roll surface. At
the same time, in the traditional methods, there have great problems in the storage and
the query of historical data of roll wear morphology. Therefore, it is urgent for a smart
system to detect the wear degree of rollers in the industrial site, which can carry out
automatic detection efficiently and accurately.
Currently, there are mainly two methods to quantitate metal surface roughness
automatically. The one is using sensor technology, e.g. fiber optic ranging technology,
light cutting technology, etc. [4, 5]. These technologies need special facilities, precision
equipment, and even dedicated software. They are of high cost. Another is a tech-
nology basing on images processing technology. Images can be got by an ordinary
camera and processed by image processing technology. The technology can be used
widely in various industrial fields [6, 7].
Roughness measurements on a variety of steel surfaces and a textured magnetic
thin-film disk have shown that their topographies are multi-scale and random. This
spectral behavior implies that when the surface is repeatedly magnified, statistically
similar images of the surface keep appearing. Therefore fractal characterization of
surface topography is applied to the study of contact mechanics and wear processes [8].
It is a very worthy to study the image characteristics of oxide film morphology on roll
surface, analyze the fractal law contained in it, and then extract the characteristic
parameters used to reflect the wear degree of roll, and establish a quantitative model to
detect the wear degree of the roll.
It is of great significance to deepen the study of roll wear mechanism and surface
morphology by using image processing technology and fractal theory and to establish a
quantitative model to characterize the wear morphology of roll surface and improve the
automation level of the rolling process. In this paper, we propose a Smart Roll Wear
Check (SRWC) scheme for ensuring the rolling quality of steel plates. In the SRWC
scheme, six typical roll surface morphology images standing by different wear stages
are selected. The contributions of this paper are as follows:
(1) The six images are preprocessed and converted into binary images, which can
express the wear morphology features.
A SRWC Scheme for Ensuring the Rolling Quality of Steel Plates 13
(2) Seventeen geometric features and texture features are extracted by using the image
processing method. After analysis, four feature parameters are selected to express
the wear degree of the roll.
(3) The fractal theory is introduced into the analysis of roll wear, and the relationship
between fractal dimension and wear degree is analyzed, which shows that fractal
dimension can be used as a parameter to characterize the wear degree of the roll.
(4) In order to verify the correctness of these conclusions, we design a roll wear
pattern recognition system based on BP neural network. Six groups of data with
five parameters are used as input to train the neural network, and then a set of data
representing a certain wear degree is used to verify the system. The correct wear
degree is obtained, which shows that the pattern recognition system is feasible.
Compared with the human eye observation method, this pattern recognition system
has the advantages of automatically recognizing the wear degree of the roll, has a
unified standard, and can be processed offline to find out in time whether the roll needs
to get off the machine, so as to avoid the hidden danger of safety and ensure the rolling
quality of the steel plate.
(a) 1st-stage (b) 2nd-stage (c) 3rd-stage (d) 4th-stage (e) 5th-stage (f) 6th-stage
gðx; yÞ ¼ ½f ðx; yÞ a ð1Þ
This equation can make the underexposed image darker in the black part and whiter
in the white part, thus improving the gray contrast of the image.
Image Smoothing. Image smoothing is to weaken or eliminate the high-frequency
components of the image, enhance the low-frequency components of the image, and
achieve the purpose of eliminating random noise in the image. Median filtering is a
nonlinear removing noise method, which can keep the detail part of the image while
eliminating the noise and preventing the edge part of the image from blurring. The
principle is to use a sliding window with odd points and replace the value of the center
point of the window with the median value of each point in the window. The window
shape and size of the median filter template have a great influence on the filtering effect.
The common median filter window is square with a size of 3 3.
Image Sharpening. The smooth processed image will inevitably cause the edge
information of the image to be lost and become relatively blurred. If the edge infor-
mation of the image is to be strengthened, the image should be sharpened. Image
sharpening can eliminate or reduce the low-frequency component of the image so as to
enhance the edge contour information in the image. It makes the gray value of the
A SRWC Scheme for Ensuring the Rolling Quality of Steel Plates 15
pixels other than the edge tends to be zero. The usual sharpening methods have the
gradient method and the Laplacian operator. For roll wear morphology, the gradient
method is preferred to the Laplacian operator method.
The Robert gradient operator is a commonly used gradient difference method, which
can be expressed by Eq. (2):
(a) 1st-stage (b) 2nd- stage (c) 3rd- stage (d) 4th- stage (e) 5th-stage (f) 6th-stage
measurement and processing of images [14]. Good object characteristics should have
the characteristics of differentiability, reliability, independence, small quantity, and the
low-dimensional space that can replace the high-dimensional image sample space. In
roll wear morphology, the geometric features of the image can be used to describe the
geometric shape of the defect and the texture feature can reflect the relationship
between the defect regions.
From the image shown in Fig. 2, the thirteen geometric features are calculated,
including boundary perimeter, defect area, rectangle, extension length, roundness,
equivalent area circle radius, and invariant moment of Hu. M. K IM1–IM7. The results
are shown in Table 2. After directly transforming images shown in Fig. 1 into gray-
scale images, the four texture features are calculated, including the second moment,
texture entropy, contrast, and correlation. The results are separately shown in Tables 2
and 3.
The changes regulation in geometric features and texture feature parameters after
normalization in different wear stages are shown in Fig. 3. As can be seen from Fig. 3
(a), as the rolling process progresses, the boundary perimeter, defect area and equiv-
alent area circle radius of roll wear defects increase gradually, and the increasing rate is
fast at first, then slowly and then faster, which indicates that with the deepening of wear
degree the defect range of surface morphology of roll is expanding, and the wear rate is
also fast and then slow and then accelerated gradually. These are according to the
characteristics of three dynamic stages of wear dynamics (self-organization stage,
chaos stage and system instability stage). As can be seen from Fig. 3(b), the change of
roundness shows a decreasing trend, which indicates that the roll morphology becomes
more and more complex. The rectangle shows the filling degree of the external rect-
angle of the wear defect, the extension length indicates the compactness of the defect
area along the axis, and the curve changes show that the two parameters have no
obvious change. Figure 3(c) reflects the changing trend of seven invariant moments. It
can be seen from the diagram that the changes in these moments are similar, but there
are no specific rules. From the curve change of Fig. 3(d), it can be seen that with the
aggravation of roll wear, the image texture entropy and contrast change monotonously
and incrementally. The texture entropy is a measure of image information. Larger
entropy value means the roll has more fine textures, and the morphology is more
complex. Greater contrast indicates that the texture is denser and clearer. It is known
from Fig. 3(d) that the second-moment value is decreasing gradually, which means that
the gray distribution of the image becomes gradually uneven. Correlation is a physical
quantity used to represent the similarity between rows and columns in the gray sym-
biosis matrix. It is found that the variation range of this value is large, so it is not
suitable to be used as a measure to judge the wear morphology of rolls.
From the above analysis, it can be seen that with the gradual deterioration of roll
wear morphology, the geometric features and texture features of roll wear images
change continuously. We can select the boundary perimeter, area, equivalent area circle
radius, and roundness in the geometric features, and contrast, texture entropy and
(a) Boundary perimeter, (b) Rectangle, exten- (c) Seven invariant (d) Texture features
defect area, equivalent sion length, round- moments
radius extension ness
second moment in the texture feature as the quantitative indexes to determine the wear
state of a roll. In the actual modeling process, considering that there have certain
correlations between the attributes of some image feature parameters, in order to reduce
the complexity of the model, it is necessary to reduce the attributes without losing the
image information and remove some unrelated or unimportant image feature attributes.
So the four image eigenvalues of roundness, equivalent area circle radius, second
moment and texture entropy can be selected to quantitatively describe the wear mor-
phology of the roll.
The fractal dimension reflects the irregularity of the surface profile, which is a
similarity measurement parameter [16]. The larger the fractal dimension is, the more
complex the surface morphology is. As can be seen from Fig. 4, with the increasing of
roll wear, the fractal dimension increases rapidly at first, then almost keep unchanged,
and at last increases gradually. This is mainly due to the fact that at the beginning of
roll wear the roll surface is smooth and the dimension is low. With the wear going on,
because of the mismatch between the roll and the contact surface between the roll and
the strip, the roll surface will become relatively complex and rough, the micro surface
area will increase, and the fractal dimension will increase rapidly. When entering the
stable wear stage, the wear rate of the roll surface is basically constant, and the friction
and wear behavior are in a stable and orderly state, the contact surfaces between the roll
and the strip adapt to each other, and the fractal dimension is basically unchanged.
While in the rapid wear stage, the relatively stable system before is broken, the roll
wear rate increases rapidly, the roll surface morphology deteriorate and the fractal
dimension increases gradually. As a result, the fractal dimension can be used as a
parameter to measure roll wear morphology.
A SRWC Scheme for Ensuring the Rolling Quality of Steel Plates 19
In order to test the effectiveness of the trained network, the characteristic param-
eters in Table 5 are extracted from a roller image of 3100t rolling tonnage and used as
the test sample input to the trained network. The output result is 3.8398, which is in
good agreement with the actual situation.
Based on the above analysis, it is shown that it is effective and feasible to select the
equivalent area circle radius, roundness, texture entropy, second moment, and fractal
dimension as the quantitative parameters to measure the wear degree of a roll.
5 Conclusion
In this paper, we propose an SRWC scheme. In the SRWC scheme, firstly, the six roll
images in deferent wear stages are preprocessing, and seventeen features parameters
have been calculated, including geometry features and texture features. Based on the
analysis and comparison of the feature parameters, four image feature parameters,
namely, roundness, equivalent area radius, second moment and texture entropy, are
extracted to reflect characteristics of the roll wear morphology.
Secondly, the fractal dimension of wear morphology image is calculated by the box
dimension method, and the change law of fractal dimension of roll wear morphology is
analyzed and explained. The results show that the morphology of roll wear shows good
fractal characteristics, and the change of fractal dimension can be used to reflect the
change of surface morphology in the roll wear process. The above works lay a foun-
dation for the establishment of the roll wear morphology detection model.
Lastly, the BP network model is proposed which can be used to identify and predict
the roll wear morphology. It provides an experimental test to show that the five
parameters as a quantitative evaluation for roll wear morphology are effective.
The SRWC scheme can demonstrate whether the roll needs to get off mill in time, so as
to avoid the hidden danger of safety and ensure the rolling quality of the steel plate.
However, whether there is a general preprocessing process can be applied to all
kinds of images containing noise, and whether the SRWC scheme will still be appli-
cable to these images, will need to be studied in the later research.
1. Kong, X.W., Shi, J., Xu, J.Z., Wang, G.D.: Wear prediction of roller for hot mill during
service. J. Northeast. Univ. (Natl. Sci.) 23(8), 790–792 (2002)
2. Huang, Y.G.: Research of image feature and fractal on roll surface morphology in the wear
process (轧辊磨损形貌图像特征及分形研究). Master thesis, Wuhan University of Science
and Technology, Wuhan (2015)
3. Liu, X.L., Xu, C.G., Zhang, X.K.: Optimization technology of original surface roughness of
rolls in cold continuous rolling mill. J. Iron Steel Res. (11), 888–893 (2018)
4. Song, F.: For the design of a reflective sensor system for the detection of roller. Electron.
Test (5), 122–123 (2015)
5. Ge, Q.: Research on laser texturing roller surface roughness detection system using a light-
section method (基于光切法的激光毛化轧辊表面粗糙度检测系统的研究). Master the-
sis, Huazhong University of Science and Technology, Wuhan (2011)
6. Chen, P.P., Su, L.H.: Detection of parallelism of hot rolling roll system based on embedded
image processing. In: Proceedings of the 11th Annual meeting of China Iron and Steel,
pp. 1–7. The Chinese Society for Metals, Beijing (2017)
7. Yang, G., Zhang, X.H., He, G.P., Huang, J.H.: Anomaly detection SVDD algorithm based
on non-subsampled contoured transform. Autom. Instrum. 6, 63–65 (2016)
8. Majumdar, A., Tian, C.L.: Fractal characterization and simulation of rough surfaces. Wear
136(2), 313–327 (2016)
9. Lan, Y., Li, Y.H., Zhang, S.S.: Research of oxide film control on high chrome work roll
surface in Maanshan Steel CSP. Chinese Metallurgy 21(1), 33–37 (2011)
A SRWC Scheme for Ensuring the Rolling Quality of Steel Plates 21
10. Li, L., Huang, Y.G., Zhang, K., Lv, X.Y., Li, B., Wu, X.D.: Image feature and fractal on roll
surface morphology in the wear process. Iron Steel 50(4), 98–103 (2015)
11. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 4th edn. Pearson Prentice Hall,
Upper Saddle River (2017)
12. Hou, Q.L.: Investigation on the technology of tool wear detection based on machine vision
(基于机器视觉的刀具检测技术研究). Master thesis, Shandong University, Shandong
13. Lee, J.H., Kim, Y.S., Kim, S.R.: Real-time application of critical dimension measurement of
TFT-LCD pattern using a newly proposed 2D image-processing algorithm. Opt. Lasers Eng.
1, 558–569 (2008)
14. Zhai, J.H., Zhao, W.X., Wang, X.Z.: Research on the image feature extraction. J. Hebei
Univ. (Natl. Sci. Ed.) 29(1), 106–112 (2009)
15. Meng, H.D., Liu, L.: Study on steel slag grinding characteristics. Iron Steel 45(2), 28 (2010)
16. Ge, S.H., Zhu, H.: Fractal of Tribology (摩擦学的分形). Mechanical Industry Publishing
House, Beijing (2005)
17. Cui, Z.M., Li, Y.L., Ying Chen, Y.: Surface structure and fractal dimension calculation of
pore in low silicon sinter. Iron Steel 49(9), 10–14 (2014)
An Improved Prediction Model
for the Network Security Situation
Abstract. This research seeks to improve the long training time of traditional
methods that use support vector machine (SVM) for cyber security situation
prediction. This paper proposes a cyber security situation prediction model
based on the MapReduce and SVM. The base classifier for this model uses an
SVM. In order to find the optimal parameters of the SVM, parameter opti-
mization is performed by the Cuckoo Search (CS). Considering the problem of
time cost when a data set is too large, we choose to use MapReduce to perform
distributed training on SVMs to improve training speed. Experimental results
show that the SVM network security situation prediction model using
MapReduce and CS has improved the accuracy and decreased the training time
cost compared to the traditional SVM prediction model.
1 Introduction
Cyber security situation prediction plays a vital role in the field of network security. It
can predict the network environment, improve the security of the network environment,
and prevent impending network security incidents [1, 7]. However, there exist many
network data attributes and huge amounts of data. Every day, massive amounts of data
are generated, which poses a huge challenge to the use of algorithms [8]. The amount
of massive data increases the training time of the machine learning algorithm, and
reduces its efficiency. The space-time cost of the algorithm has a profound impact on
the establishment of the network security prediction model.
Many algorithms have been applied to the prediction of network security situation,
including artificial neural networks, clustering algorithms, association analysis, and
support vector machines [2, 3].
The support vector machine (SVM) classifies the data set via the VC dimension
theory and the structural risk minimization theory based on statistical learning [4].
SVMs exhibit good performance in processing high-dimensional numbers and small
sample data sets [5]. However, for data sets with large sample sizes, the SVM pro-
cessing speed is slower than other machine learning algorithms, which is unfriendly to
predict network security situation with huge amount of data. So the parallelization
method is proposed to train the support vector machine. This paper uses MapReduce to
parallelize the SVMs. MapReduce passes the data fragment to the mapper function for
parallel processing, and then uses the reduce function to obtain the final result.
In order to improve the prediction accuracy of the SVM, the Cuckoo Search
(CS) algorithm is used to optimize the parameters [6, 11]. There are many other
parameter optimization algorithms for SVM, including the grid search algorithm [12,
13], particle swarm optimization algorithm [14], etc. Of the available algorithms, the
CS has global convergence, can find the global optimal solution of parameters, has
fewer control parameters, and has higher versatility and robustness.
In summary, this paper proposes a cyber space security situation prediction model
based on MapReduce and SVM (MR-SVM). The MapReduce method is used to
parallelize the SVMs, and the effectiveness and prediction accuracy of the method are
quantitatively analyzed by experimentation. Compared to the results of a traditional
SVM network security situation prediction method, the feasibility of the model is
Main research content of this paper is divided into six sections. The first section
introduces the current methods of cyber space security situation prediction, and pro-
poses a network security situation prediction model based on MapReduce and SVM.
The second section establishes the prediction model and outlines the steps of estab-
lishing the prediction model proposed in this paper. The third section introduces
MapReduce distributed training and describes the parallel method used in the paper.
The fourth section introduces the SVM classification and the SVM classification
algorithm used in this paper. The fifth section describes the SVM parameter selection
process and algorithm. The sixth section is the analysis of the experimental results,
which describes the data set selected by the experiment and the experimental verifi-
cation of the MapReduce and SVM network security situation prediction models
proposed in this paper.
The network security situation prediction process based on MapReduce and SVM
algorithm is presented in Fig. 1.
The cyber security situation prediction process is mainly divided into the following
(1) Obtain a network security data set, select a training set and testing set;
(2) Upload the data set to HDFS, schedule it by MapReduce, and parallelize the
(3) The data stored in HDFS is used as the data set of the SVM. The RBF kernel
function is selected in the SVM. Define the SVM parameter value interval and
step size, and apply the CS combined with the ten-fold cross-validation method
for parameter optimization;
(4) Use the parameters obtained in the third step to determine the SVM cyber security
situation prediction model, and test the model;
(5) Determine whether the prediction result satisfies the termination condition. If it is
true, obtain an optimized support vector machine prediction model; otherwise,
24 J. Hu et al.
return to the third step to continue optimizing the model, the termination condition
is that the model prediction accuracy reaches a predetermined threshold or the
number of cycles exceeds a preset maximum number of cycles;
(6) The parallel SVM is reduced to obtain the cyber security situation prediction
In supervised machine learning, data sets are usually divided into training sets and
testing sets, in which training sets are used to optimize model parameters, and a high-
precision network security situation prediction model is obtained. Testing sets are
required to check whether the prediction model has the promotion ability. In this
model, n training sets are needed: n data sets are obtained by sampling data sets
containing massive data, n support vector machine models are trained by n data sets.
Finally, the support vector machine prediction results are reduced. This results in the
final cyber security situation prediction model. When using the training set to train the
Obtain dataset
for test
SVM classifier, the SVM parameter optimization method uses the CS combined with
the ten-fold cross-validation to obtain the SVM classifier with high classification
The selection and parallelization of the basic classifier is the core of the network
security situation prediction. The MapReduce-SVM (MR-SVM) network security sit-
uation prediction model proposed in this chapter consists of two parts. The first part
uses MapReduce for data parallelization, and the second part uses SVM to perform
classification predictions. The core algorithms of the model and the reasons for the
algorithm selection are detailed i the following.
MapReduce is a programming model for parallel computing of large data sets [9, 10].
‘Map’ and ‘Reduce’ are their main functions. It is implemented to specify a ‘Map’
function to map a set of key-value pairs into a new set of key-value pairs, and to specify
the concurrent ‘Reduce’ function to ensure that each of the mapped key-value pairs
shares the same key group.
(1) Once a MapReduce program starts, MRAppMaster will start first. After the
MRAppMaster starts, the number of required maptask instances is calculated
according to the number and size of the network security data sets uploaded at this
time, and then the corresponding number of maptask processes is started.
(2) After the maptask process starts, data processing is performed according to the
given data slice range. The main flow is:
A. Use the specified input format to get the RecordReader to read a data set,
B. Pass the input data set to the customer-defined map() method, perform SVM
training, and collect the KV pairs output by the map() method into the cache;
C. the KV in the cache is sorted according to the K partition and then overflows
to the disk file.
(3) After MRAppMaster monitors all maptask process tasks, it starts the reducetask
process and tells the reducetask process what range of data to process.
(4) After the reducetask process starts, several maptask output result files are obtained
according to the location of the data to be processed as notified by MRAppMaster,
and then maptask output result files are re-merged and sorted locally. Then,
according to the KV of the same key as a group, the ‘Reduce’ method is called to
predict the prediction result. The result is reduced and the result KV of the
operation output is collected, then the customer-specified output format is called
to output the result data to the external storage.
4 SVM Classification
First, the work flow chart of the network security situation classification SVM is
introduced, as shown in Fig. 2. Xi ; i 2 ½1; nÞ is the sample point entered, which is a
piece of data in the training set Xi ; i 2 ½1; nÞ input into the kernel function K ðxi ; xÞ. In
26 J. Hu et al.
this paper, the kernel function K ðxi ; xÞ selects the radial basis kernel function and
finally passes the decision function sgn().
Using the network security training set to train the SVM, the two optimal param-
eters of the SVM are obtained, i.e. the classification hyperplane is determined, and the
SVM model establishment process ends. The process of prediction is to input the test
set into the trained SVM classifier and make a decision through the decision function. If
the input data falls within the safe space determined by the optimal classification
hyperplane function, the output result of the SVM classifier is marked as +1, i.e. the
network connection corresponding to the data is determined to be safe. If the classi-
fication of the input data is in the unsafe space as determined by the hyperplane
function, it is marked as −1 in the output result of the SVM classifier, i.e. the network
connection corresponding to the data is determined to be unsafe.
The sample points of the network security data set usually have outliers, as illus-
trated in Fig. 3. This situation is called approximate linear separability. If the original
steps to search are followed, a classification hyperplane that can separate the two types
of sample points cannot be found, and the problem cannot be solved.
In order to solve the above problem, it is necessary to introduce slack variables in
the classifier. ni ðni 0; i ¼ 1; 2; ; nÞ The slack variable is used to describe the
outliers, and is a non-negative value. A penalty factor is introduced to evaluate this
loss, indicating the degree of emphasis on outliers during training, and the outliers are
also called loss points.
After introducing the penalty factor and the slack variable, the objective function is
as shown in Eq. (1).
1 Xn
wðw; nÞ ¼ wT w þ C ni ð1Þ
2 i¼1
An Improved Prediction Model for the Network Security Situation 27
For the actual data, the training process of the SVM is to solve the optimization
problem of (2)
< P
2 kxk
min 1
þC ni
i¼1 ; ni 0 ð2Þ
s:t:yi ðxT cðxi Þ þ bÞ 1 ni ; i ¼ 1; 2; ; n
Cuckoo Search (CS) is used for the SVM algorithm. It combines the ten-fold cross-
validation method for parameter optimization, including penalty factor C and kernel
function parameters g.
Normally, large cuckoos remove one or more eggs from a host before they lay their
own eggs in the nest. In order not to be discovered by the host, the cuckoos ensure that
28 J. Hu et al.
the number of new eggs in the nest is equal or similar to the number that was removed.
Once the cuckoo nestlings are hatched by the foster mother, the foster mother’s own
chicks are pushed out of the nest so that the foster nestlings are raised. This greatly
increases the probability that the nestlings survive. In order to simulate the habit of the
cuckoo, the CS algorithm assumes the following three ideal states:
(1) Each cuckoo produces only one egg at a time, and randomly selects a nest to store;
(2) During the nesting process, the best nest of eggs will be retained to the next
(3) The number of available nests is fixed, and the probability of finding foreign eggs
in the nest is P, P 2 [0, 1]. If a foreign bird is found, the owner of the bird re-
establishes a bird’s nest.
Under the assumptions of the above three ideal states, the update formula of the
position and path of the CS is as the Eq. (4):
ðt þ 1Þ
xi ¼ xti þ a LðjÞ; i ¼ 1; 2; . . .; n ð4Þ
In the equation, xi indicates the position of the i-th bird’s nest in the t-generation
nest, ⊕ is point-to-point multiplication, and a is a step control amount that is used
to control the search range of the step size, and its value obeys a normal distri-
bution. Finally, L( )גis the Levi random search path, and the random step size is the
Levi distribution.
The algorithm of the CS is described as follows.
The CS optimization range is 0.1–150, the population size is 20, the maximum
discovery probability P is 0.25, and the maximum iteration number is 20. The SVM
An Improved Prediction Model for the Network Security Situation 29
classification accuracy rate is selected as the fitness function. The values of the
parameters and the highest accuracy of the SVM are obtained, and the optimal clas-
sification hyperplane function is calculated. Finally, the SVM classifier is trained by the
above steps. When the normalization operation is involved, it is mapped to [0, 1].
The experimental process of k-fold cross-validation is
The model is verified by a ten-fold crossover method. The training set is divided
into 10 subsets. Under the parameters determined by the CS algorithm, each subset is
tested once. After 10 trials, 10 experiments are calculated to determine the average
classification accuracy as a fitness function for evaluating the parameters of the group.
This section will verify the MR-SVM model. The KDD dataset is from an intrusion
detection assessment project conducted by the US Department of Defense’s Advanced
Planning Agency (DARPA) at the MIT Lincoln Laboratory. The experimental data set
in this chapter was selected from the KDD dataset. Considering the characteristics of
MapReduce and SVM algorithms, the experiment is divided into the following two
parts: (1) Parallelized support vector machine. The purpose of this part of the exper-
iment is to optimize the parameters for the two key parameters existing in the MR-
SVM algorithm, train the SVM and perform reduction, and find the optimal network
security situation prediction model under the experimental data set. (2) The purpose of
the second part of the experiment is verifying the feasibility of the model. Verification
is conducted by comparing the prediction results of the MR-SVM model with the
prediction results of traditional SVM model.
30 J. Hu et al.
This paper uses four indicators to evaluate the performance of the model, namely
accuracy, precision, recall, and F-measure values (harmonic average of precision and
recall) [16]. The recall rate indicates the ratio of the number of unsafe connections
detected to the actual number of connections. The precision indicates the ratio of the
number of unsafe connections to the actual number of connections. The F value is the
harmonic average of the precision and recall, which has achieved a compromise
between recall and precision. SVM parameter optimization is achieved with a CS
combined with ten-fold cross-validation.
An Improved Prediction Model for the Network Security Situation 31
Table 3 provides a comparison of the time efficiency between CS_SVM and MR-
Accuracy Precision
100.00% 100.00%
95.00% 90.48%
98.00% 88.41%
96.16% 90.00% 86.54%
96.00% 95.01% 85.00% 80.60%
94.00% 75.00%
90.00% 60.00%
Recall F-measure
70.00% 58.06% 61.29%
60.00% 48.39% 75.00%
30.00% 65.00%
20.00% 60.00%
10.00% 55.00%
0.00% 50.00%
It is evident from the above figures that the SVM model using the CS algorithm is
superior to the traditional SVM model in terms of all four indicators. The experimental
results illustrate two aspects: on the one hand, when other conditions are the same, the
parallelization of the SVM is more efficient than the traditional SVM model; on the
other hand, the CS algorithm can be suitable to address the SVM parameter opti-
mization problem, as it improves the four indicators.
The algorithm proposed in this paper can solve the problem of the binary classi-
fication of data sets with more data. The MapReduce method reduces the training cost
of SVM by parallelizing the SVMs. The speedup of the model is 6.69. The CS
algorithm solves the problem of parameter optimization of SVM, because it can find
the optimal solution for this global problem.
7 Conclusions
This paper proposes a SVM network security situation prediction model MR-SVM.
The model shows that it effectively reduces the training time of SVM and improves the
accuracy of network security situation prediction. Using the CS algorithm proposed in
this paper, the prediction accuracy of the MR-SVM network security situation pre-
diction model increases. This paper selects KDD data set for comparison experiment
between the traditional SVM, the SVM using the CS algorithm for parameter opti-
mization, LTSA-SVM, and the MR-SVM. The experiments proved the MR-SVM
model can effectively solve problem of rapid increase of SVM training cost associated
with increases of data volume.
Acknowledgements. This work has been supported by the National Key Research and
Development Program of China (Grant No. 2016YFB0800700) and the National Natural Science
Foundation of China (Grant No. 61772070).
An Improved Prediction Model for the Network Security Situation 33
1. Fan, Z., Xiao, Y., Nayak, A.: An improved network security situation assessment approach
in software defined networks. Peer-to-Peer Netw. Appl. 12, 295–309 (2017)
2. Hao, H., Hongqi, Z., Yuling, L.: Quantitative method for network security situation based on
attack prediction. Secur. Commun. Netw. 2017, 19 (2017)
3. Zhao, D., Liu, J.: Study on network security situation awareness based on particle swarm
optimization algorithm. Comput. Ind. Eng. 125, 764–775 (2018)
4. Ding, S., Cong, L., Hu, Q., Jia, H.: A multiway p-spectral clustering algorithm. Knowl.-
Based Syst. 164, 371–377 (2019)
5. Ding, S., Zhang, N., Zhang, X.: Twin support vector machine: theory, algorithm and
applications. Neural Comput. Appl. 28, 3119–3130 (2017)
6. Ding, S., Zhu, Z., Zhang, X.: An overview on semi-supervised support vector machine.
Neural Comput. Appl. 28, 1–10 (2017)
7. Guan, Z., Zhang, Y., Wu, L.: APPA: an anonymous and privacy preserving data aggregation
scheme for fog-enhanced IoT. J. Netw. Comput. Appl. 125, 82–92 (2019)
8. Li, Y., Hu, J., Wu, Z.: Research on QoS service composition based on coevolutionary
genetic algorithm. Soft. Comput. 22, 7865–7874 (2018)
9. Madani, Y., Erritali, M., Bengourram, J.: Sentiment analysis using semantic similarity and
Hadoop MapReduce. Knowl. Inf. Syst. 8, 413–436 (2018)
10. Bendre, M., Manthalkar, R.: Time series decomposition and predictive analytics using
MapReduce framework. Expert Syst. Appl. 116, 108–120 (2018)
11. Zhu, H., Qi, X., Chen, F.: Quantum-inspired cuckoo co-search algorithm for no-wait flow
shop scheduling. Appl. Intell. 49, 791–803 (2019)
12. Bhat, P.C., Prosper, H.B., Sezen, S.: Optimizing event selection with the random grid search.
Comput. Phys. Commun. 228, 245–257 (2018)
13. Kong, X., Sun, Y., Su, R.: Real-time eutrophication status evaluation of coastal waters using
support vector machine with grid search algorithm. Mar. Pollut. Bull. 119, 307–319 (2017)
14. Vijayashree, J., Sultana, H.P.: A machine learning framework for feature selection in heart
disease classification using improved particle swarm optimization with support vector
machine classifier (2018)
16. Fernandes, S.E.N., Papa, J.P.: Improving optimum-path forest learning using bag-of-
classifiers and confidence measures. Pattern Anal. Appl. 22(2), 703–716 (2019)
A Quantified Accuracy Measurement Based
Localization Algorithm for Autonomous
Underwater Vehicles
1 Introduction
Autonomous Underwater Vehicle is usually an indispensable tool for this sort of tasks,
where high risks like pollution and radioactivity make it unrealistic for human
The most essential advantage of AUV is the autonomous navigation, which makes
it ideal for missions including seabed mapping and undersea resource locating [1].
However, these applications usually require an AUV to provide its current location.
Without localization capability, the applicability of AUV will be significantly
The AUV localization system is quite different from terrestrial localization system.
The GPS signal cannot be used in underwater environment because the radio cannot be
propagated well in water [2, 3]. Therefore, the AUV cannot be localized via GPS
system. In fact, the AUVs or the nodes of UWSN (Underwater Wireless Sensor Net-
work) depend on acoustic system for communication [4]. The acoustics bandwidth is
narrow and its propagation delay is huge compared with the radio signal [5]. Mean-
while, the AUV localization is more difficult since the acoustic propagation speed is
highly dynamic due to the influence of water depth, temperature, current and salinity
[6, 7].
In the past, the collaborative localization among AUVs is not the research focus
because the AUV is expensive and only single or few AUVs were used for application
each time. The main AUV localization technology is remote control via acoustic
localization system, such as LBL (Long Baseline), SBL (Short Baseline), and USBL
(Ultra Short Baseline) etc. [8–10]. There are also other localization systems which
utilize techniques similar to GPS localization, such as buoy localization [11] on the
surface or anchor localization [12] on the seabed. But there are some shortages of these
technologies. For example, the acoustic remote-control system must be utilized within
a certain distance and the acoustic signals would be weakened greatly if there are
obstacles between the controller and the AUV. As the price of AUV is decreasing, it
becomes practical to use a larger number of AUVs for specific applications, which
motivates us to develop a collaborative localization algorithm among multiple AUVs.
The coordinate that is just calibrated by GPS has the highest accuracy index which
is a ¼ 1. The accuracy index of an AUV will be decreased with the increase of its
underwater navigation time since the coordinate deviation is accumulated and enlarged.
The accuracy index is designed to reflect the synthesized influence, where here tn is
current time, tlu the last coordinate updated time of this AUV, N is a positive integer
modulus, tui is the defined coordinate updating time interval, dðu;tÞ is the distance
between navigation coordinate and updated coordinate after coordinate is updated and
its initial value is 0, dðu;luÞ is the Euclidean distance of the locus between the current and
the previous coordinate updated position and it can be calculated by the two updated
coordinate. We set dðu;luÞ ¼ 1 when dðu;tÞ ¼ 0 to avoid division by zero (which will
occur when the AUV is just put into the water or the AUV wants to hold its position)
and the dðu;luÞ will be set to Its actual value when dðu;tÞ 6¼ 0. The first part of formula (1)
in the first parentheses Lowers the accuracy index while the AUV navigates under-
water, which accounts for the accumulative deviation caused by the navigation
equipment inaccuracy. The second part of formula (1) in the second parentheses is the
synthesized influence to accuracy index, which is caused by the navigation equipment
and current. The accuracy index will drop down faster if ratio of the synthesized
deviation distance (dðu;tÞ ) and the locus between the current and the previous coordinate
36 X. Yu et al.
updated position (dðu;luÞ ) is larger. The second part of the formula (1) belongs to
posterior value based on the current coherent property since it is influenced by the
previous locus and its essence is to predict the current synthesized deviation according
to previous deviation.
updated by GPS. Otherwise the AUV tries to update its coordinate and accuracy index
via MTL (Modified Triangle Localization).
Suppose the AUV’s (node A) coordinate to be localized is ðx; yÞ and the coordi-
nates of the three reference AUVs (node A1 , A2 , A3 ) for localizing node A are ðx1 ; y1 Þ,
ðx2 ; y2 Þ, ðx3 ; y3 Þ and their corresponding accuracy index are a, a1 , a2 , a3 , where
a1 a2 a3 [ a. The distances between A and A1 , A2 , A3 are d1 , d2 , d3 . The tradi-
tional triangle localization equations are:
8 2 2
< ðx x1 Þ þ ðy y1 Þ ¼ d12
ðx x2 Þ þ ðy y2 Þ2 ¼ d22
ðx x3 Þ2 þ ðy y3 Þ2 ¼ d32
The solution should be the only intersection point of three circles. The Eq. (2) is
modified to increase the probability of getting solution as follows:
< ðx x1 Þ2 þ ðy y1 Þ2 ¼ d12
ðx x2 Þ2 þ ðy y2 Þ2 ¼ d22 ð3Þ
: 2
d3 k3 ðx x3 Þ2 þ ðy y3 Þ2 d32 ð2 k3 Þ
There may be unique solution, no solution or multi solutions for Eqs. (3). 1. If there
is a unique solution, then we can use that solution
n to update the coordinate
oof A. 2. If
2 2 2
there are multi solutions, then we use min ðx x3 Þ þ ðy y3 Þ d3 to update
the coordinate of A. Then the accuracy index of A is set to a3 . 3. Node A will stop
updating its coordinate via MTL if there is no solution for Eq. (3) and go to the surface
to get the GPS signal. Its accuracy index is set to 1 in such case.
B. One AUV’s accuracy index is higher than threshold but there are at least three
neighbors in its neighbor list whose accuracy indexes are higher than its accuracy
index. This AUV will try to update its coordinate via MTL. It updates its coordinate
and accuracy index if there is a solution for Eq. (3) or, it adjusts its accuracy index
according to formula (1) if there is no solution for Eq. (3).
C. One AUV’s accuracy index is higher than threshold and there are not at least
three neighbors in its neighbor list whose accuracy indexes are higher than its accuracy
index. This AUV will continue to work and navigate underwater and adjust its accuracy
index according to formula (1).
4 Performance Evaluation
where Vx is the speed in the X axis and Vy is the speed in the Y axis. k1 , k2 , k3 , k, v are
variable which are closely related to environment factors such as tides and bathymetry.
These parameters will change with different environments.k4 , k5 are random variables.
In our simulations, we assume k1 , k2 to be random variables which are subject to
normal distribution with p as mean values and the standard derivations to be 0:1p. k3 is
subject to normal distribution with 2p as mean value and the standard derivation to be
0:2p. k is subject to normal distribution with 3 as the mean value and 0.3 as the
standard derivation. v is subject to normal distribution with 1 as the mean value and 0.1
as the standard derivation. k4 , k5 are random variables which are subject to normal
distribution with 1 as mean value and 0.1 as standard derivations.
Fig. 2. The influence of AUV density on average updating times via GPS
40 X. Yu et al.
the AUVs prefer to update their coordinate and accuracy indexes via neighbors other than
going upward to surface to get the GPS signal. Then the average updating times via GPS
drops down and the AUV can spend more time working underwater (Fig. 2).
B. The influence of accuracy threshold on average navigation deviation distance
and updated deviation distance.
The AUV number in this simulation is 200. Both the average navigation deviation
distance and the updated deviation distance decline along with the increment of the
accuracy threshold, which means that the coordinate deviation can be controlled by the
accuracy threshold. On the other hand, the updated deviation is always smaller than the
navigation deviation with different accuracy thresholds, which again verifies the
effectiveness of QuAMeL. As the accuracy threshold grows larger, coordinate update
frequency also increases (Fig. 3).
Fig. 3. The influence of accuracy threshold on average navigation deviation distance and
updated deviation distance
5 Conclusion
controller and adaptive to a wide range of AUV number. The result of simulation shows
that the accuracy index can reflect the change trend of coordinate accuracy and the
coordinate with high accuracy index can be distributed among the AUV network.
Acknowledgement. This work has been supported by the National Natural Science Foundation
of China (No. U1636213, 61876019, 61772070).
1. Williams, S.B., Pizarro, O., Mahon, I., Johnson-Roberson, M.: Simultaneous localisation
and mapping and dense stereoscopic seafloor reconstruction using an AUV. In: Khatib, O.,
Kumar, V., Pappas, G.J. (eds.) Experimental Robotics, vol. 54, pp. 407–416. Springer,
Heidelberg (2009).
2. Liu, L., Zhou, S., Cui, J.-H.: Prospects and problems of wireless communication for
underwater sensor networks. Wirel. Commun. Mob. Comput. 8(8), 977–994 (2008)
3. Heidemann, J., Ye, W., Wills, J., Syed, A., Li, Y.: Research challenges and applications for
underwater sensor networking. In: Proceedings of the IEEE Wireless Communications and
Networking Conference, vol. 1, pp. 228–235 (2006)
4. Whitcomb, L., Yoerger, D.R., Singh, H., Howland, J.: Advances in underwater robot
vehicles for deep ocean exploration: navigation, control, and survey operations. In:
Hollerbach, J.M., Koditschek, D.E. (eds.) Robotics Research, pp. 439–448. Springer,
London (2000).
5. Stojanovic, M., Preisig, J.: Underwater acoustic communication channels: propagation
models and statistical characterization. IEEE Commun. Mag. Underwater Wirel. Commun.
47(1), 84–89 (2009)
6. Stojanovic, M.: On the relationship between capacity and distance in an underwater acoustic
communication channel. ACM SIGMOBILE Mob. Comput. Commun. Rev. 11(4), 34–43
7. Akyildiz, I.F., Pompili, D., Melodia, T.: Underwater acoustic sensor networks: research
challenges. Ad Hoc Netw. 3(3), 257–279 (2005)
8. Singh, H., et al.: An integrated approach to multiple AUV communications, navigation and
docking. In: OCEANS ‘96. MTS/IEEE. Prospects for the 21st Century. Conference
Proceedings, 23–26 September 1996, vol. 1, pp. 9–64 (1996)
9. Curcio, J., et al.: Experiments in moving baseline navigation using autonomous surface craft.
In: OCEANS, 2005. Proceedings of MTS/IEEE, vol. 1, pp. 730–735 (2005)
10. Matos, A., Cruz, N., Martins, A., Pereira, F.L.: Development and implementation of a low-
cost LBL navigation system for an AUV. In: OCEANS ‘99 MTS/IEEE. Riding the Crest into
the 21st Century, vol. 2, pp. 774–779 (1999)
11. Austin, T.C., Stokey, R.P., Sharp, K.M.: PARADIGM: a buoy-based system for AUV
navigation and tracking. In: OCEANS 2000 MTS/IEEE Conference and Exhibition, vol. 2,
pp. 935–938 (2000)
12. Chandrasekhar, V., Seah, W.K., Choo, Y.S., Ee, H.V.: Localization in underwater sensor
networks - survey and challenges. In: WUWNet 2006 Proceedings of the 1st ACM
International Workshop on Underwater Networks, pp. 33–40 (2006)
13. Chitre, M., Shahabudeen, S., Freitag, L., Stojanovic, M.: Recent advances in underwater
acoustic communications & networking. In: OCEANS 2008, 15–18 September 2008, vol.
2008-Suppl., pp. 1–10 (2008)
14. Beerens, S.P., Ridderinkhof, H., Zimmerman, J.: An analytical study of chaotic stirring in
tidal areas. Chaos Solitons Fractals 4, 1011–1029 (1994)
An Improved Assessment Method
for the Network Security Risk
1 Introduction
To solve the issue above, the network security risk assessment method based on I-
HMM uses the alarm quality and learning algorithm, which calculates the network
security risk value from two dimensions: host layer and network layer. The experiment
proves that the method can accurately show the security risk status of the network,
reflect the network risk trend in a timely and intuitive manner, and distinguish the
influence of different hosts on the network risk.
The weight vector W = (0.1365, 0.2385, 0.625). AF, AC, and AS are standardized
to obtain the same range of values [1, 4] to balance the effects of each attribute.
The calculation formula for QoA is:
In each acquisition cycle, the highest quality alarm is selected as the observation
vector of HMM. Through the value of alarm quality, the observation vector is mapped
to four levels as 1, 2, 3 and 4 by formula 2.
> 1 0 QoA\1
2 1 QoA\2
Vt ¼ ð2Þ
> 3 2\QoA 3
4 3\QoA 4
¼ n1 ðiÞ
p ð3Þ
XT1 .XT1
aij ¼ n
t¼1 t
ð i Þ n ði Þ
t¼1 t
bjk ¼ t¼1;Ot ¼vk
nt ð i Þ t¼1
nt ð j Þ ð5Þ
The network security risk is divided into direct risk (DR) and indirect risk (IR). It is
necessary to consider the network node association (NNC) between the nodes [11, 12].
DR is calculated by Eq. 6.
DR ¼ i¼1
qt wi ; ð6Þ
qt is the probability distribution of the network security state of the host at time t,
which is calculated by the algorithm in Table 1.
An Improved Assessment Method for the Network Security Risk 45
For the IR, find the node associated with host h denoted as h1 ; h2 ; . . .hN . Determine
the NNC relationship type Whk ;h of the host h.
Whk ;h 2 fW1 ; . . .; W7 g; r Whk ;h 2f1:0; 0:7; 0:5; 0:3; 0:2; 0:1; 0:8g ð7Þ
1. For each node, the risk value at a certain time t is denoted as Rhk ; 1 k N, the
influence magnitude on the risk value of the host h by each node in the NNC
relationships is denoted as DRhk , 1 k N.
DRhk ¼ r Whk ;h Rhk ; 1 k N ð8Þ
Finally, the risk calculation of the entire network is calculated by Eq. 12.
R¼ v R =N
h¼1 h h
46 J. Hu et al.
Internal management
Internal user area
Host 1
Use the vulnerability scanning tool Nessus to scan the nodes in the network [14],
and obtain the vulnerability information in the network.
Three Dimensions Indicators Obtained. This paper describes the state of network
security from three dimensions, which are the basic operation of the network, vul-
nerability and threat.
In the basic operation dimension indicator, the quantified value of the host asset is
calculated by Eq. 13.
AssetValue ¼ log2 a 2C þ b 2I þ c 2A =3 ð13Þ
To reduce the error of direct assignment, the formula uses weighted logarithmic
averaging to assign value to asset confidentiality (C), integrity (I), and availability
(A) by splitting each attribute into association and criticality. a, b, c are three constants
between 0 and 3, and a + b + c = 3.
ch ¼ c h eV Score
Through the method of asset assignment, the asset value of the web server is
AV = 4.1.
Using 5 min as an acquisition cycle, the index data related to the Web server in the
system are collected. The host state index is calculated by Eq. 14. The network state
48 J. Hu et al.
index is calculated by Eq. 15. The host importance is calculated by Eq. 17, and the
observation value is obtained by the above method. The specific values are shown in
Table 3.
Model Parameter Training. Adopt the model parameters setting method proposed in
Reference 9, the state transition matrix T is set as follows.
aGG aGR aGB aGC 0:839 0:15 0:009 0:002
a aRR aRB aRC 0:005 0:972 0:02 0:003
T ¼ RG ¼
aBG aBR aBB aBC 0:004 0:017 0:975 0:004
aCG aCR aCB aCC 0:004 0:017 0:125 0:854
An Improved Assessment Method for the Network Security Risk 49
p ¼ j pG pR pB pC j ¼ j 1 0 0 0j
w ¼ j wG wR wB wC j ¼ j 0 25 50 100 j
The hmmlearn package of python is selected to estimate and optimize the model
parameters. The final model parameters are as follows (Tables 4, 5 and 6).
It can be seen from the figure that when the iteration reaches 35 times, the value of
ln PðOjkÞ has converged and there is no obvious change. The model obtained at this
time is close to the optimal, and the best correspondence is obtained between the
observation sequence of the model and implicit network security states.
Algorithm Performance Comparison. Compare the risk value change of the Web
server calculated by the method proposed by Reference 9 and I-HMM method.
As can be seen from Fig. 4, in the overall trend, the security risk values of the web
server calculated by the two methods are the same. In the 8th, 12th, and 13th sampling
periods, it can be clearly seen that the difference between the risk values obtained by
the two methods is large, and the rate of change is also significantly different. It
indicates that after the introduction of the node correlation, the change of the risk value
of host 1 correlated with web server can significantly affect the risk status of the web
server, making the risk change of the web server more obvious. Between the 14th and
20th sampling periods, the security risk value obtained by using the method in this
paper is obviously lower than that of the comparison method. Corresponding to the
specific attack scenario, it can be found that the attack degree at this time is mostly 2.
The risk status of the first five sampling cycles should be similar, so the risk value
obtained by the I-HMM method of this paper is more in line with the actual situation.
The network security risk value of the experimental network changes as shown in Fig. 5.
When measuring the overall cybersecurity risk value, the overall trend of the two
methods is consistent, and the risk changes are basically the same. However, in
comparison, the I-HMM method in this paper is more obvious in the rate of change.
This is because the association of the nodes is introduced. The indirect risk and relative
importance of the nodes are considered.
5 Conclusion
We redefine alarm quality, optimize the acquisition of observation sequences, and use
the Baum-welch algorithm to learn the model parameters to improve the acquisition of
parameters. Considering the NNC, we introduce the association of network nodes. The
52 J. Hu et al.
results show that, compared with the previous method, the sensitivity of the I-HMM
method is higher and the dynamic change perception is stronger. The fluctuation of the
risk value is more obvious than the original method. And it can also distinguish the
influence degree of different hosts on the network risk.
1. Guang, K., Guangming, T., Xia, D.: A network security situation assessment method based
on attack intention perception. In: 2016 2nd IEEE International Conference on Computer
and Communications (ICCC). IEEE (2016)
2. Kun, W., Hui, Q., Haopu, Y.: Network security situation evaluation method based on attack
intention recognition. In: International Conference on Computer Science & Network
Technology. IEEE (2016)
3. Samy, G.N., Shanmugam, B., Maarop, N.: Information security risk assessment framework
for cloud computing environment using medical research design and method. Adv. Sci. Lett.
24(1), 739–743 (2018)
4. Li, S., Bi, F., Chen, W.: An improved information security risk assessments method for
cyber-physical-social computing and networking. IEEE Access 6, 10311–10319 (2018)
5. Li, X., Zhao, H.: Network security situation assessment based on HMM-MPGA. In:
International Conference on Information Management, pp. 57–63. IEEE (2016)
6. Hamid, T., Al-Jumeily, D., Hussain, A.: Cyber security risk evaluation research based on
entropy weight method. In: 2016 9th International Conference on Developments in eSystems
Engineering (DeSE). IEEE (2016)
7. Huang, K., Zhou, C., Tian, Y.C.: Application of Bayesian network to data-driven cyber-
security risk assessment in SCADA networks. In: 2017 27th International Telecommuni-
cation Networks and Applications Conference (ITNAC), pp. 1–6 (2017)
8. Liu, S., Liu, Y.: Network security risk assessment method based on HMM and attack graph
model. IEEE/ACIS International Conference on Software Engineering, Artificial Intelli-
gence, Networking and Parallel/Distributed Computing, pp. 517–522. IEEE (2016)
9. Xi, R.-R., Yun, X.-C., Zhang, Y.-Z.: An improved quantitative evaluation method for
network security. Chin. J. Comput. 38(4), 749–758 (2015)
10. Pietras, M., Klęsk, P.: FPGA implementation of logarithmic versions of Baum-Welch and
Viterbi algorithms for reduced precision hidden Markov models. Bull. Pol. Acad. Sci. Tech.
Sci. 65(6), 935–947 (2017)
11. Wang, Z., Lu, Y., Li, J.: Network security risk assessment based on node correlation.
J. Phys.: Conf. Ser. 1069(1), 012073 (2018)
12. Li, Y., Liu, S., Yu, Y.: Analysis of network vulnerability under joint node and link attacks.
Mater. Sci. Eng. Conf. Ser. 322(5), 052052 (2018)
13. Wangen, G.: Information security risk assessment: a method comparison. Computer 50(4),
52–61 (2017)
14. Doynikova, E., Kotenko, I.: CVSS-based probabilistic risk assessment for cyber situational
awareness and countermeasure selection. In: Euromicro International Conference on Parallel,
Distributed and Network-Based Processing, pp. 346–353. IEEE (2017)
15. Coffey, K., Smith, R., Maglaras, L.: Vulnerability analysis of network scanning on SCADA
systems. Secur. Commun. Netw. 2018(4), 1–21 (2018)
A High-Performance Storage System Based
with Dual RAID Engine
Abstract. With the advent of the 5G, more and more applications use cloud
storage to store data. Data becomes the cornerstone of the development of smart
society. At the same time, these data have the characteristics of uneven gener-
ation rate, large write demand and low read requirement. The dynamic change of
load during data storage has new requirements for storage architecture. This
paper proposes a storage system that allocates strips in real time based on current
load changes. Based on the traditional RAID layout, a dual-engine based high-
performance storage system (DSH) is proposed. This system uses software and
hardware co-processing architecture to implement strip allocation and address
calculation. The strip allocation functions using software and the verification
algorithm is implemented by hardware transfer to the FPGA through PCIE.
Through experimental analysis shows that the DSH algorithm has a great
advantage in saving CPU computing resources and saving disk energy con-
sumption in the dynamic load storage environment.
1 Introduction
With the advent of the artificial intelligence era [1], the storage devices are becoming
larger and larger, and storage requirements are becoming complex and varied [2]. Data
storage in different scenarios has become a research hotspot, and storage systems’s
main goal is to save system performance and reduce energy consumption while
keeping storage secure [3].
For the file filing system [4], Liu [5] et al. proposed an energy-saving disk array S-
RAID5 system [6], which can greatly reduce the energy consumption of the disk array
under the premise of meeting performance requirements. However, the partial parallel
data of the S-RAID5 algorithm’s layout is static, suitable for smoother workloads, and
has poor adaptability to strong fluctuating loads [7] or sudden loads. In response to this
problem, Sun [8] and others proposed the DPPDL algorithm to implement dynamic
allocation of strips. When the load is small, a small number of disks are opened, and
when the load is large, multiple disks are opened.
Researchers mostly use the RAID5 [9] source code architecture in the Linux kernel
to implement the prototype system, that is, calculate the stripe address, read the disk
data fill strips, XOR the stripe, fill the strip, submit the bio, and write to the disk. In the
architecture, the XOR calculation of the stripe consumes a large amount of CPU
computing module resources, which imposes a great burden on the entire monitoring
For this problem, this paper proposes a high-performance storage system based on
dual engine. The system adopts the software and hardware co-design of dynamic load.
In the low-level layout, the system uses the hot and cold tree structure to manage the
disk space, according to the currently open disk and the number of times the disk has
been used in the past is striped. According to different application scenarios applicable
to software and hardware respectively, the algorithm adopts a software and hardware
co-processing architecture, and implements functions such as strip allocation and
address calculation using software, and transfers the XOR check algorithm of the disk
to the FPGA [10] through PCIE [11]. Implementation, this layout greatly saves CPU
computing resources and disk power consumption.
2 DSH Implementation
In order to find the most suitable strip, use the following flow chart to select the
strip. If the number of requested disks is larger than the number of disks that have been
opened, the number of open disks is insufficient. You need to open a new disk. In this
case, the second type of cold disk is used first. The level method calculates and splits
the strips. If there is exactly the required strip in the hot and cold tree and the number is
greater than 1, then the first method of calculating the cold disc priority is used to
calculate and select the strip, otherwise the second method of calculating the cold disc
priority is adopted. The specific flow chart is shown in Fig. 2.
Algorithm 1 details:
algorithm 1 First priority calculation method
1 Nth gradient priority number: A collection of disks with the same priority, which is ranked nth from
high to low.
2 Average priority = 100 / number of requested disks;
3 Priority granularity = (the disk free address / all free addresses) * average priority;
4 if(request_disks<=first gradient priority number)
5 First gradient priority = average priority
6 Second Gradient Priority = Average Priority - Second Gradient Priority Granularity
7 else if (request_disks = first gradient priority +…+ Nth gradient priority)
8 First Gradient Priority = Average Priority + First Gradient Priority Granularity
9 ...
10 N-1th gradient priority = average priority + n-1th gradient priority granularity
11 Nth gradient priority = (100-(first gradient total priority +...+nth gradient total priority))
12 Nth gradient priority number
13 n+1th gradient priority = nth gradient priority - n-1th gradient priority granularity
14 n+2 gradient priority = n+1th gradient pr iority - n+2 gradient priority granularity
15 ...
16 After calculating the priority of the cold disk, the strip that satisfies the situation is traversed from
the beginning. If the priority is 100%, the strip is directly selected, otherwise the strip with the
highest priority is selected after all calculations are completed.
17 Updating the hot and cold tree structure
The second method of calculating the priority is to split the required strips (in-
cluding the opened disks) from the maximum number of strips.
Algorithm 2 details:
We assume that there are 6 disks to form the DSH bottom layer layout, and the data
with a load of 3 is initially stored. The subsequent load changes are 3, 5, 4, 3, 2, 1. The
change of hot and cold trees is shown in Fig. 3.
With the disk allocation algorithm, we started thinking about the disk reclamation
process. When there is no strip of 5 disks in the hot and cold tree, the disk reclamation
process begins, and the oldest data is deleted in turn until a strip with 5 disks appears.
This deletion ensures that there is enough disk strips to meet the maximum load when
A High-Performance Storage System Based with Dual RAID Engine 57
the maximum load comes in, and the cold disk allocation disk algorithm is preferred
beforehand to facilitate the creation of the largest disk strip as early as possible during
the reclamation operation. Assume that the recovery is performed once in the following
figure. The change of the hot and cold trees is shown in Fig. 4.
The traditional software RAID architecture first uses the MDADM tool to create a
soft RAID in the user space. The soft RAID creates a virtual hard disk MD in the kernel
space. The MD combines several disks that make up the RAID. Only one md hard disk
can be seen in the user space. The peripheral disk can be accessed indirectly by reading
and writing the md hard disk.
The DSH communication framework transfers the step of generating the verifica-
tion data in the DSH software algorithm to the FPGA, and transmits the data to the
FPGA through the PCIE through the riffa architecture, and then transmits it back after
the calculation. This step will save a lot of CPU and GPU resources.
3 Experiment Analysis
This experiment mainly tests the energy consumption, transmission bandwidth (dif-
ferent random, sequential ratio), response time, CPU usage of different RAID prototype
Large-scale storage systems usually consist of hundreds or thousands of disks. For
ease of management, the entire storage system is generally divided into several sub-
storage systems that consist of multiple disks and a RAID structure. In order to test the
performance and energy saving effect of the DSH system, a DSH prototype system is
built under the MD (Multiple Device driver) module under the Linux 4.40 kernel.
A High-Performance Storage System Based with Dual RAID Engine 59
The DSH prototype system uses 5 disks to form a DSH prototype system, and the DSH
in-band data block. The size is 64 KB. The experimental results are applicable to large-
scale monitoring of storage systems.
The system uses a commonly used NILFS (New Implementation of a Log-
structured File System) file system, which is a file system based on a log format. The
system writes data in a sequential write manner and always writes to the disk head until
the logical storage space. Rewrite the deleted data only when it is full. This file system
is suitable for continuous storage systems such as video surveillance and archive
IOMeter is the most widely used tool for testing IO subsystems. It uses IOMeter to
perform write performance tests on DSH systems under the load of 90% continuous
data and 10% random data. In contrast, three storage systems of the same configuration
of S-RAID5 and DPPDL were built for testing.
As shown in Fig. 6, when the request length is 16 KB to 512 KB, the SRAID5
transfer rate is faster than DPPDL and DSH. This is because SRAID5 always opens
two disks. When the request length is small, DPPDL and DSH only need to open one
disk. Therefore, the SRAID5 transmission rate is greater than the DPPDL and DSH
transmission rates. When the request length is greater than 1024 KB, since the DSH
and DPPDL algorithms open more disks due to the larger load, the transfer rate is
greater than the SRAID5 algorithm, and since the SRAID 5 always opens two disks,
the transfer rate remains unchanged.
16 64 128 256 512 1024 2048
As shown in Fig. 7, since the DSH algorithm transfers the XOR computing portion
of the most CPU-consuming resources to the FPGA, the CPU usage of the algorithm is
the lowest, which is basically the CPU usage under the system idle state. The S-RAID5
algorithm always turns on only two disks, and its address conversion method is very
simple. Its main CPU usage is used in the read-and-write verification algorithm, so its
CPU utilization is maintained at about 10%. The DPPDL algorithm uses CPU
resources because of the need to split strips in real time. The parity check algorithm of
address translation and read rewriting uses CPU resources, so its CPU utilization
60 J. Liu et al.
increases with the increase of load. When the load is greater than 1024 KB, all The disk
is in the on state. When the system reaches full load, the write operation becomes the
entire write, and the CPU resource usage rate remains stable.
16 32 64 128 256 512 1024
The load in real life is dynamically changing, for which we simulate the load versus
time graph, as shown in Fig. 8.
As can be seen from the figure, the S-RAID5 algorithm has stable energy con-
sumption because it has a simple striping strategy and its number of open disks is fixed.
The DPPDL algorithm is relatively random when selecting strips, and does not
consider the feature of using open disks as much as possible. Therefore, in some cases
where the load is relatively balanced, sudden increase in power consumption may
The DSH algorithm reasonably considers the use of an already opened disk.
Compared to the DPPDL algorithm, its energy consumption changes steadily as the
load changes.
4 Future Work
1. Meng, X., Ci, X.: Big data management: concepts, techniques and challenges. J. Comput.
Res. Dev. 50(01), 146–169 (2013). (in Chinese)
2. Chen, P.M., Lee, E.K., Gibson, G.A., et al.: RAID: high-performance, reliable secondary
storage. ACM Comput. Surv. 26(2), 145–185 (1994)
3. Luo, S., Zhang, G., Wu, C., et al.: Boafft: distributed deduplication for big data storage in the
cloud. IEEE Trans. Cloud Comput. (2015)
4. Barroso, L.A., Hlzle, U.: The Datacenter as a Computer: An Introduction to the Design of
Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture. Morgan &
Claypool Publishers, San Rafael (2009)
5. Macko, P., Ge, X., Kelley, J., et al.: SMORE: a cold data object store for SMR drives. In:
Proceedings of 34th Symposium on Mass Storage Systems and Technologies (MSST), vol.
35, no. 7, pp. 343–352 (2017)
6. Jie, W., Yu, H., Zuo, P., et al.: Improving restore performance in deduplication systems via a
cost-efficient rewriting scheme. IEEE Trans. Parallel Distrib. Syst. 19(7), 121–132 (2019)
7. Xiao, W., Ren, J., Yang, Q.: A case for continuous data protection at block level in disk
array storages. IEEE Trans. Parallel Distrib. Syst. 20(6), 898–911 (2009)
62 J. Liu et al.
8. Gurumurthi, S., Sivasubramaniam, A., Kandemir, M., et al.: DRPM: dynamic speed control
for power management in server class disks. In: Proceedings of the 30th Annual
International Symposium on Computer Architecture, pp. 169–179. IEEE, San Diego (2003)
9. Papathanasiou, A.E., Scott, M.L.: Energy efficient prefetching and caching. In: Proceedings
of the Annual Conference on USENIX Annual Technical Conference, pp. 24–37. ACM,
Boston (2004)
10. Carrera, E.V., Pinheiro, E., Bianchini, R.: Conserving disk energy in network servers. In:
Proceedings of International Conference on Supercomputing, pp. 86–97. CiteSeer (2003)
11. Zhu, Q., Chen, Z., Tan, L., et al.: Hibernator: helping disk arrays sleep through the winter.
ACM SIGOPS Oper. Syst. Rev. 39(5), 177–190 (2005)
An Universal Perturbation Generator
for Black-Box Attacks Against Object
1 Instruction
In the field of computer vision, deep learning has become the main technology
to solve problems such as image classification, object detection, and semantic
segmentation. With the continuous development of deep learning technology
and the continuous improvement of computing resources, people are gradually
applying deep learning to security fields, such as mobile phone face recognition
and ATM facial recognition.
However, recent studies have shown that deep learning models are highly
susceptible to small perturbations. Szegedy et al. [13] first proposed the vulner-
ability of deep learning models in the field of image classification, that is, adding
carefully created perturbation can cause the image classifier to misclassify input
images with extremely high confidence, while the same perturbation can fool
multiple image classifiers.
Since Szegedy et al. [13] proposed that the image classifier based on deep
neural network has a problem against the adversarial samples, it has caused
considerable research heat, and the corresponding attack and defense methods
have emerged. Goodfellow et al. [4]proposed that the linear distribution of the
deep learning model in high-dimensional space leads to the possibility of the
existence of the adversarial examples. Then he proposed the FGSM (Fast Gra-
dient Sign Method) attack method, which uses the gradient descent method to
minimize the loss function to generate adversarial example. The perturbation is
indistinguishable to human eyes, and the input images can be correctly classified
by the image classifiers with high degree of confidence. The adversarial examples
are misclassified by the image classifier, but the human eye can still correctly
recognize them. In addition, relevant scholars have also proposed attack methods
such as Box-constrained L-BFGS [13], FGSM [4], BIM [5], ILCM [5], JSMA [11],
C&W [2], Deepfool [9], MI-FGSM [3], One Pixel Attack [12], and Decision-Based
Attack [1].
There are many types of attack methods. According to the purpose or expec-
tation of the attack, it can be divided into targeted attack and non-targeted
attack. Targeted attack means that the attacker hopes that the image classifier
can misclassify the adversarial example into a certain incorrect class specified by
the attacker. And non-targeted attack means that the attacker hopes that the
image classifier can just misclassify the adversarial example. According to the
attacker’s knowledgement to the model, it can be divided into white-box attacks
and black-box attacks. In the white-box state, we know all the information of
the image classifier, such as the structure and parameters of the classifier, the
training set when the model is trained, and so on. But in the black-box state,
we can’t know the internal information of the image classifier and the dataset
used during training. We can only get the adversarial example according to the
input and output of the classifier by query. In addition, The state of the gray-
box is somewhere in between. We can get the training data and the structure
of the classifier used in the training process, but the specific parameters of the
classifier can not be obtained. According to the type of perturbation, it can be
divided into universal perturbation attacks and image-dependent perturbation
attacks. The universarial perturbation is fixed and does not change with the
change of the input image, and the image-dependent perturbation vary with the
input images. At present, there are more than ten attack methods in the field.
The white-box attack methods include Box-constrained L-BFGS [13], FGSM [4],
BIM [5], ILCM [5], C&W [2], DeepFool [9], JSMA [11], I-FGSM [7], MI-FGSM
[3], etc.; black-box attack methods include One Pixel Attack [12], Alternative
Model Attack [6,14], and Decision-Based Attack [1].
The main contributions of this work can be summarized as follows:
The rest of our paper is organized as follows: we introduce some related works for
adversarial attacks such as Universal adversarial perturbation and Fast Feature
Fool Attack in Sect. 2. Next, We describe the threat model and introduce some
notations in Sect. 3. In the next section, we propose our approaches to generate
universal perturbations and attack object detectors. After that, the setup and
results of our experiments will be presented in Sect. 5. Finally, we draw the
conclusions of our entire work.
2 Related Work
3 Threat Model
Deep learning technology has achieved great success in the field of computer
vision. In the object detection field, DNNs are used to detect objects in the
input images. In our work, we attempt to add small perturbation which is imper-
ceptible for human to input images that can fool DNN classifier to classify the
objects in the input images to any incorrect class or hidden the bounding boxes.
In the next, we will introduce some popular object detectors and declare some
notations that will be frequently used in later sections.
As we all know, YOLOv3 is the most popular one-stage object detectors
now, it can detect objects in real time. YOLO means “you only look once”, that
is, the objects’ location processes and classification processes are completed in
one step. YOLO returns the locations of bounding boxes and the categories of
bounding boxes at the output layer to achieve one-stage. In this way, YOLO
can achieve 45 frames per second of computing speed, fully meet the real-time
Before YOLOv3, Faster R-CNN is the state-of-the-art detector at that time.
It detects objects in a two-stage way, firstly, the region proposal networks judge
whether the candidate frame is the target, then classify which class it belongs
to. The entire network can share the feature information extracted by the convo-
lutional neural network, which saves computational cost and solves the problem
that the Fast R-CNN algorithm generates candidate frames in a slow way.
We mainly judge the performance of perturbation from two aspects. One is
attack success rate, which we will introduce in the next. The other is the norm
of the perturbation which is added to the original image, we use L∞ in our
experiments where p-norm is defined as:
p 1
xp = ( |xi | ) p . (1)
4 Approaches
In this section, we will introduce the main process of how to refine previous
work and firstly propose to use universal adversarial perturbation to attack
object detector. This section contains two parts that separately introduce the
generation process of the universal adversarial perturbation and how to use the
transferability of the perturbation to attack object detector across tasks.
5 Evaluation
5.1 Setup
L∞ = 20 First 20 classes First 40 classes First 60 classes First 80 classes First 100 classes
Resize 15.89% 30.77% 28.85% 28.85% 30.77%
Pile-up 15.89% 35.58% 34.62% 28.85% 40.38%
To get more powerful and robust universal perturbations, we used all of the
classes in ImageNet dataset to train the perturbation generator. Table 2 show
the attack success rate and Fig. 3 depicts the perturbation added to different
input images.
L∞ = 20 Resize Pile-up
Attack success rate on YOLOv3 37.70% 40.98%
70 Y. Zhao et al.
Fig. 2. We used the first 100 classes in ImageNet dataset to train the perturbation
generator. The top line are input images, the second line are adversarial examples
with resized perturbation, and the bottom line are adversarial examples with pile-up
Fig. 3. Attacks against YOLOv3 model when L∞ = 20. The top line are input images,
the second line are adversarial examples with resized perturbation, and the bottom
line are adversarial examples with pile-up perturbation.
An Universal Perturbation Generator for Black-Box Attacks 71
L∞ = 20 Resize Pile-up
Attack success rate on Faster R-CNN 37.50% 81.25%
Fig. 4. Attacks against Faster R-CNN model when L∞ = 20. The top line are input
images, the second line are adversarial examples with resized perturbation, and the
bottom line are adversarial examples with pile-up perturbation.
6 Conclusions
In this paper, we proposed a cross-task universal perturbation attack against
black-box object detectors. We trained a deep neural network to generate uni-
versal perturbations on the classifier, then use the generated perturbation to
attack black-box object detectors. We finished experiments on two represen-
tative object detectors: Faster-RCNN based on proposal and regression-based
YOLOv3. We demonstrated the efficiency and transferability of the universal
perturbation generated by our attack. We also demonstrated the feasibility of
cross-task attack in the field of computer vision, contributing to the security of
deep neural networks.
1. Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: reliable
attacks against black-box machine learning models (2017)
2. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In:
Security & Privacy (2017)
3. Dong, Y., Liao, F., Pang, T., Hu, X., Zhu, J.: Discovering adversarial examples
with momentum (2017)
4. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial
examples. Comput. Sci. (2014)
5. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world
6. Liu, Y., Chen, X., Chang, L., Song, D.: Delving into transferable adversarial exam-
ples and black-box attacks (2016)
7. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning
models resistant to adversarial attacks (2017)
8. Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial
perturbations (2017)
9. Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate
method to fool deep neural networks. In: Computer Vision & Pattern Recognition
10. Mopuri, K.R., Garg, U., Babu, R.V.: Fast feature fool: a data independent approach
to universal adversarial perturbations (2017)
11. Papernot, N., Mcdaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The
limitations of deep learning in adversarial settings. In: IEEE European Symposium
on Security & Privacy (2016)
12. Su, J., Vargas, D.V., Kouichi, S.: One pixel attack for fooling deep neural networks.
IEEE Trans. Evol. Comput. (2017)
13. Szegedy, C., et al.: Intriguing properties of neural networks. Comput. Sci. (2013)
14. Tramr, F., Papernot, N., Goodfellow, I., Dan, B., Mcdaniel, P.: The space of trans-
ferable adversarial examples (2017)
A Brief Survey on Cyber Security Attack
Entrances and Protection Strategies
of Intelligent Connected Vehicle
1 Introduction
drivers’ lives and property. It was first occurred in 2010 that attacking on the car
information systems [2]. As the scope of application of the ICV continues to expand,
cyber security also attacks continue to increase. Information tampering, virus intrusion
and other means have been successfully applied by hackers in cyber-attacks on smart
cars [3], which has aroused great concern from all walks of life.
This paper mainly focuses on automobile security based on the vehicle ecosystem.
And we obtained the attack entrance of the ICV by means of testing the vehicle and
other simulation investigation and research. Then, we summarize the key strategies of
vehicle cyber security protection, which can be used as a reference in the design and
manufacture of automobiles to improve the overall information security level of
In the digital age, the degree of global connectivity has increased, making people more
connected, and the automotive industry has undergone the same transformation. With
the rise of ICV and mobile travel services, the traditional ecosystem of vehicles has
been further expanded to include technologies, services, infrastructure providers and
smart cities to make vehicles an interconnected system. Vehicle ecosystems can be
broadly classified into 4 broad categories that are terminal equipment, cloud platforms,
third party service, and communication and network transmission as Fig. 1 shown.
Fig. 1. The diagram of vehicle ecosystem. It contains terminal equipment, cloud platforms, third
party service, and communication and network transmission.
The terminal equipment layer will always be used as a third-party interface that can
be used to create services and applications. It contained inside car and outside car two
parts. The equipment inside vehicle include the operating system (OS), semiconductor
chip, T-Box, IVI, On Board Diagnostics (OBD), mobile application (APP), vehicle
A Brief Survey on Cyber Security Attack Entrances and Protection Strategies 75
APP, and so on. People has more directly interactions with the OS, mobile APP and
vehicle APP in the vehicle ecosystem terminal equipment. In many use cases, other
advanced technologies such as artificial intelligence and blockchain are also used in
those equipment. The external infrastructure includes road test equipment, road con-
ditions, sensor device, traffic conditions, weather conditions, and the others which may
affect the vehicle.
The core of the cloud platforms is the connectivity platform, which consists of the
vehicle and the 3rd part services. Cloud technology also provides a platform for users
and vehicles to share information. It can get and share data and information through the
vehicle includes precise positioning of the vehicle, vehicle health and climatic condi-
tions. Such data sources will be connected to the cloud platforms, which requires
technical support for cellular networks such as 4G, and will be implemented in the
future using 5G communication networks. Users and vehicles can communicate in real
time, and can obtain vehicle maintenance information in a timely manner. Vehicle
service reservations are smarter and more convenient. In the future, the use of cloud
storage will realize the sharing of information between vehicles and people, vehicles
and vehicles, vehicles and infrastructure, laying the foundation for optimizing travel,
improving efficiency and realizing smart life, and building a more complete intelligent
ecosystem and intelligence.
In the development of new technologies in the automotive industry, more and more
new players have emerged as the third-party service, such as internet and software
companies, sensor manufacturers, and travel service providers. And data management
and maintenance can also be outsourced to third-party organizations. These new
players form a new automotive ecosystem with traditional vehicle manufacturers and
component suppliers. They may represent business (e.g. legal, communications, pur-
chasing) and technical organizations (e.g. engineering, IT) within original equipment
manufacturers (OEMs), suppliers, and other automotive stakeholders. The OEM,
telematics service provider (TSP), Tier1, suppliers, and other automotive industry
stakeholders have played a role include product cybersecurity managers, support staff,
crisis managers, executives, legal counsel, and product managers in vehicle cyber
security. So, it is better to find and fix issues in vehicle ecosystems that with the help of
a variety of internal and external stakeholders.
The communication in vehicle ecosystem includes in-vehicle communication and
V2X communication such as vehicle to cloud communication, vehicle to infrastructure
communication, and vehicle to human communication. The in-vehicle communication
mostly contains CAN bus, LIN bus, FlexRay bus, and MOST bus. All data transactions
in the vehicle are made through the gateway. A vehicle gateway that can communicate
through various protocols is installed in the vehicle. The V2X communication con-
tained 4G, LTE-V, WIFI, Bluetooth, USB, OTA, dedicated short range communica-
tions (DSRC), on board diagnostics (OBD), etc. DSRC is an efficient wireless
communication technology that enables the identification and two-way communication
of moving targets in high-speed motion in a small area (usually tens of meters), such as
the “vehicle-road” and “vehicle-vehicle” of the vehicle two-way communication. It
transmits image, voice and data information in real time, to connect vehicles and roads
organically. ODB can monitor the working status of the engine electronic control
76 Z. Wang et al.
system and other functional modules of the vehicle in real time during the running of
the vehicle.
From the vehicle ecosystem, we can find it is very widely, cantinas the numerous
stakeholders, and has a lot of ways of communication. Therefore, the risks and threats
faced by the vehicle are numerous and ubiquitous, so the information security of cars
has a long way to go.
Any device connected to the Internet may be vulnerable to hackers. While enjoying the
convenience of the network, we must also face the “dark side” of the network—the
information security threat, which is not immune to the automotive industry. To ensure
the cybersecurity of the vehicle, based on the car’s ecosystem, we simulated the attack
entrances for vehicle cybersecurity and came up with the following common attack
3.4 T-BOX
T-BOX is the communication gateway of intelligent networked vehicles, that almost all
communication like 4G, Wi-Fi, OTA and vehicle remote communication are all
completed by T-BOX. So, it has played an important role in intelligent and connected
vehicle. The main threat of the T-BOX is the attack of middlemen. The attacker hijacks
the T-BOX session and listens to the communication data through pseudo base stations
and DNS hijacking. For example, in an embedded system, the T-BOX-hardware layer
UART debug interface can be used to enter the uboot for firmware upgrade, as the
Fig. 2 shows.
Fig. 3. Attackers can use the HeartBleed vulnerability of the cloud platform to directly read
server data, including user’s cookies and even plaintext accounts and pass-words.
The complex application scenarios and technologies of the intelligent and connected of
vehicles make it have more security risks. So, the comprehensive measures are needed
to protect them [5, 6]. Based on above security attack entrances, we put forward several
relevant and common protection strategies, which are OBD firewall, encrypted trans-
mission, system protection, firmware hardening, application hardening, and code
obfuscation [7, 8].
A Brief Survey on Cyber Security Attack Entrances and Protection Strategies 79
creating a new conditional jump code block that jumps to the real basic block or
another code block containing the garbage instruction, the original basic code block.
Will also be cloned and populated with randomly selected spam commands. Instruction
substitution does not change the original control flow of the function, but replaces
ordinary arithmetic and Boolean operations with more complex operations. When
several equivalent sequences of instructions are available, it will randomly select one to
replace (Fig. 4).
Fig. 4. Identifier confusing instantiation in code obfuscation. The (a) is before the confusion,
and the (b) is after confusing, the confusion can increase the analysis difficulty of the decompiled
82 Z. Wang et al.
5 Conclusion
In general, the protection measures for ICVs cyber security are not only reflected in the
hardware and software requirements of the vehicle, but also reflected in the require-
ments of communication and cloud. Only by doing cybersecurity protection measures
in the entire automobile ecosystem, can we cope with various possible cyber security
issues and ensure the cyber security of ICVs as much as possible
This paper outlines the vehicle attack entrance and protection strategies, analyzes
the common attack methods and potential security threats for automobiles, and sum-
marizes the corresponding vehicle protection measures for each security threat.
1. Ashibani, Y., Mahmoud, Q.H.: Cyber physical systems security: analysis, challenges and
solutions. Comput. Secur. 68, 81–97 (2017)
2. Chen, L.W., Syue, K.Z., Tseng, Y.C.: A vehicular surveillance and sensing system for car
security and tracking applications. In: Proceedings of the 9th ACM/IEEE International
Conference on Information Processing in Sensor Networks (IPSN 2010), pp. 426–427 (2010)
3. Okul, Ş., Aydin, M.A., Keleş, F.: Security problems and attacks on smart cars. In: Boyaci, A.,
Ekti, A.R., Aydin, M.A., Yarkan, S. (eds.) International Telecommunications Conference.
LNEE, vol. 504, pp. 203–213. Springer, Singapore (2019).
4. Wolf, M., Weimerskirch, A., Wollinger, T.: State of the art: embedding security in vehicles.
EURSIP J. Embed. Syst. 16(1) (2007)
5. Lee, C.H., Kim, K.H.: Implementation of IoT system using block chain with authentication
and data protection. In: 2018 International Conference on Information Networking (ICOIN),
pp. 936–940. IEEE (2018)
6. Alfred, J.R., Sidorov, S., Tsang, M.C., et al.: In-vehicle networking. U.S. Patent Application
15/270,957, 22 March 2018
7. Wroblewski, G.: General method of program code obfuscation (2002)
8. Pizzolotto, D., Fellin, R., Ceccato, M.: OBLIVE: seamless code obfuscation for Java
programs and Android apps. In: 2019 IEEE 26th International Conference on Software
Analysis, Evolution and Reengineering (SANER), pp. 629–633. IEEE (2019)
Hierarchically Channel-Wise Attention
Model for Clean and Polluted Water
Images Classification
1 Introduction
Rapid economic development, large growth of population and over exploitation
of nature resources may result in serious pollution of water ecosystem, e.g., river,
lake and sea, if not instantly monitoring, controlling and abating pollution. In
the last decade, water pollution monitoring system based on cloud and big data
system [7,13] is established by manually water sampling and laboratory analy-
sis. However, there is an obvious time-delay with such low efficient monitoring
strategies. Realtime monitoring sudden and large-scale pollution outbreaks thus
have gained a lot of interests from researchers and government.
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 83–92, 2019.
84 Y. Wu et al.
Fig. 1. Architecture of the proposed water pollution monitoring system, where we can
notice the proposed hierarchically channel-wise attention model classifies the inputting
captured water images into clean and polluted categories with 10 subcategory labels.
mi = ui (j, k), (1)
W × H j=1
where function sig(), N or() and η() refer to sigmoid, normalization and ReLU
functions respectively, W1 and W2 are the learnable parameter matric, and b1
and b2 are the bias vectors. The reason to adopt such structure for module con-
structing lies in two facts, i.e., firstly the designed structure must be capable of
learning a highly nonlinear interaction between channels, and secondly, it must
allow multiple channels to be emphasised opposed to one-hot activation. The
Hierarchically Channel-Wise Attention Model for Clean and Polluted Water 87
We adopt the popular VGG-16 network to modify it for classification of clean and
polluted water images. VGG-16 is pre-trained on Imagenet dataset and achieves
highly efficient results for visual category classification. Built on VGG16, we pro-
pose to fine-tune it incorporating with the proposed hierarchically channel-wise
attention model for accurate and task-specified classification based on training
sets of real data collecting from sensors. Figure 3 gives the overview of the pro-
posed method, where we construct two levels of channel-wise attention module
to describe channel-wise attention in both local and global sense. The reason of
building hierarchically channel-wise attention lies in the fact that low-level and
middle-level features could be representative for ambiguous category-level clas-
sification problem. Take classification of water images as an example, texture,
one of the most important low-level features, is the determined feature to recog-
nize water surface polluted by oil, which has been proved by [10]. However, the
decay effect of gradients and semantic abstraction of higher layers in neural net-
work may ignore the importance of low-level or middle-level features. Based on
these considerations, we regard the proposed hierarchically channel-wise atten-
tion model as an option to emphasize the low-level and middle-level features for
ambiguous category-level classification problem.
As shown in Fig. 3, the local channel-wise attention weights are built as a
function of the lth CNN channel feature Uk,l output by the kth vgg block and
then work with it as
U˜k,l = Uk,l · Φ(Uk,l ) (4)
88 Y. Wu et al.
3 Experiments
Since there is no benchmark dataset for clean and polluted water image classifi-
cation in literature, we follow [11] to utilize their collected dataset. Their dataset
consists of 1000 water images, includes images from standard videos of different
Hierarchically Channel-Wise Attention Model for Clean and Polluted Water 89
water images [3] and internet sources, such as Google, Bing and Baidu. Their
clean water image class is classified as four sub-classes, namely, Fountain, Lakes,
Oceans and Rivers. Similarly, the polluted water image class is labeled with six
sub-classes, namely, Algaes, Dead animals, Fungus, Industrial pollution, Oils and
Rubbish. In summary, the considered dataset in this paper is complex with high
intra variations and less inter variations.
During experiments, we adopt 750 images as training set and others as testing
set to perform experiments. We use 4-fold cross validation to evaluate classifica-
tion with measurements of precision, recall and f-score. The proposed model is
trained within 150 epoches by defining batch size as 64. The initial learning rate
of the logic regression layer and other layers are settled as 5e−3 and 5e−4 , respec-
tively. To make the convergence faster and more stable, we adopt a trick that
all the learning rate values will be divided by 10 when the validation accuracy
begins to decrease.
Fig. 4. Confusion matrices on classification for 10 classes of clean and polluted water
images, where (a)–(d) refer to result of the proposed method, the proposed method
without attention, [11] and [3], respectively.
four and six classes) and accuracy for total water images (with ten classes) in
Table. 1. We can be seen that the average accuracy performance of the proposed
method is the best compared to the existing methods. The main reason of [3]
to get poor results is that the descriptors used in [3] are not robust to classify
clean and polluted water images. In fact, [3] is initially designed to perform
water detection in videos and requires high contrast and clear object shapes to
get higher accuracy. The presence of irregular objects in polluted images, such
as rubbish, oil and dead animals, thus leads to poor classification results. [11]
takes advantages of HSV and Fourier spectrum to extract distinguish features in
frequency domain to overcome the shortages of irregular objects, which results
in a much better performance than [3].
Quantity of complex patterns of clean and polluted water images are hard
to classify with a single manually designed feature. This could be proved by
Fig. 4(c), where misclassification rate of dead animal to rubbish is as high as
0.89 and meanwhile correctness rate of rubbish is as high as 0.87. The reason to
achieve good performance for rubbish and bad performance for dead animal is
the high sensitivity and effectiveness of the manual designed feature for rubbish
other than dead animals. Furthermore, the small inter-class variations between
rubbish and dead animals and high intra-class variations of rubbish make the
Hierarchically Channel-Wise Attention Model for Clean and Polluted Water 91
Fig. 5. Samples of correct and misclassified classification results of clean and polluted
water images achieved by the proposed method.
4 Conclusion
In this paper, we propose a novel hierarchically channel-wise attention model
incorporating with CNN structure for clean and polluted water images classifi-
cation. Experimental results on a latest water image dataset with several com-
parative methods demonstrate the effectiveness and robustness of the proposed
92 Y. Wu et al.
method for water image classification. Our future work includes the explorations
on implementing the proposed method in water pollution monitoring system to
help instantly monitor, control and abate water pollution.
1. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint
arXiv:1709.01507 (2017)
2. Khan, M., Wu, X., Xu, X., Dou, W.: Big data challenges and opportunities in
the hype of industry 4.0. In: Proceedings of IEEE International Conference on
Communications, pp. 1–6 (2017)
3. Mettes, P., Tan, R.T., Veltkamp, R.C.: Water detection through spatio-temporal
invariant descriptors. Comput. Vis. Image Underst. 154, 182–191 (2017)
4. Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual
attention. In: Proceedings of NIPS, pp. 2204–2212 (2014)
5. Prasad, M.G., Chakraborty, A., Chalasani, R., Chandran, S.: Quadcopter-
based stagnant water identification. In: Proceedings of Fifth National Confer-
ence on Computer Vision, Pattern Recognition, Image Processing and Graphics
(NCVPRIPG), pp. 1–4 (2015)
6. Qi, L., Chen, Y., Yuan, Y., Fu, S., Zhang, X., Xu, X.: A QoS-aware virtual machine
scheduling method for energy conservation in cloud-based cyber-physical systems.
World Wide Web, pp. 1–23 (2019)
7. Qi, L., et al.: Finding all you need: web APIs recommendation in web of things
through keywords search. IEEE Trans. Comput. Soc. Syst. 6(5), 1063–1072 (2019)
8. Qi, X., Li, C.G., Zhao, G., Hong, X., Pietikäinen, M.: Dynamic texture and scene
classification by transferring deep image features. Neurocomputing 171, 1230–1241
9. Rankin, A.L., Matthies, L.H., Bellutta, P.: Daytime water detection based on sky
reflections. In: Proceedings of ICRA, pp. 5329–5336 (2011)
10. Santana, P., Mendonça, R., Barata, J.: Water detection with segmentation guided
dynamic texture recognition. In: Proceedings of IEEE International Conference on
Robotics and Biomimetics (ROBIO), pp. 1836–1841 (2012)
11. Wu, X., Shivakumara, P., Zhu, L., Lu, T., Pal, U., Blumenstein, M.: Fourier trans-
form based features for clean and polluted water image classification. In: Proceed-
ings of International Conference on Pattern Recognition (2018)
12. Xu, X., Liu, Q., Zhang, X., Zhang, J., Qi, L., Dou, W.: A blockchain-powered
crowdsourcing method with privacy preservation in mobile environment. In: IEEE
Transactions on Computational Social Systems (2019)
13. Xu, X., Zhang, X., Gao, H., Xue, Y., Qi, L., Dou, W.: Become: blockchain-enabled
computation offloading for iot in mobile edge computing. In: IEEE Transactions
on Industrial Informatics (2019)
14. Zhao, W., Du, S.: Spectral-spatial feature extraction for hyperspectral image clas-
sification: a dimension reduction and deep learning approach. IEEE Trans. Geosci.
Rem. Sens. 54(8), 4544–4554 (2016)
Energy-Efficient Approximate Data
Collection and BP-Based Reconstruction
in UWSNs
1 Introduction
It is known that the propagation speed for an acoustic link is about 1,500
meters/sec, which is five orders of magnitude lower than that of a radio link.
Hence, underwater acoustic communication in UWSNs consumes much higher
energy cost than terrestrial radio communication. Recharging and replacing bat-
tery for underwater sensors is also difficult, and data retransmission designed in
dedicated transfer protocols cause extra energy cost. To alleviate the usage of
limited power, approximate data collection is an efficient scheme to optimize
energy consumed for sensing and communication. Approximate strategies aim
to only transmit partial data to represent all of the data, and in this case we do
not need to set all sensor to work or deliver every raw sensing value.
Moreover, it is acknowledged that the collected data in the adjacent tempo-
ral or same spatial regions are high-correlated. The correlation feature of sens-
ing data enables us to approximately collect data by mathematical prediction
models (e.g., least mean square [3], Compressive Sensing (CS) [13]). However,
these methods are efficient for simple scenarios or with some preconditions. For
instance, CS-based methods assume the measurement matrixes are sparse [4],
which may not be practical in some WSN scenarios. As the connection of cloud
and IoT systems, more statistical learning models, such as belief propagation
(BP) [6], can be used to exploit complicated correlations where the process of
data inference/prediction is transferred from nodes to the clouds.
First, we model the problem of selecting sensing sensors as a process of con-
structing a most representative dominating set taking both the residual energy
and node correlation into account. It’s known that data transmission in UWSNs
is prone to high-frequency data loss affected by harsh underwater environments
(e.g., ambient noise, packet failure) that may lower the quality of the collected
data and cause failure prediction or analysis. To assure data quality, the sensor
selection should assure sufficient data are collected at the initial sensing phase.
To address the problem, we utilize the idea of fault-tolerant nodes where the sen-
sor selection problem is converted as a minimum m-dominating problem, which
is known to be NP-hard [5]. With the pre-learning correlation information in
the center, we propose a heuristic-based sensor selection algorithm where the
node with largest weights are greedily selected as a dominating node. Here the
node weight is defined by modeling the combined factor of energy cost and node
With the computing ability of cloud resources, data inference, evaluation
and improvement are performed at the end of each cycle. For missing data, we
use a well known algorithm, BP [6], to perform inference by a graphical model.
Using BP is efficient and it has the ability of integrating multiple correlations
(e.g, temporal, spatial, multivariate) to provide high inference accuracy. The
contributions of this paper are the following:
(1) we first model the sensor selection problem in UWSNs as a multi-objective
optimization problem, and propose a heuristic sensor selection algorithm to solve
the problem.
(2) By modeling temporal, spatial and data correlation, we use Belief Prop-
agation (BP) algorithm to infer uncollected and missing data.
Energy-Efficient Approximate Data Collection and BP-Based Reconstruction 95
(3) Experiments with real-world data are conducted to validate the efficiency
of our approximate sensor selection and data reconstruction strategies.
2 Problem Formulation
Assume there are S = {s1 , s2 , · · · , sn } sensors, and the i-th sensor si is anchored
at the location coordinate li = [lix , liy , liz ]. All data collected by the sensor network
can be written as D, which corresponds to an approximate dataset, D. We then
use a graph G = (V, E) to model the sensor network, where V means the vertex
set of sensor nodes, and E means the set of edge.
Given a data collection task with C cycles, let xci be a binary decision variable.
xi = 1 if sensor si is selected at cycle c, and xci = 0 otherwise. Let Er (i) be the
Constraint (1a) means that the consumed energy cannot exceed its low
i , collected by
threshold. The value of Ec (i) depends on the size of dataset, D
node si . In our scenario, we assume the transmission path of a deployed node
is fixed so that we can estimate Ec (i) by data size. Let pin
i and pi
be the
energy consumption of sensor si on transmitting and receiving through acoustic
channels, respectively. There is:
Ec (i) = pout
i + (pin out
k + pk ) (2)
96 Y. Liu et al.
where pout
i or pin
i can be estimated using Urick Propagation Model [7] that
includes two acoustic propagation mechanisms. One is cylindrical spreading for
the shallow water (depth ≤ 100 m) and the other is spherical spreading for the
deep water. In our paper, we consider the energy consumption of the second
Constraint (1b) requires the overall error should not exceed the predefined
error bound. In TWSNs or other IoT-based networks (e.g., crowdsensing), it’s
feasible to construct a real-time feedback mechanism to ensure data quality with
the high speed communication [4]. In the mechanism, the decision maker will
select more sensors to send collected data if the current estimated quality does
not satisfy the predefined requirement. However, the mechanism is not practical
for underwater networks, which is constrained by long propagation delay. To
ensure the data quality, the selected sensors need to provide sufficient collected
data as possible in the initial collection cycle.
When combining the usage of dominating set in wireless networks, the prob-
lem can be further converted into the problem of constructing a minimum dom-
inating set based on learning correlation information. Compared with communi-
cation in TWSNs, data loss in underwater transmission is much more frequency
as mentioned in Sect. 1. Inspired by the construction of fault-tolerant virtual
backbone in WSNs, we formulate the problem of sensor selection into a mini-
mum m-dominating set problem. The selected node subset has a certain degree
of redundancy to alleviate the effect of data loss caused by routing failure on
the inference quality. The quality constraint (1b) is converted to the expression
of m-dominating set:
1 c
xci + xj ≥ 1, ∀i ∈ n; ∀c ∈ C (3)
j∈N (i)
where Eq. (3) requires unselected sensors are represented by at least m selected
sensors. Before making selection decisions, there is an off-line learning to obtain
the initial setting of m by historical collected data [8]. The value of m also
depends on the loss rate of the final collected data where we consider the average
loss rate.
(1) Temporal correlation: The correlation between the observation at time t and
t − 1 can be defined as:
(dt − dt−1 )2
ψit (t, t − 1) exp − i 2i (4)
(2) Spatial correlation: Generally, the closer two sensors, the more similar col-
lected data are. We compute the Euclidean distance between two nearby
sensors to model their spatial correlations. Similarly, the normalized func-
tion with range [0,1] is:
dis li − lj 2
ψ (i, j) exp − 2 (5)
where li − lj 2 = (lix − ljx )2 + (liy − ljy )2 + (liz − ljz )2 . The parameters σi2 ,
σij can be learned from training data.
(3) Data correlation: In heterogeneous IoT networks, generally sensing data are
multivariate. For instance, underwater environmental parameters include
temperature, salinity, conductivity. Let di , dj be the data vectors of sen-
sors si and sj in different dimensions and the cosine similarity between two
vectors is:
di · dj
ψ da (i, j) (6)
di · dj
The off-line phase utilizes historical data to learn the node correlation ψ(i, j).
Here we jointly consider the distance and data similarity with its neighbors:
where α ∈ [0, 1] is a tradeoff factor to adjust the weight between the spatial and
data correlations.
Modeling NI. The main idea is to model the impact of the node on other
neighbor nodes where we consider the interaction between nodes:
⎛ ⎞
⎜ ψ(i, j) ⎟
N I(i) = ψ(i, j) ⎝ ⎠ (8)
ψ(k, j)
j∈N (i)
k∈N (j)
98 Y. Liu et al.
To solve the optimization problem of Eq. (1), the two objectives, energy cost
and node correlations, are put together to represent the selection priority:
w(si ) = β · e− Er (i) + (1 − β) · N I(i) (9)
where β ∈ [0, 1] determines the proportion of two objectives in the weight func-
Although the distributed algorithm is flexible, it requires the pre-learning
of priori knowledge is operated by sensors and each node needs to store his-
torical data about itself and its neighbors. Considering the limited storage and
computing resources, we make the sensor selection strategy determined by the
center. The heuristic-based algorithm (CASS) greedily search nodes based on
their weights. In the algorithm description, the status of sensor node can be
divided into three types: selected, pending, unselected. Correlated parameters
are updated, including N I(i), Er (i), and w(si ). Based on the estimated energy
cost and pre-learning node correlations, the algorithm calculates nodes’ weights
and sort the weights in descending order. In each round, the algorithm chooses
the node with a largest weight in the pending list as the dominator node and
add it into selected set. At the moment, its neighbors are dominated at least
one node, so the dominant value of neighbors is reduced by one. If the domi-
nant value after minus is 0, the node can be dominated by m nodes and marked
as unselected. The algorithm ends when the pending list is empty. At last, all
nodes are marked as selected or unselected.
Energy-Efficient Approximate Data Collection and BP-Based Reconstruction 99
Fig. 1. The constructed MRF graph Fig. 2. The illustration of iterative mes-
model. sage update and belief computation.
where mki represents the incoming message from the neighbors of node i except
for node j. The message update is an iteration process where messages from pre-
vious iteration are the input of computing the current message in each iteration.
BP performs message update until Eq. (11) converges after enough or pre-set
100 Y. Liu et al.
Figure 2 shows the belief propagation process where we illustrate the case of
message update between si and sk and belief computation of si .
For the evidence function ψi (di , dˆi ), the correlation between the true reading
and observation can be defined based on Gaussian Kernel [9]:
(d i − dˆi )2
ψi (di , dˆi ) = exp − (14)
5 Evaluation
We now evaluate the performance of our proposed approximate data collection
strategy. In the underwater network simulation, sensors are randomly deployed
in a 3D grid space of 3.33 km × 3 km × 200 m , which is divided with a number
of 6 × 5 × 5 (Latitude × longitude × depth). The data transmission rate is set
2000 bit/s. The initial energy is 100 J and the low energy threshold is 10 J. The
powers of receiving and sending data are 0.03 W, 0.67 W, respectively.
1.0 2000
1800 m=1
Total Energy Consumption
Energy Balance Coefficient
0.8 1200
0.7 800
DASS 600
0.6 MED 400
Random 200
0.5 0
1 2 3 1 2 3 4 5
m Cycle
Fig. 3. The comparison of energy bal- Fig. 4. Total energy consumption during
ance coefficients. 5 cycles.
1.8 BP 0.35
1.6 GP 0.30 CS
MAE(Temp: °C )
0.25 KNN
0.8 0.15
0.0 0.00
1 2 3 1 2 3
m m
Fig. 5. Missing value inference error for Fig. 6. Missing value inference error for
Temperature Salinity
102 Y. Liu et al.
20 1.0 3.0
BP α
18 CS α
16 0.8 α
MAE(Heat: 108J/m2 )
MAE(Heat: 108J/m2 )
MAE(Temp: °C )
12 0.6
10 1.5
8 0.4
4 0.2
0 0.0 0.0
1 2 3 1 2 3
m m
Fig. 7. Missing value inference error for Fig. 8. MAE comparison with and with-
heat content out data correlation ψ da
Energy-Efficient Approximate Data Collection and BP-Based Reconstruction 103
6 Conclusion
In this paper, we propose an energy-efficient approximate data collection strat-
egy in UWSNs. To design efficient data collection strategy, the specific features of
underwater scenarios, long propagation delay and high-frequency packet failure,
cannot be ignored. Taking these features into account, we formulate the sensor
selection problem as a minimum m-dominating problem and design a heuristic
distributed algorithm by integrating two factors of energy cost and node corre-
lations. Then missing values are inferred with the belief propagation method.
The simulation results with real-world ocean datasets validate the effectiveness
of our proposed approximate strategy.
1. Heidemann, J., Stojanovic, M., Zorzi, M.: Underwater sensor networks: applica-
tions, advances and challenges. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci.
370(1958), 158–175 (2012)
2. Garcia, M., Sendra, S., Atenas, M., Lloret, J.: Underwater wireless ad-hoc net-
works: a survey. In: Mobile Ad hoc Networks: Current Status and Future Trends,
pp. 379–411 (2011)
3. Wu, M., Tan, L., Xiong, N.: Data prediction, compression, and recovery in clustered
wireless sensor networks for environmental monitoring applications. Inf. Sci. 329,
800–818 (2016)
4. He, S., Shin, K.G.: Steering crowdsourced signal map construction via bayesian
compressive sensing. In: IEEE INFOCOM 2018-IEEE Conference on Computer
Communications, pp. 1016–1024. IEEE (2018)
5. Dai, F., Wu, J.: On constructing k-connected k-dominating set in wireless networks.
In: 2005 Proceedings of the 19th IEEE International Parallel and Distributed Pro-
cessing Symposium, p. 10. IEEE (2005)
6. Yedidia, J.S., Freeman, W.T., Weiss, Y.: Understanding belief propagation and its
generalizations. Exploring Artif. Intell. New Millennium 8, 236–239 (2003)
7. Etter, P.C.: Underwater Acoustic Modeling and Simulation. CRC Press, Boca
Raton (2018)
8. Bijarbooneh, F.H., Du, W., Ngai, E.C.-H., Fu, X., Liu, J.: Cloud-assisted data
fusion and sensor selection for Internet-of-things. IEEE Internet Things J. 3(3),
257–268 (2016)
9. Takeda, H., Farsiu, S., Milanfar P., et al.: Kernel regression for image processing
and reconstruction. Ph.D. dissertation, Citeseer (2006)
10. An, Y., Li, C., Wang, G., Zhang, R., Wang, H.: User’s manual of global Argo
dataset index and query system (version 1.0)”, p. 11 (2012)
11. Hong, Z., Pan, X., Chen, P., Su, X., Wang, N., Lu, W.: A topology control with
energy balance in underwater wireless sensor networks for IoT-based application.
Sensors 18(7), 2306 (2018)
104 Y. Liu et al.
12. Pan, L., Li, J.: K-nearest neighbor based missing data estimation algorithm in
wireless sensor networks. Wirel. Sens. Netw. 2(02), 115 (2010)
13. Kong, L., Xia, M., Liu, X.-Y., Wu, M.-Y., Liu, X.: Data loss and reconstruction in
sensor networks. In: Proceedings IEEE INFOCOM, pp. 1654–1662. IEEE (2013)
14. Yen, H.-C., Wang, C.-C.: Cross-device Wi-Fi map fusion with gaussian processes.
IEEE Trans. Mob. Comput. 16(1), 44–57 (2016)
I2P Anonymous Communication Network
Measurement and Analysis
1 Introduction
Network security has attracted more and more attention in recent years, such as cloud
security [1], Internet of Things security [2, 3] and mobile security [4]. To anonymous
communications over the Internet is a very active field. Anonymous network com-
munication system is designed to provide anonymous network communications ser-
vices that users cannot be identified by third-party entities. There are three aspects of
anonymous communication: sender anonymity, recipient anonymity, and relationship
anonymity. Anonymous communication is also a double-edged sword. The normal
users can protect their personal privacy through anonymous communication system,
while the malicious users can also use the anonymous communication system to carry
out some illegal activities. For example, the DDOS attack is implemented through
anonymous communication, and it is very difficult to trace the source.
TOR and I2P are well-known anonymous network. TOR aims to allow users to
access the Internet anonymously and allows them to access restricted or blocked ser-
vices. I2P is quite similar to TOR Project [5]. I2P is designed to protect messages inside
its network, so the access is granted to all participants. To gather facts about the
structure and size of the I2P network, this paper makes the following contributions:
1. We proposed a combined measurement method, which includes passive measure-
ment and active measurement. In this approach, passive measurement can collect
most nodes, while active measurement is its complement, and its input is the result
set of passive measurement. This combination greatly reduces the number of
missing nodes in a single measurement.
2. According to the measurement results, we analyzed the I2P network from six
aspects including key space distribution, subnet, country distribution, bandwidth
distribution, software version distribution, and FloodingFill node attributes.
3. We discussed the vulnerability of I2P network and proposed a security optimization
2 Related Work
Although I2P can communicate anonymously, researchers can still find its vul-
nerability. An I2P monitoring study [17], to identify published I2P applications and
characterize the usage of I2P network, showed that web servers and file-sharing clients
accounted for a great proportion. Herrmann et al. [18] presented an attack to de-
anonymization based on peer selection, that the adversary controlled nodes by
launching a DoS attack to force target to choose peers. Recently, an USB side-channel
attack on Tor [19] was proposed, which is also effective for I2P anonymous network.
In addition to the security issues discovered by these researchers, this paper ana-
lyzed the remaining vulnerability of I2P from three aspects, including brute-force
attack, collusion attack and heavy traffic attack. After that, we provided theoretical
Node measurement capture subsystem: there are two measurement methods which
are active measurement and passive measurement. Each measurement covers ordinary
node (also called router node) and FF (FloodingFill) node.
1. Active measurement can capture I2P node by setting the value of “donotin-
cludePeers”. The capture system sends DLM (DataBaseLookupMessage) to the
known FF node in local database, and then obtains the node from the reply packet.
And then, measure these acquired nodes.
2. Passive measurement can capture I2P node by setting the value of “router.flood-
fillParticipant”. If true, the local node is disguised as a FF node, and false is an
ordinary node. It collects information of other ordinary nodes that communicate
with the local node or exchange routing information. Finally, the collected nodes
108 L. Liu et al.
are saved to the local Network Database for analysis of their node attributes.
Network Database, also known as NetDB, built on top of Kademlia, which stores
two kinds of information: Routers and LeaseSets. To observe FloodFill router, we
add a FloodFill Routers to the NetDB.
Node attribute analysis subsystem: extract and preprocess the acquired data from
the captured node above, and analyze data from six aspects, Key Space distribution,
country distribution, distribution of IP addresses on the/16 subnet, bandwidth distri-
bution, distribution of I2P software version and FloodFill node attribute. Finally, it is
expressed in the form of a web page HTML, and the daily average value of the data is
converted into a monthly attribute value.
• Passive Measurement
Passive measurement can set the measurement node as ordinary node and FF node to
measure traffic. We deploy multiple I2P measurement nodes to collect information at
same time. The collected information served as the initial node library for active
Running as an Ordinary I2P Node: When an I2P node communicates with a
measurement node, they will exchange routing information with each other. More
specifically, this subsystem can obtain routing information in two ways. Firstly, other
nodes want to build tunnels through measurement nodes, and secondly, when the local
route is update, measurement node will send request of building tunnels to other nodes.
The convergence condition of the operation is that the local measurement data table
remains stable.
Running as a FF Node: Each node in the I2P network contains an ID (identity key)
and a rID (routing key). The ID is used to mark its own identity, which is fixed in its
life cycle. The rID is a route ID generated by the Hash function SHA256 (ID + date),
that is used to calculate the logical distance of each node in the Kad network. To
prevent the Sybil attack, all the I2P network nodes will recalculate the rID every 24 h
and publish their own information to the closest FF node. Thus, running a FF node can
be used as a way to collect I2P nodes. In addition, the communication between any two
nodes starts with the exchange of the router information. Therefore, the deployed FF
node can acquire any node information that interacts with it.
I2P networks often choose nodes with higher bandwidth as FF nodes, and it takes a
period of interaction for a newly started node to switch to FF node. Therefore, if we
simply run an I2P node and wait for it become a FF node, a higher bandwidth and a
certain amount of time are necessary. Fortunately, in the I2P software configuration file
router.config, a configuration statement can be added “router.floodfillParticipant =
I2P Anonymous Communication Network Measurement and Analysis 109
true”, which forces the I2P node to run in FF mode to collect routing nodes in the I2P
• Active Measurement
Active measurement is a supplement to passive measurement, whose result set is an input
of active measurement. There are two steps: query FF nodes and query router node. For
query progress, there are three type packet, DLM (DatabaseLookupMessage) is query
request packet, DSM (DatabaseStoreMessage) and DSRM (DatabaseSearchReplyMes-
sage) are query reply packets. The query type depends on a field in DLM called “don-
tIncludePeers”, and which package to reply depends on the match result of FF node. If
information matches, reply DSM, else DSRM carries three nodes with the shortest XOR
According to the data published on I2P website, there are about 30000 ordinary
nodes, since the FF peers is about 6% of all the ordinary nodes.
According to KAD routing algorithm, one peer prefers to store nearest peers in the
ID space. For nearer ID zones, the crawler crawls every zones. For further ID zones, it
only crawls the 2x th zones. We construct a XOR distance prefix list. For FF peers, the
list contains (5 + 32) prefix, and for ordinary routers the list contains (8 + 128) prefix.
The algorithm 1 describes a crawling method for FF node, while the algorithm 2 for
ordinary node.
110 L. Liu et al.
Table 1 shows the results of the node analysis program on December 15, 2018. The
analysis program runs every 30 min for an analysis. It indicates that the largest known
router number of the FF node on the day is 5583, the number of local FF nodes is 212
which accounts for 3.8%, and the total set of leasesets is 14,167. There were 356
NetDB entries at the time. In terms of time, I2P user activity increased rapidly at 11:00,
15:00, 18:00 and 20:00, which can reveal the temporal regularity of I2P user activity in
coarse-gained way. For further detailed analysis, pay attention to the next section.
mechanism of I2P, an Internet-scale censorship system can disable the I2P boot-
strapping process by blocking access to the reseed servers. The I2P developers have
foreseen the situation and update the software to allow for manual reseeding. In fact,
the blocking is validated by measuring the reseeding server provided in I2P project.
Based on the experiments, we analysis the nodes from three aspects described
respectively which are country distribution, bandwidth distribution and FF node
Based on the experiments, we analysis the nodes from three aspects described
respectively in the following sections, which are country distribution, bandwidth dis-
tribution and FF node attribute.
K 12KBps
12KBps \ L 48KBps
48KBps \ M 64KBps
64KBps \ N 128KBps
128KBps \ O 256KBps
256KBps \ P 2000KBps
X [ 2000KBps
We classify the collected I2P nodes according to their bandwidth flags. The clas-
sification results are shown in Table 2. Among them, the N-type is the most dominant
in the network, accounting for 37%, followed by the O-type, accounting for 29%.
Because O-type nodes and P-type nodes have large bandwidth, these two types nodes
are most likely to become high-speed nodes, which is more likely to build customer
tunnel. Compared with Tor anonymous network, I2P has larger traffic and slower
speed. With the increase of network bandwidth in the world, this explains why N-type
accounts for largest proportion, not P-type or X-type. Bandwidth Distribution of I2P
5 Conclusions
Anonymous networks can provide secure, non-tracked P2P connections and can also
resist censorship, which is a hotbed for hackers, and I2P network is an important one.
In order to grasp I2P network and trace the source of attacks, it is essential to measure
the network. This paper designed a system that can perform anonymous node mea-
surements and analysis of measurement results. For the former, we proposed a com-
bined measurement method that are active measurement and passive measurement. For
the latter, data analysis is performed from three aspects, including country distribution,
bandwidth distribution and FloodFill node attributes.
Acknowledgement. This work was supported by National Key Research & Development Plan
of China under Grant 2016QY05X1000, and National Natural Science Foundation of China
under Grant No.61771166.
1. Gai, K., Qiu, L., Chen, M., Zhao, H., Qiu, M.: SA-EAST: security-aware efficient data
transmission for its in mobile heterogeneous cloud computing. ACM Trans. Embed.
Comput. Syst. 16(2), 1–22 (2017)
2. Gai, K., Qiu, M.: Blend arithmetic operations on tensor-based fully homomorphic encryption
over real numbers. IEEE Trans. Ind. Inf. 14(8), 3590–3598 (2017)
3. Gai, K., Qiu, M., Ming, Z., Zhao, H., Qiu, L.: Spoofing-jamming attack strategy using
optimal power distributions in wireless smart grid networks. IEEE Trans. Smart Grid 8(5),
2431–2439 (2017)
4. Gai, K., Choo, K.K.R., Qiu, M., Zhu, L.: Privacy-preserving content-oriented wireless
communication in internet-of-things. IEEE Internet Things J. 5(4), 3059–3067 (2018)
5. The onion router (tor) project, official website.
6. Zantout, B., Haraty, R.: I2P data communication system. In: Proceedings of ICN, Citeseer,
pp. 401–409 (2011)
I2P Anonymous Communication Network Measurement and Analysis 115
7. Qiu, H. Noura, H. Qiu, M. Ming, Z., Memmi, G.: A user-centric data protection method for
cloud storage based on invertible DWT. IEEE Transactions on Cloud Computing (2019)
8. T. I. Project.: The invisible internet project.
9. I2p source code.
10. Conrad, B., Shirazi, F.: A survey on tor and I2P. In: Ninth International Conference on
Internet Monitoring and Protection (ICIMP), pp. 22–28 (2014)
11. Liu, Z., Liu, Y., Winter, P., Mittal, P., Hu, Y.C.: Torpolice: towards enforcing service-
defined access policies for anonymous communication in the tor network. In: 2017 IEEE
25th International Conference on Network Protocols (ICNP), pp. 1–10 (2017)
12. Liu, P., Wang, L., Tan, Q., Li, Q., Wang, X., Shi, J.: Empirical measurement and analysis of
I2P routers. J. Netw. 9(9), 2269–2279 (2014)
13. Zincir-Heywood, K.S.A.N.: Weighted factors for measuring anonymity services: a case
study on tor, jondonym, and I2P
14. Shahbar, K., Zincir-Heywood, A.N.: Effects of shared bandwidth on anonymity of the I2P
network users. In: 2017 IEEE Security and Privacy Workshops (SPW), pp. 235–240 (2017)
15. Ye, L., Yu, X., Zhao, J., Zhan, D., Du, X., Guizani, M.: Deciding your own anonymity: user-
oriented node selection in I2P. In: IEEE Access, vol. 6, pp. 71350–71359 (2018)
16. Hoang, N.P., Kintis, P., Antonakakis, M., Polychronakis, M.: An empirical study of the I2P
anonymity network and its censorship resistance. In: Proceedings of the Internet
Measurement Conference (IMC), pp. 379–392 (2018)
17. Timpanaro, J.P., Isabelle, C., Olivier, F.: Monitoring the I2P network. Ph.D. dissertation,
Inria (2011)
18. Herrmann, M., Grothoff, C.: Privacy-implications of performance-based peer selection by
onion-routers: a real-world case study using I2P. In: Fischer-Hübner, S., Hopper, N. (eds.)
PETS 2011. LNCS, vol. 6794, pp. 155–174. Springer, Heidelberg (2011).
19. Yang, Q., Gasti, P., Balagani, K., Li, Y., Gang, Z.: Usb side-channel attack on tor. Comput.
Netw. 141, 57–66 (2018)
20. Maxmind-geoip2.
Plot Digitizing over Big Data Using Beam
1 Introduction
We divide the above procedure into six methods step by step, Greyscale
and Binarization, Distill plot region, Exclude sub lines, Denoise, Abstract plots
and Derive plot data. Previous research we have done before [7–9,14,15] has
indicated that the convenience with big data. Therefore, we create a ground
truth alignments based on GLDAS big data set [1], apply the alignments to
compare result between different methods.
2 Methods
As is shown in Fig. 1, we used the following methods to obtain the results we
want step by step. Each method has its input and output, where Image File
represent the specific file to digitizing plot, Mi (i = gs, R, S, L, D) represent
the reminder matrix after processing each method, Ut represent the collection
list which contains ever plot’s collection in the form of pixels points, Data File
represent the data which store in format.
As Eq. 2 show, here we choose to define each pixel point’s value in the image
with Irgb while addressing plot images [5]. This way we can quantify the value
of the pixel point.
Irgb = (FR , FG , FB ) (2)
where FR represents the red channel value of the specific point’s red channel,
FG represents the green channel value of the specific point’s red channel, FB
represents the red channel value of the specific point’s red channel. Subsequently,
we use handy theorems to calculate the corresponding gray-scale value Igs for
Irgb computationally, with respect to metadata’s SRGB space type or linear
space type.
While plots usually appear as the darker components of the image, entail-
ing the differentiating of the plot part and the noise part from the gray scale
value matrix Mgs . We set the Grey level of Igs to m, using a fast threshold
segmentation method [13] to calculate the threshold t of each image.
1 if x < t
F (x) = (3)
0 otherwise
After the threshold t has been obtained, we address the binarization of each
gray-scale value Igs from the matrix Mgs based on the truncation equation as
Eq. 3. After the binarization, we obtain the reminder matrix Mr , the sole value
is 0 or 1. Thus, we have completed the binarization of the RGB image, which is
convenient for subsequent processing.
As a standard plot image, there are generally two coordinate axes of the hor-
izontal axis and the vertical axis exist. Which denotes that the image has two
straight lines in both horizontal and vertical directions, equivalents to the two
points collections exist both x direction and y direction in the matrix Mr . Here
we use the Hough transform [12] to detect the line and we detect the maximum
value and the minimum value from Ux , Uy as Eq. 4 shows:
xmin = min (xx1 , xx2 , . . . , xxn ), xmax = max (xx1 , xx2 , . . . , xxn )
ymin = min (yy1 , yy2 , . . . , yyn ), ymax = max (yy1 , yy2 , . . . , yyn )
where xx1 , xx2 and xxn represent the unordered sequence of x-axis coordinate val-
ues of all pixel points in Ux , yy1 , yy2 and yyn represent the unordered sequence of
y-axis coordinate values of all pixel points in Uy . We use xmin , xmax , ymin , ymax
to differentiate the substantive region of plot from image as Eq. 5 shows, for the
purpose of distilling plot region.
where crad(U1 ), crad(U2 ) and crad(UI−1 ) represent the length of the unordered
sequence from the collection Ui in the list Lu , which denotes that the collection
Um represent the largest connected region i n the image relatively. Then we
detect the maximum value and the minimum value from Um , which satisfy the
following condition Eq. 4.
Then we use xmin , xmax , ymin , ymax to abstract the substantive region of plot
from image as Eq. 12:
the matrix Mc . If the number Sblack is greater than the threshold γcrossover , we
presume the point P (x, y) to be the intersection point on the plot, otherwise set
the value of the point P (x, y) in the matrix MS to zero.
After performing the few steps as we described above, we obtain the reminder
matrix ML without sub-lines.
2.4 Denoise
We use the fuzzy similarity-based filter [10] to denoise the remainder noise points
in the matrix ML . We detect each point P (x, y) in the matrix ML , starting
with the ⎤
⎡ point P (0, 0), create 8-connectedneighborhood matrix M8−connected =
P1 P 2 P 3
⎣ P4 P P5 ⎦. Then we derive the number Sblack of points Pi (xi , yi ) whose value
P6 P7 P8
equals one in the matrix M8−connected . If the number Sblack less than the thresh-
old γsalt , we presume the point P (x, y) to be the remainder noise point. Then
we set the value of each reminding noise point’s value to zero. Consequently, we
obtain the reminder matrix MD .
After the above preconditions, we detect the start point Pstarti (x, y) from each
plot Li , which satisfy the following condition Eq. 14:
x → min (x1 , x2 , . . . , xn )
y → max (y1 , y2 , . . . , yn ) min (y1 , y2 , . . . , yn )
First, we select a certain value B as the Beam Width to indicate the result
of selecting the maximum condition probability Pmax B each time. Then we set
i + 1 → i, where i represent the total of plots. Afterwards, we create a point
list Bi , set variable 0 → i and select B connected areas as the candidate result
collection list U0 for plot Li , which start with the point Pstarti , equivalent to
find B clusters of Pstarti in the portion region. If the size of list U0 is less than
B, we inference there is no more residual plots and terminal this algorithm.
Second, if value t ∗ w less than the maximum of x-axis in matrix MS , we
split matrix MS with the threshold μ to matrix Mft = MS (t ∗ w : t ∗ w+, :)
and create the collection list Bit . Otherwise, we assume that the number of
remainder points on plot Li in matrix Mft equals zero. Then we assign points
Pstarti , Pi 1 , Pi 2 and Pendi from the first collection in Ut−1 as the points in the
plot i, push them into collection Li and turn to the first step.
Fourth, we consider the conditional probability results for each collection Ck
in the collection list Bit with the candidate result collection list Ut−1 , whose pro-
cedure is proposed as following: Initially, we detect point PS (xs , ys ) in collection
Ck , which satisfy the following condition Eq. 16:
Pstarti (xstarti , ystarti ) and the end point Pendi (xendi , yendi ), which match the
following condition 18:
xstarti → max (x1 , x2 , . . . , xn )
xendi → min (x1 , x2 , . . . , xn )
where Lx−axis represents the length of x-axis in matrix MS , Xend represents the
value of the end point on the x-axis, Xstart represents the value of the beginning
point on the x-axis, Xscale represents the graduation value of x-axis, Ly−axis
represents the length of y-axis in matrix MS , Yend represents the value of the
end point on the y-axis, Ystart represents the value of the beginning point on the
y-axis and Yscale represents the graduation value of y-axis.
Third, export the data list Di as the prototype file.
3 Results
Firstly, we download GLDAS data set from EARTH DATA(, which date range from 2000-01-01 to 2019-05-31. Then we gener-
ate big data set F using python( with the Matplotlib
library(, which use random dishes. The total of the data
is 1,101,600+.
Second, we choose Gehre’s method as algorithm A, Kai’s method as algo-
rithm B, Bronstein’s method as algorithm C, the method we propose above as
algorithm D. Then we compare the averange result in run time, number of curves
We abstract 100,000 images from the big data set F to compare the running
time of the four algorithms in the target with different number of curves.
As Fig. 2 shows, In the case where the number of curves is 1, the four algo-
rithms run at the same time. When the number of curves reaches five, the speed
of algorithm B decreases significantly. When the number of curves reaches ten,
the running time of algorithm C opens the gap with algorithms A and D. Algo-
rithms A and D still share the analogous run time until the number of curves
reaches 20.
Then We compare the number of curves differentiated using the four algo-
124 Z. Xu et al.
Fig. 3. The average percent result of the curves each algorithm differentiated
4 Discussions
After algorithm simulation, it is not difficult to see that the proposed algorithm
has state of the art on plot digitization in the current research background, and
the time complexity can also meet the requirements of big data processing. If the
algorithm is applied to the mobile field, it can meet the needs of ordinary users
for real-time tasks. Subsequent work will continue to improve the algorithm,
improve processing speed and recognition accuracy, and extend the algorithm to
more complex images.
Plot Digitizing over Big Data Using Beam Search 125
1. Fang, H., Beaudoing, H.K., Teng, W.L., Vollmer, B.E., et al.: Global land data
assimilation system (GLDAS) products, services and application from NASA
hydrology data and information services center (HDISC) (2009)
2. Gehre, A., Bronstein, M., Kobbelt, L., Solomon, J.: Interactive curve constrained
functional maps. In: Computer Graphics Forum, vol. 37, pp. 1–12. Wiley Online
Library (2018)
3. Lee, K.W., Bo, P.: Feature curve extraction from point clouds via developable strip
intersection. J. Comput. Design Eng. 3(2), 102–111 (2016)
4. Lei, X., Ouyang, H.: Image segmentation algorithm based on improved fuzzy clus-
tering. Clust. Comput. 1–11 (2018)
5. Liu, C., Chen, X., Wu, Y.: Modified grey world method to detect and restore colour
cast images. IET Image Process. 13(7), 1090–1096 (2019)
6. Ow, P.S., Morton, T.E.: Filtered beam search in scheduling. Int. J. Prod. Res.
26(1), 35–62 (1988)
7. Qi, L., Chen, Y., Yuan, Y., Fu, S., Zhang, X., Xu, X.: A QoS-aware virtual machine
scheduling method for energy conservation in cloud-based cyber-physical systems.
World Wide Web 1–23 (2019)
8. Qi, L., et al.: Finding all you need: web APIs recommendation in web of things
through keywords search. IEEE Trans. Comput. Soc. Syst. 6(5), 1063–1072 (2019)
9. Qi, L., et al.: Structural balance theory-based e-commerce recommendation over
big rating data. IEEE Trans. Big Data 4(3), 301–312 (2016)
10. Torrente, M.L., Biasotti, S., Falcidieno, B.: Recognition of feature curves on 3D
shapes using an algebraic approach to hough transforms. Pattern Recogn. 73, 111–
130 (2018)
11. Tripta, F.A., Kumar, S.B.A., Saha, T.C.S.: Wavelet decomposition based channel
estimation and digital domain self-interference cancellation in in-band full-duplex
OFDM systems. In: 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC),
pp. 1–4. IEEE (2019)
12. Wei, Q., Feng, D., Zheng, W.: Funnel transform for straight line detection. arXiv
preprint arXiv:1904.09409 (2019)
13. Xie, D.H., Lu, M., Xie, Y.F., Liu, D., Li, X.: A fast threshold segmentation method
for froth image base on the pixel distribution characteristic. PloS One 14(1),
e0210411 (2019)
14. Xu, X., Dou, W., Zhang, X., Chen, J.: EnReal: an energy-aware resource allocation
method for scientific workflow executions in cloud environment. IEEE Trans. Cloud
Comput. 4(2), 166–179 (2015)
15. Xu, X., Liu, Q., Zhang, X., Zhang, J., Qi, L., Dou, W.: A blockchain-powered
crowdsourcing method with privacy preservation in mobile environment. IEEE
Trans. Comput. Soc. Syst. 1–13 (2019)
Trust-Aware Resource Provisioning
for Meteorological Workflow in Cloud
1 Introduction
Cloud computing has been evolved over a decade to support Industrial Internet
of things (IIoT) with credible, concurrent and universal access to high-end calcu-
lating capabilities, and it has ability to satisfy the demands of IIoT application
execution which need to analysis large-scale data [1,2]. As an important branch
in IIoT, the meteorological department has large-scale data increasing at a daily
growth rate of 12 TB. Meanwhile, meteorological applications (e.g., prediction
of natural disasters, synchronization of information, etc.) require huge physical
resources and sufficient computing power for the execution to store and analyze
datasets collected [3,4]. Thus, cloud computing is fully leveraged to accommo-
date the meteorological services and data [5].
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 126–135, 2019.
Trust-Aware Resource Provisioning for Meteorological Work Flow in Cloud 127
– Use Simple Additive Weighting (SAW) and Multiple Criteria Decision Making
(MCDM) to find the best fault task offloading strategy from the solution set.
– A large number of experimental results obtained by multiple comparison
experiments with Benchmark, BFD and FFD demonstrate the effectiveness
of ODPM.
To recover these fault tasks wti (1 < i < m) and the data generated by their
parent tasks need to be acquired. Then, these fault tasks will be restarted on
other computing nodes. However, it takes time for the offloading target comput-
ing nodes to acquire the data from different computing nodes, and these times
cannot be ignored in the cloud. In this case, the recovery time RTi (1 < i < m)
of the fault task wti is calculated by
RTi = rtsi + rtm
i + rti , (1)
where rtsi denotes the startup time of task wti on other nodes in the cloud, rtm i
denotes the time for downloading the mirror of task wti from the task mirror
nodes, and rtfi represents the longest time to get data from parent tasks.
Suppose that wti is to be restarted at psi (1 < i < l), then psi needs to
download the mirror of wti and the data required for the execution of wti from
different computing nodes. According to the VL2 network topology we described
above, assume that the data required to restart wti includes the task mirror and
the data from their parent task, which are stored in pm a (1 < a < k) and pj
s m
respectively. The relationship between pi and pa is expressed by ξ. There are
three cases of ξ: (a) ξ = 0 means that psi and pm a are in the same TOR; (b) xi = 1
represent that psj and pma are not in the same TOR switch and aggregation switch,
but in the same intermediate switch. Besides, the relationship between psi and
psj is also expressed by ξ, and the situation of ξ is consistent with the discussion
Assume that the mirror size of the fault task ti is dm i , in this case, the fault
task mirror of ti transmission time is deduced by
⎨ 2 · dm
i /wst , if ξ = 0,
i = 2 · (d m m
i /wst + di /wta ), if ξ = 1, (2)
2 · (dmi /w st + d m
i /w ta + d m
i /wau ), if ξ = 2,
where wst is the bandwidth of two nodes in the same TOR, wta is the bandwidth
between TOR switch and aggregation switch, and wau is the bandwidth between
the aggregation switch and intermediate switch.
After the task mirror transmission is completed, the task execution needs to
get data from their parent tasks. Suppose that tj is a parent task of ti , nd the
data size from tj is dfj , he transmission time rtfi,j between ti and tj is measured
by ⎧ f
⎨ 2 · di /wst , if ξ = 0,
rti,j = 2 · (dfi /wst + dfi /wta ), if ξ = 1, (3)
⎩ f f f
2 · (di /wst + di /wta + di /wau ), if ξ= 2.
A task in a workflow may have serval parent tasks, so restarting the fault
task requires getting data from parent tasks, but getting data from their parent
130 R. Mo et al.
tasks are done simultaneously with multiple paths, therefore, when calculating
the data acquisition time, the longest transmission time is selected as the time
required for the recovery task, so the longest time required to transfer data from
parent tasks is measured by
note σ(ti ) as a flag to estimate whether the task ti is deployed on pb (1 < b < l),
which is measured by
1, if ti runs on pb ,
σ(ti ) = (6)
0, otherwise.
minD , U. (9)
4.3 Iteration
I n the iteration process of NSGA-II, the parent population Ho generates the
child population Qo through the selection and mutation, then merges all the indi-
vidual from the parent population and the child population into Ro . The algo-
rithm first selects the appropriate individuals from Ro through non-dominated
sorting, then selects N individuals to form new populations N individuals to
form new populations Ht+1 by crowding distance sorting. The iteration process
of NSGA-II stops after the result converges. The current population will form a
solution set, which is called Pareto optimal solution.
strategy of the down task. Assume that ED and Uχi represents the values of
the two objective function of xi . In addition, ρ1 and ρ2 indicates the weight
of workflow completion time and load balancing indicators respectively. ED ,
min max min
ED , Uχ and Uχ indicates the maximum and minimum values of the two
objective functions respectively. Thus, the utility value θi of Xi is calculated by
ED − EDi Uχmax − Uχi
θi = ρ1 · min − E i
+ ρ2 · (12)
ED D Uχmin − Uχi
6 Related Work
1. Maenhaut, P.-J., Moens, H., Volckaert, B., Ongenae, V., De Turck, F.: Resource
allocation in the cloud: from simulation to experimental validation. In: 2017 IEEE
10th International Conference on Cloud Computing (CLOUD), pp. 701–704. IEEE
2. Xie, X., Yuan, T., Zhou, X., Cheng, X.: Research on trust model in container-based
cloud service. Comput. Mater. Continua 56(2), 273–283 (2018)
3. Botta, A., De Donato, W., Persico, V., Pescapé, A.: Integration of cloud computing
and internet of things: a survey. Future Gener. Comput. Syst. 56, 684–700 (2016)
4. Zhang, J., Xie, N., Zhang, X., Yue, K., Li, W., Kumar, D.: Machine learning based
resource allocation of cloud computing in auction. Comput. Mater. Continua 56(1),
123–135 (2018)
5. Wu, Q., Ishikawa, F., Zhu, Q., Xia, Y., Wen, J.: Deadline-constrained cost optimiza-
tion approaches for workflow scheduling in clouds. IEEE Trans. Parallel Distrib.
Syst. 28(12), 3401–3412 (2017)
6. Xu, X., Dou, W., Zhang, X., Chen, J.: EnReal: an energy-aware resource allocation
method for scientific workflow executions in cloud environment. IEEE Trans. Cloud
Comput. 4(2), 166–179 (2015)
Trust-Aware Resource Provisioning for Meteorological Work Flow in Cloud 135
7. Duan, R., Prodan, R., Li, X.: Multi-objective game theoretic schedulingof bag-of-
tasks workflows on hybrid clouds. IEEE Trans. Cloud Comput. 2(1), 29–42 (2014)
8. Qi, L., et al.: Structural balance theory-based e-commerce recommendation over
big rating data. IEEE Trans. Big Data 4(3), 301–312 (2016)
9. Qi, L., Chen, Y., Yuan, Y., Fu, S., Zhang, X., Xu, X.: A QoS-aware virtual machine
scheduling method for energy conservation in cloud-based cyber-physical systems.
World Wide Web 4(3), 1–23 (2019)
10. Li, Z., Ge, J., Hu, H., Song, W., Hu, H., Luo, B.: Cost and energy aware scheduling
algorithm for scientific workflows with deadline constraint in clouds. IEEE Trans.
Serv. Comput. 11(4), 713–726 (2015)
11. Chaisiri, S., Lee, B.-S., Niyato, D.: Optimization of resource provisioning cost in
cloud computing. IEEE Trans. Serv. Comput. 5(2), 164–177 (2011)
12. Greenberg, A., et al.: Vl2: a scalable and flexible data center network. In: ACM
SIGCOMM Computer Communication Review, Vol. 39, pp. 51–62. ACM (2009)
13. Rankothge, W., Le, F., Russo, A., Lobo, J.: Optimizing resource allocation for
virtualized network functions in a cloud center using genetic algorithms. IEEE
Trans. Netw. Serv. Manage. 14(2), 343–356 (2017)
14. Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked
search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2),
340–352 (2015)
15. Xu, X., Liu, Q., Zhang, X., Zhang, J., Qi, L., Dou, W.: A blockchain-powered
crowdsourcing method with privacy preservation in mobile environment. IEEE
Trans. Comput. Soc. Syst. 340–352 (2019)
16. Sadooghi, I., et al.: Understanding the performance and potential of cloud comput-
ing for scientific applications. IEEE Trans. Cloud Comput. 5(2), 358–371 (2015)
17. Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific
workflow management. J. Grid Comput. 13(4), 457–493 (2015)
18. Asvija, B., Shamjith, K., Sridharan, R., Chattopadhyay, S.: Provisioning the MM5
meteorological model as grid scientific workflow. In: 2010 International Conference
on Intelligent Networking and Collaborative Systems, pp. 310–314. IEEE (2010)
19. Chen, X., Wei, M., Sun, J.: Workflow-based platform design and implementation
for numerical weather prediction models and meteorological data service. Atmos.
Clim. Sci. 7(03), 337 (2017)
20. Qi, L., et al.: Finding all you need: web APIs recommendation in web of things
through keywords search. IEEE Trans. Comput. Soc. Syst. 337–351 (2019)
21. Ostermann, S., Prodan, R., Schüller, F., Mayr, G.J.: Meteorological applications
utilizing grid and cloud computing. In: 2014 IEEE 3rd International Conference
on Cloud Networking (CloudNet), pp. 33–39. IEEE (2014)
Do Top Social Apps Effect Voice Call?
Evidence from Propensity Score
Matching Methods
Hao Jiang1 , Min Lin2 , Bingqing Liu1(B) , Huifang Liu2 , Yuanyuan Zeng1(B) ,
He Nai1 , Xiaoli Zhang1 , Xianlong Zhao3 , Wen Du4 , and Haining Ye2
School of Electronic Information, Wuhan University, Wuhan, China
China United Telecommunications Co., Ltd., Guangzhou, China
Beijing Smart Chip Microelectronics Co., Ltd., Beijing, China
[email protected]
Shanghai DS Communication Equipment Co., Ltd., Shanghai, China
[email protected]
Abstract. The various mobile social APPs greatly enrich the way peo-
ple communicate with each other. It has been argued that the use of
mobile social APPs may influence user mobile phone call behaviour, as
more and more people are used to using mobile social APPs for voice
or video calls. Although mobile social APPs has penetrated into every
aspect of our daily lives, so far there is no convincing research show-
ing how the mobile social APPs influence the use of traditional mobile
phone calls. Based on the potential outcomes model, we use the poten-
tial outcomes model to study the causal effects of the frequent use of
mobile social APPs on mobile phone calls. The propensity score match-
ing method is performed for bias adjustment. Moreover, the sensitivity
analysis is conducted to test whether the results remained robust in the
presence of hidden biases. The results suggest statistically significant
positive effects of frequent use of Wechat on traditional mobile phone
calls. But for QQ, we found that frequent use of QQ reduces mobile
phone calls. The conclusion provides a new theoretical feature for busi-
ness package recommendation, namely, the frequency of mobile social
APPs. For WeChat users who use WeChat frequently, they are more
inclined to provide business package containing high call duration, and
for QQ users who use QQ frequently, they are more inclined to provide
business package containing low call duration, which further enriches the
method of business package recommendation.
1 Introduction
The latest data released by QuestMobile showed that mobile social users in China
reached 1.104 billion by February 2019. As an emerging Internet technology and
communication channel, mobile social APPs have an ever-increasing impact on
people’s work and life [19]. From the “on the phone” to “on-line chat”, social
APPs such as WeChat and QQ have enriched the way people communicate.
However, the rapid development of various online social calls. Compared with
Internet telephony, the signal is more stable when communicating using mobile
phone calls. It is more efficient to make a phone call when the network is in poor
condition. APPs did not completely replace traditional mobile phone. Both social
APPs and smart-phones are important communication methods for comsumers,
so it is meaningful for business cooperation to explore the relationship between
social APPs and traditional voice.
Frequent use of mobile social APPs may have two effects on using phones for
voice calls. One view is that mobile social APPs can increase interaction between
people [15,17]. However, it is also argued that mobile social APPs make people’s
expressions more fragmented and symbolized [20], and more and more people
are unwilling to call or receive calls because it is mandatory and immediate.
This paper aims to assess the causal effects of frequent use of WeChat and
QQ on mobile phone calls respectively. In this paper, The WeChat, QQ visit
and offline call data records provided by China Unicom are used for empirical
research. Based on the potential outcomes framework, we use the propensity
score matching method to “control” the confusion that may occur when eval-
uating the effect, to get an accurate estimate of the impact of frequent use of
WeChat and QQ on mobile phone calls.
The rest of this paper is organized as follows. The ralated work is discussed
in Sect. 2. Section 3 introduces the data and model which used to analyse, and
Sect. 4 evaluates and analyses experimental Results. Finally, the paper is con-
cluded in Sect. 5.
2 Related Work
The mobile Internet has developed rapidly and gradually penetrated into peo-
ple’s lives and all aspects of work. With the rapid development of 3g, 4g or even
5g technology, the network is in our pockets. More and more people are insep-
arable from mobile phones. At the same time, the birth of many mobile social
APPs has greatly enriched the way people communicate, and the connection
between people is becoming more and more convenient and close. Many scholars
begin to explore the impact of mobile social Apps on people’s real life [7].
Althoff et al. [1] studied the impact of social APPs on online and offline
behavior of users. By analyzing the online and offline behavior of 6 million users
in 5 years, the paper concludes that social networks can significantly increase
users’ online and offline activities. Based on the study on the household survey
data from Italy, Sabatini et al. [14] found that the use of SNS social networking
sites (SNS) has a negative effect on social trust.
138 H. Jiang et al.
There is also a considerable part of the literature on the impact of social APPs
on physical health and mental health [11,16]. In studies [9,10], the scholars found
a slight positive effect of social APPs on health behavior change. Burke et al.
[4] conducted a longitudinal study of 1,200 Facebook users and analyzed a series
of behaviors on Facebook (such as likes, private messages, comments, etc.). Her
conclusion is that social APP alone does not make people feel unhappy and
lonely, but some of its functions and features can easily adversely affect users.
Also, the relationship between online social network based on social APPs
and offline social network has been studied by some researchers [6]. Reich et
al. [2,12] found a moderate overlap between teenagers’ closest online and offline
friends, which indicated that the online context is used to strengthen offline
relationships. Using Facebook users as a research sample [5], the researchers
found that the size and scope of social networks on mobile social APPs are
similar to offline social networks.
These findings provide a better understanding of how mobile social APP
affects user engagement behavior and provides guidance for improving the design
of the APP. However, in the work of studying the impact of mobile social APPs
on people’s behavior, most of them need to be completed through the assis-
tance of tracking surveys or questionnaires. user behavior is often affected by
many variables. In some empirical studies, the author did not consider the exis-
tence of selectivity bias. Without controlling the selection biases, the conclusions
obtained by calculating the correlation coefficients are often not convincing.
(Y 1 , Y 0 ) ⊥ Z (1)
a parameter. The Eq. (1) is also called strong ignorable treatment assignment
assumption, which is a necessary assumption for reliable cause-and-effect con-
clusions. In actual study, it is impossible to find two people who are exactly the
same in all aspects. Other personal factors may be overlooked when studying
the impact of frequent use of mobile social APPs on mobile phone calls. For
example, users who frequently use social APP may use mobile phone calls more
frequently than control groups due to their differences in personal or original
social circles. In this case, we can’t clearly know how the mobile social APP
influence phone voice calls. In our analysis, we control as much as possible the
covariate that may affect the assignment and experimental results. Equation (2)
shows the mathematical expression under this condition.
(Y 1 , Y 0 )⊥Z|X (2)
Unlike the requirement of Eqs. (1), (2) relaxes the condition. The probability
that a user is assigned to an experimental group or a control group does not need
to be strictly equal, only needs to be equal under the condition of controlling
the covariate.
where μ0 and μ1 represent the mean of the covariate in the experimental group
and the control group, respectively. s20 and s21 represent the standard devia-
tion of the covariate in the experimental group and the control group. To deal
with the multidimensional data, the propensity score is used to represent the
one-dimensional eigenvalues of all covariates, and check whether the matched
covariates lack coincidence.
users’ online and call records, the research in this paper belongs to observational
research. Observational research is different from experimental research that
the assignment of treatment can not be controlled. Also, it is impossible to
simultaneously observe the user’s two different habits of mobile social APPs
The propensity score matching (PSM) means that the experimental group
and the control group are screened by a certain statistical method, so that the
selected research objects are comparable in the presence of confounding fac-
tors. The PSM method enables random assignment of assignment variable. At
the same time, the method can control the confounding variables, so that the
probability of frequent use of mobile social APP by the experimental group and
the control group user is the same. So that the two groups are comparable, to
achieve the purpose of simulation experiments. Therefore, the PSM [3] method
is used to select users in the control group that are similar in all aspects to the
experimental group.
We divided the samples into treatment and control groups according to the
treatment assignment variable. Users in the experimental group have behaviors
that frequently use a certain mobile social APP. Conversely, users of the control
group do not use the mobile social APP very often. The propensity score is
the probability that the user is assigned to the experimental group given the
covariate, which is calculated as Eq. (4).
e(X) = P r(Z = 1|X) = (4)
1 + exp(βX)
After obtaining the estimated value by Eq. (4), the probability that the user fre-
quently uses a certain mobile social APP that is, the PS (propensity score), can
be obtained. In the case where the multidimensional covariate exists, the one-
dimensional PS score will be used to find matching users between experimental
group and control group. To satisfy the ignorable treatment assignment assump-
tion, we first choose covariates that affect the frequent use of mobile social APPs
by users and mobile phone calls. Then, the logistic regression model is used to
estimate the probability that the user will frequently use the mobile social APP
given the covariates. The machine learning method including neural network
model, and XGBOOST model are also used for the estimation of propensity
score, which has less assumptions. The neural network model can deal with
high-dimensional data without considering the interactions or high-dimensional
forms may exist in the covariate.
Common matching methods include 1:K nearest neighbor matching, radius
matching. The nearest neighbor matching method includes replaceable and non-
replaceable of the control sample. In the following empirical analysis, we will
select the best matching method based on the balance effect of the covariates.
After the matching, we analyze the causal effects of frequent use of mobile social
APPs on mobile phone calls by comparing the values of the dependent variables
in experimental and control group. This paper focuses on the effect of frequent
use of mobile social APPs on mobile phone calls. That is, after frequent use of the
142 H. Jiang et al.
mobile social APPs, how does the use of mobile phone calls change? In the sample
of experimental group, we are unable to observe the behavior of users when they
don’t use mobile social APPs very often. So we use the propensity score matching
method, under the assumption of ignorable treatment assignment, to select the
sample in the control group instead. The formula that estimate the ATT (average
treatment effect on the treated) of the experimental group is as shown in Eq. 5.
AT T = E(Y 1 − Y 0 |Z = 1)
= E(Y 1 |Z = 1) − E(Y 0 |Z = 1) (5)
= E(Y |Z = 1, e(X) − E(Y |Z = 0, e(X))
1 0
(a) Distribution of PS in experimental group that frequent use Wechat and control
(b) Distribution of PS in experimental group that frequent use QQ and control group
As can be seen from the results obtained by the user data analysis in April
2018, the average value of the experimental group on each dependent variable
was significantly larger than that of the control group. Specifically, using LR
Model, compared with those in control group, the total call times increased by
60, the calling times increased by 31, the call duration increased by 72 min, the
calling user number increased by 20, the called user number increased by 19.
This indicates that frequent use of WeChat has a significant positive impact on
using mobile phone calls.
Table 3 shows the causal effects of frequent use of QQ on mobile phone calls.
The average value of the experimental group on each dependent variable was
significantly less than that of the control group. Using LR Model, compared
with those in control group, the total call times decreased by 9, the calling times
decreased by 3, the call duration decreased by 7 min, the calling user number
decreased by 2, the called user number decreased by 4. It can be seen that
unlike the causal influence of WeChat, frequent use of QQ reduces the frequency
of users using mobile phone calls.
Using the same method and steps, we analyze the data in May 2018 of mobile
phone users. Tables 4 and 5 show the results. There is not much difference from
the data analysis results in April. Combined with the results of two months of
empirical analysis, we can conclude that the frequent use of WeChat will promote
the use of traditional mobile phone call function, the frequent use of QQ has a
negative effect on mobile phone call.
Table 6. Sensitivity analysis of the causal effects of frequent use of WeChat on mobile
phone calls
Γ 1 2 4 4.8 5.2
Y0 sig + <0.0001 <0.0001 <0.0001 <0.0001 0.0694
sig − <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Y1 sig + <0.0001 <0.0001 <0.0001 0.00681 >0.05
sig − <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Y2 sig + <0.0001 <0.0001 <0.0001 0.0001 >0.05
sig − <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Y3 sig + <0.0001 <0.0001 <0.0001 <0.0001 >0.05
sig − <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Y4 sig + <0.0001 <0.0001 <0.0001 <0.0526 >0.05
sig − <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Γ 1 2 2.6 2.8
Y0 sig + <0.0001 <0.0001 <0.0001 <0.0001
sig − <0.0001 <0.0001 0.1841 >0.05
Y1 sig + <0.0001 <0.0001 <0.0001 <0.0001
sig − <0.0001 <0.0001 <0.0001 0.0526
Y2 sig + <0.0001 <0.0001 <0.0001 <0.0001
sig − <0.0001 <0.0001 <0.0001 0.3015
Y3 sig + <0.0001 <0.0001 <0.0001 <0.0001
sig − <0.0001 <0.0001 <0.0001 0.0721
Y4 sig + <0.0001 <0.0001 <0.0001 <0.0001
sig − <0.0001 <0.0001 0.0029 >0.05
has low sensitivity to the influence of hidden bias and the conclusion is reliable.
Similarly, the Table 7 shows the sensitivity analysis results of the casual effect
of QQ on mobile phone calls.
1. Althoff, T., Jindal, P., Leskovec, J.: Online actions with offline impact: how online
social networks influence online and offline user behavior (2016)
2. Atzmueller, M.: Analyzing and grounding social interaction in online and offline
networks. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML
PKDD 2014. LNCS (LNAI), vol. 8726, pp. 485–488. Springer, Heidelberg (2014). 41
3. Austin, P.C.: An introduction to propensity score methods for reducing the effects
of confounding in observational studies. Multivariate Behav. Res. 46(3), 399–424
4. Burke, M., Kraut, R.E.: The relationship between facebook use and well-being
depends on communication type and tie strength: Facebook and well-being. J.
Comput.-Mediat. Commun. 21(4), 265–281 (2016)
5. Dunbar, R.I.: Do online social media cut through the constraints that limit the
size of offline social networks? R. Soc. Open Sci. 3(1), 150292 (2016)
6. Dunbar, R.I.M., Arnaboldi, V., Conti, M., Passarella, A.: The structure of online
social networks mirrors those in the offline world. Soc. Netw. 43, 39–47 (2015)
7. Gwenn Schurgin, O., Kathleen, C.P.: The impact of social media on children, ado-
lescents, and families. Pediatrics 127(4), 800 (2011)
8. Hsu, J.Y., Small, D.S., Rosenbaum, P.R.: Effect modification and design sensitivity
in observational studies. J. Am. Stat. Assoc. 108(501), 135–148 (2013)
Do Top Social Apps Effect Voice Call? 149
9. Laranjo, L., et al.: The influence of social networking sites on health behavior
change: a systematic review and meta-analysis. J. Am. Med. Inform. Assoc. 22(1),
243–256 (2014)
10. Meng, J.: Your health buddies matter: preferential selection and social influence on
weight management in an online health social network. Health Commun. 31(12),
1 (2016)
11. Oh, H.J., Ozkaya, E., Larose, R.: How does online social networking enhance life
satisfaction? The relationships among online supportive interaction, affect, per-
ceived social support, sense of community, and life satisfaction. Comput. Hum.
Behav. 30(1), 69–78 (2014)
12. Reich, S.M., Subrahmanyam, K., Espinoza, G.: Friending, iming, and hanging out
face-to-face: Overlap in adolescents’ online and offline social networks. Dev. Psy-
chol. 48(2), 356 (2012)
13. Rosenbaum, P.R.: Design of Observational Studies (2010)
14. Sabatini, F., Sarracino, F.: Online social networks and trust. Soc. Indicat. Res. 2,
1–32 (2015)
15. Shi, J., Salmon, C.T.: Identifying opinion leaders to promote organ donation on
social media: network study. J. Med. Internet Res. 20(1), e7 (2018)
16. Song, H., et al.: Does facebook make you lonely?: A meta analysis. Comput. Hum.
Behav. 36(36), 446–452 (2014)
17. Utz, S.: The function of self-disclosure on social network sites: not only intimate,
but also positive and entertaining self-disclosures increase the feeling of connection.
Comput. Hum. Behav. 45, 1–10 (2015)
18. Westreich, D., Lessler, J., Funk, M.J.: Propensity score estimation: neural net-
works, support vector machines, decision trees (CART), and meta-classifiers as
alternatives to logistic regression. J. Clin. Epidemiol. 63(8), 826–833 (2010)
19. Yadav, M., Joshi, Y., Rahman, Z.: Mobile social media: the new hybrid element of
digital marketing communications. Proc. - Soc. Behav. Sci. 189, 335–343 (2015)
20. Yang, S., Wang, B., Lu, Y.: Exploring the dual outcomes of mobile social network-
ing service enjoyment: the roles of social self-efficacy and habit. Comput. Hum.
Behav. 64, 486–496 (2016)
Cognitive Hierarchy Based Coexistence
and Resource Allocation for URLLC
and eMBB
1 Introduction
In the past few years, there has been a consensus in the industry that the future
5G communication systems can serve different types of devices: enhanced mobile
broadband (eMBB), ultra-reliable low-latency communications (URLLC) and
massive machine-type communications (mMTC). eMBB may require a pretty
high data transmission rate. URLLC devices are required to meet ultra lower
latency with high reliability, and mMTC supports massive devices which spo-
radically send small packets [1]. Topics such as edge computing and Internet
of things have been extensively studied [2–4], which may provide solutions for
5G new scenarios. Besides, due to the limited available spectrum and different
quality of service (QoS) requirements, the coexistence of URLLC and eMBB has
been a hot topic in both industry and academia. The variable 5G frame structure
has provided a feasible way for researchers to satisfy the URLLC latency [5,6],
where slots are further divided into mini-slots and a puncturing mechanism is
applied. And [7] proposed a novel resource allocation scheme based on flexible
numerology of 5G. In addition, network slicing schemes for different services have
also been studied [8].
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 150–160, 2019.
CH Based Coexistence and Resource Allocation for URLLC and eMBB 151
The rest of this paper is organized as follows. A description of the our system
model is discussed in Sect. 2. Section 3 introduces our proposed resource alloca-
tion algorithm, and Sect. 4 evaluates the performance of our model. Finally, the
paper is concluded in Sect. 5.
... ... ... ... ...
Repetition interval
eMBB user
URLLC user
bm = Bτm qm
|hm |2 Pm Vm −1 (4)
= Bτm log 1 + − Q (m ) log e ,
σ2 Bτm
where bm denotes the packet size. We can get the error probability
√ |hm |2 Pm
Bτm log(1 + σ2 ) bm
m = Q( −√ ), (5)
log e Bτm log e
Since it is obvious that Q(x) decreases with x, our objective to minimize the
error probability of each URLLC user can be converted to maximizing the
content in parentheses on the right side of (5).
The optimization problem of each URLLC user is therefore
√ |hm |2 Pm
Bτm log(1 + σ2 ) bm
max −√ , (6)
τm log e Bτm log e
subject to
τm ≤ L − τm − ln , (7)
m ∈M n∈N
0 ≤ τm ≤ L, (8)
Pm · τm ≤ Pm , (9)
√ |hm |2 Pm
Bτm log(1 + ) bm
−√ ≥ Q−1 (m T ), (10)
log e Bτm log e
where m T is the maximum error rate that can be tolerated. And PmT
the set energy threshold during the repetition interval with the purpose of
energy saving.
(2) eMBB Users Requirements
Different from URLLC users, eMBB users access spectrum through the
EDCA mechanism, which means that users must contend with each other.
154 K. Yang et al.
Since users may have different QoS requirements, the ACs of different pri-
orities may be helpful. We assume that those users are of I ACs, which are
endowed with different parameters, such as contention window (CW) and
arbitration interframe space (AIFS) to enable high-priority users to prefer-
entially access channels.
Thus the data rate of an eMBB user n belonging to AC i can be expressed
|hn |2 Pn
rn (i) = Ps,i · B log 1 + , (11)
where Ps,i denotes the successful transmission probability for each user of
AC i . And the detailed derivation is shown in [11].
In view of the strong demand for data rate, each eMBB user should solve
the optimization problem
max rn · ln , (12)
subject to
ln ≤ L − ln − τm , (13)
n ∈N m∈M
0 ≤ ln ≤ L, (14)
P n · ln ≤ PnT , (15)
rn · ln ≥ Rn T , (16)
The constraints show that both minimum data rate requirement and maxi-
mum energy limit should be satisfied.
The main objective of this paper is to find an optimal resource allocation solution
for our proposed coexistence mechanism, i.e., τm and ln of (6) and (12). Besides,
in the light of that there may be a large amount of users in the system and the
heterogeneity of users, it is not practical to solve the problem directly because
of the probability of causing a great deal of computation. Thus, a distributed
resource allocation scheme is required in our paper. We can see that the optimal
time fraction of user m and user n is relevant with the time fractions of the
remaining users, which motivates us to adopt a method of game theory.
According to (6)–(16), we can define a noncooperative game I, (Si )i∈I , (Ui )i∈I .
(Si )i∈I is the strategy space for players in I, i.e., URLLC and eMBB users. And
CH Based Coexistence and Resource Allocation for URLLC and eMBB 155
(Ui )i∈I represents their utility functions. The utility function of each URLLC
user can be expressed as
√ |hm |2 Pm
Bτm log(1 + σ2 ) bm
Um (τm ) = −√ , (17)
log e Bτm log e
Likewise, the utility function of each eMBB user
Un (ln ) = rn · ln , (18)
It is obvious that the utility functions in (17) and (18) are separately increasing
with τm and ln .
We denote the upper bound of τm and ln as ubm and ubn . It’s easy to find
that the generalized nash equilibrium
solution is unique and exactly the
upper bound of users when m∈M ubm + n∈N ubn ≤ L because they has no
incentive to change their strategies. Otherwise GNE is not unique. However,
traditional game theory based on nash equilibrium which holds that all players
are completely rational and always needs information exchange is undoubtedly
unrealistic. In view of the heterogeneity of users in our system, that is, they
may have different requirements and computation capabilities, as well as the
confidentiality of user information, bounded rationality will be a more suitable
assumption. So we will think about a game of cognitive hierarchy (CH) theory
[12,13], which is based on the concept of bounded rationality.
In a CH-based game, players are divided into different levels of rationality.
The original cognitive hierarchy theory assumes that player at level k will select
its strategy based on the strategies of players belonging to lower levels, and play-
ers at level 0 will choose strategies randomly according to a uniform distribution.
However, it’s not fit for our model since there may be a lot of users having similar
computation capacity, i.e., in the same level. So we extend the original model by
considering that player at level k will regard the levels of the rest users as 0 ∼ k.
Details are shown below:
(1) The number of users of each level is considered to yield to the Poisson
distribution f with mean value τ . And in our system model, eMBB users
are endowed with higher levels (over q) than URLLC users (0 ∼ q) due to
their greater computation capacities.
(2) Users of level k know exactly the proportions of users belonging to levels 0 ∼
k, i.e., f (0) , ..., f (k). After normalization, we will get the relative frequency
gk (h) of players at a level h (0 ≤ h ≤ k) as gk (h) = f (h)/ i=0 f (i), and
gk (h) = 0 for h > k.
(3) Level 0 users are assumed to choose strategies within [lbm , ubm ] in case of
wasting too much resource. lbm denotes the lower bound of user m.
(4) The cognitive hierarchy equilibrium (CHE) is defined to be composed of
strategies si ∗ if and only if si ∗ (k) = arg max Ui (si ∗ ).
We can note that the strategy of a level-k user depends on decisions of users
of level 0 ∼ k, i.e., the strategy space of each user is based on its beliefs gk about
156 K. Yang et al.
other users. Thus for a level-k user, constraints (7) and (13) can be converted to
gk (h) τm (h) ≤ L, k ≤ q (19)
h=0 m∈U
gk (h) τm (h) + gk (h) ln∗ (h) ≤ L, k > q (20)
h=0 m∈U h=q+1 n∈U
where U = M ∪ N , so U = M + N is the total number of users. τm (h) and ln∗ (h)
represent the CHE strategy of URLLC and eMBB users in level h, respectively.
(19) can be further converted to
∗ ∗ ∗
gk (k) · τm (k) ≤ L − gk (k) τm (k) − gk (h) τm (h), k ≤ q (21)
m ∈U h=0 m∈U
However, while considering the behavior of peer players, each solving step
is like an equilibrium problem, resulting in large computational complexity.
According to the characteristic that players of the same type may have simi-
lar demands and capabilities, we assume that k-level players believe that players
∗ ∗
of the same level will choose the same action (τm (k) = τm (k)). Then (21) can
be further expressed as
∗ ∗
τm (k) ≤ L− gk (h) τm (h) , k ≤ q (22)
gk (k) · U
h=0 m∈U
While solving the optimization problems of URLLC and eMBB users, each
user will find its CHE strategy based on its own beliefs instead of iterated process,
and has no trend to change its action, indicating the stability of CHE. We can
express the CHE solutions of URLLC user m at k level as
∗ ∗
τm (k) = min L− gk (h) τm (h) , ubm , (23)
gk (k) · U
h=0 m∈U
4 Simulation Results
In this section, simulation results are shown to evaluate the performance of our
proposed URLLC and eMBB coexistence scheme. We set URLLC users to level 0
and 1 and eMBB users to level 2 and 3 according to their computation capacity
and QoS requirements. Besides, higher level eMBB users are thought to have
higher priority in the EDCA mechanism, which is achieved by the parameters
in [11]. And we consider the bandwidth of 20 MHz and interval of 100 ms. The
noise power is −174 dBm/Hz. Transmission powers of each level are set to 0.2,
0.5, 1 and 1.5 W, respectively. The mean value τ of the Possion distribution is
1.2. And each user knows the parameters of each type of devices before the time
Fig. 3. Successful transmission probability of each eMBB user in level 2 and level 3.
(a) CHE solution of level 1 user. (b) CHE solution of level 2 user.
almost exceeds the upper bound due to the continuous reduction of successful
transmission probability, which means that requirements of users will not be sat-
isfied when there are more users in the system. And Fig. 5 shows that compared
to GNE the performance of CHE will not drop much with only several users
failing to meet their requirements due to the normalization.
CH Based Coexistence and Resource Allocation for URLLC and eMBB 159
Thus we can conclude that CH theory performs well when the number of users
is proper in the system with significantly reduced computational complexity and
communication overhead, while excessive users may cause degradation of system
performance. For the propose of guaranteeing performance of both URLLC and
eMBB users, we should control the number of total users within an appropriate
5 Conclusion
In this paper, we have proposed a coexistence scheme of URLLC and eMBB
users. Firstly, the access mechanisms defined in EDCA and HCCA are utilized
to achieve the access of URLLC and eMBB users, aiming to satisfy different QoS
requirements of users. And a more realistic CH-based game is utilized to solve
the distributed resource allocation problem. The simulation results demonstrate
that the CH-based coexistence and distributed resource allocation scheme can
achieve good performance with low computational complexity and communica-
tion overhead when there are not too much users in the system.
1. 3GPP TSG RAN WG1 95, Technical report, November 2018
2. Gai, K., Qiu, M., Zhao, H., Tao, L., Zong, Z.: Dynamic energy-aware cloudlet-based
mobile cloud computing model for green computing. J. Netw. Comput. Appl. 59,
46–54 (2016)
3. Qiu, H., Noura, H., Qiu, M., Ming, Z., Memmi, G.: A user-centric data protection
method for cloud storage based on invertible DWT. IEEE Trans. Cloud Comput.
4. Gai, K., Qiu, M., Zhao, H.: Cost-aware multimedia data allocation for heteroge-
neous memory using genetic algorithm in cloud computing. IEEE Trans. Cloud
Comput. (2016)
5. Pedersen, K.I., Pocovi, G., Steiner, J., Khosravirad, S.R.: Punctured scheduling
for critical low latency data on a shared channel with mobile broadband. In: 2017
IEEE 86th Vehicular Technology Conference (VTC-Fall), pp. 1–6. IEEE (2017)
6. Anand, A., De Veciana, G., Shakkottai, S.: Joint scheduling of URLLC and eMBB
traffic in 5G wireless networks. In: IEEE INFOCOM 2018-IEEE Conference on
Computer Communications, pp. 1970–1978. IEEE (2018)
7. You, L., Liao, Q., Pappas, N., Yuan, D.: Resource optimization with flexible
numerology and frame structure for heterogeneous services. IEEE Commun. Lett.
22(12), 2579–2582 (2018)
8. Popovski, P., Trillingsgaard, K.F., Simeone, O., Durisi, G.: 5G wireless network
slicing for eMBB, URLLC, and mMTC: a communication-theoretic view. IEEE
Access 6, 55765–55779 (2018)
9. IEEE802.11e: 802.11e-2005 IEEE standard for information technology telecom-
munications and information exchange between systems local and metropolitan
area networks specific requirements part 11: wireless LAN medium access control
(MAC) and physical layer (PHY) specifications: Amendment 8: Medium access
control (MAC) Quality of Service enhancements (2005)
160 K. Yang et al.
10. Yang, W., Durisi, G., Koch, T., Polyanskiy, Y.: Quasi-static multiple-antenna fad-
ing channels at finite blocklength. IEEE Trans. Inf. Theory 60(7), 4232–4265 (2014)
11. Chen, Q., Yu, G., Ding, Z.: Enhanced LAA for unlicensed LTE deployment based
on TXOP contention. IEEE Trans. Commun. 67(1), 417–429 (2018)
12. Camerer, C.F., Ho, T.H., Chong, J.K.: A cognitive hierarchy model of games. Q.
J. Econ. 119(3), 861–898 (2004)
13. Abuzainab, N., Saad, W., Hong, C.S., Poor, H.V.: Cognitive hierarchy theory for
distributed resource allocation in the internet of things. IEEE Trans. Wirel. Com-
mun. 16(12), 7687–7702 (2017)
Smart Custom Package Decision for Mobile
Internet Services
Abstract. With the rapid development of the mobile Internet services, the
existing package decision of telecom operators has become the bottleneck of the
mobile Internet services due to the high price. Mobile Internet application
companies have begun cooperating with telecom operators to promote free-flow
packages. Based on the users’ online data provided by the telecom operator, we
propose a smart custom packet decision scheme based on user behavior analysis.
A two-side market model is proposed to formulate the package decision.
Experimental results show that our scheme is with good efficiency of profit and
social welfare.
1 Introduction
Mobile Internet services are developing fast in recent years. According to research
statistics, the global total mobile business traffic will reach 292 EB in 2019, of which
intelligent traffic accounts for about 97%. About 70% of traffic will be applied to
bandwidth-sensitive services such as video services. With the rapid development of
Mobile Edge Computation (MEC) and Cloud Computation, more and more traffic will
be used by mobile terminals [1–3]. With the increase of the mobile Internet market,
many Internet companies have launched their own APP services. The number of APPs
in Google Play and APP Store has reached 5.0 million in 2019. Under this circum-
stances, traditional traffic packages provided by telecom operators can’t meet the needs
of users and Internet companies.
Chinese telecom operators propose unlimited traffic packages for certain types of
APPs by cooperating with Internet companies. This kind of custom package is benefit
to both mobile users and Internet companies by reducing the burden of telecom
operator. The King Card issued by Tencent is a typical example.
We propose smart custom package decision scheme for Mobile Internet services
based on online user behavior pattern. Two-side market model is used to formulate the
problem, in order to achieve maximum profit with good social welfare in the same
The rest of this paper is organized as follows. Section 2 reviews the main solutions
related to users’ online behavior pattern. Section 3 formalize the problem and propose
the result mathematically. Section 4 describes the custom package decision scheme.
Section 5 evaluates the performance of the method proposed in Sect. 4. Section 6
conclude the contributions of this paper.
2 Related Work
User Persona technology is used to extract the user’s Internet characteristics. User
Persona is proposed by Alab Copper, whose basic function is to describe one user or a
class of users with some characteristics. Holden et al. describe users with photos and
user interests [4].
Researchers choose different description objects in different scenarios. Kumar,
Chikhaoui and others build User Portrait models for individual users and extract fea-
tures for tasks such as user behavior prediction or content recommendation [5, 6]. User
Portraits for single user can be used for personalized customization. When User Portrait
technology is used for group users, all users is broken down into groups which are
described with multiple tags. Lerouge analyzed the characteristics of the elderly pop-
ulation in China using group User Portrait [7]. Rossi determine whether the airport is
congested using real-time data of population on airport [8]. Group User Portrait
technology is mainly used for population characteristic analysis.
Timing characteristics are another important feature of the user behavior. Many
effective methods have been proposed to depict users’ Internet habits, such as col-
laborative filtering, factor model [9] and Markov model. However, those methods
ignored other information such as time and location. Many context-based analysis
methods that used time information, spatial information and social relationships have
been proposed [10, 11].
Users should be segmented while designing different plans for different users. K-
means algorithm can solve the problem. However, there is no solution to the pricing of
each type of user’s package. Among many economic theories, the two-sided market
can describe the relationship between operators, users and Internet manufacturers
Armstrong proposed the two-sided market model firstly and put forward the basic
model of the two-sided market [12], which is mainly used to describe the intermediary
market. Armstrong, Wright, Rochet and others research the pricing strategy of the
platform in different ways [13, 14]. However, the two-sided market is still one of the
developing economic models.
Smart Custom Package Decision for Mobile Internet Services 163
3 Problem Formulation
First of all, the symbols used in the following equations is shown in Table 1.
A discrete sequence of user accessed content is used to describe the user’s Internet
behavior. The user’s online behavior pattern in Fig. 1 can be described by the sequence
“WeChat-PayPal-WeChat-QQ-Alipay”. It is assumed that the user will still access the
last accessed content before the user accesses the next content.
To build a user state transfer matrix, define a discrete transfer tendency as:
Formula (1) is used to calculate the content state transfer intensity matrix Q, and fit
the state transfer probability matrix P. The fit method is:
prs ¼ 0; r ¼ s
prs ¼ qqrrrs ; r 6¼ s ð2Þ
164 Z. Yang et al.
Telecom Operator
Factor model is used to analyze the user’s Internet characteristics and design the
contents of the traffic plan. The model is generally expressed as:
R ffi PQT ð6Þ
P 2 RMK and Q 2 RNK are the factor matrix for the user and the accessing
content consisting. M represents the number of users. N represents the number of
accessed contents. K is the number of factors.
It is found that users’ online behavior is similar, which is referred as the user’s
behavior pattern. It is assumed that each user has several independent Internet timing
characteristics. Each pattern affects the user’s behavior with different weight xP .
Smart Custom Package Decision for Mobile Internet Services 165
The timing characteristics and the current accessing content At jointly determine the
upcoming access to the content At þ 1 .
PðAt ; At þ 1 Þ ¼ xðPk Þ f ðPk ; At ; At þ 1 Þ ð7Þ
When all users are taken into consideration, the formula is converted into a volume,
as shown in the Eqs. (7) and (8).
R¼vABC ð9Þ
W ðP; U Þ ¼ A ð10Þ
F ðP; At ; At þ 1 Þ ¼ v B C ð11Þ
Finally, according to the weight matrix, the online behavior mode with the highest
weight for each user is found, which is regarded as the main online behavior pattern of
the user. The proportion of different online behavior patterns is analyzed to find the top
K pattern with the largest proportion, which is the main patterns of all users.
It’s found user number n is related to utility function u, which is described as u.
n 1 ¼ u 1 ð u1 Þ ¼ ð12Þ
1 þ ex 1 u 1
n 2 ¼ u 2 ð u2 Þ ¼ ð13Þ
1 þ ex 2 u 2
While maximizing social welfare, the overall welfare of users, Internet companies
and telecom operator is required to be the greatest.
v 1 ð u1 Þ ¼ u 1 ð u1 Þ ð14Þ
v 2 ð u2 Þ ¼ u 2 ð u2 Þ ð15Þ
u1 ¼ a1 lnð1 þ F n2 Þ C F f þ a2 n2 ð18Þ
1 þ F n1
u2 ¼ a2 lnð1 þ F n2 Þ C F f þ a1 n1 ð19Þ
1 þ F n2
Ps1 ¼ C F þ f a2 n2 ð20Þ
1 þ F n1
Ps2 ¼ a1 n1 ð21Þ
1 þ F n2
Formula (21) shows that under the optimal conditions of social welfare, the plat-
form should subsidize Internet manufacturers. At the same time, it should charge the
user at a price below the cost price ðC F þ f Þ.
Platform profit optimization requires the most profit of telecom operators.
W ¼ pð u 1 ; u 2 Þ ð22Þ
F u ð u1 Þ
u1 ¼ a1 lnð1 þ F n2 Þ C F þ f þ a2 n2 10 ð23Þ
1 þ F n1 u 1 ð u1 Þ
F u ð u2 Þ
u2 ¼ a2 lnð1 þ F n1 Þ þ a1 n1 20 ð24Þ
1 þ F n2 u2 ðu2 Þ
F u ðu1 Þ
Ps1 ¼ C F a2 n2 þ 10
1 þ F n1 u1 ðu1 Þ
F 1
¼ C F a2 n2 þ
1 þ F n1 ð1 n1 Þ x1
F u ð u2 Þ
Ps2 ¼ a1 n1 þ 20 ¼
1 þ F n2 u2 ðu2 Þ
F 1
a1 n 1 þ
1 þ F n2 ð1 n2 Þ x2
While maximizing the profit of the platform, it charges the user and the Internet
manufacturer an additional fee, which is related to parameters x.
Smart Custom Package Decision for Mobile Internet Services 167
Under the competition condition, this paper analyzes the pricing strategy under the
optimal condition of social welfare and optimal condition of platform profit.
The calculation process of optimal condition of social welfare is shown below.
W ¼ pð u 1 ; u 2 Þ þ v 1 ð u 1 Þ þ v 2 ð u 2 Þ ð27Þ
u1m ¼ a1 lnð1 þ Fm n2 Þ C Fm f þ a2 n2 ð28Þ
1 þ Fm n1m
u2m ¼ a2 lnð1 þ Fm n1m Þ þ a1 n1m ð29Þ
1 þ Fm n2m
Ps1m ¼ C Fm þ f a2 n2 ð30Þ
1 þ F m n1
Ps2m ¼ a1 n1m ð31Þ
1 þ Fm n2m
W ¼ pðu1 ; u2 Þ ð32Þ
Fm u ðu2 Þ
u2m ¼ a2 lnð1 þ Fm n1m Þ þ a1 n1m þ 20 ð34Þ
1 þ Fm n2m u2 ðu2 Þ
Fm 1
Ps1m ¼ C Fm þ f a2 n2 þ PM ð35Þ
1 þ Fm n1m i¼1 x1mm þ x1mj nj
Fm 1
Ps2m ¼ a1 n1m þ ð36Þ
1 þ Fm n2m ð1 n2m Þ x2m
Formulas (31) and (36) shows the result under competition condition.
5 Performance Evaluations
The dataset is provided by China Unicom, which including a record of all information
about the user’s online behavior within a natural month. Users are classified according
to behavior pattern weight, as is shown in Table 2.
168 Z. Yang et al.
Figure 3(a) and (b) describes the fees charged to each user for different packages
under the condition of maximizing social welfare or the platform’s profit. It’s founded
that package I has the largest pricing cost and package II has the lowest pricing fee.
The number of users is equivalent to the probability that users will participate in
this package. When there are many users (n1 > 0.8), the platform can charge a higher
fee from the users.
Figure 4(a) and (b) describes how the platform charges Internet manufacturers
under the condition of maximizing social welfare or platform benefits.
It is found that the total cost of Internet companies in different packages is basically
the same. Formula (34) shows that the charges do not reduce the flow consumption. As
a result, more Internet companies involved in the package means less fees for each
(a) Analysis of user pricing scheme while (b) Analysis of user pricing scheme while
maximizing social welfare maximizing The profit of platform
(a) Analysis of Internet Manufactures pricing (b) Analysis of Internet Manufactures pricing scheme
scheme while maximizing social welfare while maximizing The profit of platform
internet companies. When Internet companies aren’t willing to join the packages (n2 <
0.2), the platform needs to subsidize Internet companies. When Internet companies tend
to join this package (0.2 < n2 < 1), the platform should adopt a free strategy. However,
under the condition that the platform profit is optimized, the Internet companies should
be charged.
Figures 5 and 6 shows the experimental results. Traffic doesn’t determine the cost
of Internet manufacturers. It is the willingness of Internet companies to join packages
and the number of Internet companies involved in the package that determines the cost
of Internet manufacturers.
(a) Analysis of user pricing while maximizing social (b)Analysis of user pricing while maximizing
welfare platform profit.
(a) Analysis of Internet Manufactures pricing (b) Analysis of Internet Manufactures pricing scheme
scheme while maximizing social welfare while maximizing The profit of platform
6 Conclusion
Based on the UDR data provided by mobile operators, this paper designs a free-flow
package mechanism based on users’ online behavior. Firstly, a factor model based on
tensor decomposition was proposed to analyze the user’s online behavior. For two
different scenarios, this paper designs the utility functions of users, internet manufac-
turers and mobile operators. While maximizing social welfare and maximizing platform
profit, They are analyzed respectively, and the package pricing formula under different
conditions is obtained. Finally, this paper makes a quantitative analysis of the model
through empirical data, and completes the pricing design of the free-flow package.
Acknowledgement. This work was supported in part by the National Natural Science Foun-
dation of China under Grant 11571383, the Natural Science Foundation of Hubei Province of
China under Grant 2017CFB302, the Science and Technology Program of Guangzhou, China
under Grant 201804020053, and Shanghai Special Fund Project for Artificial Intelligence
Innovation and Development under Grant 2018-RGZN-01013.
1. Gai, K., Qiu, M., Hui, Z., Tao, L., Zong, Z.: Dynamic energy-aware cloudlet-based mobile
cloud computing model for green computing. J. Netw. Comput. Appl. 59(C), 46–54 (2016)
2. Gai, K., Qiu, M., Hui, Z.: Energy-aware task assignment for mobile cyber-enabled
applications in heterogeneous cloud computing. J. Parallel Distrib. Comput. 111, S0743731
517302319 (2017)
3. Gai, K., Xu, K., Lu, Z., Qiu, M., Zhu, L.: Fusion of cognitive wireless networks and edge
computing. IEEE Wirel. Commun. 26(3), 69–75 (2019)
4. Holden, R.J., Kulanthaivel, A., Purkayastha, S., Goggins, K., Kripalani, S.: Know thy
eHealth user: development of biopsychosocial personas from a study of older adults with
heart failure. Int. J. Med. Inform. 108, 158–167 (2017)
5. Kumar, H., Lee, S., Kim, H.G.: Exploiting social bookmarking services to build clustered
user interest profile for personalized search. Inf. Sci. (Ny) 281, 399–417 (2014)
6. Chikhaoui, B., Wang, S., Xiong, T., Pigot, H.: Pattern-based causal relationships discovery
from event sequences for modeling behavioral user profile in ubiquitous environments. Inf.
Sci. An Int. J. 285(C), 204–222 (2014)
7. LeRouge, C., Ma, J., Sneha, S., Tolle, K.: User profiles and personas in the design and
development of consumer health technologies. Int. J. Med. Inform. 82(11), E251–E268 (2013)
8. Rossi, R., Gastaldi, M., Orsini, F.: How to drive passenger airport experience: a decision
support system based on user profile. IET Intell. Transp. Syst. 12(4), 301–308 (2018)
9. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by
latent semantic analysis. J. Assoc. Inf. Sci. Technol. 41(6), 391–407 (2010)
10. Kawazu, H., Toriumi, F., Takano, M., Wada, K., Fukuda, I.: Analytical method of web user
behavior using Hidden Markov Model. In: IEEE International Conference on Big Data (2017)
11. Park, J., Lee, D.S., González, M.C.: The eigenmode analysis of human motion. J. Stat.
Mech: Theory Exp. 2010(11), P11021 (2016)
12. Armstrong, M.: Competition in two-sided markets. RAND J. Econ. 37(3), 668–691 (2006)
13. Armstrong, M., Wright, J.: Two-sided markets, competitive bottlenecks and exclusive
contracts. Econ. Theory 32(2), 353–380 (2007)
14. Rochet, J., Tirole, J.: Two-sided markets: a progress report. RAND J. Econ. 37(3), 645–667
A Space Dynamic Discovery Scheme for Crowd
Flow of Urban City
1 Introduction
Urban crowd flow analysis plays an important role in urban planning, management and
development. Such as, reduce the blindness of commercial location based on regional
crowd flow. Combining regional crowd flow and crowd consumption attributes such as
purchasing power and consumption level can provide multi-dimensional digital support
for the location of the business circle, in order to achieve the overall selection of layout
and maximize overall benefits. Also traffic planning based on regional flow data makes
urban governance more forward-looking. Consider the regional crowd flow attributes of
various time periods or holidays, including data such as changes in crowd flow, popu-
lation density, and effective flow of population. In order to plan new roads, public
transportation, peak control measures, and large-scale transportation hubs in urban traffic.
Make regional governance more subtle by analysis crowd flow. Real-time dynamic
analysis of crowd flow through interaction data between mobile phones and base stations,
analysis of regional population characteristics, and data support for urban managers.
Most of the existing crowd traffic analysis methods only consider the crowd itself,
ignoring the spatial connection between the locations, and not being able to perceive
the spatial flow of the crowd. This paper focuses on the problem to obtain location data
and the dynamic analysis of population flow data. We propose to use Crowd Flow
Analysis based on Graph Signal Processing to discover the characteristics of user
mobile behavior in cities.
© Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 171–179, 2019.
172 Z. Wang et al.
2 Related Work
3 System Model
In which, arr ðm; nÞ indicates that the nodes are reachable, the reachable is true, and
the unreachable is false.
We use the idea of Crowd-Sensing to perceive user movement behavior and
determine base station dependencies according to the user’s movement trajectory.
For the massive user mobile information in large-scale mobile data, the node depen-
dency discovery method used in this paper can restore the spatial connectivity of the
base station more reliably. In terms of spatial structure, the spatial network map based
on the user’s natural mobility replaces it. The graph topology construction method
based on the traditional spatial distance comprehensively considers the human mobile
situation and the base station spatial distance in the real world, so that the base station
can be used as a spatial mobile checkpoint, which is better as an observation entry point
for human mobile behavior. In terms of application, it is obvious that the base station
relationship discovery method used in this paper is derived from the data itself, and
does not require any external auxiliary tools, which saves a lot of resources compared
to the method based on Geographic Information Systems (GIS).
The specific base station dependency discovery algorithm is shown in Algorithm 1.
174 Z. Wang et al.
On the spatial network diagram, the graph signal is defined as a set of finite samples,
each of which is defined on the nodes of the graph, expressed as f : V ! RN . Where the
nth component of the signal represents the number of users at the nth node in V. In the
urban space network diagram signal model, the definition of the number of users depends
on the duration of the user’s Internet access. If the user has an online behavior within the
time window of Dt, the user is recorded to have effective access to the node in Dt, one The
user’s effective access record exists only once at most in the time window, and the graph
signal is the sum of the valid access records recorded under the time window.
Assume that the proposed spatial network graph topology is G ¼ ðVG ; EG Þ, and the
time series structure of the signal is H ¼ ðVH ; EH Þ. VH is each time point divided by Dt.
EH in turn connect each VH . si and lj are nodes in VG and VH . In the Cartesian product
graph G H obtained by G and H, If the node si at the i position in G is adjacent to the
node sk at position k, and the node lj at the j time in H is adjacent to the node ll at time
l, nodes si ; lj and ðsk ; ll Þ in G H will have a side. In the figure, assuming that the
topology of the spatial network graph G is time-invariant, the Cartesian product graph
G H can be regarded as a copy of the spatial network graph G stacked by the nodes
in the time series H. Such a network structure comprehensively considers the spa-
tiotemporal dependence of nodes. Since the time slice nodes in H are connected by
spatial nodes in G, the wavelet function can be applied to analyze the signal function
defined on the nodes in G, thereby detecting the temporal and spatial changes of user
movement. The Cartesian product graph construction process is shown in Fig. 1.
In order to dig deep and fine-grained user-moving features, we use the Spectral
Graph Wavelets Transform (SGWT) method to obtain wavelet coefficients of different
scales on the nodes by wavelet decomposition of the graph signal model. The coeffi-
cients represent the low to high frequency characteristics of the network. Specifically,
the low-frequency wavelet coefficients exhibit low-frequency sensitivity, that is, when
the node signal vi connected to a node is close to the vi signal amplitude, the low-
frequency wavelet coefficients exhibit a large dimensionless value. On the contrary,
when the wavelet filter filters the graph signal model, the high-frequency wavelet
coefficients exhibit a small value for the above case.
A Space Dynamic Discovery Scheme for Crowd Flow of Urban City 175
In the graph signal model, G H is used as the topology of the graph signal model,
and node V as a set of finite number of base stations in each period divided by Dt, as
well as the symmetric adjacency matrix A; 8amn ¼ R þ , the real-valued signal f ðvi Þ
defined on the vertex vi on the graph is a column vector representing the number of
users of the node at a certain time period. The denormalized Tulaplas operator is
defined as shown in Eq. (2).
L ¼ D A 2 RNN ð2Þ
At the same time, for the diagonal array K ¼ diagð½k0 ; k1 ; ; kN1 Þ, L can be
converted into the form of the Eq. (4).
L ¼ vKvH ð4Þ
^ X
f ðlÞ ¼ hf ; vl i ¼ f ðnÞvl ðnÞ; l ¼ 0; 1; ; N 1 ð5Þ
f can be recovered by the graph signal reconstruction formula, as shown in Eq. (6).
N 1 ^
f ð nÞ ¼ f; vl ¼ f ðlÞvl ðnÞ; n ¼ 1; 2; ; N ð6Þ
Using vector and matrix notation, Eqs. (5) and (6) can be converted into the fol-
lowing form.
^ ^ ^ ^
f ¼ f ð0Þ; f ð1Þ; ; f ðN 1Þ ¼ vH f ð7Þ
h h h h
f ¼ ½f ð1Þ; f ð2Þ; ; f ðN ÞT ¼ u f ð8Þ
176 Z. Wang et al.
GFT follows the Parseval theorem, that is, for any signal sum defined on the graph,
the relationship is as in Eq. (9).
^ ^
hf ; hi ¼ f ; h ð9Þ
The spectral wavelet transform is an extension of GFT. For the user number signal
G H on f , the spectrum wavelet transform defined by g is as shown in Eq. (10).
N 1 ^
Wf ðs; nÞ ¼ Tgs f ðnÞ ¼ gðskl Þ f ðlÞvl ðnÞ; n ¼ 1; 2; ; N ð10Þ
Tgs is a wavelet operator on the s scale and satisfies Tgs ¼ gðtLÞ. The wavelet kernel
g is a continuous function defined in the positive real space and satisfies Eq. (11):
Zþ 1
gð x Þ
gð0Þ ¼ lim gð xÞ ¼ 0; dx ¼ Cg 2 R þ ð11Þ
x! þ 1 x2
Therefore, combined with Eq. (5), by inner product of a single graphics wavelet,
the spectral wavelet transform formula is converted to:
Wf ðs; nÞ ¼ Tgs f ðnÞ ¼ f ðmÞws;n m ¼ f ; ws;n ; n ¼ 1; 2; ; N ð12Þ
N 1
ws;n ðmÞ ¼ gðskl Þvl ðmÞvl ðnÞ; m ¼ 1; 2; ; N ð13Þ
5 Performance Evaluations
This paper uses data from one major operator in Beijing for one day as a data source for
analysis and experimentation. The main information is shown in Table 1.
We used Mexican Hat Wavelets (MHW) as the bandpass filter for the graph signal.
Extracting wavelet coefficients in the graph signal model by defining such a set of
filters. We set the number of wavelet filters M of different scales in filter bank
F¼½F1 ; F2 ; . . .; FM to 6, which in turn represent the MHW of high frequency to low
frequency. For the graph signal model topology considered in this paper, the frequency
domain wavelet kernel response in the finite support interval is shown in Fig. 2.
Considering the sparsity of the signal and facilitating the visualization of the
method, high frequency wavelet is selected as the basis of wavelet analysis in wavelet
analysis. Under the high-frequency wavelet filter, the wavelet coefficients are sensitive
to sudden changes in adjacent nodes. Specifically, when the node signal is close to the
adjacent node signal value, the wavelet coefficient approaches 0, the node signal is
larger than the adjacent node signal, and the wavelet coefficient is larger. In order to
discover the relationship between the signal on the graph and the wavelet coefficients,
and the performance of the wavelet coefficients, we randomly selected a node #902 on
the graph and studied it. The wavelet coefficient is determined by the signal relation-
ship between the node signal and the adjacent node, and the adjacent relationship is
found by the time and space correlation of the node. Therefore, this paper constructs a
1-hop self-centered network, randomly taking #902 nodes and a total of five 1-hop
nodes connected to it as the research target, and the connection relationship is shown in
Fig. 3a and Fig. 3b left shows the signal changes for each node throughout the day in.
Figure 3b right shows the variation of the high-frequency wavelet coefficients for the
#902 node throughout the day.
178 Z. Wang et al.
b left: Signal in the self-centered network, right: High frequency wavelet coefficient in
self-centered network
Fig. 3. Comparison of #902 node signal and high frequency wavelet coefficient (Color figure
The blue dotted line in Fig. 3b left shows the full-day signal amplitude of #902, and
the solid line indicates the signal amplitude of the spatial neighboring nodes. Com-
paring the two figures, it is obvious that:
• The signal amplitude of #902 in e is significantly lower than that of other nodes in e.
Compared with Fig. 3b right #902 all day wavelet coefficients are mostly less than
• At 0:00–6:00, the signal of #902 and spatial adjacent nodes changes stably, and the
time dimension has no significant fluctuation. The absolute value of #902 high
frequency wavelet coefficient is low and stable, and the low frequency characteristic
is significant.
• At 8:00, the signal amplitude of adjacent nodes in #902 space increased signifi-
cantly, and the corresponding #902 high-frequency wavelet coefficient decreased
significantly. At 14:00 and 16:00, #902 showed a relatively high mutation in the
time dimension, corresponding to a significant increase in the #902 high frequency
A Space Dynamic Discovery Scheme for Crowd Flow of Urban City 179
Using the idea of Crowd-Sensing, this paper uses the user’s mobile behavior between
base stations to determine the dependencies between base stations, and then constructs
the spatial network map topology. We models the number of users of the base station as
the signal on the graph and use the theory of spectral wavelet decomposition to find the
spatial behavior characteristics of the graph signal model. Through the high-
dimensional behavioral characteristics, it is found that wavelet coefficients of differ-
ent scales can finely represent the high-frequency features and low-frequency features
of the user moving in space. It provides a new understanding of urban population
movement characteristics from the perspective of spectrum map, and the obtained
results can also be applied to important aspects such as abnormal behavior monitoring
and urban infrastructure construction in urban supervision.
1. Yang, S., Kalpakis, K., Biem, A.: Spatio-temporal coupled bayesian robust principal
component analysis for road traffic event detection. In: 16th International IEEE Conference on
Intelligent Transportation Systems (ITSC 2013), pp. 392–398. IEEE (2013)
2. Senaratne, H., Mueller, M., Behrisch, M., et al.: Urban mobility analysis with mobile network
data: a visual analytics approach. IEEE Trans. Intell. Transp. Syst. 19(5), 1537–1546 (2018)
3. Rawassizadeh, R., Momeni, E., Dobbins, C., et al.: Scalable daily human behavioral pattern
mining from multivariate temporal data. IEEE Trans. Knowl. Data Eng. 28(11), 3098–3112
4. Von Landesberger, T., Brodkorb, F., Roskosch, P., et al.: MobilityGraphs: visual analysis of
mass mobility dynamics via spatio-temporal graphs and clustering. IEEE Trans. Vis. Comput.
Graph. 22(1), 11–20 (2016)
5. Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation.
IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)
6. Shuman, D.I., Narang, S.K., Frossard, P., et al.: The emerging field of signal processing on
graphs: extending high-dimensional data analysis to networks and other irregular domains.
IEEE Signal Process. Mag. 30(3), 83–98 (2012)
7. Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph
theory. Appl. Comput. Harmon. Anal. 30(2), 129–150 (2011)
8. Narang, S.K., Ortega, A.: Perfect reconstruction two-channel wavelet filter banks for graph
structured data. IEEE Trans. Signal Process. 60(6), 2786–2799 (2012)
9. Shuman, D.I., Narang, S.K., Frossard, P., et al.: The emerging field of signal processing on
graphs: extending high-dimensional data analysis to networks and other irregular domains.
arXiv preprint arXiv:1211.0053 (2012)
Automated Classification of Attacker
Privileges Based on Deep Neural Network
Beijing Advanced Innovation Center for Big Data and Brain Computing,
Beihang University, Beijing 100191, China
1 Introduction
With the rapid development of the digitalization level of various industries,
today’s computer networks are facing increased numbers of cyberattacks, pro-
viding safety and security for such systems is more urgent than ever [16]. In
order to evaluate potential threats, attack graphs are widely used for modelling
attack scenarios that exploit vulnerabilities in computer systems and networked
infrastructures. So risk assessments generated from probabilistic attack graphs
assist further security decisions [2,21].
Supported by National Key R&D Program of China (2018YFB0803500), the 2018
joint Research Foundation of Ministry of Education, China Mobile (5–7) and State
Key Laboratory of Software Development Environment (SKLSDE-2018ZX).
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 180–189, 2019.
Automated Classification of Attacker Privileges Based on DNN 181
Most of the proposed attack graph generation techniques are based on Prereq-
uisite/Postcondition Models. P rerequisites stand for the necessary conditions
of exploiting the vulnerabilities. P ostconditions are the effects and capabilities
obtained by the attackers as a result of the vulnerability exploitations. Topologi-
cal Vulnerability Analysis (TVA) [8] is one of the famous attack graph generation
tools, which utilizes a knowledge database of prerequisites and postconditions
that relate to vulnerability exploitation steps. But the prerequisites and post-
conditions need manually generated from the vulnerability descriptions. Network
Security Planning Architecture (NETSPA) [7] generates prerequisites and post-
conditions via a logistic regression model trained with a sample manual data.
However, their privilege classification schemes seem to be limited, such that not
cover application level privileges. The other attack graph generation methods
proposed in [12,17,19] all rely on manually determining the attacker privileges
corresponding to all vulnerabilities, which seems time-consuming.
A direct solution is to extract the prerequisites and postconditions from pub-
licly accessible vulnerability databases automatically. However, the vulnerability
descriptions in the databases are usually not entirely machine-readable, imped-
ing the easy parsing of the vulnerabilities [9]. Fortunately, deep learning has
a significant impact on the field of text processing [10,18]. Conneau et al. [4]
propose a very deep convolutional network which using up to 29 convolutional
layers for natural language processing. Hassan et al. [6] describe a joint CNN
and RNN framework for sentence classification to overcome the locality of the
convolutional and pooling layers.
Therefore, in order to generate the prerequisites and postconditions of vul-
nerabilities in an automated way, we propose an automatic attacker privilege
classification model IG-DNN. In this model, we first use the information gain
(IG) algorithm to extract the feature words of the vulnerability descriptions,
then construct a DNN neural network model based on deep learning. The IG-
DNN model was trained and tested using vulnerability data derived from the
National Vulnerability Database (NVD). The experimental results show that
the proposed model effectively improves the performance of attacker privilege
The remainder of this paper is arranged as follows: Sect. 2 introduces the defi-
nition of relevant concept and algorithm. Section 3 describes the implementation
details of our model. Experiment dataset and results are discussed in Sect. 4 with
comparative analysis. And in Sect. 5, the conclusions are outlined.
2 Related Definition
The automatic classification model of attacker privilege (IG-DNN) is constructed
in this paper. The relevant definitions are as follows.
are depicted in descending order from OS (Admin) level to the N one. OS/AP P
level privileges indicate privilege requirements or gains for specific operating sys-
tems/applications. And Admin level privileges are more capable than privileges
at U ser levels. The N one level implies the attacker does not need/gain any of
the four privileges listed at the operating system or application level.
Information gain measures on how mixed up the features are [5]. In text classifi-
cation domain, information gain is used to measure the importance of feature A
to class C. The expected value of the information gain is the mutual information
I(C, A) between classes C and feature A.
H(C | A) = − p(Ci | A) log2 p(Ci | A) (3)
Based on the feature selection method of the information gain criterion, the
information gain of each feature is calculated, and usually, a feature with larger
information gain value should be preferred to other features.
Next, we max-pool the result of the convolutional layer into a long feature
vector and concatenate the vector of taxonomy-based features with it. Then
the combined vectors are fed to a deep densely-connected network that consists
of a dropout layer and three fully-connected layers. The dropout layer stacked
behind the first fully-connected layer is used to avoid overfitting, thus increasing
the generalizing power.
In addition, the forward propagation of the convolution layer and fully-
connected layers uses ReLU as the activation function. The cross-entropy is
selected as the loss function to measure the loss between the predicted output
and the actual output. Moreover, the extended stochastic gradient descent algo-
rithm Adam [11] is used as the optimization algorithm to minimize the loss
Finally, the output layer uses softmax as activation function to assign prob-
abilities to different objects for the final output. As the output, the attacker
privilege categories are determined.
4 Performance Evaluation
In this section, we evaluate the performance of the proposed model for attacker
privilege classification and compare our results with the other models.
Automated Classification of Attacker Privileges Based on DNN 185
4.1 Dataset
In order to verify the proposed model, we collect the vulnerability information
from NVD as experimental data. By the end of 2018, NVD hosts more than
109,000 vulnerability entries which recorded in a series of JSON files. We extract
the required vulnerability information from the JSON files using Python and
label the vulnerability entries manually by carefully analyzing their description
texts and other base fields. Finally, we get two datasets, a prerequisite labelled
dataset that contains 45,958 vulnerabilities, and a postcondition labelled dataset
consists of 88,730 vulnerabilities. The distribution of privilege classes on the
experimental datasets is depicted in Table 2.
Fig. 3. Comparison for privilege pre- Fig. 4. Comparison for privilege post-
requisites. conditions.
5 Conclusion
In this work, we applied a deep neural network for generating attacker privi-
leges as prerequisites and postconditions from the vulnerabilities in the NVD.
The proposed IG-DNN model achieved an F-measure of 99.53% and 98.90% for
privilege prerequisites and postconditions, respectively. And compared to the
traditional machine learning algorithms, the IG-DNN model outperforms others
in precision, recall and F-measure. This promising result demonstrates the effec-
tiveness of IG-DNN in attacker privilege classification. Moreover, it also indicates
that deep learning can be used successfully for privilege determination in order
to generate attack graphs in an automated way.
188 H. Liu and B. Li
1. Aksu, M.U., Bicakci, K., Dilek, M.H., Ozbayoglu, A.M., et al.: Automated genera-
tion of attack graphs using NVD. In: Proceedings of the Eighth ACM Conference
on Data and Application Security and Privacy, pp. 135–142. ACM (2018)
2. Aksu, M.U., Dilek, M.H., Tatlı, E.İ., Bicakci, K., Dirik, H.I., Demirezen, M.U.,
Aykır, T.: A quantitative CVSS-based cyber security risk assessment methodology
for it systems. In: 2017 International Carnahan Conference on Security Technology
(ICCST), pp. 1–8. IEEE (2017)
3. Cheikes, B.A., Cheikes, B.A., Kent, K.A., Waltermire, D.: Common platform enu-
meration: naming specification version 2.3. US Department of Commerce, National
Institute of Standards and Technology (2011)
4. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional net-
works for text classification. arXiv preprint arXiv:1606.01781 (2016)
5. Gray, R.M.: Entropy and Information Theory. Springer, Heidelberg (2011).
6. Hassan, A., Mahmood, A.: Convolutional recurrent deep learning model for sen-
tence classification. IEEE Access 6, 13949–13957 (2018)
7. Ingols, K., Lippmann, R., Piwowarski, K.: Practical attack graph generation for
network defense. In: 2006 22nd Annual Computer Security Applications Conference
(ACSAC 2006), pp. 121–130. IEEE (2006)
8. Jajodia, S., Noel, S., Oberry, B.: Topological analysis of network attack vulnera-
bility. In: Kumar, V., Srivastava, J., Lazarevic, A. (eds.) Managing Cyber Threats,
pp. 247–266. Springer, Heidelberg (2005).
9. Kaynar, K.: A taxonomy for attack graph generation and usage in network security.
J. Inf. Secur. Appl. 29, 27–56 (2016)
10. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint
arXiv:1408.5882 (2014)
11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014)
12. Lippmann, R.P., Ingols, K.W., Piwowarski, K.J.: Generating a multiple-
prerequisite attack graph, 17 May 2016. US Patent 9,344,444
13. Loper, E., Bird, S.: NLTK: the natural language toolkit. arXiv preprint cs/0205028
14. Loria, S., Keen, P., Honnibal, M., Yankovsky, R., Karesh, D., Dempsey, E.,
et al.: Textblob: simplified text processing. Simplified Text Processing, Secondary
TextBlob (2014)
15. Mell, P., Scarfone, K., Romanosky, S.: A complete guide to the common vulnerabil-
ity scoring system version 2.0. In: Published by FIRST-Forum of Incident Response
and Security Teams, vol. 1, p. 23 (2007)
16. Qiu, H., Kapusta, K., Lu, Z., Qiu, M., Memmi, G.: All-or-nothing data protection
for ubiquitous communication: Challenges and perspectives. Information Sciences
17. Salahi, A., Ansarinia, M.: Predicting network attacks using ontology-driven infer-
ence. arXiv preprint arXiv:1304.0913 (2013)
18. Silver, D., et al.: Mastering the game of go with deep neural networks and tree
search. Nature 529(7587), 484 (2016)
Automated Classification of Attacker Privileges Based on DNN 189
19. Singhal, A., Ou, X.: Security risk analysis of enterprise networks using probabilis-
tic attack graphs. Network Security Metrics, pp. 53–73. Springer, Cham (2017). 3
20. Team C: Common vulnerability scoring system V3. 0: specification document. (2015)
21. Wang, H., Chen, Z., Zhao, J., Di, X., Liu, D.: A vulnerability assessment method
in industrial internet of things based on attack graph and maximum flow. IEEE
Access 6, 8599–8609 (2018)
NnD: Shallow Neural Network
Based Collision Decoding
in IoT Communications
1 Introduction
IoT has been widely used in a variety of application scenarios, e.g., intelligent
warehouse [4], environment monitoring [5] and smart buildings [6]. With the
prosperity of IoT [10,11], concurrent transmissions become frequently, resulting
in severe collisions and low throughputs.
In this paper, we focus on a typical wireless technology ZigBee [1], which
is a low-power wireless protocol elaborated for IoT applications based on IEEE
802.15.4 standard. Adopting star, tree, or mesh topology in ZigBee based net-
work, concurrent transmissions appear frequently in which a receiver (RX)
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 190–199, 2019.
NnD: Shallow Neural Network Based Collision Decoding 191
v v
Label: v
AB 11 00 10 01 11 00 10 01
2 Background
ZigBee is a low-power wireless protocol based on IEEE 802.15.4 standard. The
IEEE 802.15.4 PHY layer supports three ISM bands: 868.3 MHz in Europe, 902–
928 MHz in America, and 2.4 GHz all over the world. ZigBee of 2.4 GHz ISM band
is the most widely used and its associated bandwidth is 2 MHz. A standard PHY
Protocol Data Unit (PPDU) of ZigBee physical layer consists of three parts,
Synchronization Header (SHR), PHY Header (PHR) containing frame length
information and Protocol Service Data Unit (PSDU), a variable length payload.
The 32-bit preamble is a part of SHR and consists of 32 zeros.
The PPDU in TX goes through several processes before sending to a RX
and the RX decodes data by an inverse procedure. We concentrate on the
(de)modulation related to our design. First, binary data from PPDU is divided
into symbols which consists of 4-bit. After that, each 4-bit symbol is spread
to a 32-bit chip sequence which is known as Direct Sequence Spread Spectrum
(DSSS). Next, chip sequence is modulated into I and Q phase according to Offset
Quadrature Phase Shift Keying (OQPSK). Thus, amplitude of chips from the
same packet is uniform. Then pulse shaping is performed based on chips’ value
and chips in one packet are shaped as half-sine waveform of the same amplitude.
Training Set
Initializing Fine Tuning
Decode A
Decode B
Decoded packets
training set from history packets and current collided packet; (ii) training strat-
egy to initializing and fine tuning the NN; (iii) decoding or predicting unknown
overlapped chips’ values. The working diagram of nnD is shown in Fig. 2.
Training Set Collection. First, we collect abundant training set making fully
use of known chips. We observe that overlapping status of chips from different
packets are limited because time offset between packets from different TXs is
determined. Time offsets between packets can be detected and calculated by
correlation with preamble’s chip sequence [8]. After time offset is obtained, the
original overlapped packets can be partitioned into basic segments as shown in
Figs. 1 and 5. As a result, each segment is the overlapping result of single chip
from different packets, while samples of a single chip are from the same chip
cycle in a packet. Thus, each segment can be labeled with a definite bit sequence
according to the values of chips belong to this segment as shown in Figs. 1 and 5.
It can be seen that for a 2 or 3 packets’ collision there are 4 or 8 kinds of labels
respectively. Generally, there are 2n kinds of labels for an n-packet collision.
Chips with the same label are nearly identical since chips of the same value
from one packet have the same waveform and amplitude. Considering a specific
chip from a single packet, multiple samples forming a half-sine pulse shaping can
be indicated as:
X[i] = c · α · sin(πf t). (1)
where X[i] is the value of i-th sample, c is the sign indicator (positive for chip
‘1’ and negative for chip ‘0’), α is the amplitude which depends on TX power
and distance. Besides, history packets recently received are also available and
effective when the transmission power of TXs is fixed and channel is considered
to be relatively stable in short terms [3]. To collect adequate training set, for
one thing, we take the collision-free chips from both current packet and history
packets into consideration. We manually overlay these chips under various time
194 Z. Wang et al.
where X and Y are collections of N training samples and labels, yi and ŷi is the
ground truth and predicted result which are 1 x n vectors as shown in Fig. 3.
Performance of the neural network is measured by bit error ratio (BER) and
packets reception ratio (PRR) rather than conventional classification accuracy.
In order to maintain high classification precision and low computing cost, we
introduce adaptive NN structure based on the number of collided packet and
signal to noise ratio (SNR). In two and three collision scenarios, we train neural
NnD: Shallow Neural Network Based Collision Decoding 195
Available collision-free and overlay chips in preamble
networks with two and three hidden layers of 32 neurons each layer. The NN
architecture is shown in Fig. 3. The input vector length equals to the number of
samples per chip (SPC) after complementing zeros. The output depends on the
number of collided chips.
To ensure the sufficiency of training set at the same time of avoiding over-
fitting, sliding window strategy for training set updating is employed. We retain
known labeled data from 64 history packets and updating training set when RX
receives a new collided packet. Whenever a packet arrives, the neural network
will be retrained using current and history data. As a result, high classification
accuracy can be achieved. Besides, because the number of weights and bias is
limited as the neural network architecture is very simple, time and computing
overhead of training and decoding procedure is tolerative. In general, the design
of nnD is practicable and is able to decode collided packets accurately.
Packet Decoding. Last, the trained NN is used to predict labels of unknown
overlapped chips. We maintain a NN for each received packet in collisions. The
decoding result of multi-packet can be used for cross-validation for high accuracy.
ABC 111 000 100 011 110 001 010 101
Available collision-free and overlay chips in preamble
4 Simulation
To estimate performance of nnD, we conduct extensive trace-driven simulations
of two- and three-packet collision decoding.
We first collect vast collided packets in two- and three-packet collisions. Our
data collection platform is based on several USRP N210 devices with SBX daugh-
terboard as shown in Fig. 7. Our trace-driven simulation is based on Matlab
platform. We implement the state-of-the-art mZig [8] for comparison since con-
ventional ZigBee cannot decode packets in a collision. The objective of our simu-
lation is to validate the capability of nnD from different perspectives. While nnD
is a PHY layer design, bit error rate (BER), chip error rate (CER) and packet
reception ratio (PRR) of PHY layer are used to evaluate the performance of
nnD, mZig and traditional ZigBee under different signal to noise ratio (SNR)
level. We set a threshold of BER below which a packet is considered decoding
correctly, and the value of threshold is 10−3 in our simulation.
In our simulation, we set the sampling frequency in the RX as 32M/s, i.e.,
32 sampling points in a chip cycle. We collect known labeled segments of the
maximum length as training set. The labeled segments are normalized to the
length of 32 by complementing zeros.
In Fig. 8(a), (c) and (e), we exhibit CER, BER and PRR of two-packet col-
lision under different SNR. nnD can achieve a low CER and BER, i.e., less than
10−2 , even under low SNR. The PRR of nnD under different SNR retain relative
stable whose value is more than 90% while mZig is more susceptible to noise.
We also extend nnD to three-packet collision decoding. In three-packet colli-
sion scenarios, we implement NN with three layers of the same number of neurons
as two-packet situation. As shown in Fig. 8, nnD can also achieve lower BER and
higher PRR under different SNR compared with mZig, which demonstrate that
nnD has a preferable scalability and superior performance.
We perform plenty of repeated experiments in each case, e.g., two hundred
repetitions under each SNR. The simulation results show that nnD significantly
outperforms existing methods in terms of BER, PRR and the number of con-
current transmissions.
198 Z. Wang et al.
0 0 0
10 10 10
-1 -1 -1
10 10 10
Chip Error Rate
-3 -3 -3
10 10 10
-4 -4 -4
10 10 10
nnD nnD nnD
10-5 mZig 10-5 mZig 10-5 mZig
ZigBee ZigBee ZigBee
-6 -6 -6
10 10 10
0 3 6 9 12 3 6 9 12 15 0 3 6 9 12
Signal to Noise Radio (dB) Signal to Noise Radio (dB) Signal to Noise Radio (dB)
60 60
40 40
nnD ZigBee
-5 20 mZig 20
10 mZig
ZigBee nnD
10-6 0 0
3 6 9 12 15 0 3 6 9 12 3 6 9 12 15
Signal to Noise Radio (dB) Signal to Noise Radio (dB) Signal to Noise Radio (dB)
Fig. 8. CER, BER and PRR of two- (left) and three- (right) packet collision under
different SNR
5 Related Works
6 Conclusion
We present a new design based on shallow neural network which can resolve
multi-packet collision directly. The origin multi-packet decomposition problem
is translated into a classification procedure and collided packets can be decoded
by a trained network. We implement extensive simulations to estimate the per-
formance of nnD. The results demonstrate that nnD significantly outperforms
existing methods in terms of bit error rate and the number of concurrent trans-
Acknowledgements. This work was supported in part by the National Key R&D
Program of China 2018YFB1004703, NSFC grant 61972253, 61672349, 61672353.
1. Alliance, Z.: Introduction to Zigbee (2018).
2. Gollakota, S., Katabi, D.: Zigzag decoding: combating hidden terminals in wireless
networks. In: ACM SIGCOMM (2008)
3. Halperin, D., Hu, W., Sheth, A., Wetherall, D.: Predictable 802.11 packet delivery
from wireless channel measurements. In: ACM SIGCOMM (2010)
4. Jabbar, S., Khan, M., Silva, B.N., Han, K.: A REST-based industrial web of things’
framework for smart warehousing. J. Supercomput. 74, 4419–4433 (2018)
5. Justino, C., Duarte, A., Rocha-Santos, T.: Recent progress in biosensors for envi-
ronmental monitoring: a review. Sensors 17, 2918 (2017)
6. Kelly, S.D.T., Suryadevara, N.K., Mukhopadhyay, S.C.: Towards the implementa-
tion of IoT for environmental condition monitoring in homes. IEEE Sens. J. 13,
3846–3853 (2013)
7. Kleinrock, L., Tobagi, F.: Packet switching in radio channels: part i - carrier sense
multiple-access modes and their throughput-delay characteristics. IEEE Trans.
Commun. 23, 1400–1416 (1975)
8. Kong, L., Liu, X.: mZig: enabling multi-packet reception in ZigBee. In: ACM
MOBICOM (2015)
9. Laufer, R., Kleinrock, L.: The capacity of wireless CSMA/CA networks.
IEEE/ACM Trans. Netw. 24, 1518–1532 (2016)
10. Liu, Y., Yang, C., Jiang, L., Xie, S., Zhang, Y.: Intelligent edge computing for
IoT-based energy management in smart cities. IEEE Netw. 33, 111–117 (2019)
11. Ronen, E., Shamir, A., Weingarten, A.O., Flynn, C.O.: IoT goes nuclear: creating
a ZigBee chain reaction. In: IEEE S&P (2017)
12. Sobrinho, J.L., de Haan, R., Brazio, J.M.: Why RTS-CTS is not your ideal wireless
LAN multiple access protocol. In: IEEE WCNC (2005)
13. Tobagi, F., Kleinrock, L.: Packet switching in radio channels: Part II - the hidden
terminal problem in carrier sense multiple-access and the busy-tone solution. IEEE
Trans. Commun. 23, 1417–1433 (1975)
14. Ziouva, E., Antonakopoulos, T.: CSMA/CA performance under high traffic condi-
tions: throughput and delay analysis. Computer Communications (2002)
Subordinate Relationship Discovery Method
Based on Directed Link Prediction
He Nai1, Min Lin2, Hao Jiang1(&), Huifang Liu2, and Haining Ye2
School of Electronic Information, Wuhan University, Wuhan, China
China Unicom Group Co., Ltd., Guangdong Branch, Guangzhou, China
1 Introduction
Subordinate relationship is shared between the superior and the subordinate. It exists in
the enterprise, the subordinate and its immediate supervisor have certain interests or
benefits of both parties. The well leader-member relationship in an enterprise can
enhance the organizational loyalty of lower-level employees, thereby improving the
organizational commitment of subordinates and reducing the turnover intention of
subordinates. In mobile data mining, using call record and online log of employees to
build complex networks and discover the leader-member relationship has the great
significance for perceiving the manager and management model within an enterprise.
In the relationship discovery method for mobile data, community discovery has
always been a common method. These methods have algorithms based on network
topology and algorithms based on semantic clustering. The algorithm based on network
topology only divides the community structure from the external link form, and ignores
the user’s online behavior and attribute characteristics. The algorithm based on
semantic clustering has combined the strength of uses’ relationship with the attribute
information to discover the semantic community. The advantage is that the community
mining results are more accurate and more cohesive, and more suitable for discovering
overlapping community structures.
However, the above method based on community discovery only judges whether
the user is in a community by the similarity of the users’ attribute and the online
behavior. It can only judge whether the user has a certain association, and cannot find
the level of the users. However, the directed link prediction for the network can get the
relationship of the nodes and this direction can express the subordinate relationship.
Link prediction methods are mainly divided into two categories: content-based and
graph-based. In both categories, a higher metric means that vertex pairs will be con-
nected with a high probability and vice versa. In content-based measures, attributes of
vertices and links are employed. In graph-based measures, the topological features of
complex networks are used to measure and perform link prediction. Graph-based link
prediction can be divided into three categories: neighbor-based, path-based and pattern-
based. The neighbor-based link prediction method calculates the characteristics of the
common neighbors of the vertex pairs. If the characteristics of the common neighbors
are more, the more likely the vertex pairs are connected. Method proposed by Adamic
and Adar [1], Salton Index [2] and Hup Promoted Index [3] are belong to neighbor-
based measures. In path-based measures, consider the transition possibilities of the path
through the edge between the vertices and the random walk from the vertices to its
neighbors. Some pattern-based measures studies are described in [2, 3]. Due to the
difficulty of content-based information acquisition, most link prediction methods focus
on graph-based measurements.
Systematic analysis of traditional graph-based link prediction measures in unsu-
pervised link prediction model [4]. In the unsupervised link prediction models, the
scores of all unlinked vertex pairs are calculated from predetermined link prediction
metrics. After that, the node with the highest score is predicted to be connected. The
above methods have inherent drawbacks of unsupervised learning methods. In order to
overcome these drawbacks, supervised learning models were proposed to predict links.
Most content-based and graph-based link prediction methods are supervised. Since the
supervised learning method uses a large amount of data to train the model and is more
in line with real-life situations, the link prediction method for supervised learning is
more accurate than the unsupervised learning link prediction method. In link predic-
tion, the most commonly used supervised learning models are classification, proba-
bilistic, matrix factorization and graph kernel based models [5]. Hasan et al. [3]
proposed a classification-based link prediction method, which uses content-based and
graph-based features, such as keyword matching count of vertices, shortest distance
between vertex pairs, and clustering index of vertices. It uses very few computing
resources, efficiently uses the classification method for link prediction, and achieves
good results. Backstrom and Leskovec uses a transition probability on vertices to
develop a supervised link prediction algorithm based on random walk model. The
proposed algorithm learns edge strengths by effectively combining structural infor-
mation of vertex and edge attributes. The key idea of this study is that a random walker
is most probably to visit the vertices to which links will be formed. Dai et al. [10]
proposed a nonnegative matrix factorization based link prediction algorithm to predict
links in multi-relational networks. They used similarity and influence among distinct
types of links. Brouard et al. [7] formulated the link prediction problem as an output
kernel learning problem and presented a semi-supervised link prediction approach
based on output kernel regression.
202 H. Nai et al.
Many of the link prediction methods described in the literature are applied to
undirected networks. Neighbor-based link prediction measures don’t use direction of
edges in directed social networks. Path based measures cannot effectively cover local
topological structures when it is not possible to reach from a source vertex to target
vertex in directed networks. A small number of studies described in the literature have
considered direction of links for effective link prediction. For instance, Schall [4]
presented an unsupervised link prediction method using directional graph patterns. He
introduced a pattern-based measure named as Triadic Closeness (TC). TC is based on
ratio of the count of closed triads, which are matched to pattern of a given vertex pairs,
versus possible closed triads. Lichtenwalter and Chawla [2] defined a new concept
named as vertex collocation profile (VCP) for link prediction and analysis. VCP is a
vector of all possible sub-graphs containing vertex pairs, which are to be predicted,
with total n vertices over r edges. VCP can present rich graph-based structures such as
directed and multi-relational edges and also additional information like weight and
temporal information of edges. Behfar et al. [5] analyzed link formation mechanisms
on inlinks and outlinks. They put forward the idea that inlinks and outlinks have
distinct link formation mechanisms, because they have different degree distributions in
directed networks. They concluded that power-law distribution is followed on inlink
formation, while heavy-tailed degree distribution isn’t followed on outlink formation.
Shang et al. [5] introduced a new directional link prediction measure to reveal the
difference roles of one-directional links and bi-directional links on link formation
mechanism. They concluded that vertex pairs linked by a bi-directional edge are more
likely to be linked to common neighbors than vertex pairs linked by a onedirectional
edge. Aghabozorgi and Khayyambashi [2] proposed a new link prediction measure
based triad network patterns having distinct collocations of directed edges. The pro-
posed measure was employed in classification models to predict links in social net-
works. Zhao et al. [8] developed a new link prediction method for ranking potential
links by using topological structures of network and edge covariates information for
both directed and undirected networks. It is assumed that if endpoints of vertex pairs
are similar, they are more likely to be linked in directed networks. Ding and Li [2]
proposed a probabilistic based method for reconstructing the topological structure of
directed weighted network to predict trade actions in world trade networks. Wang et al.
[8] extended the study of Zhou et al. [7] as directed to predict link direction in directed
networks. They added extra ground vertex to the original network to take advantage of
topological structures as much possible. Guo et al. [13] developed a new link prediction
method based on ranking of vertices to predict link directions in directed networks.
They assumed that links are mostly established from lower ranked vertices to higher
ranked vertices. They proposed a recursive subgraph based ranking algorithm by
combining local and global structures to rank vertices. In this paper, we have extended
neighbor-based measures into pattern-based measures based directed edge structures to
improve accuracy of link prediction in directed networks. This work generalizes our
previous [9] work for extension of all traditional undirected neighbor-based similarity
measures into directional pattern-based measures. The proposed method also takes into
account weight and temporal information of edges. The works described in [12] sug-
gested that weighted and temporal based link prediction measures could improve the
accuracy of link prediction. So, we extend traditional neighbor-based measures and the
Subordinate Relationship Discovery Method Based on Directed Link Prediction 203
2 Related Work
The traditional method of the link prediction is based on the neighbor nodes. A number
of neighbor-based link prediction measures are described in the literature. Adamic-
Adar Coefficient (AA) was firstly proposed to determine whether two given web pages
are strongly related or similar. This measure was also adopted to predict links in social
networks. In terms of link prediction, the vertex pairs having fewer common vertices
are weighted more heavily by AA measure. So, vertex pairs sharing fewer relations
have higher probability to be connected. Common Neighbors (CN) is also one of the
basic link prediction measures because of the simplicity of its calculation. Vertex pairs
sharing more common neighbors are more likely to be linked with respect to the CN
measure. Newman’s study shows that there is a correlation between number of com-
mon neighbors and future collaborations among scientists in collaboration networks.
Jaccard’s Coefficient (JC) is assumed that two vertices are more likely to be linked
when they share more common neighbors in proportion to their total number of
neighbors. Sørensen Index was proposed to establish equal amplitude groups in plant
sociology based on the similarity of species. It is also used to calculate similarities of
vertices in complex networks. It is determined by common neighbors of vertex pairs
relative to their sum of individual degrees. Hup Promoted Index (HP) assigned vertex
pairs adjacent to hub vertices as higher score. Hub vertices play a role that directs
vertices having low degree to central vertices having high degree. Hup Depressed
Index (HD)is similar to HP measure but it is affected by higher degree. Any vertex
which has high degree is penalized by this measure. Leicht–Holme–Newman Index
(LHN) gives higher score for vertex pairs having more common neighbors in pro-
portion to their expected number of neighbors. Resource Allocation Index (RA) is
similar to AA measure. However, it produces lower score value than AA measure for
vertex pairs whose common neighbors have high vertex degree. Salton Index (SA)
proposed by Salton and McGill is based on cosine similarity which is most widely used
in similarity measurement.
In this section, we will introduce the traditional method of link prediction, build a
general model of link prediction, and then propose our weighted directional link pre-
diction method. The link prediction method based on topological similarity has been
deeply studied by scholars for a long time. Among these methods, the link prediction
method based on the local prediction indices, quasi-local prediction indices, global
prediction indices are the most typical. Due to its high accuracy and low complexity,
quasi-local indices-based link prediction method is the most widely used. In the next
204 H. Nai et al.
subsection, we will present our directional weighted link prediction method based on
this method.
We first establish a general model of link prediction. The link prediction method in
this paper are implemented in directed graph GðV; EÞ, where V and E indicate set of
vertices and edges respectively. Cðvx Þ indicate the set of neighbors of a vertex x 2 V,
connected x with in or out edges. The wðvx ; vz Þ indicate the weight of the edge from vx
to vz . kvx indicate the degree of vertex vx , it shows the number of neighbors of vertex
vx .
The CN measure is one of the most widely adopted metrics in link prediction,
mainly for its simplicity. Also, it is intuitive because it is expected that a high number
of common neighbors make easier future contacts between two nodes. Now to estimate
weight based on CN measure, the WCN measure is defined as:
wðvx ; vz Þ þ w vy ; vz
z2jCðvx Þ \ Cðvy Þj
WCN vx ; vy ¼ ð2Þ
Cðvx Þ \ C vy
The JC measure is well explored in Data Mining. It assumes higher values for pairs
of nodes that share a higher proportion of common neighbors relative to the total
number of neighbors they have. For unweighted networks, the JC measure is defined as
Cðvx Þ \ C vy
JC vx ; vy
¼ ð3Þ
Cðvx Þ [ C vy
To calculate weight from this similarity metric, the JC coefficient can be expressed
wðvx ; vz Þ þ w vy ; vz
z2jCðvx Þ \ Cðvy Þj
WJC vx ; vy ¼P 0
P 0
v0z 2jCðvx Þj w vx ; vz þ v00z 2jCðvy Þj w vy ; vz
The PA measure assumes that the probability that a new link is created from a node
x is proportional to the node degree kvx (which means the nodes that currently have a
high number of relationships tend to create more links in the future). And someone
have proposed that the probability of a future link between a pair of nodes could be
expressed by the product of their number of collaborators. For unweighted networks,
the PA measure is given by:
Subordinate Relationship Discovery Method Based on Directed Link Prediction 205
X 1
AA vx ; vy ¼ ð7Þ
logðCðvz ÞÞ
z2jCðvx Þ \ Cðvy Þj
Adamic and Adar formulated this metric related to Jaccard’s coefficient. It defines a
higher importance to the common neighbors which have fewer neighbors. Hence, it
measures the relationship be- tween a common neighbor and the evaluated pair of
nodes. The AA measure is extended for weighted networks as:
X w vx ; vy þ w vy ; vz
WAA vx ; vy ¼ P ð8Þ
z2jCðvx Þ \ Cðvy Þj log 1 þ c2Cðvz Þ wð v z ; v c Þ
OTP vx ; vy ; vz , vertex vx and vy does not connect and the vertex vz is the common
neighbor of vertex vx and vy . We use the substructure of OTP vx ; vy ; vz to predict
whether there is an edge between vertex vx and vy .
As can be seen from the above model, for each vertex vz , it can be descripted in the
equations as vz 2 Cð xÞ \ Cð yÞ, forms an OTP with vertices vx and vy. In undirected
networks, only one OTP type, shown in Fig. 1. We only need to judge whether there is
the connection between any vertex pairs vx and vy. But different types of OTP are
possible in directed networks. Directional OTP structures have been also used in the
other works for directed networks, e.g., In this study, we use directed OTP structures in
supervised learning algorithms to improve accuracy of link prediction in weighted,
directed networks. The next part provides all forms of OTP types with distinct direc-
tional edges for two vertices and our directional link prediction measure.
Fig. 2. (a), (b) and (c) are several forms of OTP vx ; vy ; vz with directional edges, (d) the only
possible form of OTP vx ; vy ; vz with undirected edges.
Distinct OTPs for a given pair of vertices (vx, vy) in directed networks are not
differently taken into account by Link prediction reported. For example, Link predic-
tion can use only one OTP (vx, vy, vz), shown in Fig. 2(d), and compute same scores for
different OTP (vx, vy, vz) in Fig. 2(a–c). However, there are different attentions in Fig. 2
(a–c). The vertex vz has attracted attention of vertices vx and vy in Fig. 2(a), whereas it
is vice versa in Fig. 2(c). In Fig. 2(b), the vertex vx has directly attracted attention of
the vertex vz and indirectly interested the vertex vy. These various situations have
different effects on forming links. Traditional undirected neighbor-based link prediction
measures might be adopted to directed networks by taking into account only out edges
or only in edges. However, this proposal has a disadvantage that it cannot use both in
and out edges at same local structures. Other probable proposal, already adopted to
Link prediction in Table 1, is that using both in and out edges in calculation of Link
prediction. Nevertheless, link prediction cannot effectively figure out link formation
mechanisms in directional OTPs. To address this issue, we extend Link prediction as
pattern-based by taking advantage of edges directions. All of the possible OTP types in
directed networks are showed in Fig. 3. General definition of the proposed extended
Link prediction named as Link prediction is expressed as follow:
DLP vx ; vy ¼ fscoreOPT1 ; scoreOPT2 ; scoreOPT3 ; . . .; scoreOPT4 g ð9Þ
Subordinate Relationship Discovery Method Based on Directed Link Prediction 207
prediction. Strength of relationships between vertices are taken into account by using
weight information. Importance of weight information is demonstrated in the sample
network shown in Fig. 4. For example, while weighted CN scores of the vertex pairs
(vx, vy1) and (vx, vy2) are, respectively, 6 and 9, their unweighted score is 2 for both.
Weighted CN scores show that vertex vy2 has more strong attention to vx than vy1.
Recent relationships between vertices can be more decisive on edge formation. This
fact is shown in the sample network given in Fig. 4. For instance, while non- temporal
CN score of the vertex pairs (vx, vy1) and (vx, vy2) is 2, their temporal CN score are
5.193 and 6.804; vy2 has more recent relationships close to vx than vy1, so it is rewarded
with high score.
Table 1. Undirected Link prediction (ULP) and directed Link prediction (DLP) scores for
vertex pairs (vx, vy1) and (vx, vy2) in the sample network in Fig. 4.
Indicator ULP scores DLP scores for (vx, vy1) ULP scores DLP scores for
for (vx, vy1) for (vx, vy2) (vx, vy2)
WCN 7.38 (0, 0, 0, 0, 0, 0, 0, 3.8, 3.58) 16.81 (0, 0, 0, 8.77,
8.04, 0, 0, 0, 0)
WJC 0.46 (0, 0, 0, 0, 0, 0, 0, 0.24, 0.63 (0, 0, 0, 0.33,
0.22) 0.3, 0, 0, 0, 0)
WPA 2.06 (0, 0, 0, 0, 0, 0, 0, 1.06, 1) 1.36 (0, 0, 0, 0.71,
0.65, 0, 0, 0, 0)
WAA 4.08 (0, 0, 0, 0, 0, 0, 0, 2.08, 2) 7 (0, 0, 0, 3.6,
3.4, 0, 0, 0, 0)
Subordinate Relationship Discovery Method Based on Directed Link Prediction 209
4 Experiments
We have China Unicom’s mobile call list for a city in southern China. The data were
recorded for 30 days from August 1, 2018 to August 31, 2018. We use the record from
August 1, 2018 to August 15, 2018 as the train set and use the record from August 16,
2018 to August 31, 2018 as the test set to build the weighted directed graph.
According to the behavior of using the APP, we have made a user portrait for the
user and selected the user who found it to be an office worker. And using these users’
call list to build a weighted directed graph. In this graph, the direction of the edge is
represented as the caller and called in the call list, the direction of the edge is from the
caller to the called. The weight of the edge is expressed as the total duration of the call
between the two users in half a month. Because in the enterprise, people usually use
other social software to communicate, the call is usually considered as a social way in
an emergency, and it has a strong relationship attribute, so the call network in the
enterprise can also reflect the level relationship of the user in an enterprise. In an
emergency, the boss will call the subordinates to complete some of the corresponding
tasks, but the number of subordinates’ emergency is less than the number of the boss,
so there will be less time for the boss to call. It is with this attribute that we can judge
the level of users in the enterprise by the directionality of the call. We use the directed
link prediction to determine the hidden relationship of the user in the graph, and find all
the subordinate relationship attributes of the entire user in the graph.
Because in terms of data acquisition, we can’t get the user’s hierarchical rela-
tionship in the enterprise, this information belongs to the user’s privacy, so we can only
judge the effectiveness of our method by capturing the stability of this relation-
ship. Through link prediction, we observe whether this relationship is consistent in the
first half of the month and the second half of the month. In the graph, that is, after the
prediction, whether the direction of the edge between the common vertex pair is same.
We use SVM and K-NN for supervised learning, and regard directed link prediction
as a multi-classification problem with connected edges. There are four types of edge
relationships: no edge, vertex vx points to vertex vy , vertex vy points to vertex vx , and
vertex vx and vertex vy are pointed at the same time. If there is only one direction for the
edge, the user being pointed to is a subordinate, and if there are two directions in the
same edge, then the two users are considered in a same level. The level judgment result
of the user in the graph is shown in Table 2. The accuracy rate in the table indicates the
210 H. Nai et al.
probability of determining the stability of the user relationship in the first half of the
month and the second half of the month. As can be seen from the figure, the judgment
of the user relationship is stable, and the accuracy rate exceeds 75%.
5 Result
This paper proposes a directed link prediction method to get the potential direction
relationship in a network and judging the relationship between users in the network
through a directed connection. We use the direction of the edge in the call list graph to
determine the relationship between the upper and lower levels of the enterprise user. If
the user is pointed, the user is in the lower level for this two user. If users are pointed at
the same time, the two users are in the samelevel. Because the subordinate relationship
cannot get directly, so we use the relationship recurrence rate to verify the effective-
ness. The experiment proves that the prosed directed link prediction method can dis-
cover the relationship of users, and there is a stable relationship between users.
1. Dafre, S.F., de Rijke, M.: Discovering missing links in Wikipedia. In: Proceedings of the 3rd
International Workshop on Link Discovery, pp. 90–97 (2005)
2. De Sá, H.R., Prudêncio, R.B.C.: Supervised link prediction in weighted networks. In: The
2011 International Joint Conference on Neural Networks, IJCNN, pp. 2281–2288 (2011)
3. Huang, Z.: Link prediction based on graph topology: the predictive value of generalized
clustering coefficient, Available SSRN 1634014 (2010)
4. Li, X., Chen, H.: Recommendation as link prediction: a graph kernel-based machine learning
approach. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries,
pp. 213–216 (2009)
5. Lü, L., Zhou, T.: Role of weak ties in link prediction of complex networks. In: Proceedings
of the 1st ACM International Workshop on Complex Networks Meet Information &
Knowledge Management, pp. 55–58 (2009)
6. Murata, T., Moriyasu, S.: Link prediction of social networks based on weighted proximity
measures. In: IEEE/WIC/ACM International Conference on Web Intelligence, pp. 85–88
7. Rossetti, G., Guidotti, R., Pennacchioli, D.: Interaction prediction in dynamic networks
exploiting community discovery. In: Proceedings of the 2015 IEEE/ACM International
Conference on Advances in Social Networks Analysis, pp. 553–558 (2015)
8. Xiang, R., Neville, J., Rogati, M.: Modeling relationship strength in online social networks.
In: Proceedings of the 19th International Conference on World Wide Web, pp. 981–990
9. Zhu, J., Hong, J., Hughes, J.G.: Using Markov models for web site link prediction. In:
Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia, pp. 169–170
10. Wind, D.K., Morup, M.: Link prediction in weighted networks. In: 2012 IEEE International
Workshop on Machine Learning for Signal Processing, pp. 1–6 (2012)
Subordinate Relationship Discovery Method Based on Directed Link Prediction 211
11. Brouard, C., d’Alché-Buc, F., Szafranski, M.: Semi-supervised penalized output kernel
regression for link prediction. In: Proceedings of the 28th International Conference on
Machine Learning (ICML 2011) (2011)
12. Brzozowski, M.-J., Romero, D.M.: Who Should I Follow? Recommending People in
Directed Social Networks (2011)
13. Al Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In:
SDM 2006: Workshop on Link Analysis, Counter-Terrorism and Security (2006)
An Elephant Flow Detection Method
Based on Machine Learning
1 Introduction
Accurate measurement of network traffic is the key to the implementation of
a range of network applications, including traffic engineering, abnormal flow
detection, security analysis and so on [1,2,4–7]. A large number of network
management decisions, such as blocking abnormal network traffic, load balanc-
ing network traffic, etc., require real-time acquisition and analysis of network
traffic status. Elephant flow detection [3] is an important branch in network
traffic detection. An elephant flow usually refers to traffic that exceeds a certain
threshold (such as 1 Mbytes/s) per unit time. Because of the high proportion of
elephant flows in network traffic, The control of elephant flows can effectively
complete most of the control tasks of network traffic, such as congestion avoid-
ance, traffic load balancing, etc. [8–12].
In order to achieve the detection of traffic in the network, the simplest
method is to count all the flows in the network. But because the network traf-
fic in the current network is huge and the arrival rate is high, this method has
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 212–220, 2019.
An Elephant Flow Detection Method Based on Machine Learning 213
high requirements for storage and is not suitable for real-time statistics on net-
work flows. The improvement method is sampling method [13], which faces the
problem of low precision. In addition there are some sketch methods, such as
Count-Min [14], which requires additional hardware support to be implemented
in a software-defined network. This paper focuses on the elephant flow detec-
tion based on iterative method, which has the advantages of low consumption
of detection resources and easy deployment in software-defined networks.
Although the iterative detection method has many related researches [3,10,
15], it still faces problems in detection accuracy and detection speed. The reason
why these methods have these problems is because the flow characteristics of the
elephant flow are not considered in the detection, so the elephant flow cannot
be detected accurately. In this paper, the characteristics of the elephant flow
are learned by the machine learning method to improve the detection speed
and accuracy during detection. The main ideas of our method are as follows:
first, we use the iterative method to detect the elephant flow and collect the
parameters generated during the detection process (such as the flow of an address
range, etc.), and then process and classify these parameters through the off-line
data processing to determine whether these parameters belongs to an elephant
flow. Then we can use machine learning to learn the relationship between these
parameters and the types of their corresponding flows, so that we can distinguish
whether a flow is an elephant flow. Next, these flows are reasonably operated in
an iterative method to increase the accuracy and speed of flow detection.
The main contributions of this paper are summarized as follows:
– We formulate a problem for network traffic monitoring to detect elephant
– We propose an improved algorithm for iterative elephant flow detection in
networks, based on machine learning, in order to realize network monitoring.
– We conduct extensive simulations to verify the effectiveness of the proposed
algorithm, and the results show the proposed algorithm is effective compared
with the previous methods.
The remainder of this paper is organized as follows. We review the iterative
elephant flow detection method in Sect. 2. The elephant flow detection algorithm
is proposed in Sect. 3. In Sect. 4, we conduct simulations to determine the per-
formance of our proposed algorithm. We conclude this paper in Sect. 5.
Algorithm 1
Input: : Address range A, Elephant flow threshold β, Number of divisions n
Output: : Elephant flow set H
1: put A into Set W
2: Assign a flow entry to the address in W
3: After a detection cycle, update the traffic corresponding to the address in W
4: ∀w ∈ W
5: if the flow of w ¿ β then
6: remove w from W , divide w into n segments and put them into W , put w into
7: else
8: remove w from W
9: report H and return to step 2
the specific problem description can be referred to the literature [6]. In a real
network environment, traffic is constantly fluctuating, even an elephant flow is in
a state of constant fluctuation. When a certain granularity address of an elephant
flow is detected, the Algorithm 1 may be lost the detection due to fluctuations in
the elephant flow. Hence, we may not detect the corresponding elephant flow due
to the fluctuations of the elephant flow. The following Fig. 1 is an example of the
detection when we use the iterative method. The vertical axis in the figure shows
the number of addresses of the detected flow (or the depth in the address prefix
tree [6]). The larger the number, the higher the accuracy of our detection. For
example, when we detect the address (source address and destination address)
of a flow, the exact address of the flow we get should be 64 bits (depth is 64).
When we only know an elephant flow at the source address within the interval, the number of addresses we get for this flow is 24 (the detected depth
is 24). The horizontal axis represents the number of cycles of the iteration. As
we can see from Fig. 1, our method can only obtain a thicker address range of an
elephant flow due to fluctuations in traffic, and the detection accuracy is poor.
Similarly, due to the loss of elephant flow detection, the iterative method needs
to detect the flow again, which results in a long time to detect an elephant flow
or no elephant flow can be detected at all.
In [10], researchers propose a method called DM to improve the detection
accuracy, but through experiments we find that it is still difficult to effectively
improve the accuracy of detection. The main reason is that the method uses a
comprehensive history of a flow, which can be released after a period of obser-
vation to obtain more reliable detection results. But this also causes a large
number of flow entries to be released in a short period of time, which make all
available table items be consumed quickly. After the table item is exhausted, the
method tends to release a large number of detection items. At the same time,
the items that are detected together with the elephant flows are also released,
so the detection accuracy is not high.
By analysing of the shortcomings of the DM method, we can find that the
key to improving the accuracy and speed of detection is to correctly distinguish
An Elephant Flow Detection Method Based on Machine Learning 215
Algorithm 2
Input: : Address range A, Elephant flow threshold β, Probability a, Ratio of additional
impulse δ
Output: : Elephant flow set H
1: put A into Set W
2: Assign a flow entry to the address in W
3: After a detection cycle, update the traffic corresponding to the address in W
4: ∀w ∈ W , apply the trained random forest model (w.f, w.momentum, w.static,
w.pForce) to predict whether the stream w corresponding to the tuple is a elephant
5: if the Probability of w > a then
6: w.momentum=w.f
7: ∀w ∈ W , merging small streams using the method of [15].
8: Calculate the number of available entries, set to n, and count the number of flows
with flow greater than β in W, set to m
9: ∀w ∈ W
10: if w, f ≥ β then
11: remove w from W , divide w into n/m segments and put them into W , put w
into H. For each segment w , set w.static = 0, w .momentum = w.f
12: else if w.pF orce ≥ β then
13: w.static=w.static+1
14: else if w.pF orce ≤ β then
15: w.static=0
16: set w.momentum=δ w.momentum+w.f
17: report H and return to step 2
In each iteration of the detection process, we can obtain a tuple T for each flow
w, where T = (wf, w.momentum, w.static, w.pF orce). In the DM method, only
the judgment of pF orce value is used to decide whether to abandon the detection
of a flow or not. The pF orce value takes into account the historical traffic of a
flow and the number of currently available flow entries. However, as mentioned
earlier, the DM method cannot correctly determine whether a flow is a small
flow. The other methods in the previous literatures considered simpler, using
only the current traffic of a flow, i.e. w.f , to determine whether a flow is an
elephant flow, which would lead to more serious detection accuracy problems.
In this paper, we propose to use machine learning to determine whether a flow
w is an elephant flow by learning its corresponding tuple T . And then we can
apply reasonable operations for different flows to control their detection process
(as shown in the algorithm above).
In order to collect the relationship between the tuple T and flow types, we
need to make some changes to Algorithm 2. Because we have not acquired a
trained random forest model. Hence, we need to remove step 4 of Algorithm 2
and then execute the algorithm. Each detected flow will correspond to a tuple
An Elephant Flow Detection Method Based on Machine Learning 217
T in each iteration, but we don’t know if this flow is an elephant flow. For each
flow, we need to record its traffic in the next t iterations, and calculate the
mean value of the traffic, so that we know whether the flow is an elephant flow
and construct the training data. Our training data has five fields, namely (f ,
momentum, static, pF orce, isElephantF low), where f represents the traffic of
a flow, momentum represents the impulse of the flow, and static is the number
of iterations in which the flow is continuously smaller than the threshold of the
elephant flow, pF orce is used to measure the size of the flow comprehensively.
isElephantF low indicates whether the flow is an elephant flow. If a flow is an
elephant flow, the value of isElephantF low is 1, if not, the value is 0.
After the training data is acquired, the data needs to be preprocessed first.
Since the random forest has few requirements on data, we only remove some
extreme values from the training data to reduce the impact on the training
results. The extreme values of the data we removed are mainly generated at the
beginning of the iteration. At this time, since the detection address interval of the
detection entry is usually coarser, the detected traffic is large. The momentum
value of most flow is very large, where most of these flows are actually small
flows. In the actual detection, this kind of situation rarely occurs, therefore we
do not need to consider these data.
In order to avoid over-fitting in actual training, we set the ratio of the number
of samples in the random forest leaf node to the training data to 0.001, and then
we use the grid search method to find the most suitable parameters. We will
show the results of the training in the next section.
4 Experiments
In order to verify the performances of the proposed algorithms, we use Java to
implement a simulation platform to simulate the iterative detection method. In
the simulation platform, real network traffic can be injected and the elephant
flows can be detected. The data flow we inject into the simulation network comes
from CAIDA’s packet capture data [16]. Its average traffic is 2 Gbps. In addition,
we also insert a certain number of elephant flows into the simulation network.
The average traffic of each elephant flow is 1 MB/s, and there is a maximum
of 50% random fluctuations above and below its mean value. An example of
the elephant flow we inject into the network is shown in Fig. 2, where the red
line represents the mean value of the flow. In this paper, the ratio of the depth
detected by an elephant flow to its true depth is used to measure the accuracy
of the detection. Equation 1 is the calculation method of precision, where H
represents the elephant flow set and w.depth represents the detected depth of
elephant flow w, w.depth represents the true depth of the elephant flow.
w∈H w.depth
score = (1)
We use the random forest classification algorithm in the scikit learn library
to train the data, where we select 80% of the training data as the training
218 K. Lou et al.
Fig. 2. An example of a elephant flow inserted into a network (Color figure online)
set, and the rest data as the test set. During the process of finding the most
suitable parameters, the candidate parameters we set are shown in Table 1. The
number of shards for cross validation is set to 5. After Grid Search, the optimal
parameters are n estimators = 10, max f eatures = 3. At this time, the root
mean square error of the classifier on the training set is 0.112., and the root
mean square error of the test set is 0.113. It is not difficult for us to find that
the classifier performs almost identically on the training set and the test set.
After the random forest classifier is trained, in order to directly use this
model for prediction, we use Python to build a simple server, where we can run
the prediction model. The simulation platform contacts the server through the
network to send it Tuple T and obtain prediction results. The parameters a and
δ of Algorithm 2 are set to 0.9 and 0.5, respectively.
We conduct two set of experiments to verify the performances of the proposed
algorithm, where both sets of experiments are compared to the DM method. In
the first set of experiments, we set up different numbers of flow entries and insert
100 elephant flows into the simulation network to test the detection accuracy
(score) of our algorithm and DM method. As shown in Fig. 3, the OU R500 indi-
cates the average precision of our algorithm when the total number of available
entries is 500, the DM 500 indicates the average precision of the DM method and
so on. we calculate the average accuracy by only consider the inserted elephant
flows for the convenience of calculations. The horizontal axis in Fig. 3 represents
the detection period, and the vertical axis represents the detection accuracy. In
the second set of experiments, we set the total number of available flow entries
An Elephant Flow Detection Method Based on Machine Learning 219
Fig. 3. Detection results when the detec- Fig. 4. Detection results when the detec-
tion resources are different and the num- tion resources are the same and the num-
ber of elephant flows is the same ber of elephant flows is different
to 1000, and insert a different number of elephant flows to test the detection
accuracy of the two algorithms. As shown in Fig. 4, the OU R100 indicates the
average precision of our algorithm when inserting 100 elephant flows, DM 100
represents the average precision of the DM method. The meanings of the vertical
axis and the horizontal axis are the same as those in Fig. 3. It is obvious that our
proposed algorithm is superior to the DM method in both detection speed and
detection accuracy. In other words, our algorithm can obtain higher precision in
a shorter time than the DM method.
By observing Figs. 3 and 4, we can find that the DM method cannot detect
elephant flows stably. We believe that the reasons are as follows: because the
small flows are given a large impulse, thus these small flows are difficult to be
released, which will lead to the exhaustion of available entries in a relatively
short time. After available entries are exhausted, the value of pull will become
large. At this time, the DM method tends to release a large amount of flow
during this detection iteration, whose traffic value is less than the threshold of
elephant flows. As a result, some elephant flows that fluctuate to the trough are
also released. Hence these elephant flows cannot be effectively detected, and the
detection accuracy of the DM method is poor. We also test the effect of the
DM+ER method through experiments. The results are similar to those of the
DM method, where the elephant flow cannot be effectively detected. Hence we
don’t show these results in this paper.
5 Conclusion
The elephant flow detection method is an important service in a software-defined
network. Based on the random forest method, we propose an improved algo-
rithm for iterative elephant flow detection in this paper. By implementing the
simulation platform and conducting experiments, it is verified that the proposed
algorithm is effective compared with the previous methods. And it can effectively
220 K. Lou et al.
improve the accuracy and speed of detection. In the future work, we will try to
extract more parameters in the iterative detection process, so that the accuracy
and speed of the detection algorithm can be further improved.
1. Akyildiz, I.F., Lee, A., Wang, P., et al.: A roadmap for traffic engineering in SDN-
OpenFlow networks. Comput. Netw. 71, 1–30 (2014)
2. Thottan, M., Ji, C.: Anomaly detection in IP networks. IEEE Trans. Signal Pro-
cess. 51(8), 2191–2204 (2003)
3. Jose, L., Yu, M., Rexford, J.: Online measurement of large traffic aggregates on
commodity switches. In: Hot Topics in Management of Internet, Cloud, and Enter-
prise Networks and Services. USENIX Association (2011)
4. Qiu, H., Noura, H., Qiu, M., et al.: A user-centric data protection method for cloud
storage based on invertible DWT. IEEE Trans. Cloud Comput. 1 (2019)
5. Gai, K., Qiu, M., Zhao, H., et al.: Dynamic energy-aware cloudlet-based mobile
cloud computing model for green computing. J. Netw. Comput. Appl. 59(C), 46–54
6. Gai, K., Qiu, M., Zhao, H.: Energy-aware task assignment for mobile cyber-
enabled applications in heterogeneous cloud computing. J. Parallel Distrib. Com-
put. S0743731517302319 (2017)
7. Gai, K., Xu, K., Lu, Z., et al.: Fusion of cognitive wireless networks and edge
computing. IEEE Wirel. Commun. 26(3), 69–75 (2019)
8. Alizadeh, M., Edsall, T., Dharmapurikar, S., et al.: CONGA: distributed
congestion-aware load balancing for datacenters. ACM SIGCOMM Comput. Com-
mun. Rev. 44(4), 503–514 (2014)
9. Alizadeh, M., Yang, S., Sharif, M., et al.: pFabric: minimal near-optimal datacenter
transport. ACM SIGCOMM Comput. Commun. Rev. 43(4), 435–446 (2013)
10. Benson, T., Anand, A., Akella, A., et al.: MicroTE: fine grained traffic engineering
for data centers. In: Proceedings of the Seventh COnference on Emerging Network-
ing EXperiments and Technologies, p. 8. ACM (2011)
11. Garcı́a-Teodoro, P., Dı́az-Verdejo, J., Maciá-Fernández, G., et al.: Anomaly-based
network intrusion detection: techniques, systems and challenges. Comput. Secur.
28(1–2), 18–28 (2009)
12. Kabbani, A., Alizadeh, M., Yasuda, M., et al.: AF-QCN: approximate fairness
with quantized congestion notification for multi-tenanted data centers. In: 2010
18th IEEE Symposium on High Performance Interconnects, pp. 58–65, 2191–2204.
IEEE (2010)
13. Duffield, N.: Sampling for passive internet measurement: a review. Stat. Sci. 19(3),
472–498, 2191–2204 (2004)
14. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-
min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)
15. Moshref, M., Yu, M., Govindan, R., et al.: DREAM: dynamic resource allocation for
software-defined measurement. ACM SIGCOMM Comput. Commun. Rev. 44(4),
419–430 (2015)
16. The caida ucsd anonymized internet traces 2015[EB/OL], 06 April 2019. http:// 2015 dataset.xml
AI Enhanced Automatic Response
System for Resisting Network Threats
Song Xia1 , Meikang Qiu2,3(B) , Meiqin Liu4 , Ming Zhong2 , and Hui Zhao5
College of Electronic and Information, Wuhan University, Wuhan, Hubei, China
[email protected]
College of Computer Science, Shenzhen University, Shenzhen, Guangdong, China
Department of Computer Science,
Harrisburg University of Science and Technology, Harrisburg, USA
[email protected]
College of Electrical Engineering, Zhejiang University, Zhejiang, China
[email protected]
School of Software, Henan University, Kaifeng, China
[email protected]
1 Introduction
With terabits of information stored in the network and much of this information
being confidential, its protection work turn to be very important. Cyber-security
is a measure protecting computer systems, networks, and information from dis-
ruption [1]. The data leakage caused by network threats every year brings tens
of thousands of losses to society. It is therefore important to have a system that
monitors activity of users with intent of detecting malicious activities.
Nowadays, there are two ubiquitous network security products to protect
cyber from those threats: firewall and Intrusion Detection System (IDS ) [2].
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 221–230, 2019.
222 S. Xia et al.
Many scholars such as Gai et al. [17–19] and Qiu et al. [20] have done numerous
valuable work in the Cyber-security field. However, with the diversification of
cyberattacks, traditional defense products cannot fully guarantee the security
of the network. AlFayyad et al. evaluated the performance of personal firewall
systems but it indicated that personal firewalls were poorly used which leaded to
vulnerabilities in security [3]. Intrusion detection systems are similar to burglar
alarms which could look for suspicious activity and alert the system and network
administrators [4], but traditional IDS might miss some unknown threats [12].
The threat detection systems based on machine learning, such as SVM [5],
KNN [6,7] and deep learning such as CNN [8], RNN [9] have achieved the
high detection accuracy rate, but most current systems are only suitable for
a specific type of threat and lack a complete defense mechanism for threats.
Since many users of the network lack basic knowledge about the defense against
cyber threats, they cannot choose a system for a specific threat detection, and
it is difficult for them to find a corresponding solution after the detection is
completed [3]. In order to make the network threat response mechanism more
efficient and practical, we propose an automatic system which combines machine
learning with deep learning, for threat detection, identification and mitigation.
It is a brand new system which can automatically complete the entire process
of threat response. The main advantages of our system is elaborated below:
(1) It not only can detect but also identify and mitigate the threat automatically.
(2) It is capable to handle many types of threats.
(3) It has a relatively low false positive rate.
The rest of the paper is organized as follows: Sect. 2 describes the related work
about the network threat response system; Sect. 3 introduces the framework of
our system; Sect. 4 presents the experiment and analysis of the results; Sect. 5
concludes our work and gives a prospect.
2 Related Work
The Intrusion Detection System (IDS ), first proposed by Denning in 1987, is
widely used in threat detection [10]. People researched on intrusion detection
approaches mainly from two views: anomaly detection and misuse detection [11].
Mena et al. [12], proposed an intrusion detection system based on misuse intru-
sion. The advantage of misuse detection system is its ability to detect all known
threats with a high accuracy rate. Garcia [13] gave a review of the intrusion
detection system based on anomaly detection. The advantage of this system is
the potential to detect previously unseen threats. However the misuse methods
are not capable of detecting new, unfamiliar intrusions and the anomaly meth-
ods have a relatively high false positive rate. Recently, artificial intelligence has
developed rapidly and achieved good results in many fields [14,15]. Elike Hodo
et al. [16], established a neural network based threat detection system to protect
the Internet of things network and achieved relatively high accuracy. However
in their experiment, the system just worked for DOS threat. Vinayakumar et al.
AI Enhanced Automatic Response System for Resisting Network Threats 223
[8] analyzed the effectiveness of convolution neural network for intrusion detec-
tion by modeling the network traffic events as time series of TCP/IP packets.
The result showed CNN could reach a good performance in the intrusion detec-
tion. Kevin Ross [5] used multiple machine learning methods in SQL injection
detection, which included decision Tree, rule-based, support vector machine and
random forest algorithm.
Those methods above show the machine learning and deep learning ways can
be more efficient in network threat response when compared to the IDS and
firewall. However the systems mentioned above have the following deficiencies:
only response for a certain type of threat and lack of a complete handling process
for threats. In this paper, we combined the merits of machine learning and deep
learning, and built a hierarchical threat response system. This system overcame
the shortcomings of the above methods.
3 System Model
As shown in Fig. 1, the overall framework of our proposed system consists of four
important phases: data preprocessing, threat detection, threat identification and
threat mitigation. We collect a large number of network connection samples with
different types of threats from the internet. Those samples are preprocessed by
vectorization and normalization.
The detection module judges whether there is a malicious information in
the input samples. If not, next modules will ignore those samples. Otherwise
those malicious samples will be pushed into the identification module to get
their labels, which indicate the major categories of them. In the last step, the
mitigation module can find a certain type of the threat and give a solution. The
details of those modules will be elaborated below.
z = w0 · x0 + w1 · x1 + w2 · x2 + w3 · x3 + ... + wn · xn = wT · x (1)
σ(z) = (2)
1 + e−z
AI Enhanced Automatic Response System for Resisting Network Threats 225
The function of the pooling layer is to reduce the dimension of the data and
avoid overfitting. We choose the maximum pooling method to select the maxi-
mum value in the feature map. The last layer is the full connection layer, through
which we can get the result of the classifying. Figure 3 shows the calculation pro-
cess of the full-connected layer. The output of the layer will be calculated from
the expression in Eq. 4. W [1] and W [2] are the optimal coefficients we get from
the training process and σ represents the sigmoid function. b[1] and b[2] are the
offsets of the function. Given the input x, the result of classification will be a[2] .
We will compare the output with the expected value. When the error is
greater than our Expectation, the error is transmitted back to the network to
update the parameters in each layer. When the error is equal to or less than our
expected value, the training is ended.
226 S. Xia et al.
In the mitigation module, we will use machine learning methods, KNN algo-
rithm and decision tree, to find a solution for threat mitigation. The input of
the k-nearest neighbor algorithm is the feature vectors of different samples, cor-
responding to the points in the feature space; the output is the type of the each
sample. The new type of sample is predicted by majority vote according to the
type of its k nearest neighbors. The value of k is usually no more than 20. We
choose the Euclidean Distance to calculate the distance between two samples.
Equation 5 gives the expression of calculating the distance between 2 samples. x
are the feature values of input samples and x are the feature values of training
2 2 2 2
d = (x1 − x1 ) + (x2 − x2 ) + (x3 − x3 ) + ... + (xn − xn ) (5)
The output of the KNN is used as the input of the decision tree to find a
solution. The principle of dividing the decision tree is to minimize the informa-
tion entropy of the dataset. The calculation expression of information entropy is
shown in Eq. 6. pi means the probability of each type of threats appearing in the
samples space. The working scheduling of this system are shown in Algorithm 1.
H(U ) = E[− log(pi )] = − pi log pi (6)
AI Enhanced Automatic Response System for Resisting Network Threats 227
The dataset used in the system is called KDD99, which is the benchmark in the
network threat detection realm. In this dataset, it contains 39 types of threat,
including 22 types in the training samples and 17 types in the test samples.
Table 1 gives the number of each category of the threats (10% of this dataset).
In our experiment, we only take 10% of this dataset to verify our threat
response model. 22 types of threat are extracted from the dataset and handled
by our system.
The above dataset is used in our experiment. We test the efficiency of each
module sequentially. The features are divided into two types: real-time features
and statistic features. The result shows that the rate of successfully respond to
threats in our system is over 97%. The detail of the results are elaborated below.
Figure 4 shows the results of logistic regression for threat detection. We ran-
domly select 8000 samples from the training dataset and 2000 samples from the
testing dataset. Because there are two types of feature, we have tried two detec-
tion methods; one is to use the real-time features of data for detection, and the
other is to combine the statistical features and real-time features for detection.
Under 10 iterations, the accuracy of the detection module using real-time fea-
tures is 97.3%, and the accuracy of the detection module based-on real-time and
statistical feature is 98.9%. The results show that the threat detection with sta-
tistical features will have a better accuracy rate, but the statistical module will
generally have a delay. Therefore, in practical application, those two modules
can be used in different scenarios.
After the system gives an alert of the threat, the threat identification module
based-on CNN neural network will try to classify the current threat. We used
50,000 traffic data with different tags, including DOS, U2R, R2L, PROBING, to
train the neural network, and used 10000 unknown traffic data to perform the
228 S. Xia et al.
(a) (b)
Fig. 4. The result of the threat detection: (a) the accuracy of statistical detection
module (b) the accuracy of real-time detection module.
(a) (b)
Fig. 5. The result of the identification module: (a) the accuracy rate (b) the value of
loss function.
identification test. Figure 5 shows the accuracy rate of the test is 97.33% with
0.098 loss. The results show that our model can classify those four categories of
threats precisely.
In mitigation module, we will subdivide the four major categories of threat
into more specific types by KNN algorithm and then use decision tree make a
quick response for each threat type. We extracted 50,000 samples from the DOS
category for training and 20,000 samples for testing. We subdivided DOS into
eight more specific categories: Apache2, Teardrop, Land, Mail-bomb, Neptune,
Pod, Processtable, and Smurf. The accuracy of assigning the right type of each
samples is 99.82%. Then we will represent those eight types of threat by a three
dimensions vector and use the decision tree to find a solution. The structure of
the tree is in Fig. 6.
AI Enhanced Automatic Response System for Resisting Network Threats 229
5 Conclusion
1. Thakur, K., Qiu, M., Gai, K., Ali, M.: An investigation on cyber security threats
and security models. In: 2015 IEEE 2nd International Conference on Cyber Secu-
rity and Cloud Computing, pp. 307–311. IEEE, New York, November 2015
2. Tidwell, K., Saurabh, K., Dash, D., Njemanze, H.S., Kothari, P.S.: Threat detection
in a network security system. U.S. Patent 7,260,844. Washington, DC, August 2007
3. Alfayyadh, B., Ponting, J., Alzomai, M., Jøsang, A.: Vulnerabilities in personal
firewalls caused by poor security usability. In: 2010 IEEE International Conference
on Information Theory and Information Security, pp. 682–688, Beijing, January
230 S. Xia et al.
4. Rietta, F.: Application layer intrusion detection for SQL injection. In: ACM-SE
44 Proceedings of the 44th Annual Southeast Regional Conference, pp. 531–536,
Florida, March 2016
5. Ross, K.: SQL injection detection using machine learning techniques and multiple
data sources. Master’s Projects. 650.
6. Li, W., Yi, P., Wu, Y., Pan, L., Li, J.: A new intrusion detection system based on
KNN classification algorithm in wireless sensor network. J. Electr. Comput. Eng.
2014(240217), 1–9 (2014)
7. Punithavathani, D.S., Sujatha, K., Jain, J.M.: Surveillance of anomaly and misuse
in critical networks to counter insider threats using computational intelligence.
Clust. Comput. 18(1), 435–451 (2015)
8. Vinayakumar, R., Soman, K., Poornachandran, P.: Applying convolutional neural
network for network intrusion detection. In: IEEE International Conference on
Advances in Computing, Communications and Informatics (ICACCI), p. 2017.
Udupi, September 2017
9. Hamed, H., Ali, D., Raouf, K., Kim-Kwang, R.: A deep Recurrent Neural Network
based approach for Internet of Things malware threat hunting. Future Gener. Com-
put. Syst. 85, 88–96 (2018)
10. Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. SE-13(2),
222–232 (1987)
11. Liao, H., Lin, C., Lin, Y., Tung, K.: Intrusion detection system: a comprehensive
review. J. Netw. Comput. Appl. 36(1), 16–24 (2013)
12. Mena, J.: Investigative Data Mining for Security and Criminal Detection. Butter-
worth Heinemann (2003)
13. Teodoro, P.G., Verdejo, J.D., Fernández, G.M., Vázquez, E.: Anomaly-based net-
work intrusion detection: techniques, systems and challenges. Comput. Secur.
28(1–2), 18–28 (2009)
14. Ma, Z., Xue, J., Leijon, A., Tan, Z., Yang, Z., Guo, J.: Decorrelation of neutral
vector variables: theory and applications. IEEE Trans. Neural Netw. Learn. Syst.
29(1), 129–143 (2016)
15. Ma, Z., Lai, Y., Kleijn, W.B., Wang, L.K., Guo, J.: Variational Bayesian learning
for Dirichlet process mixture of inverted Dirichlet distributions in non-Gaussian
image feature modeling. IEEE Trans. Neural Netw. Learn. Syst. 30(2), 449–463
16. Hodo, E., et al.: Threat analysis of IoT networks using artificial neural network
intrusion detection system. In: 2016 International Symposium on Networks. Com-
puters and Communications (ISNCC), pp. 1–6, Yasmine, May 2016
17. Gai, K., Qiu, M., Zhao, H., Tao, L., Zong, Z.: Dynamic energy-aware cloudlet-based
mobile cloud computing model for green computing. J. Netw. Comput. Appl. 59,
46–54 (2016)
18. Gai, K., Qiu, M., Zhao, H.: Energy-aware task assignment for mobile cyber-enabled
applications in heterogeneous cloud computing. J. Parallel Distrib. Comput. 111,
126–135 (2018)
19. Gai, K., Xu, K., Lu, Z., Qiu, M., Zhu, L.: Fusion of cognitive wireless networks
and edge computing. IEEE Wirel. Commun. 26(3), 69–75 (2019)
20. Qiu, H., Noura, H., Qiu, M., Ming, Z., Memmi, G.: A user-centric data protection
method for cloud storage based on invertible DWT. IEEE Trans. Cloud Comput.
A Cross-Plane Cooperative DDoS Detection
and Defense Mechanism in Software-Defined
Abstract. Distributed Denial of Service (DDoS) has been one of the biggest
threats in the field of network security and a big problem to many researchers
and large enterprises for years. In SDN, traditional DDoS attack detection
mechanisms are mostly based on intermediate plug-ins or SDN controllers, most
of which have problems of large southbound communication overhead, detec-
tion delay or lacking network-wide monitoring information. In this paper, we
propose a cross-plane cooperative DDoS defense system (CPCS) under the
architecture of SDN, which filters abnormal traffic through coarse-grained
detection on the data plane and fine-grained detection on the control plane. On
the data plane, a preliminary screening is performed to reduce the detection
range of the control plane, and the K-means clustering algorithm is used to
perform fine-grained analysis of traffic on the control plane. In addition, an anti-
false positive module is added ingeniously. The proposed method captures the
key characteristics of DDoS attack traffic by polling the value of counters in
OpenFlow switches which leverages the computational power of OpenFlow
switches that currently not fully utilized. We conducted experiments on a
campus network center including OpenFlow switches and RYU controllers. The
results show that the framework and traffic monitoring algorithms proposed in
this paper can greatly improve detection efficiency and accuracy, and reduce
detection delay and southbound communication overhead.
1 Introduction
attack traffic by calculating the entropy value on the control plane [4]. The existing
methods have some shortcomings: (1) Most of the methods only consider the detection
process but ignore the defense strategy, that is, these methods do not involve dealing
with the detected DDoS attack traffic. A few methods directly DROP these packets
which might accidentally injure normal traffic. (2) If all the work is done by the
controller, the load of southbound interface will be very high, and continuous polling
increases the overhead of the detection system.
In this paper, we extract the features of DDoS attack traffic that suitable for data
plane computing, offload some lightweight DDoS detection mechanisms to the data
plane, and design a cross-plane collaborative system with verification and defense
functions. This system includes not only DDoS detection but also defense strategies
and long-term effective black and white list mechanism to ensure high detection rate
and low false positive rate. Firstly, a coarse-grained screening is performed on the data
plane. If the data plane generates an alarm, then a fine-grained test which based on
machine learning will be performed on the control plane to determine whether the
traffic is DDoS attack traffic. This method greatly reduces system overhead and
southbound communication overhead.
When the verification module determines that a flow is DDoS attack traffic, we add
a flow table through the controller to set the source IP address of the flow to blacklist.
After that, the packet which contains this source IP address will be dropped directly
when it passes through the switch, which greatly reduce the system overhead consumed
caused by duplicate inspections. Similarly, we have also established a whitelisting
mechanism. We add a flow table for the source IP of a packet that determined to be
normal traffic by the verification and anti-false positive module. Packets from this
source IP will be forwarded directly without verification.
The main contributions of this paper can be summarized as follows:
• We designed a lightweight algorithm for detecting DDoS attacks on the data plane,
using the switch CPU for coarse-grained detection and sending alarm information to
the control plane.
• We propose a DDoS attack detection framework for cross-plane collaboration,
performing two levels of detection on the data plane and the control plane. If the
data plane coarse-grained detection generates an alarm, a KNN-based fine-grained
detection is performed on the control plane.
• We conducted a verification experiment at the campus network center. The
experimental results show that our method has a low southbound communication
overhead and detection delay, which greatly improves the detection efficiency and
The rest of this paper is organized as follows. Section 2 covers the related works.
Section 3 presents an overview of our framework. Experimental verification is pre-
sented in Sect. 4. This paper is concluded in Sect. 5.
A Cross-Plane Cooperative DDoS Detection and Defense Mechanism 233
2 Related Work
Nowadays, the main solution to the security defense of DDoS attack is real-time
network monitoring. When a DDoS attack occurs, the attack traffic cleaning device is
started to shield the DDoS attack source, thereby preventing the network from being
attacked by the DDoS attack and achieving the purpose of security defense [5]. The
implementation of the process mentioned above mainly includes the following meth-
ods: (1) Confirm the legality of the source IP address in a network forwarding device
such as a switch or a router, and then establish a blacklist/whitelist; (2) Real-time
monitoring of network traffic from changes in network flow density based on statistical
methods; (3) Establish an inbound port mapping table corresponding to the source
addresses and the forwarding devices in the routing and forwarding devices.
Most DDoS attack detection under SDN is performed on the control plane [6].
A classic method is to calculate the entropy value at the control plane to determine the
dispersion degree of the source IP in the traffic. This method detects abnormal traffic by
using the characteristics of high diversity of forged source IPs. The accuracy of the
detection system depends on the threshold of entropy. However, the selection of the
threshold is obtained by adjusting the size of the parameter, so the method is one-sided.
Another method is to use the scalable computing power of the control plane to extract
the key features of DDoS attack traffic from the header of packets, and use machine
learning algorithms for detection [3, 7, 8]. These machine learning methods for
detecting DDoS attacks at the control plane have lower false-positive rate, but have
higher controller system consumption and southbound communication load.
Most of the current DDoS attack detection mechanisms only have detection parts
but don’t have specific defense strategies. Each time attack traffic reaches the con-
troller, it causes high southbound communication load and control plane system
overhead, furthermore, there are many other problems such as prolonged detection, low
detection accuracy, high false-positive rate and weak detection capability for new
DDoS attacks.
Based on the characteristics of the components such as OpenFlow switch counter,
we assign some lightweight DDoS detection tasks on the data plane to cooperate with
the fine-grained detection method of the control plane [9]. This method greatly
improves the efficiency of the detection system and the flexibility of the scheme. The
cross-plane mechanism reduces the system overhead and the communication load of
the southbound interface, the detection speed and accuracy are improved too.
the controller [10], which greatly increases the communication overhead of the
southbound interface and reduces the delay of detection.
Our system architecture is shown in Fig. 1. We assign pre-detection function to the
data plane to perform coarse-grained detection. In fact, most OpenFlow switches or
hybrid switches that support the OpenFlow protocol consist of one or more CPUs with
rich computing resources that are usually far from being fully utilized [11]. We use the
switch CPU to perform lightweight traffic detection operations and perform a coarse-
grained attack detection on the data plane. In addition, since the data plane has
undergone a coarse-grained screening, the control plane detection method should have
higher accuracy and detection efficiency. We choose k-means algorithm to perform
finer-grained detection on DDoS. The detection speed of this method is fast and the
accuracy is high.
Our solution also adds blacklist and whitelist mechanisms and incorporates anti-
false positives modules after control plane detection. Our method greatly reduces the
communication load and system overhead of the southbound interface due to the use of
event trigger instead of polling mechanism when detecting. Next, we will discuss the
main modules of the system and how the coarse-grained and fine-grained detections are
performed in detail.
As we discussed in the first section, the differences between DDoS attack traffic and
normal traffic are: (1) The arrival of large-scale traffic in a short time. The amount of
arrived packets and bytes per unit time under DDoS attacks are much higher than
normal traffic. (2) There is usually a large difference between the rates of flows when
entering and flowing out of the victim server during the attack. As shown in Fig. 3,
DDoS attacks are often initiated in the form of IP forging, fake IP addresses are used to
send packets. So, when a DDoS attack occurs, the single-flow growth rate increases
rapidly while the proportion of pair-flow is small. (3) When a DDoS flood attack
occurs, the flow duration of different source IP addresses is short, so the average flow
duration is also an important feature. (4) The same as the IP forging caused by DDoS
attacks, the attacker can also perform a scanning attack by randomly generating ports,
so the growth rate of different ports in the DDoS attack traffic is much higher than
normal traffic [13].
236 Y. Cao et al.
Fortunately, these four features can be determined directly by polling the counter
values of OpenFlow switches. These four characteristics of DDoS attack traffic
described above can be represented as follows:
• Average packets per second.
• Average bytes per second.
• Percentage of pair-flow.
• Growth rate of single-flow.
• Average durations per flow.
• Growth rate of different port.
Among the six indicators represented above, 1th and 2th indicators come from the
volume characteristics of the traffic, 3th and 4th indicators come from the asymmetric
characteristics of DDoS attack traffic, 5th and 6th indicators respectively show the
characteristics of DDoS attack traffic short-time multi-flow and forged ports.
Lightweight Detection Algorithm of Data Plane. In this section, we present a
lightweight algorithm that can be used to capture changes in the six indicators repre-
sented above. The algorithm uses the previous n values of the six indicators in a piece
of traffic to estimate future values. If values of these indicators in the next cycle falls
within the predicted range, it means that the current flow is normal. Otherwise, the
deviation between the predicted value and the observed value represents a change in
the behavior of the network. If all six indicators are out of acceptable range, we identify
it as an abnormal traffic caused by a DDoS attack.
If one or more of the six ratio metrics fall within an acceptable range, then there is
no DDoS attack in the specific flow currently. Otherwise, it indicates that this is an
abnormal traffic caused by a DDoS attack. Once the abnormal traffic is detected by the
monitoring thread in the flow monitoring module, the data plane will send an alarm
message to notify the controller to perform fine-grained DDoS attack detection.
Feature Extraction. We can extract various features from the header of packets sent
from the data plane for the clustering algorithm. Since the most prominent feature of
DDoS attack traffic is that it sends out a large number of packets by forging the source
IP address and port, the quintuple contains most of the DDoS feature information,
therefore, we extract the quintuple information of each packet. Because entropy is a
good representation of uncertainty, based on the characteristics of the DDoS attack
random forgery source IP and port, we calculate the entropy value as the input
eigenvalue separately. The definition of entropy is:
X n
ni i
H ðX Þ ¼ log ð1Þ
X n
ni i
H ðsizeÞ ¼ imax log ð2Þ
• Calculate the Euclidean distance from X to the center Ci of all clusters and record
the result as di .
• Select the minimum distance dt ¼ minfdi g and assign the sample X to the appro-
priate cluster.
• Compare di with the cluster radius rt , if dt \rt , then the sample X is judged as
normal data.
When the traffic is normal traffic, we save the source address of these flows, update
the flow table to update the whitelist library of the data plane. Otherwise, the sample is
judged to be abnormal data, and we believe this traffic is DDoS attack traffic. The
source addresses are also saved. After the error prevention module is checked, if there
is no misjudged traffic, the data plane blacklist is updated.
Anti-false Positive Module. After previous coarse filtering on the data plane and fine-
grained detection on the control plane, the detection accuracy of the system is greatly
improved and the total system overhead is reduced. However, since our fine-grained
inspection methods are based on multiple, that is, it is difficult to pick it out when
normal traffic is also mixed in DDoS attack traffic. The result only indicates whether the
traffic within a period contains DDoS attacks. Therefore, after the detection by clus-
tering algorithm, the anti-false positive module is added, in this way, the normal traffic
flow mixed in the DDoS attack traffic can be distinguished and added to the list. It is
worth mentioning that our anti-false positive module is based on probability statistics.
The larger the traffic, the better the effect. The principle is described in detail below.
As shown in Fig. 5, DDoS attack has the characteristics of forging source IP, and
these forged source IPs are randomly generated, therefore, the flow of the same source
IP will not appear twice in a short period of time, in other words, the probability of this
situation is extremely small and negligible. However, when the traffic contains normal
traffic, the flow of the same source IP occurs twice or more in short time and that is
normal. In this case, we believe that the flow of this source IP is normal traffic rather
than DDoS attack traffic. We save the source IP and add it to the whitelist to prevent
misjudgment. We use this feature of DDoS attack traffic to develop an anti-false
positive module. It’s noted that the same IP flow here refers to two flows with intervals,
rather than consecutive packets.
A Cross-Plane Cooperative DDoS Detection and Defense Mechanism 239
4 Experiments
4.1 Experimental Environment and Settings
The network test bed of this experiment consists of three OpenFlow switches in the
campus network center. As shown in Fig. 6, the client and attack host are connected to
switch S1, and the server is connected to S3. Three switches are connected to the
controller, and the collection module is connected to the source switch S1. The com-
pute server is equipped with an Intel e5-2620v3 CPU, 64 GB of RAM and
GeForce GTX titan x. The operating system is ubuntu16.04, we chose onos1.13 as the
SDN controller.
The experimental data was collected before the experiment and divided into normal
traffic and DDoS attack traffic. DDoS attack traffic is generated using the attack tool
Hping3. Our training set only includes normal traffic, excluding DDoS attack traffic,
and the purpose of training phase is to determine the threshold of cluster radius.
Fig. 7. Changes of data plane feature indicators before and after DDoS attack.
5 Conclusion
1. Wang, R., Jia, Z., Ju, L.: An entropy-based distributed DDoS detection mechanism in
software-defined network. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 310–317.
IEEE (2015)
2. Dou, W., Chen, Q., Chen, J.: A confidence-based filtering method for DDoS attack defense
in cloud environment. Future Gener. Comput. Syst. 29(7), 1838–1850 (2013)
3. Ye, J., Cheng, X., Zhu, J., et al.: A DDoS attack detection method based on SVM in software
defined network. Secur. Commun. Netw. 2018 (2018)
4. Qin X., Xu T., Wang C.: DDoS attack detection using flow entropy and clustering technique.
In: 2015 11th International Conference on Computational Intelligence and Security (CIS),
pp. 412–415. IEEE (2015)
5. Gil, T.M., Poletto, M.: MULTOPS: a data-structure for bandwidth attack detection. In:
USENIX Security Symposium, pp. 23–38 (2001)
6. Mousavi, S.M., St-Hilaire, M.: Early detection of DDoS attacks against SDN controllers. In:
2015 International Conference on Computing, Networking and Communications (ICNC).
IEEE, pp. 77–81 (2015)
A Cross-Plane Cooperative DDoS Detection and Defense Mechanism 243
7. Barki, L., Shidling, A., Meti, N., et al.: Detection of distributed denial of service attacks in
software defined networks. In: 2016 International Conference on Advances in Computing,
Communications and Informatics (ICACCI). IEEE, pp. 2576–2581 (2016)
8. Niyaz, Q., Sun, W., Javaid, A.Y.: A deep learning based DDoS detection system in software-
defined networking (SDN). arXiv preprint arXiv:1611.07400 (2016)
9. Braga, R., de Souza Mota, E., Passito, A.: Lightweight DDoS flooding attack detection using
NOX/OpenFlow. In: LCN, vol. 10, pp. 408–415 (2010)
10. D’Cruze, H., Wang, P., Sbeit, R.O., Ray, A.: A software-defined networking (SDN) approach
to mitigating DDoS attacks. In: Latifi, S. (ed.) Information Technology - New Generations.
AISC, vol. 558, pp. 141–145. Springer, Cham (2018).
11. Yang, X., Han, B., Sun, Z., et al.: SDN-based DDoS attack detection with cross-plane
collaboration and lightweight flow monitoring. In: GLOBECOM 2017-2017 IEEE Global
Communications Conference. IEEE, pp. 1-6 (2017)
12. Zhao, T., Li, T., Han, B., et al.: Design and implementation of software defined hardware
counters for SDN. Comput. Netw. 102, 129–144 (2016)
13. Yan, Q., Yu, F.R., Gong, Q., et al.: Software-defined networking (SDN) and distributed
denial of service (DDoS) attacks in cloud computing environments: a survey, some research
issues, and challenges. IEEE Commun. Surv. Tutor. 18(1), 602–622 (2015)
14. Qin, X., Xu, T., Wang, C.: DDoS attack detection using flow entropy and clustering
technique. In: 2015 11th International Conference on Computational Intelligence and
Security (CIS). IEEE, pp. 412–415 (2015)
15. Qiu, H., Noura, H., Qiu, M., et al.: A user-centric data protection method for cloud storage
based on invertible DWT. IEEE Trans. Cloud Comput. (2019)
16. Qiu, H., Memmi, G.: Fast selective encryption method for bitmaps based on GPU
acceleration. In: 2014 IEEE International Symposium on Multimedia, pp. 155–158. IEEE
17. Qiu, H., Kapusta, K., Lu, Z., et al.: All-or-nothing data protection for ubiquitous
communication: challenges and perspectives. Inf. Sci. (2019)
A Hardware Trojan Detection Method Design
Based on TensorFlow
Abstract. As an extra circuit inserted into chip design, Hardware Trojan can
achieve malicious functional changes, reliability reduction or secret information
disclosure. Meanwhile, the design of the hardware Trojan circuit is concealed,
triggered only under rare conditions, and is in a waiting state for most of the life
cycle. It is quite small compared to the host design and has little influence on
circuit parameters. Therefore, it is difficult to detect hardware Trojans. Fast and
accurate detection technology is provided by Google’s open source machine
learning framework TensorFlow. The hardware Trojan circuit adopts the stan-
dard circuit provided by Trust-Hub. It is realized through FPGA programming.
ISE is used for compiling and simulation to obtain the characteristic value of the
circuit; Finally, a hardware Trojan detection platform based on machine learning
is established by simulating the data via TensorFlow machine learning. The
experimental test results verify the correctness of the design and provide a
simple hardware Trojan detection for IC.
1 Introduction
Trojan circuit is composed of a trigger and a useful load to activate and execute
expected objectives [7]. According to the Trojan design, when the trigger signal
reaches a rare expected value, the hardware Trojan activation signal is triggered and the
useful load is enabled, which will influence the stability of the system and lead to
system function loss or information leakage [8].
Since hardware Trojan activation is rare, the system cannot detect hardware Trojan by
exhaustive method [9]. To this end, this paper designs a hardware Trojan detection system
based on TensorFlow. A feature matrix is formed by acquiring extrinsic features of the
hardware circuit. Meanwhile, a machine learning model is designed and trained. Finally,
a machine learning model which can correctly detect hardware Trojan is designed.
The United States holds the information security month every year since 2007. Most
American universities, such as Yale University, New York University and Carnegie
Mellon University, have carried out in-depth research on hardware Trojans. Great
progresses have been achieved in hardware Trojan design and detection technology.
The research method evolved from the initial software simulation to FPGA experiment.
The detection based on bypass signal has developed rapidly, and the effect is contin-
uously improved. The chips of various types of hardware Trojans can be detected. The
structural diagram is shown in Fig. 1 [10]. The hardware Trojan circuit is composed of
trigger logic and function logic. When the system is running under certain conditions,
the trigger signal of the hardware Trojan circuit is triggered. The system operates
according to the Trojan design, thus influencing the system function. Meanwhile, the
hardware Trojan circuit can detect the temperature and the electromagnetism sur-
rounding the system. The hardware Trojan circuit will be activated at a certain tem-
perature of the system.
In the design, a feature matrix is formed based on the circuit characteristics of the
system. By comparing with the golden template without Trojan, a Trojan feature matrix
is generated. A machine learning model is established and trained to obtain the correct
hardware Trojan detection model. The structure diagram is shown in Fig. 2. The
machine learning model learns the existing Trojan matrix information, and adjusts the
parameters, so that the model outputs a more appropriate value. Finally, the logic
formed by the arithmetic formula is the final model obtained by the system. Afterwards,
the trained model is saved. By extracting a specific feature matrix and inputting a
machine learning model, the hardware Trojan is detected.
246 W. Wu et al.
In hardware Trojan circuit detection, the IP core is detected by describing the hardware
description language. The key is to build a comprehensive hardware Trojan database.
The hardware Trojan characteristics are extracted based on the hardware circuit in the
Trust-Hub benchmark [11, 12]. Firstly, the hardware circuit is realized by hardware
description language. The characteristic matrix of the hardware circuit is obtained
through integrated wiring and system simulation for extraction analysis. The charac-
teristic matrix of the hardware Trojan is obtained by comparing with the Gold template.
Figure 3 shows the FPGA RTL schematic diagrams of RS232-T100 in the Trust-
Hub benchmark. As shown in Fig. 3, the RS232-T100 system is composed of three
modules, the RS232 module, the Trojan_Trigger module and the Trojan module. When
the system reaches a specific state (i.e., State is a specific value), the Trojan module is
triggered, which influences the normal operation of RS232.
Figure 4 shows the output of the simulated hardware circuit using Cyclone V
5CEFA9F31C7 of Altera. After the hardware Trojan is inserted to the system, the
Logic Utlization number, register number and IO power consumption of the system
change obviously, but there is no obvious change in static power consumption.
A Hardware Trojan Detection Method Design Based on TensorFlow 247
This paper uses Xq7vx330t chip of Xilinx Company to obtain the register number,
Look-Up Table number, Flip-Flop number and power consumption information of the
system by simulation in the ISE software. The power consumption information is
collected by carrying out XPower software analysis [13]. Table 1 list the output feature
values of 12 circuits of RS232, RS232-T1000*RS232-T2000 in the Trust-Hub
benchmark platform simulated by Xilinx ISE software, including registers, LUT, Slices
and FF. The hardware Trojan realizes the hardware Trojan function circuit by occu-
pying LUT and Flip-Flop of the system. There are 52 LUTs and 64 Flip-Flops without
hardware Trojan. In the circuit inserted with hardware Trojan, the two values will
increase significantly and show significant correlation. The RS232 hardware Trojan
simulation output values are shown in Fig. 5.
From the above analysis, it can be concluded that the register number, flip-flop
number and power consumption information in the hardware Trojan infection circuit
change significantly, while the static power consumption information and the IO
number change slightly. Therefore, the register number, flip-flop number and power
consumption information are taken as the input information of the system, and the LUT
number as the output information. A two-dimensional matrix is established and
compared with the gold template, thus forming the final Trojan information matrix.
The network model of a single neuron is shown in Fig. 7, and its calculation
formula is Eq. 1.
z¼ i¼1
w i xi þ b ð1Þ
Where z is the output result, x is the input, w is the weight, and b is the offset value.
The values of w and b are constantly adjusted to suitable values in model learning. The
logic formed by the value and the formula is the neural network model [16–22].
Commonly used activation functions are Sigmoid, Tanh, Relu and Softplus. Their
mathematical forms are as shown in Eqs. 2–5.
The Relu function is taken as the activation function of the system, while the output
of the intermediate layer is a 1 * 2 matrix. Its calculation formula is Eq. 2. There are
two nodes in the hidden layer, and the model is designed at the learning rate of 0.0001.
The design steps are as follows:
Firstly, the learning parameters are designed. The weight w and the offset value b
are defined in the form of a dictionary. h1 represents the hidden layer, and h2 represents
the final output layer. The forward structure entry of the design model is x, which is
multiplied by w of the first layer and plus b. The result is activated and transformed by
the Relu function, and y_pred_layer_1 is generated. y_pred_layer_1 is substituted into
the second layer, and the Relu function generates the final output of y_pred.
Secondly, the back propagation structure of the model uses the mean squared
difference reduce_mean() to calculate the loss function, which is optimized by
Thirdly, the simulated dataset is manually generated, and the simulated data are
obtained from the dataset in Table 1, which are fed into the x of the model to obtain the
final output y. The system iterates 50,000 times. The model is trained and saved.
Finally, the data set x to be predicted is input in to generate the final result. The
accuracy of the system is obtained by comparing the predicted value and the true value.
5 Experimental Verification
In hardware Trojan detection platform design, the data set is first simulated and
acquired. The FPGA experimental platform is connected to the PC through the JTAG
interface, and the output interface is connected to the oscilloscope. After the system is
powered on, the Verilog file on the Trust-Hub benchmark platform is input into ISE
and compiled for comprehensive wiring. The comprehensive wiring and the data in the
XPower system are read, and the results are shown in Tables 1 and 2.
Through the test platform simulation design, the prediction results are obtained, as
shown in Table 3. The accuracy rate is above 93%, which has good robustness.
Meanwhile, the accuracy rate using the SVM method is approximately 83% [9]. The
prediction accuracy percentage of the experiment platform is presented in Fig. 8. The
system has a high accuracy and can better predict the hardware Trojan.
6 Conclusions
This paper proposes a hardware Trojan detection method design based on Tensorflow.
This method is designed based on machine learning to prevent the integrated circuit
and chip from being attacked by hardware Trojan. This technique can improve hard-
ware Trojan detection success rate. However, further studies are still needed. (1) This
paper concentrates on RS232 series circuit, more series circuit have not been tested.
(2) The RS232 series circuit has been tested, but no classification discussion has been
explored to explore possible results. As hardware Trojans and machine learning are
widely investigated by research scholars or industry participants, its application will be
improved and the problems at this stage will be solved. Proposed detection technique
can be applied to chip inspection, such as chip design and testing.
This research was support by Key Construction Discipline under project number
6081700310; Key Laboratory and Scientific Research Platform under project number
6161700206 and 6161700202; Fujian Provincial Education Department Science and
Technology Project under project number JAT170435; CERNET Innovation Project
1. Lei, Z., Mengxi, Y., Chaoen, X., Youheng, D.: Hardware trojan detection based on
optimized SVM algorithm. Appl. Electron. Tech. 44(11), 17–20 (2018)
2. Liang, W., et al.: A security situation prediction algorithm based on HMM in mobile
network. WCMC 2018(4), 1–11 (2018)
252 W. Wu et al.
3. Liang, W., et al.: Efficient data packet transmission algorithm for IPV6 mobile vehicle
network based on fast switching model with time difference. FGCS 100, 132–143 (2019)
4. Liang, W., Long, J., Weng, T.H., Chen, X., Li, K.C., Zomaya, A.Y.: TBRS: a trust based
recommendation scheme for vehicular CPS. FGCS 92, 383–398 (2019)
5. Cakir, B., Malik, S.: Hardware Trojan detection for gate-level ICs using signal correlation
based clustering. In: Proceedings Design, Automation and Test in Europe (DATE), p. 47147
6. Xue, M., Bian, R., Liu, W., et al.: Defeating untrustworthy testing parties: a novel hybrid
clustering ensemble based golden models-free hardware Trojan detection method. IEEE
Access 7, 5124–5140 (2018)
7. Dong, C., He, G., Liu, X., et al.: A multi-layer hardware trojan protection framework for IoT
chips. IEEE Access 7, 23628–23639 (2019)
8. Hasegawa, K., Yanagisawa, M., Togawa, N.: Hardware Trojans classification for gate-level
netlists using multi-layer neural networks. In: 2017 IEEE 23rd International Symposium on
On-Line Testing and Robust System Design (IOLTS). IEEE, pp. 227–232 (2017)
9. Hasegawa, K., Yanagisawa, M.: A hardware-Trojan classification classification method
using machine learning at gate-level netlists based on Trojan features. IEEE (2017)
10. Lin, N., Lei, S., Kun, H., Shaoqing, L.: Hardware Trojan detection of IP soft cores based on
feature matching. Comput. Eng. 43(03), 176–180 (2017)
11. Salmani, H., Tehranipoor, M., Karri, R.: On design vulnerability analysis and trust
benchmarks development. In: 2013 IEEE 31st International Conference on Computer Design
(ICCD). IEEE, pp. 471–474 (2013)
12. Shakya, B., He, T., Salmani, H., Forte, D., Bhunia, S., Tehranipoor, M.: Benchmarking of
hardware Trojans and maliciously affected circuits. J. Hardw. Syst. Secur. (HaSS) 1, 85–102
13. FIFO Generator v13.2 LogiCORE IP Product Guide. Xilinx (2017).
14. Chang Zhanguo, P., Baoming, L.X., Shuai, W., Shuo, Y.: Automatic detection of myocardial
infarction via machine learning. Comput. Syst. Appl. 04, 218–224 (2019)
15. Fangyu, R., Yang, X., Siyuan, Z., Renyuan, H.: Research on remote sensing image feature
recognition based on TensorFlow. Sci. Technol. Innov. Her. 15(11), 53–54 (2018)
16. Huang Rui, L., Yilin, X.W.: Handwritten digital recognition and application based on
TensorFlow deep learning. Appl. Electron. Tech. 44(10), 6–10 (2018)
17. Gavai, N.R., Jakhade, Y.A., Tribhuvan, S.A., et al.: MobileNets for flower classification
using TensorFlow. In: 2017 International Conference on Big Data, IoT and Data Science
(BID), pp. 154–158. IEEE (2017)
18. Saxena, A.: Convolutional neural networks: an illustration in TensorFlow. XRDS:
Crossroads ACM Mag. Stud. 22(4), 56–58 (2016)
19. Thakur, K., Qiu, M., Gai, K., et al.: An investigation on cyber security threats and security
models. In: 2015 IEEE 2nd International Conference on Cyber Security and Cloud
Computing. IEEE, pp. 307–311 (2015)
20. Gai, K., Qiu, M., Elnagdy, S.A.: Security-aware information classifications using supervised
learning for cloud-based cyber risk management in financial big data. In: 2016 IEEE 2nd
International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE
International Conference on High Performance and Smart Computing (HPSC), and IEEE
International Conference on Intelligent Data and Security (IDS), pp. 197–202. IEEE (2016)
21. Qiu, H., Qiu, M., LU, Z., et al.: An efficient key distribution system for data fusion in V2X
heterogeneous networks. Inf. Fusion 50, 212–220 (2019)
22. Qiu, H., Kapusta, K., Lu, Z., et al.: All-Or-Nothing data protection for ubiquitous
communication: challenges and perspectives. Inf. Sci. (2019)
Resolving the Loop in High-Level
SDN Program for Multi-table
Pipeline Compilation
1 Introduction
Fig. 1. The three flow tables for the simple routing program with the structure of each
table a → b representing the matches of table a being written to table b.
This simple routing program routes packets based on policies specified by the
source ip address of each packet. As each statement can be viewed as a flow table,
the program can be converted to a pipeline with three flow tables as shown in
the Fig. 1 (i.e., the proactive program compilation to datapath). Note that the
return statement does not require conversion. However, to satisfy the hardware
conditions of openflow switches (e.g., limited number of flow tables), some tables
should be merged in the pipeline. For example, if the number of tables is limited
to 2, then sTable and rTable can be merged into a single table whose matches
are dstIP and policy. We denote this process as the pipeline design (i.e., deciding
which tables should be merged). Specifically, we denote the initial pipeline as
the software pipeline and the new pipeline as hardware pipeline as it satisfies the
hardware constraints.
Given the many pipeline design strategies with differing objectives and con-
straints, in this paper, we choose to focus on minimizing the total number of
flow rules with limited numbers of flow tables in pipeline design. For example,
in the simple routing software pipeline with 2 flow tables allowed, if the number
of flow rules of both pTable and sTable are 1000 each, and the number of flow
rules of rTable is 900, then merging sTable and rTable (where the total number
of flow rules is 1000 + 1000 * 900) is preferable to merging pTable and sTable
(where the total number of flow rules is 1000 * 1000 + 900). We further show
that the second design is optimal. Note that in this paper, we do not consider
table merging algorithms or corresponding flow rule compression as they have
been well studied [9,11].
existing related works [4,15] cannot support loops in their languages. The simple
loop program demonstrates the utility of loops:
L1: def onPacket(pkt):
L2: p = None; u = Max; ports = allPorts(pkt.dstIP)
L3: for tp, tu in ports:
L4: if u > tu:
L5: u = tu
L6: p = tp
The simple loop program first initiates a port (p) and its utilization (u), and
selects a set of potential ports (line 2). Then it iterates all the potential ports
(line 3 to line 6) and compares their port utilization to select the least utilized
port. Though this can be achieved by defining the libraries for port selection
without using a loop, the libraries would add constraints to the selection policies.
Programmers should have enough flexibility to specify their own policies with
high-level language (i.e., line 4 to line 6).
As existing SDN datapaths (e.g., [1]) do not support loops, transformation
is required to generate loops in the hardware pipeline. There are two potential
methods for the transformation:
The first method is the black-box approach that considers the whole loop
as a single statement. The loop in the simple loop program can be viewed as a
single flow table where the matches include p, u, and dstIP. The issues for this
approach are: (1) It cannot leverage the multi-table pipeline structure in the
datapath; (2) The table merging in the pipeline design used to compute the flow
table rules is difficult to implement due to loop dependencies.
The second method is to leverage the unrolling technique in the compiler
domain. The unrolling unfolds the loop by iterating the index of the loop (i.e., the
value of i). For example, the loop in the simple loop program may be transformed
(We first convert for tp, tu in ports to for i in len(ports).)
L1: def onPacket(pkt):
L2: ...
L3: i = 0
L4: ... #the loop body
L5: i = 1
L6: ... #the loop body
L7: ... #i from 2 to len(ports) - 1
The benefits of the unrolling approach is that the transformed program can be
applied by the pipeline design. But the limitations are: (1) The unrolling app-
roach can only handle static loop conditions (e.g., explicit numbers of iterations
(i.e., i)) which is limiting as the loop condition can be dynamic (e.g., while
pkt.vlan is None); (2) Even though the loop can be transformed into sequential
statements, the long sequence of statements may degrade the pipeline design
In this paper, we aim to resolve the loops in SDN programs by proposing
efficient, repeated software pipeline (RSP) transformations which support the
dynamic loop condition.
Resolving the Loop in High-Level SDN Program 257
to break the loop. The RSP is multiple copies of the software pipeline connected
together. Here is the formal definition of the RSP:
Repeated Software Pipeline (RSP): Given n pipelines (denoted as pl1 , pl2 ,
... pln ) that have the same layout and flow rules, we denote pln as the n RSP
constructed as sequentially connecting pl1 , pl2 , ... pln . When a packet pkt passes
through pli without any terminal action, such as DROP or PORT(i), pkt will
arrive at pli+1 for 1 ≤ i < n. An example of RSP and its SSA form are shown
in Fig. 2.
RSP of PL: (v_a, v_b) -> v_a v_c -> v_b (v_a, v_b) -> v_a v_c -> v_b
SSA of
RSP of PL: (v_a1, v_b1) -> v_a2 v_c -> v_b2 (v_a2, v_b2) -> v_a3 v_c -> v_b3
We observe that for a given a loop and its corresponding RSP, if a packet pkt
enters the loop by repeating n times and then produces an output, pkt would
have the same output as if pkt were to enter the n RSP. The intuition is that
(considering the loop iterates i = 0, 1, ..., n) though for each software pipeline
in the repeated pipeline, it has no difference from its peers, each pipeline would
increment i by one each time. Therefore, a packet arriving at different pipeline
may apply different rules and by passing through the n repeated pipeline, it can
have the same output as the n iteration of the loop.
Having introduced the RSP, we apply it to pipeline design. In the next
section, we give an analysis of the pipeline design based on RSP to show that it
can be done efficiently and supports dynamic loop conditions (i.e., effectiveness
of RSP).
4 Analysis
In this section, we first give an analysis of the optimal pipeline design. We then
give an analysis of the RSP to demonstrate its efficiency and efficacy.
We model the pipeline design as a partition problem based on the dataflow graph
of the pipeline.
Dataflow Graph (DFG): Given a pipeline pl, we denote DF G(pl) as the
dataflow graph of pl which is a directed acyclic graph (V, E) in which a vertex
v ∈ V specifies a variable in pl and if there is a table ti with ti .inputV =
i i i o o o
{vi1 , vi2 ,|ti .inputV |
} and ti .outputV = {vi1 , vi2 ,|ti .outputV |
} in pl, there are
Resolving the Loop in High-Level SDN Program 259
Fig. 3. The DFG of a pipeline and the source vertices selection. (Color figure online)
i o i o i
|ti .inputV | ∗ |ti .outputV | directed edges, vi1 → vi1 , vi2 → vi1 , ..., vi|ti .inputV |
vi|ti .outputV | , in E. The DFG of the SSA formed pipeline in Fig. 2 is shown in
Fig. 3.
Source Vertices Selection: Select a subset of source vertices in the DFG into
a set S and if a vertex v in the DFG has all of its parent vertices in S, then
v can be added to S. One selection can have multiple iterations of the adding
process. After the selection, remove S (i.e., S is partitioned) from the original
DFG and then S can be viewed as a flow table t such that a vertex without any
parent in S is added to t.inputV and a vertex without any child in S is added
to t.outputV . The example of source vertices selection is shown in Fig. 3 where
va2 can be added to S1 but va3 cannot (even after adding va2 to S1) as some of
its parents are not in S1. And by adding vb2 and vb3 to S2, the current selection
can have three tables (as shown in the red dot line in Fig. 3): (va1 , vb1 ) → va2 ,
vc → (vb2 , vb3 ), and (va2 , vb2 ) → vb3 .
We model the pipeline design as several steps of the selections on the DFG
of the pipeline. If the pipeline design has limited k tables, then there will be, at
most, k − 1 iterations of the selection process.
Greedy Property of the Optimal Pipeline Design: If a pipeline design is
optimal, then it means the generated pipeline has the minimum number of flow
rules. In this case, each step of the selection should be “greedy”. That is, if v
can be added to S, it is added. We denote this property as the greedy selection
property. For example, in Fig. 3, to achieve the optimal pipeline, vb3 should
be added to S2 otherwise there is another table: vc → vb3 . Given the limited
number of pages, we omit an exhaustive proof but provide intuition in that, if
there are two flow tables t1 and t2 , and t2 .inputV ⊆ (t1 .inputV ∪ t1 .outputV ),
then merging t2 into t1 will not add extra flow rules. This means t2 should be
merged into t1 in the optimal pipeline design to reduce the number of processing
steps a given packet encounters, improving the per-packet processing latency.
260 X. Wang et al.
Based on the greedy property of the optimal pipeline design, we start to demon-
strate (by analysis) the efficiency and effectiveness of applying pipeline design
on the RSP. We initially give the goal of the optimal pipeline design.
Goal of the Optimal Pipeline Design: To compute the pipeline with the
minimum number of flow rules for the RSP pln with the hardware constraint that
at most k tables are available such that k < n. As the pipeline depth influences
the latency of packets passing through the pipeline, if two pipelines have the
same number of flow rules, the pipeline with the smaller number of tables is
Complexity: As the complexity of the pipeline design depends on the number of
vertices in the DFG, the pipeline design of pln where n is very large is a complex
process. However, we show that the complexity is dependent on k, not n.
We first outline three concepts: single-output pipeline, end-to-end selection,
and full-output table.
Single-Output Pipeline: A pipeline pl for which the output variables of pl
only appear in the last table of pl. We denote such a pipeline as a single-output
pipeline (SO-PL).
End-to-End (E2E) Selection: We denote source vertices selection (proper
subset of all source vertices) on the DFG in which there is at least one sink
vertex v in the DFG is selected (i.e., v becomes the output variable in the
flow table generated by the selection) as the E2E selection. An example of E2E
selection is shown in Fig. 3. S1 is not E2E as va3 also depends on vb2 . However,
S2 is E2E as vb3 only depends on vc .
From the definition of E2E selection, we find that there is no E2E selection
in SO-PL as for any selection that has at least one sink vertex v selected, the
ancestor of v should be all the source vertices (not a proper subset).
Full-Output Table: Given pln , if a table tx whose output variables include all
the input variables of a pipeline pli , then tx is a full-output table of pln . By
the greedy property, all the remaining tables should be merged to tx if it is the
optimal pipeline design. If there are multiple pipelines whose input variables are
included in the output variables of tx , we define the minimum index of these
pipelines as M inIndexP L(tx ). An example of a full-output table in the SSA
pipeline is found in Fig. 2. This figure is a table (denoted by ty ) that matches
va2 , vb2 , vc which are the input variables of the second pipeline. If the second
pipeline has been merged into ty , then the third pipeline (not shown in the
figure) can be merged into ty . In this case, M inIndexP L(ty ) is still 2 which is
the index of the second pipeline.
We provide the following proposition on the existence of the full-output table
in the pipeline design.
Resolving the Loop in High-Level SDN Program 261
DFG(pl_1) t_3 DFG(pl_2) t_4 DFG(pl_3)
Fig. 4. The proof of existence proposition that three pipelines (pl1 , pl2 , pl3 ) can only
be merged to four tables (t1 , t2 , t3 , t4 ) at least with the assumption. Each block is a
DFG of a pipeline.
The existence proposition says that if we want to merge pln into k tables
where k < n, and if there is no E2E selection in the DFG of pln , then there must
exist at least one full-output table in the merged tables. Multiple full-output
tables are available as the optimality of pipeline design is not involved in the
existence proposition.
Based on the existence of the full-output table proposition, we now give
the position proposition of the full-output table where we consider the optimal
pipeline design.
Proposition 2. Position: If there is no E2E selection in the DF G(pli ) ∀i ∈
1, 2, ...n, then by the optimal pipeline design of pln into at most k tables (k < n),
there must exist exactly one full-output table tx in the merged at most k tables
and M inIndexP L(tx ) ≤ k.
262 X. Wang et al.
Summary: If each software pipeline is the SO-PL (i.e., no E2E selection) in the
pln , then for the optimal pipeline design of pln into at most k tables (k < n), the
pipeline design should only consider the first k software pipelines instead of all n
pipelines (as the result of pipeline design with the first k pipelines is the same as
the n pipelines by the position proposition), which demonstrates the efficiency
as the complexity of the pipeline design does not scale with n pipelines. Further,
for the effectiveness, even though the number of iterations cannot be specified
prior to execution, it only requires us to consider the first k pipelines which will
have the optimal pipeline design for the merging to at most k tables.
5 Evaluation
In this section, we evaluate the proposed RSP design from two aspects: the
execution time of pipeline design and the number of flow rules of the generated
pipeline. All evaluations are run on a 3.5 GHz Intel i7 processor with 16 GB of
RAM running Mac OSX 10.13.
Fig. 5. The execution time of two approaches by changing the number of iterations
(n) and the number of vertices (m).
when k < n (around 10 times when n = 50 and m = 10). Note that when
n = 10, both RSP and the unrolling approach consider 10 software pipelines
in the pipeline design, therefore, the execution time of both approaches is the
same. And when n = 100, the execution time of unrolling is too large compared
with the RSP approach.
Table 1. The number of flow rules of the single table approach and the RSP approach.
n = 10, m = 10 n = 50, m = 10
Single 3898434 3898434
RSP 61856 224098
6 Related Work
High-Level SDN Programming Model: For the reactive high-level SDN pro-
gramming models [8,14,16], packets are forwarded to the controller to compute
actions. Maple [16] can support arbitrary processing on the packets including the
loop. However, it processes the packets in the controller which increases latency.
For the proactive programming models [4,15], though they can be compiled to
the datapath by leveraging the multi-table pipeline, they do not support loops
in their models.
SDN Control Platform: There are several SDN control platforms and archi-
tectures [2,3,6,7,10] that support loops for the upper layer applications by pro-
viding APIs to access packets in the controller. However, they are low-level pro-
gramming models (OpenFlow [13] protocol) and also increase latency in packet
Programming Datapath: P4 [5] provides the programming capability that the
pipeline can be re-configured with the P4 language to the datapath. However,
the P4 cannot support loops in its programming model.
7 Conclusions
In this paper, we have proposed RSP to resolve loops in SDN programs for the
pipeline design. Specifically, for a class of n iteration loops to be deployed to a
multi-table pipeline with a limited number of tables (i.e., the maximum number
of tables is k), the pipeline design only needs to handle the first k iterations of
the loop. It makes RSP very useful when there are dynamic loop conditions or
the number of iterations of the loop is very large.
Acknowledgement. The authors would like to thank the anonymous reviewers for
their insightful comments. This work is partly supported by National Key R&D Project
2018YFB2100800; NSFC 61972253, 61672349.
1. Broadcom: of-dpa.
2. Floodlight openflow controller.
3. The opendaylight project.
Resolving the Loop in High-Level SDN Program 265
4. Arashloo, M.T., Koral, Y., Greenberg, M., Rexford, J., Walker, D.: SNAP: state-
ful network-wide abstractions for packet processing. In: Proceedings of the 2016
Conference on ACM SIGCOMM 2016 Conference, pp. 29–43. ACM (2016)
5. Bosshart, P., et al.: P4: programming protocol-independent packet processors. SIG-
COMM Comput. Commun. Rev. 44(3), 87–95 (2014)
6. Chen, L., Qiu, M., Dai, W., Jiang, N.: Supporting high-quality video streaming
with SDN-based CDNs. J. Supercomput. 73(8), 3547–3561 (2017)
7. Erickson, D.: The beacon openflow controller. In: Proceedings of the Second ACM
SIGCOMM Workshop on Hot Topics in Software Defined Networking, pp. 13–18.
ACM (2013)
8. Foster, N., et al.: Frenetic: a network programming language. ACM Sigplan Not.
46(9), 279–291 (2011)
9. Ge, J., Chen, Z., Wu, Y., E, Y.: H-SOFT: a heuristic storage space optimisation
algorithm for flow table of openflow. Concurr. Comput. Pract. Exp. 27(13), 3497–
3509 (2015)
10. Gude, N., et al.: NOX: towards an operating system for networks. ACM SIGCOMM
Comput. Commun. Rev. 38(3), 105–110 (2008)
11. Gupta, P., McKeown, N.: Algorithms for packet classification. IEEE Netw. 15(2),
24–32 (2001)
12. Han, Q., Meikang, Q., Zhihui, L., Memmi, G.: An efficient key distribution system
for data fusion in V2X heterogeneous networks. Inf. Fusion 50, 212–220 (2019)
13. McKeown, N., et al.: OpenFlow: enabling innovation in campus networks. ACM
SIGCOMM Comput. Commun. Rev. 38(2), 69–74 (2008)
14. Monsanto, C., Reich, J., Foster, N., Rexford, J., Walker, D.: Composing software
defined networks. In: 10th USENIX Symposium on Networked Systems Design and
Implementation (NSDI 2013), pp. 1–13 (2013)
15. Sivaraman, A., et al.: Packet transactions: high-level programming for line-rate
switches. In: Proceedings of the 2016 ACM SIGCOMM Conference, pp. 15–28.
ACM (2016)
16. Voellmy, A., Wang, J., Yang, Y.R., Ford, B., Hudak, P.: Maple: simplifying SDN
programming using algorithmic policies. In: ACM SIGCOMM Computer Commu-
nication Review, vol. 43, pp. 87–98. ACM (2013)
17. Wang, X., Wang, C., Zhang, J., Zhou, M., Jiang, C.: Improved rule installation for
real-time query service in software-defined internet of vehicles. IEEE Trans. Intell.
Transp. Syst. 18(2), 225–235 (2016)
DC Coefficients Recovery
from AC Coefficients in the JPEG
Compression Scenario
1 Introduction
Over the last few decades, big data generation and transmission have greatly
evolved and expanded at a remarkably fast pace [2,9]. The most significant
algorithms for multimedia transmission are compression algorithms which could
compress large, raw multimedia data while preserving the content quality. For
instance, lossy and lossless compression algorithms have allowed for the efficient
and reliable transmission of multimedia data in images and/or videos [3].
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 266–276, 2019.
DC Coefficients Recovery from AC Coefficients 267
our research motivation. Section 4 evaluates our recovery results based on visual
and statistical analysis. We conclude in Sect. 5.
2 Research Background
2.1 Practical Definition of DC Coefficients
Since DCT has different types shown in [4], the most popular DCT algorithm is
a two-dimensional symmetric variation of the transform that operates on 8 × 8
blocks (DCT 8×8) and its inverse (iDCT 8×8). This DCT 8×8 is used in JPEG
compression routines [13] and has become an important standard in image and
video compression steps.
According to the definition of the DCT transformation [4], the DC coefficient
is the average value of the input elements. Thus, the DC coefficients of the DCT
transform of image blocks represent the mean values of the pixel values in the
corresponding image blocks. We illustrate this definition by an example of one
8 × 8 block shown in Fig. 1. In the Fig. 1(a), we list the pixel values on the z-axis
for the original 8 × 8 block. Then, we add the DC coefficient in the DCT result
and do the iDCT to get the pixel values with only the DC coefficient increased
in Fig. 1(b). The pixel value distribution has not changed in Fig. 1(a) and (b)
but only every pixel value is added with the same value.
Thus, for one JPEG image, if all DC coefficients in every 8 × 8 block are
replaced with zero, the relative difference of the pixel values still exist without
any loss. We could consider that for the pixel value distribution on one JPEG
image when all the DC coefficients are zero means that every 8 × 8 block still
maintains the distribution inside the block, but the difference is the average value
of the different blocks. Based on this practical explanation for the DC coefficients,
the motivation of this research is to transmit only the AC coefficients in each
block at the sender’s end and to recover the corresponding DC coefficients at
the receiver’s end to achieve a higher compression ratio for JPEG images.
DC Coefficients Recovery from AC Coefficients 269
ficients are missing for JPEG images, all AC coefficients have remained which
means that for each 8 × 8 block, the relative pixel value distribution accurately
remains. If we use the method operated in the spatial domain or try to opti-
mize the image content with the LP approach, even the recovered image is more
smooth and most of the AC coefficients have changed. Therefore, we not only
failed to recover the accurate DC coefficient but also incorporated the errors for
the AC coefficients. Thus, we use the first approach that is focused on how to
recover the image content by accurately recovering the DC coefficients without
changing the AC coefficients.
We first indicate that the basic observed theory in [12] does not fit the prac-
tical scenario with an example shown in Fig. 2. These two 8 × 8 blocks are chosen
from an image and the adjacent two vectors of the adjacent pixels are very dif-
ferent. In this case, the method in [8] which is always trying to find the DC value
to achieve the minimum Mean Square Error (MSE) of two adjacent blocks will
result in wrong predictions for the DC value since the real case is that the MSE
is very large.
Fig. 2. An example of the real case of pixel value distribution that does not fit the
zero-mean Laplacian distributed variance in adjacent two blocks.
In Fig. 2, if we know the DC value of the block B(i,j) is 80, based on the
min MSE method in [8], the predicted DC value of the block B(i+1,j) is 82. The
difference of the two adjacent vectors in Fig. 2 is [7, −13, −15, 3, 9, −1, 2, 19] while
the real DC value of the block B(i+1,j) is 49 and the difference of the adjacent 8
pixels is [103, 20, 80, 62, 48, 68, 57, 73]. Therefore, the problem to be solved is for
cases shown in Fig. 2 which do not fit the observed statistical theory in [12], or
how to accurately estimate the DC coefficients.
Error (MSE) [8] is used to test the variance between two adjacent blocks on
the three directions respectively to pick out the smallest MSE as the estimation
results. However, as pointed in Sect. 3.1, this smallest MSEs of the adjacent
pixels between two adjacent blocks do not always exist. Therefore, we improve
this method by further exploring the pixel relationships with two steps as shown
in Fig. 4.
Fig. 3. Three directions to calculate the MSE of adjacent pixels of adjacent blocks [8].
The MSEs of the adjacent pixels of adjacent blocks for the three directions
are calculated as shown in Fig. 3 and calculated as in the Eqs. (1)–(3):
1 i
M SEa (Bi , Bi+1 ) = (c − ci+1
j )
8 j=1 j
1 i
M SEb (Bi , Bi+1 ) = (c − ci+1
j )
7 j=1 j+1
1 i
M SEc (Bi , Bi+1 ) = (c − ci+1
j+1 )
7 j=1 j
The method used in [8] is to estimate the DC coefficient of one block from its
left adjacent block like a scan from left to right. This scan will move from up to
down, from right to left, and from down to up respectively and then the DC
coefficients of the blocks are calculated from the average value of the four scans.
However, due to the error propagation, the calculation of the average cannot
overcome the propagated errors of DCs and the results are in Fig. 5(c), (f), (j).
We improve this MSE-based calculation with two steps. The first is to cal-
culate the three directions [8] of MSEs of the adjacent pixels for the vertical
and horizontal adjacent blocks as shown in Fig. 4(a). The purpose is not to find
the best estimation between two adjacent blocks in the vertical and horizontal
adjacent blocks but to calculate the average value of these two predicted DC
coefficients. Based on the observed results, some blocks are not smooth consid-
ering their left blocks but are smooth on pixel distribution comparing with their
up blocks. This step can be used to make up the error due to the intensive pixel
value change in the spatial domain.
272 H. Qiu et al.
Fig. 4. An example of the proposed method: (a) calculate the average of the adjacent
pixels’ MSEs with the vertical and horizontal adjacent blocks; (b) calculate the pixels’
MSEs most similar with the MSE of the pixels in the last two columns/rows of the
adjacent blocks.
Then, the second step is to calculate the most similar MSE value consid-
ering the MSE values of the adjacent blocks’ last two columns/rows as shown
in Fig. 4(b). In other words, we do not calculate the DC value that can make the
two adjacent blocks have the smoothest pixel distribution at its boundary. The
predicted DC value we are looking for is the DC value that can make the pixel
distribution of the boundary the most similar with the pixel distribution of the
last two columns/rows of pixels in the adjacent blocks.
According to the JPEG standard [6], if the Q50 quantization table is
deployed, the DC values in all 8 × 8 blocks are integers within the range:
DC ∈ [−64, +64]. Thus, for the purposes of experimentation, we assume that the
DC coefficients of the four corner 8 blocks are reserved and all other DC coeffi-
cients are stored as zeros for enhancing the JPEG compression. At the receiver’s
end, the DC recovery method is deployed to predict the DC values according to
the only known DC coefficients of the four 8 × 8 blocks at the four corners of the
In other words, for Method 2, we try to consider the pixel distributions based
on not only the adjacent blocks’ relationship but also the pixel distributions of
the last two columns/rows of the pixel distributions in the adjacent blocks. Since
all AC values are preserved, the relative relationship of the pixel distribution
inside one block still exists. This method is based on observations in [8] where
the pixel distributions of the two columns/rows of pixels of the two adjacent
blocks is similar to the pixel distributions of the last two columns/rows of pixels
in the adjacent blocks. With a mathematical simulation, the calculation for one
block B(i,j) is presented in Eq. (4) to find the smoothest pixel distributions with
the last two columns/rows of the adjacent blocks.
arg min M SEa (B(i,j) , B(i−1,j) ) − M SEa (B(i−1,j) [1 : 8, 8], B(i−1,j) [1 : 8, 7]) (4)
DC Coefficients Recovery from AC Coefficients 273
After the DC prediction based on Eq. (4), which is the vertical direction, the
DC prediction is also carried out horizontally which is to calculate the block
B(i,j) and block B(i,j−1) . Then, the two DC values are used to generate an
average value to be the final predicted DC value.
As a result, this method does not change any of the AC values but only
predicts the DC values based on the pixel distributions. Nor does it work at the
boundary of the two adjacent blocks but also at the pixel distributions of the
last two columns/rows inside the adjacent blocks.
In this section, we evaluate the recovery results with the visual results and
the statistical analysis. The statistical methods are also deployed to prove the
effectiveness of the proposed methods.
Here, we take three images as examples to show the effectiveness of the recovery
method. For the evaluation, we keep only four DC coefficients of the four 8 ×
8 blocks at the four corners of the image. The recovery method is performed
by recovering all DC coefficients from the four corners respectively and then
calculating the average value as the predicted DC values. The visual results of
recovered images and the comparison are listed in Fig. 5.
Fig. 5. The visual results: (a) original JPEG image; (b) visual results of the JPEG
image when all DC coefficients are zeros; (c) Recovery result in [8]; (d) Recovery result
based on the Method 2 in this paper.
We also compare the details of the recovered images by the two methods
described in this paper. We pick two areas and the visual results are enlarged
to show the difference of the block effects. The details of the comparison of
Methods 1 and 2 are shown in Fig. 6. The observation is that the block effects
are slightly reduced in these areas. Since the JPEG format is already losing the
image details and block effects exist in the JPEG images, the results in Fig. 6(b)
are very similar to the JPEG image.
274 H. Qiu et al.
We use the Peak Signal to Noise Ratio (PSNR) and Structural SIMilarity (SSIM)
used in [9] to evaluate the recovered images compared with original images.
Compared with the previous work, the statistical results of the PSNR and SSIM
have clearly improved. One point must be made which is that the PSNR is used to
measure the noise compared with the ground truth images and can also be used
to measure the similarity between two images. However, the recovered images
in Fig. 6 do not have noises according to the traditional definition [15] since the
noise is defined as the errors mainly in the high frequency band. The recovered
elements are only DC coefficients rather than the pixel values which means all
the AC coefficients are exactly the same with the JPEG compression results.
Thus, the block effects we observed in Fig. 6 are not caused by the traditional
defined “noise”.
We still use the PSNR to evaluate the results and the image in Fig. 5(a)
is used as an example. Compared with the original image, we calculate the
PSNR for the image without DC coefficients, the JPEG image, the recovered
image with Method 1, and the recovered image with Method 2, respectively. The
results are listed in Table 1 and Method 2 shows slight improvement compared
with Method 1. For the SSIM results also listed in Table 1, the improvement of
Method 2 compared with Method 1 is also shown. In fact, we also tested several
images and there were always slight improvements using Method 2 on the image
recovery compared to Method 1.
5 Conclusion
In this paper, we redesigned the DC coefficients recovery algorithm for the JPEG
images to get an additional 40–60% compression ratio. Compared with existing
solutions, our method can achieve the highest accuracy when recovering DC
coefficients at the receiver’s end. Based on only four DC coefficients at the corner
blocks of the image, the remaining DC coefficients can be accurately recovered
and the edge effects in the previous work are removed. This method can be used
for enhancing the JPEG compression by only transmitting four DC coefficients
at the four corners which highly improved the JPEG compression ratio.
1. Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans.
Comput. 100(1), 90–93 (1974)
2. David, R., John, G., John, R.: Data age 2025: the digitization of the world, from
edge to core. In: IDC White Paper. Seagate (2018)
3. Fisher, Y.: Fractal Image Compression: Theory and Application. Springer, Berlin
4. Kresch, R., Merhav, N.: Fast DCT domain filtering using the DCT and the DST.
IEEE Trans. Image Process. 8(6), 821–833 (1999)
5. Li, S., Karrenbauer, A., Saupe, D., Kuo, C.C.J.: Recovering missing coefficients in
DCT-transformed images. In: 2011 18th IEEE International Conference on Image
Processing (ICIP), pp. 1537–1540. IEEE (2011)
6. Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard.
Springer, Berlin (1992)
7. Qiu, H., Enfrin, N., Memmi, G.: A case study for practical issues of DCT based
bitmap selective encryption methods. In: IEEE International Conference on Secu-
rity of Smart Cities, Industrial Control System and Communications. IEEE (2018)
276 H. Qiu et al.
8. Qiu, H., Memmi, G., Chen, X., Xiong, J.: DC coefficient recovery for JPEG images
in ubiquitous communication systems. Future Gener. Comput. Syst. 96, 23–31
9. Qiu, H., Noura, H., Qiu, M., Ming, Z., Memmi, G.: A user-centric data protection
method for cloud storage based on invertible DWT. IEEE Trans. Cloud Comput.
10. Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable video coding exten-
sion of the H. 264/AVC standard. IEEE Trans. Circuits Syst. Video Technol. 17(9),
1103–1120 (2007)
11. Thuraisingham, B.: Data Mining: Technologies, Techniques, Tools, and Trends.
CRC Press, Boca Raton (2014)
12. Uehara, T., Safavi-Naini, R., Ogunbona, P.: Recovering DC coefficients in block-
based DCT. IEEE Trans. Image Process. 15(11), 3592–3596 (2006)
13. Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Con-
sum. Electron. 38(1), 18–34 (1992)
14. Zeng, J., Au, O.C., Dai, W., Kong, Y., Jia, L., Zhu, W.: A tutorial on image/video
coding standards. In: 2013 Asia-Pacific Signal and Information Processing Associ-
ation Annual Summit and Conference (APSIPA), pp. 1–7. IEEE (2013)
15. Zhang, J., Cox, I.J., Doerr, G.: Steganalysis for LSB matching in images with high-
frequency noise. In: 2007 IEEE 9th Workshop on Multimedia Signal Processing.
pp. 385–388. IEEE (2007)
A Performance Evaluation Method
of Coal-Fired Boiler Based on Neural Network
1 Introduction
As people’s demand for electric energy increases, NOX pollution emissions based on
coal-fired power supply have become one of the largest sources of atmospheric pol-
lution in China. At present, the average coal consumption of thermal power in China is
321 g/(kw h), and 90% of SO2 and 67% of NOX in the atmosphere comes from coal
combustion. The air pollution caused by thermal power generation is very impressive.
Therefore, China’s coal industry energy conservation and emission reduction is
imperative, which is related to the health of people and stability and harmony of society
[1]. In recent years, there have been many researches on energy saving and emission
reduction of coal-fired boilers at home and abroad, and many achievements have been
In terms of model construction, Zhao [2] used the data collected by power station
boilers to perform multiple repeated simulation studies and emission processes; Chen
[3] adopted fuzzy neural network modeling method to optimize model parameters by
using an algorithm. And the better modeling effect is achieved; Safa and Spentzos [4,
5] used artificial neural network modeling, and combined genetic algorithm and fluid
dynamics calculation to optimize the model parameters, and through experimental tests,
satisfactory boiler combustion system mode was established.
In terms of model optimization, Xie [6] established a model of boiler combustion
process by using neural network modeling. He also used genetic algorithm to optimize
boiler thermal efficiency, and achieved good results. Wang [7] established the model of
boiler combustion characteristics through genetics algorithm and neural network,
optimized the oxygen content and the boiler combustion process. Gu [8] used the fuzzy
correlation mining algorithm to optimize the model parameters and obtained a satis-
factory model of boiler combustion characteristics.
The above studies have played a very beneficial role in improving the performance
of coal-fired boilers. However, most studies have not given the causal relationship
between the factors affecting the combustion of coal and the combustion effect. The
specific control scheme is not given after the boiler performance evaluation. Because of
the multi-factor, complexity and randomness of coal-fired boiler emissions, as well as
its ambiguity, its precise evaluation and control methods still need to be further
explored and revealed. In summary, in order to better evaluate and control the boiler
combustion performance, this paper proposes a boiler performance parameter analysis
method, establishes the boiler operating parameters and BP neural network analysis
model based on boiler efficiency and emissions, and it verifies the feasibility and
correctness of the method.
In formula (1), “n” represents the number of nodes in the input layer; “l” represents
the number of nodes in the hidden layer; “m” represents the number of nodes in the
output layer; “a” represents a constant between 0 and 10. The number of neurons in the
hidden layer is not fixed. Firstly, the range of hidden layer nodes in the network
analysis model is determined by the formula. Let the f(X) function be sigmoid function
as follows:
f ðxÞ ¼ ð2Þ
1 þ ex
280 Y. Chen et al.
Dwij ¼ g ð3Þ
Dwij ¼ g ð4Þ
According to formulas (3) and (4), the amount of change of each layer of weight
can be obtained, thereby updating the ownership value of the entire network, and
repeating until the requirement is met.
4 Instance Verification
A 410T quadrangular pulverized coal boiler of a power plant was selected as the
research object. Based on the above evaluation theory and analysis, the model uses a 3-
layer BP network. Since there are 21 input neural nodes, the number of neurons in the
middle layer is set to 11, thereby the BP network structure is as shown in Fig. 2.
A Performance Evaluation Method of Coal-Fired Boiler 281
Coal quality
characteris c
Primary fan
NOx emission concentra on
Secondary fan
output 2 variable
Input 21 variable
quan ty
quan ty
Quality of the
boiler efficiency
chamber of a
stove or furnace
exhaust gas
120 sets of working condition data were selected as training samples, and 100 sets
of running data were selected as test samples. The operating data of the combustion test
part of coal-fired boilers is shown in Table 1.
Table 1. Partial combustion data of a 410T coal-fired boiler in a thermal power plant
Working Coal quality Primary fan Secondary Draft fan Boiler Emission
condition current fan current current efficiency load
number Calorific Ash Sulfur (A) (A) (A) (%) (mg/m3)
value content content Left Right Left Right Left Right
% %
1 5456 20.92 0.74 26.1 24.9 14.1 14.2 91.4 89 93.1066 44
2 5456 20.92 0.74 26.1 24.8 14.1 14.3 92.4 89 91.8803 47
3 5456 20.92 0.74 26.1 24.6 14.1 14.3 91.4 90 91.9308 46
4 5456 20.92 0.74 26 24.7 14.1 14.6 87 91.0616 54
101 5627 13.39 0.42 24.5 24.7 15.5 15.8 95 94 95.0809 47
102 5627 13.39 0.42 14.9 15.3 14.9 15.3 95 93 95.7002 46
103 5627 13.39 0.42 14.9 15.4 14.9 15.4 95 93 98.6446 47
104 5627 13.39 0.42 14.9 15.1 14.6 15.2 95 94 95.8366 48
237 5253 13.39 0.48 22.4 25.1 12.5 12.5 89 86 94.254 55
238 5253 13.39 0.48 22.5 25.1 12.4 12.4 88 84 98.254 46
239 5253 13.39 0.48 22.5 25.1 12.2 12.5 86 82 99.2087 50
240 5253 13.39 0.48 22.5 25.1 12.2 12.4 87 84 99.466 52
After the data is normalized, the accuracy and feasibility of the boiler combustion
efficiency and emission prediction of the AdaBoost-BP algorithm are tested by using
the selected 100 sets of operational data (case number 121–220). The actual value is
shown in Fig. 3.
282 Y. Chen et al.
As can be seen from Fig. 3, the actual values of boiler combustion efficiency and
emissions are generally consistent with the predicted values. Therefore, the established
BP neural network model has a strong generalization ability, which can accurately
reflect the true thermal efficiency and emissions of the boiler under working conditions,
and lays a foundation for the subsequent adjustment of boiler operating parameters.
The network model is used to evaluate the performance of the current combustion
boiler operation data, as shown in Fig. 4.
The Black elliptical circle indicates the approximate distribution of the result. The
combustion efficiency of the boiler is basically below 90%. The emission of nitrogen
oxides is basically above 50. The red arrow indicates the national standard position.
Therefore, the current overall performance of the coal-fired boiler is relatively poor.
There is room for improvement.
A Performance Evaluation Method of Coal-Fired Boiler 283
According to the principal component analysis theory, the index with the cumu-
lative proportion of 85% is selected as the core index, that is, the first nine indicators
are used as the core indicators. These indicators are: exhaust gas temperature (left),
water supply, secondary fan current (left), Furnace oxygen, steam, primary fan current
(left), exhaust temperature (right), primary fan current (right), furnace outlet
We randomly selected 20 sets of data in the data with better test results, that is, the
data has higher coal-burning efficiency and NOx emission in accordance with the
standard and the working conditions are selected to meet the requirement. The adjusted
parameters are shown in Table 4.
6 Conclusion
In this research, the combustion process of coal-fired boilers was studied. The boiler
efficiency and combustion emission analysis model based on AdaBoost-BP algorithm
was established. The principal component analysis method was introduced to analyze
the combustion factors, and the internal relationship and regularity between coal
combustion efficiency, emissions of NOX and various effects were studied. Through
the evaluation and control process of the combustion emission of a 410T four-corner
pulverized coal boiler in a power plant, it verified the effectiveness and feasibility of the
method, and provided a guiding evaluation for the enterprise management personnel.
The control method also provides a theoretical reference for energy saving and emis-
sion reduction of coal-fired boilers.
A Performance Evaluation Method of Coal-Fired Boiler 285
Acknowledgements. This work is supported by the Social Science Planning Project in Fujian
Province Project (FJ2016C133), the Scientific Research Foundation for Young and Middle-aged
Teachers of Fujian Province (JZ160163) and the Fujian Province Education Science “13th Five-
Year Plan” Project (FJJKCG16-289).
1. Ministry of Environmental Protection: GB 13223-2011. Emission Standards for Air
Pollutants in Thermal Power Plants. China Environmental Science Press, pp. 1–3 (2011)
2. Zhao, J., Pan, Q., Liu, T.: Analysis of application of intelligent control in boiler combustion
system of thermal power station. Electron. Manuf. 368(1), 142–148 (2019)
3. Chen, X., Gao, L., Zhou, J., Gao, H., Wang, L., et al.: Boiler combustion control model of
large-scale coal-fired power plant with asymmetric artificial neural networks. In: 2017 IEEE
2nd Information Technology, Networking, Electronic and Automation Control Conference
(ITNEC). IEEE (2017)
4. Safa, M., Samarasinghe, S., Nejat, M.: Prediction of wheat production using artificial neural
networks and investigating indirect factors affecting it: case study in Canterbury Province,
New Zealand. J. Agric. Sci. Technol. 17(4), 791–803 (2018)
5. Spentzos, A., Barakos, G.N., Badcock, K.J., Richards, B.E., et al.: Computational fluid
dynamics study of three-dimensional dynamic stall of various planform shapes. J. Aircr. 44
(4), 1118–1128 (2017)
6. Xie, C., Liu, J., Zhang, X., Xie, W., Sun, J., Chang, K., Kuo, J., et al.: Co-combustion
thermal conversion characteristics of textile dyeing sludge and pomelo peel using TGA and
artificial neural networks. Appl. Energy 212, 786–795 (2018)
7. Wang, C., Liu, Y., Zheng, S., Jiang, A., et al.: Optimizing combustion of coal fired boilers
for reducing NOx emission using Gaussian process. Energy 153, 149–158 (2018)
8. Gu, Y., Zhao, W., Wu, Z.: Combustion optimization of power plant boilers using optimal
MVs decision model. Proc. CSEE 32(2), 39–44 (2012)
9. Yu, T., Liao, L., Liu, R.: Prediction of NOx emissions from coal-fired boilers based on
support vector machines and BP neural networks. Nat. Environ. Pollut. Technol. 16(4),
1043–1049 (2017)
10. Shi, Y., Han, L., Lian, X.: Neural Network Design Methods and Case Analysis, 2nd edn.
Beijing University of Posts and Telecommunications Press, Beijing (2009)
Analysis and Prediction of Commercial Big
Data Based on WIFI Probe
1 Introduction
In some countries, WIFI probes are widely used, such as the use of WIFI probes for
company sign-in, meeting sign-up and other occasions where inventory personnel need
to be counted; large-scale shopping malls generally have passenger flow detection and
statistical systems.
WIFI probe technology can count the number of consumers in the store according
to the wireless network card of the mobile phone, and can also track the trajectory
according to the MAC address of the mobile phone. WIFI probes have the advantage of
low cost, which has become a popular application. In the context of the booming big
data, the WIFI probe sensor collects the passenger flow information of the store,
combined with machine learning, it can provide the basic services of passenger flow
detection and passenger flow forecast for the commercial field, so as to accurately
understand the passenger flow, arrange employee hours and warehouse storage for the
commercial field. Management, scheduling of holiday promotions, etc. have helped.
Combined with big data, WIFI probes, analysis and prediction algorithms, it can
bring low-cost customer analysis and forecasting to merchants, and make passenger
flow analysis more accurate, thus helping to improve the overall operation of the store.
At the same time, through the data feedback and prediction of the system, the merchant
can timely understand the operation status of the store, adjust the operation strategy in
time, provide powerful support operation capability, and promote the rapid economic
development of the merchant [1].
2 Related Technology
which is used to process web requests and view forwarding. Spring is the core
framework, which is equivalent to a container, manages all dependencies, and provides
IoC (reverse control). The mechanism realizes the management of the life cycle of all
objects, and integrates Mybatis according to the configuration; Mybatis acts as a per-
sistence engine for data objects, which is convenient for quickly implementing data
entity classes to data operations.
1 Xm ðiÞ ðiÞ
Jtrain ðhÞ ¼ i¼1
h h x y ð1Þ
3 System Design
3.1 The Architecture of the System
The system adopts modular development. The server is divided into three kinds of
responsibilities: the server cluster is used for preprocessing and analyzing data, the data
cache server is used for cache data and data persistent storage, and the web server is
used to display the visualization platform (Fig. 1).
Analysis and Prediction of Commercial Big Data Based on WIFI Probe 289
Reatime data
data Hadoop+ demonstration
Data Spark
collection Visual server
Data cache
Data cache Data cache Data storage Business logic demonstration Data storage
Hbase Redis MySQL Node MySQL
Fig. 1. System architecture
The left side of the figure is the data source - WIFI probe, which is responsible for
collecting passenger flow data. The data is uploaded to the data cache server through
the wireless network, and the data is preprocessed and then selected and stored to the
HBase server. At this point, the data is passed to the server cluster through Kafka for
calculation, and the calculation result is returned to the cache server for real-time
display. The data persistence server architecture diagram is shown in Fig. 2.
Kaa Hive
After the data is received by the controller of the SSM framework, the prepro-
cessing is performed, including deserializing the json data and then packaging it into an
object. Once the pre-processing is complete, the data can be persisted for storage. At
the same time, data is sent to the server cluster through the data distribution framework
Kafka for subsequent data analysis, prediction and other operations. The visualization
platform architecture diagram is shown in Fig. 3.
290 X. Zeng et al.
Data base
Offline data Factual data
It can be seen from the figure that the technology used by the front end is imple-
mented by the React framework and Nodejs. The front-end route mapping is fully
controlled by React-Router, and the front-end operation is controlled by the NPM
management tool. The background is built by the Spring MVC + Spring + Mybatis
framework, which is routed through the controller and processes various requests.
According to enterprise-style development, when the background interacts with the
database, it is divided into DAO (Data Access Object) data access object layer and
Service layer. The Service layer adopts abstract development, that is, the defined
interfaces are all abstract classes, and the specific implementation needs to be recon-
structed. The implementation class, which greatly reduces the degree of coupling of the
state to the sleep state. The stop command can be controlled by the background
management system that comes with the login probe. It can also be connected to a
single computer or a single-chip computer to send a command to control a single-chip
computer. After receiving the command, the single-chip computer will stop exploring.
The needle is powered to complete the control of the working state of the probe.
Data Persistence Module
The function of this module: The data sent by the WIFI is received by the data
persistence server; the data is preprocessed, deserialized, encapsulated into objects, and
then the duplicate data is removed; the data is cached and persisted; the data is sent to
the processing cluster.
The Algorithm flow: After the data is received by the data persistence server, the
data is preprocessed. By deserializing and encapsulating json data into objects, object-
oriented operations can be implemented to process the data. Then perform HBase
storage and further analysis, and store the analyzed data in Redis.
Data Analysis and Prediction Module
The function of this module: According to the analysis algorithm, the data is started to
calculate the basic data; the analysis result is sent to the persistent server to update the
Redis operation; the online or offline task calls the prediction algorithm for prediction,
and the regularized polynomial regression model is adopted, and the learning curve is
used as an auxiliary tool., fit in the training set, and complete cross-validation, select
the best model, and complete the hypothesis prediction.
There are six basic data calculation methods are as follows:
Regional overall passenger flows
The overall regional passenger flow refers to the number of customers passing through
the area per unit time. Therefore, the calculation method is to obtain the number of
addresses of the detected mobile phone wireless network card in the original data field;
Incoming amount
Assume that the overall passenger flow is N and the unit is human. Assume that the
probe is placed 5 m away from the store door, and the signal strength of the actual store
customer should be greater than −50, the unit is dbm. The data that satisfies the above
requirements in N is counted as the amount of the store; otherwise it is only counted as
the passenger flow. It is advisable to count the above-mentioned stores as Nenter .
Arrival rate
The store entry rate refers to the proportion of customers entering the store to the total
passenger flow. The calculation formula is
Renter ¼ ð3Þ
4 System Implementation
4.1 System Operation Interface
After using the WebSocket protocol, the WIFI probe collects the data, uploads it to the
persistent server, and then displays it in real time through the visualization platform
server. The following figure is an example of the instore amount and deep visit rate in
the store (Fig. 5).
294 X. Zeng et al.
Show the forecasting traffic for each time period in the coming day (The blue
volume stands for the passenger flow and the red volume stands for the instore amount)
(Fig. 6).
Fig. 6. The forecasting traffic for each time period in the coming day
Analysis and Prediction of Commercial Big Data Based on WIFI Probe 295
5 Conclusion
This paper discusses commercial big data analysis and prediction based on WIFI Probe.
The designed software and hardware combination system is built with distributed
clusters. Different servers perform their duties, improve operational efficiency and
achieve more powerful computing power. Storage capacity; using a variety of different
databases to combine, the data with large query volume and small modification amount
is stored in the relational database, and the large amount of modified and unformatted
data is stored in the non-relational database, which improves the working efficiency of
the database system. In the data prediction, a machine learning algorithm with regu-
larized polynomial regression is adopted. In order to solve the problem that the selected
model is in high deviation and high variance, the learning curve can be introduced to
better understand the state of the current model. At the same time, the data visualization
platform adopts The WebSocket protocol enables the server to push data in real time to
the front page of the visualization platform. In addition, the visualization platform uses
a variety of charts to show the data more clearly. The entire system of research and
design can use scientific data to help mall managers better understand the current
situation of the mall and make a series of scientific decisions to achieve better oper-
ational results.
1. Guozhi, Y.: Research on application of passenger flow detection based on Wi-Fi probe.
Comput. Prod. Circ. 157–158 (2018)
2. Chang, X.: Design and Implementation of WiFi-Based Human Traffic Monitoring System,
pp. 3–4. Beijing University of Posts and Telecommunications, Beijing (2017)
3. Mengmeng, L.: Research on Cluster Analysis Based on Hadoop, pp. 6–8 (2016)
4. Xuejun, J., Feng, W., Haixin, H.: Spark big data computing platform. Electron. World
549(15), 82–84 (2018)
5. Wei, W., Rui, H., Yuxiang, J.: Distributed big data machine learning algorithm based on
spark. Comput. Mod. 279(11), 119–126 (2018)
6. Yun, Z., Liu, C.: HBase based storage system for the Internet of Things. In: 2016 4th
International Conference on Machinery, Materials and Computing Technology, pp. 3–5
7. Soekarno, I., Hadihardaja, I.K., et al.: A study of hold-out and k-fold cross validation for
accuracy of groundwater modeling in tidal lowland reclamation using extreme learning
machine. In: 2014 IEEE 2nd International Conference on Technology, Informatics,
Management, Engineering & Environment (TIME-E), pp. 228–233. IEEE (2014)
8. Yajie, Z., Shengjun, L., Han, Y., Xiang, J., Robert, J.C., Li, H.: Robotic anatomical
segmentectomy: an analysis of the learning curve. Ann. Thorac. Surg. 107(5), 106–107 (2019)
Design and Optimization of Camera HAL
Layer Based on Android
Abstract.. Nowadays, in the field of mobile and communication, with the rapid
popularization and application of Internet and network, and the continuous
expansion to the family field, the trend of integration of consumer electronics,
computer and communication (3C) is becoming increasingly obvious, and
embedded system naturally becomes a research hotspot. The embedded oper-
ating system Android has been widely used in mobile devices, such as smart
phones, smart watches and so on. Our design is based on the Android platform
to optimize and improve Camera HAL. This design is applied to the develop-
ment of the Android system, the design of the Camera subsystem, the
description of the Camera subsystem from the structure, function and data flow,
and then proceed to structural optimization. This design uses TI AM335x as the
hardware platform, based on Android system, designed and implemented the
Android Camera hardware abstraction layer and Camera subsystem. At the end
of the development, after the actual experimental test, the development of the
system achieved the expected function and worked well.
1 Introduction
The technology development in today’s era is very rapid, especially the smart phone
technology is rapidly emerging. Operating systems, including IOS and Android. They
are attracting more and more attention from the market, and manufacturers around the
world are beginning to enter the market. Due to the excellent interface and convenient
environment of the Linux kernel-based operating system, more and more new players
are investing in Android development in recent years. So in this article, we’re gonna try
to propose a method to optimize and improve Camera HAL based on the Android
This paper first introduces the environment and background of this system design,
explains the structure of the paper and introduces the key technologies used [1]. The
© Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 296–303, 2019.
Design and Optimization of Camera HAL Layer Based on Android 297
article first introduces the Android system and some of the key technologies that need
to be applied, and then introduces the overall design of the Camera subsystem, mainly
for the improvement of the HAL architecture. Then, the specific process of Cam-
era HAL development is described from the aspects of client, server and HAL layer.
Finally, the entire process of design is summarized.
Android is the platform for this development. The core of its system is Linux,
which is based on the mobile operating system of various services and UI. As former
Android has a wide range of applications, Google has set up OHA (Open Handheld
Alliance) to provide continuous updates. Its infrastructure is the Software Stack
architecture, with the Linux kernel-based kernel layer as the lowest layer and the
development language C [2]. The bottom layer provides a stable system interface to the
upper layer. The middle layer is a system runtime layer, which uses C++-based
function libraries and virtual machines. Further up is the Application layer. The basis
for developing Android applications is the application framework layer.
HAL is proposed by the Android platform to protect the intellectual property of the
manufacturer. It has the feature of bypassing the GPL development source of Linux,
without having to publish all the code [3]. We can implement hardware control
methods in the Android HAL, and only use the Linux driver to complete a simple data
exchange, or directly map the hardware register space to the User Space. Android is
based on the Aparch license, which means that the hardware vendor only needs to
provide binary code or library files [4]. So we said that Android is extremely open as an
operating system, but it is not open source. But Android does not comply with the
GPL, and the contradiction between GPL and IC design vendors still exists.
Android layers use JNI and HAL middleware technologies [5]. As mentioned
before, the system is highly open and provides an unlicensed fee service. Many
advantages make Android the world’s most used mobile operating system. Google has
cooperated extensively with many parties, established a standardized and shared mobile
communication platform, and built an open ecosystem chain to provide better services
to users around the world.
2 Related Work
2.1 Android System Structure
Android divides the system architecture into four levels. The following describes the
four layers in order from top to bottom.
Application layer (Application). The application layer contains a collection of
applications [6]. The emails, calls, QQ, WeChat, maps, alarm clocks, etc. That we
usually use are all listed, and the programs designed and written by the third-party
developers downloaded from the app store using the Java language also run at this level
Application Framework layer (Application Framework). This layer is the basic
framework of Android application development, which contains a lot of interfaces and
class objects, which are often applied in the upper layer application development.
Designers can use the API framework provided by Google to provide a lot of
298 Y. Gan et al.
convenience in designing the program architecture and logical structure, reducing a lot
of tedious and repetitive projects.
System runtime library (Libraries). In order to implement the positioning function
of Android, many components will need to be implemented by some C/C++ libraries,
which are included in the system runtime layer.
Linux kernel layer (Linux Kernel). Android is developed on the basis of the Linux
kernel and has many software stacks [8]. The Linux kernel is between the hardware
device and the software stack, and is still the core of the system itself.
3 Proposed Model
Development Kit), the local language development kit. This tool user uses C and C+ to
implement certain features.
3.3 Summary
In this section, we focus on the key technologies used in this development process [15].
The development is based on the Android platform and applied to some official free
development kits. The most critical part is the Android hardware abstraction layer and
Binder-based communication. The C/S communication mechanism architecture built
by way. It is not complicated by itself. The key to implementing this architecture is to
distinguish the communication structure layer and the business logic layer of the
4 Experiments
development interface interfaces through the toolkit. After the 8.0 version update, the
operation efficiency is greatly improved.
The system version before 5.0 is compiled with JDK, and then compiled with open
JDK7 or higher. The installation of the system version may be different. We use the 6.0
version as an example. Advanced Packaging Tools is a powerful package management
mechanism, generally referred to as APT. We use APT for installation, which has the
advantage of being simple and convenient and does not require manual configuration of
the path. The tool will configure the environment variables at the time of installation
[17]. The APT mechanism reduces the threshold for user operations, whether it is an
installation upgrade or an uninstall, which greatly reduces the amount of user
We first need to get the Android source code. First create a repository with repo
init, initialize the android source code repository to download the latest code.
5 Conclusions
The goal of this design is to develop Android source code level in Ubuntu system,
which can be realized and verified by modifying Camera HAL.
The core of the design is the design of hardware abstraction layer, which is realized
by modifying the Android platform.
The emphasis of development programming is as follows:
(1) Improve and optimize the design of Android native HAL and put forward its own
HAL model.
(2) Write the corresponding Camera HAL code for the design of HAL.
(3) Compile and debug the program.
Specific tasks can be divided into:
1. Research the existing Android HAL design and embedded device HAL design,
arrange the main process of development and design.
2. Theory learning - learning Android system-level development tools and methods,
familiar with the use of ubuntu.
302 Y. Gan et al.
3. In the Android development environment that has been built, the corresponding
coding work is realized by combining the HAL model designed.
4. Testing Camera module in application layer, improving the design of HAL layer
through testing.
5. Finally, after comprehensive adjustment, report and paper will be compiled
according to the results of implementation.
There are five objectives: first, to accurately analyze the key technologies needed in
the design of Android HAL. Secondly, to develop and code Android at the source level
in Ubuntu system; secondly, to test Camera HAL module through application software
at the application level. Finally, to get the analysis report and paper collation after
getting the test results.
After three months’ efforts, we have basically realized the preview and photography
functions of Camera subsystem, and improved the hardware abstraction layer of
Android. The driver uses a general V4L2 framework, which is universal and improves
the running efficiency of Camera subsystem. In the aspect of HAL design, the real-
ization of photo threads and preview threads is completed. An optimization scheme is
proposed for HAL. Core Service is added to the lower layer of HAL to improve the
efficiency of program and realize the integration of software and hardware. In the later
stage, it is necessary to improve the functions and further optimize the structure of
HAL in order to achieve efficiency and versatility. Find a balance between them.
There is no standard evaluation of efficiency and performance. In the process of
further optimization of the framework, hot spot function analysis, time stamp, process
analysis and other methods should be introduced to test the performance of the system,
including photo delay, focusing delay and so on. Good test system and targeted
optimization can make the designed Camera subsystem more robust and efficient.
1. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning
algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
2. Arora, A., Peddoju, S.K.: Minimizing network traffic features for android mobile malware
detection. In: Proceedings of the 18th International Conference on Distributed Computing
and Networking, ICDCN 2017, pp. 32:1–32:10. ACM (2017)
3. Cimpanu, C.: CopyCat Adware Infects Zygote Android Core Process. BleepingComputer
4. Grace, M., Zhou, Y., Zhang, Q., Zou, S., Jiang, X.: Riskranker. In: Proceedings of the 10th
International Conference on Mobile Systems, Applications, and Services, MobiSys 2012,
pp. 281–294 (2012)
5. Kapratwar, A., Troia, F.D., Stamp, M.: Static and dynamic analysis of android malware. In:
Proceedings of the 3rd International Conference on Information Systems Security and
Privacy, pp. 653–662 (2017)
6. Liu, B., Nath, S., Govindan, R., Liu, J.: DECAF: detecting and characterizing ad fraud in
mobile apps. In: Proceedings of the 11th USENIX Conference on Networked Systems
Design and Implementation, NSDI 2014, pp. 57–70. USENIX Association (2014)
7. Moonsamy, V., Rong, J., Liu, S.: Mining permission patterns for contrasting clean and
malicious Android applications. Future Gener. Comput. Syst. 36, 122–132 (2014)
Design and Optimization of Camera HAL Layer Based on Android 303
8. Sharma, D.: Android malware detection using decision trees and network traffic. Int.
J. Comput. Sci. Inf. Technol. 7(4), 1970–1974 (2016)
9. Stamp, M.: Introduction to Machine Learning with Applications in Information Security.
Chapman and Hall/CRC, Boca Raton (2017)
10. Yan, L.-K., Yin, H.: DroidScope: seamlessly reconstructing the OS and Dalvik semantic
views for dynamic android malware analysis. In: USENIX Security Symposium, pp. 569–
584 (2012)
11. Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising
networks. In: The 28th International Conference on Distributed Computing Systems, ICDCS
2008, pp. 77–84. IEEE (2008)
12. Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware
detection in android. In: Proceedings of International Conference on Security and Privacy in
Communication Systems, pp. 86–103 (2013)
13. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: DREBIN: effective and
explainable detection of android malware in your pocket. In: 21st Annual Network and
Distributed System Security Symposium, NDSS 2014, vol. 14, pp. 23–26. The Internet
Society (2014)
14. Crussell, J., Stevens, R., Chen, H.: MAdFraud: investigating ad fraud in Android
applications. In: Proceedings of the 12th Annual International Conference on Mobile
Systems, Applications, and Services, pp. 123–134. ACM (2014)
15. Daswani, N., Stoppelman, M.: The anatomy of Clickbot.A. In: Proceedings of the First
Conference on Hot Topics in Understanding Botnets, HotBots 2007, pp. 11–31. USENIX
Association, Berkeley (2007)
16. Enck, W., et al.: TaintDroid: an information-flow tracking system for realtime privacy
monitoring on smartphones. ACM Trans. Comput. Syst. 32(2), 5 (2014)
17. Miller, B., Pearce, P., Grier, C., Kreibich, C., Paxson, V.: What’s clicking what? Techniques
and innovations of today’s clickbots. In: Holz, T., Bos, H. (eds.) DIMVA 2011. LNCS, vol.
6739, pp. 164–183. Springer, Heidelberg (2011).
18. Qiu, H., Noura, H., Qiu, M.: A user-centric data protection method for cloud storage based
on invertible DWT. IEEE Trans. Cloud Comput. (Early Access), 1 (2017)
19. Qiu, H., Memmi, G.: Fast selective encryption method for bitmaps based on GPU
acceleration. In: IEEE International Symposium on Multimedia, pp. 155–158. IEEE (2014)
20. Qiu, H., Memmi, G., Kapusta, K., Lu, Z., Qiu, M.: All-or-nothing data protection for
ubiquitous communication: challenges and perspectives. Inf. Sci. 502, 434–445 (2019)
Music Rhythm Customized Mobile Application
Based on Information Extraction
1 Introduction
The rhythm tracking mainly uses information extraction technology. This technology
was first used in surveying and mapping to extract useful information for users in
remote sensing images [1]. The earliest possible application of this technology to the
field of music was the feature extraction of phrase-based music information retrieval
published by Japanese scholars in 1999.
In this paper, the audio data is converted into an audio stream, and the audio data is
transformed into a data frame by the rhythm extraction algorithm by using the frame
changeable characteristics of the audio data stream, and the audio is clipped by using
the selected double speed time stretching algorithm, and then the template is used.
Process audio frames, create/edit your own audio, save it in the sandbox or share it with
© Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 304–309, 2019.
Music Rhythm Customized Mobile Application 305
your friends. Developers only need to add their own customized templates, users only
need to import the audio they need to modify, adjust the paragraphs that need to be
modified, and select the corresponding speed and template to get their own music. In
the end, this paper designed a rhythm customization application centered on infor-
mation extraction, which realized a simple and clear audio editing method.
Rhythm tracking technology is essentially a technique that converts the music data
imported by the user into an array of rhythms and edits the audio array accordingly [2].
The music rhythm customization technology based on information extraction is used
for mobile application development. With the rhythm tracking algorithm and time
stretching algorithm of ffmpeg audio and video framework on audio, the effect of
customized music is achieved: (1) It is easy to operate, removes a lot of unrelated
operations, can be processed without the user knowing the relevant professional
knowledge; (2) rich rhythm templates are available; (3) free time stretching, software
supports the start and end time interception and speed editing of imported audio.
2 Related Work
3 Proposed Model
3.1 Rhythm Tracking Algorithm
Subsequent paragraphs, however, are indented. The main rhythm tracking algorithm
consists of three steps: intermediate input representation, general state, and contextual
state. The middle input representation, also known as rhythm detection function,
converts the input audio signal into an array of audio frames, serving as an intermediate
signal between the input audio rhythm and the output audio rhythm. The tracking
formula is shown in formula (1):
CðmÞ ¼ Sk ðmÞ ^Sk ðmÞ2 : ð1Þ
In general, the main task is to detect the rhythm periodicity and the rhythm
alignment without knowing the audio input [4]. The main task of context-dependent
state is to merge and extract context-dependent information (including rhythm, time
signature and past rhythm location) from audio input into relevant parameters [5]. In
order to achieve the effect of emphasizing significant rhythm and discarding insignif-
icant rhythm, the adaptive moving average range of a frame is calculated as shown in
formula (2):
Rhythm tracks are cut into the frame, query beat alignment, contextual, cycle a few
characteristics, and the general state of running in the form of no memory, extracted by
repeating rhythms cycle and alignment, the disadvantages of this method lies in the
context, makes the final results do not have continuity, and the strong consistency we
hear music is a big difference [6]. Single use general state can generate many unnec-
essary mistakes, therefore, need to introduce the contextual state is used to solve and
separately to solve the problem of continuous audio output.
4 Experiment
4.1 Audio Production
First of all, since this app is a pure iOS platform audio mobile app, what needs to be
considered is how to import local audio [11]. Apple has made the API of audio
selection for developers in its own framework. Just import the relevant framework and
you can easily use MPMediaItem to store audio files of iOS.
Second, this article describes an audio editing application, so the second step is to
process the audio just obtained, including the intermediate input layer representation,
general state processing, and context-dependent state mentioned above in rhythm
tracking theory, this part is mainly the use of rhythm tracking algorithm to achieve;
Then through the user’s input to the audio file time and double speed editing, this
part mainly USES the time stretch technology to achieve;
Finally, after editing audio in local or share out, for users to upload audio data
security, did not use the cloud database, but instead USES application sandbox to save
the audio data for the user, the user can see myself in the historical view of all the audio
data, and view or delete it [10]; If users want to export/share their results, they can
upload audio files from the sandbox to DropBox, and then further share/export. The
workflow of the whole application is shown in Fig. 1:
Fig. 1. A figure caption is always placed below the illustration. Short captions are centered,
while long ones are justified. The macro button chooses the correct format automatically.
Can be seen from the diagram, the design of the audio file operations is the main
process of the one-way, and users can at any time to back up one step operation, but not
finished step, as long as the user will not be able to take the next step operation, so the
application interface card operation, background using the album cover (the url of the
picture in the MEMediaItem album Image attribute), will be applied to each function
into one card [12]. Then according to the steps to use ScrollView paging was carried
out on the card, and then set the card swiping to step on the unfinished disable mode
308 Y. Li et al.
5 Conclusion
1. Yu, Z.: The positioning, reasons and significance of the relationship between rites and music
in early confucianism. Film Rev. Introduction 18, 106–109 (2010)
2. Ravi, N.D., Bhalke, D.G.: Musical instrument information retrieval using neural network
Music Rhythm Customized Mobile Application 309
3. Wang, Y.: Concept and practice of data journalism in the context of big data. Mod. Media 23
(6), 16–17 (2015)
4. Fu, H., Chen, C., Xiang, Y., et al.: Research and implementation of key technologies for
distributed big data acquisition. Guangdong Commun. Technol. 35(10), 7–10 (2015)
5. Davies, M.E.P., Plumbley, M.D.: Context-dependent beat tracking of musical audio. IEEE
Trans. Audio Speech Lang. Process. 15(3), 009–1020 (2007)
6. Chu, W., Champagne, B.F.: Further studies of a FFT-based auditory spectrum with
application in audio classification. In: International Conference on Signal Processing, pp. 1–
3 (2008)
7. Degara, N., Rua, E.A., Pena, A., et al.: Reliability-informed beat tracking of musical signals.
IEEE Trans. Audio Speech Lang. Process. 20(1), 290–301 (2012)
8. Mohapatra, B.N., Mohapatra, R.K.: FFT and sparse FFT techniques and applications. In:
Fourteenth International Conference on Wireless & Optical Communications Networks,
pp. 1–5. IEEE (2017)
9. Zhan, Y., Yuan, X.: Audio post-processing detection and identification based on audio
features. In: International Conference on Wavelet Analysis & Pattern Recognition, pp. 3–7.
IEEE (2017)
10. Greamo, C., Ghosh, A.: Sandboxing and virtualization: modern tools for combating
malware. IEEE Secur. Priv. 9(2), 79–82 (2011)
11. Pang, B.: Development trend of interface design from flat style. Decoration 4, 127–128
12. Roig, C., Tardón, L.J., Barbancho, I., et al.: Automatic melody composition based on a
probabilistic model of music style and harmonic rules. Knowl.-Based Syst. 71, 419–434
13. Marchand, S.: Fourier-based methods for the spectral analysis of musical sounds. In: Signal
Processing Conference, pp. 1–5. IEEE (2014)
14. Li, P., Zou, Z.: Loop of the scroll view class (UIScrollView) in apple iOS and algorithm of
dynamic image loading. Comput. Telecom 10, 54–55 (2011)
15. Liu, C., Zhou, B., Guo, S.: Load optimization of large quantity data based on UITableView
in iOS. J. Hangzhou Univ. Electron. Sci. Technol. 4, 46–49 (2013)
16. Ma, Z.: Digital rights management: model, technology and application. China Commun. 14
(6), 156–167 (2017)
17. Kim, B., Pardo, B.: Speeding learning of personalized audio equalization. In: International
Conference on Machine Learning & Applications, 3–6 (2015)
18. Rong, F.: Audio classification method based on machine learning. In: International
Conference on Intelligent Transportation, pp. 3–5. IEEE Computer Society (2016)
19. Liu, L., Bian, J., Zhang, L., et al.: Implementation of audio and video synchronization based
on FFMPEG decoding. Comput. Eng. Des. 34(6), 2087–2092 (2013)
20. Akram, F., Garcia, M.A., Puig, D.: Active contours driven by local and global fitted image
models for image segmentation robust to intensity in homogeneity. PLoS ONE 12(4), 1–32
Computational Challenges and Opportunities
in Financial Services
1 Summary
Financial services is one of the fastest growing areas of applied scientific computing
[1]. Computationally intensive problems solved by financial services require not only
an extensive array of traditional x86 processors but have shown to benefit from other
exotic and proprietary hardware such as GPU [2, 3], FPGA [4, 5] and even custom
ASIC [6]. In addition to the heterogeneity of the infrastructure, different methods of
computation have also been applied in financial services such machine learning, AI [7],
various simulation methods [4, 5, 8–10], along with many different mathematical
techniques from statistics, to numerical methods [11] to Stochastic Differential Equa-
tions [12]. This paper aims to provide a short survey of the top computationally
intensive workloads in financial services, and the methods and techniques used to solve
them. First, the top business questions that lead to computationally difficult problems
will be outlined. This is followed by some methods used to solve these problems, albeit
most use some form of heuristic method due to the difficulty entailed. We will then
delve into the runtime aspect of these problems and what capabilities cloud, mainly
Azure [13], provides vis-à-vis infrastructure capabilities to aide in solving said
The majority of the complex problems in finance stem from one of the following three
areas: (a) pricing, valuing and hedging of securities, (b) risk management [14] and
(c) portfolio optimization [15].
Pricing of securities involved complex mathematical methods, most of which can
only be solved using numerical methods or heuristically using methods like Monte
Carlo simulations [3, 16–18]. The complexity arises from the fact that stock prices
follow Brownian Motion (BM) in that they move in an irregular, non-differentiable,
and non-smooth fashion [19]. To model stock prices, and as a result, the rest of the
financial markets, the non-differentiability aspect of Brownian Motion requires the use
of Stochastic Differential Equations (SDEs) [12]. SDEs discretize stock prices into
infinitesimal pieces, model or simulate each component, and collect the result set.
The primary goal of risk management is to estimate the loss with a certain level of
confidence ($11.55–$11.57 million with a confidence level of 99%, for example). The
potential loss translates to capital requirement from a regulatory perspective, and to
trading decision from a portfolio optimization perspective [14]. Risk management
encompasses a number of different risk calculations, some of which are regulatory
enforced [20–23]. The majority of risk calculations require vast computational power
due to the complexity of the problem [9, 10, 24–29].
Portfolio optimization is affected by the preferences set by the investor, the con-
straints set by the investor, governmental bodies, regulations, taxes and many other
decisions made to optimize the portfolio based on the aforementioned preferences and
constraints [1]. Taxes and tax implications have particularly shown to have a great
implication on portfolio optimization [30].
The aforementioned complexities are caused by mathematical challenges. Solving
these problems using the set of tools and techniques at our disposal has received a great
deal of attention over the years, and that research is still ongoing. Some of the details
behind the mathematical problems will further be explored in the next section.
The ability to efficiently use the available technologies such as cloud, multi-core
processors, FPGA’s, GPUs, and purpose-build hardware in order to adequately and cost
effectively solve complex mathematical methods will be covered in the later section of
this paper.
Derivative securities have payoff functions that are dependent on other securities,
known as the underlying [31]. The primary use of derivatives is to hedge own risk or
transfer the risk to others [14]. Pricing and valuing European options, American
options, Asian options, Mortgage Backed Securities (MBS), credit derivatives, credit
default swaps, and collateral debt obligations [11, 16, 32–35] are examples of
derivative securities which present a computational challenge in finance. They all
312 A. Sedighi and D. Jacobson
depend on the value, price and performance of one or more underlying securities such
as bonds or equities.
Monte Carlo (MC) method is used to simulate and determine options pricing. MC
heavily depends of random numbers or pseudo-random numbers for pseudo-Monte
Carlo [4] simulations. Generating truly random numbers at scale represents a com-
putational challenge in financial services [36, 37]. The Computational complexity of
the standard MC is O(n3) [38], with n representing the random number generated for
the simulation. Reducing the computational complexity of MC to anything lower
represents another challenge in financial services [39].
Pricing of exotic options [40] represent another computational challenge in finance
[39]. Exotic options can be used in trading of different asset classes like commodities:
coffee, corn, pork belly, and crude oil, along with equities, bonds and foreign exchange.
All the aforementioned components need to be discretized individually. As different
asset classes are added to the mix, discretizing will become a challenge, and prone to
error [41]. As a result, the combinatorial of different choices of asset classes is
explored, which itself introduces another computational challenge in financial services.
6 Data Challenges
Although the focus has been on computational challenges in finance, challenges in data
are tied to the why and the how of the challenges presented.
Financial services has always been a data driven industry, but the conventional
methods via which financial data has been managed have become a challenge for
financial services. The telemetry generated by a financial transaction: seller informa-
tion, buyer information, price, P&L, positions, along with industry-related telemetry
such as interest rate, commodity prices, equity prices, bond rate, foreign exchange rate,
have added to the level of interest coming from financial services. About 70% of US
equity trades are machine-driven using High Frequency Trading (HFT) techniques.
Due to advancement in technology, there is an exponential increase in data generated
by recent HFT. This has translated to Ultrahigh Frequency Data (UHFD), and storing,
accessing and making sense out of this data in a reasonable timeframe is a data
challenge facing financial services [45]. The interest is shifting to UHFD as UHFD-
based volatility models [46] have shown to have improved statistical efficiency, and be
very useful in evaluation of daily volatility forecasts [47, 48]. With UHFD, financial
services are now able to achieve accurate intraday volatility measures using machine
learning techniques and statistical models [45, 49, 50].
However, computational challenges are now facing additional challenges brought
upon by the increase in data volume.
314 A. Sedighi and D. Jacobson
7 Solutioning Options
Properties of cloud [53–55] have shown to benefit the large-scale needs of financial
services. General Purpose GPU (GPGPU) computation can aide in solving problems
with high parallelization needs. Field Programmable Gate Arrays (FPGA) based
computing has the capability to be customized to a specific problem, and as such to
dramatically increase performance.
Computational Challenges and Opportunities in Financial Services 315
The overall goal for any option presented is to allow for intra-day risk calculation to
occur. Intra-day risk calculation is hindered by strict time requirement. If calculation
needs to occur to measure risk across the entire portfolio before the decision on the next
transaction to take place, then the timing allotted to do such calculation is bound by the
speed of business. In other words, there is no limit on the need to speed up these
calculations. This requirement sets the basis for much of the large-scale deployments of
compute in financial services.
7.1 Cloud
The properties of cloud such as metered usage, and dynamic provisioning is useful in
most cases, specially more so in cases where quick response is needed that would
require a dynamic buildout of a new environment [56]. The dynamic provisioning of
resources can reduce computation time as it allows virtually unlimited scalability. The
dynamicity of cloud to rapidly scale up to 1000’s of VM’s [57, 58] can reduce the need
to spent time creating complex models and allow for rapid testing. This can potentially
reduce the overall runtime of the calculations and allow for intra-day risk models [59]
to run. Fast VMs coupled with the availability of high-speed networking and tech-
nology like Infiniband [57, 58] can reduce the interprocess communication delays,
which in turn can assist with the task of random number distribution. To tackle the
storage bottleneck introduced by UHFD on the cloud, a cloud-based high-speed storage
environment would be desirable [60].
Machine learning and optimization techniques are often used for portfolio opti-
mization [15, 61, 62]. A new approach that can potentially replace traditional Monte
Carlo simulations with Neural Networks has shown to reduce calculation times
exponentially [63].
In additions to the traditional VM-based machine learning deployments, machine
learning and optimization models are now finding their way on both Cloud-based
FPGAs [62], and GPUs [63, 64].
7.2 GPU
GPU-based infrastructure has been used to speed up risk and pricing calculation in
financial services [2, 3, 70]. The many-core capability of GPU’s have the ability to run
potentially 1000’s of tasks in parallel, thus reducing the overall runtime. Some cloud-
based GPU’s are capable of *3 Teraflops of double-precision operation across *5000
cores [71].
As mentioned before, one of the main challenges of financial services is options
pricing, and more specifically, exotic options pricing. GPU’s have been used to be able
to do both very efficiently [72–74]. GPUs have been used develop a Neural Network-
based approach to replace Monte Carlo simulations in pricing derivatives [63]. GPUs
have the capability to provide raw computational power, with little to change to the
method by which these models are solved. FPGA’s, however, have the capability to
introduce a new paradigm in solving computational challenges in financial services.
316 A. Sedighi and D. Jacobson
7.3 FPGA
FPGA’s allow for an alternate method of solving computationally challenging prob-
lems. Random number generation is inherent in FPGA’s and thus easier to compute as
demonstrated in [66]. Quasi-random numbers have also been tested and proven to
function at great speeds on FPGA’s [4]. Putting an entire Monte Carlo simulation on an
FPGA has also been demonstrated in [5, 75] with a good degree of success.
Azure has demonstrated the performance gain achievable with their FPGA-based
SmartNIC [76] network cards. The authors demonstrate a <15 ms VM-VM TCP
latencies and 40+ Gbps throughput. The reconfigurability of the FPGA-based network
cards proved to be a valuable feature as changes and bugs could simply be deployed in
real-time. FPGA’s [and GPU’s] are used as CPU offloads, in that tasks are off-loaded to
the FPGA in order to reduce the load on the CPU cores. This does require code change,
as the process needs to be FPGA aware to utilize its capabilities. Even though the
development costs are higher, cloud providers are still providing FPGA computing -as-
a-Service as part of their service portfolio.
8 Conclusion
1. Haugh, M.B., Lo, A.W.: Computational challenges in portfolio management. Comput. Sci.
Eng. 3(3), 54 (2001)
2. Grauer-Gray, S., et al.: Accelerating financial applications on the GPU. In: Proceedings of
the 6th Workshop on General Purpose Processor Using Graphics Processing Units. ACM
3. Solomon, S., Thulasiram, R.K., Thulasiraman, P.: Option pricing on the GPU. In: 2010
IEEE 12th International Conference on High Performance Computing and Communications
(HPCC). IEEE (2010)
4. Banks, S., Beadling, P., Ferencz, A.: FPGA implementation of pseudo random number
generators for Monte Carlo methods in quantitative finance. In: 2008 International
Conference on Reconfigurable Computing and FPGAs. IEEE (2008)
Computational Challenges and Opportunities in Financial Services 317
5. Woods, N.A., VanCourt, T.: FPGA acceleration of quasi-Monte Carlo in finance. In: 2008
International Conference on Field Programmable Logic and Applications. IEEE (2008)
6. Pottathuparambil, R., et al.: Low-latency FPGA based financial data feed handler. In: 2011
IEEE 19th Annual International Symposium on Field-Programmable Custom Computing
Machines. IEEE (2011)
7. Krollner, B., Vanstone, B.J., Finnie, G.R.: Financial time series forecasting with machine
learning techniques: a survey. In: ESANN (2010)
8. Eckhardt, R.: Stan Ulam, John von Neumann, and the Monte Carlo method. Los Alamos Sci.
15(131–136), 30 (1987)
9. Glasserman, P., Heidelberger, P., Shahabuddin, P.: Efficient Monte Carlo methods for value-
at-risk (2010)
10. Tezuka, S., et al.: Monte Carlo grid for financial risk management. Future Gener. Comput.
Syst. 21(5), 811–821 (2005)
11. Gobet, E.: Advanced Monte Carlo methods for barrier and related exotic options. In:
Handbook of Numerical Analysis, pp. 497–528. Elsevier (2009)
12. Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations, vol. 23.
Springer, Berlin (2013).
13. Microsoft Azure (2018). Accessed 12 Feb 2018
14. Staum, J.: Monte Carlo computation in finance. In: L’Ecuyer, P., Owen, A. (eds.) Monte
Carlo and Quasi-Monte Carlo Methods. Springer, Berlin (2009)
15. Joseph, T.: Computational financing techniques and fundamental challenges in portfolio
optimization. IOSR J. Hum. Soc. Sci. 9(6), 51–58 (2013)
16. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81(3),
637–654 (1973)
17. Korn, R., Korn, E.: Option Pricing and Portfolio Optimization: Modern Methods of
Financial Mathematics, vol. 31. American Mathematical Society (2001)
18. Korn, R., Müller, S.: Binomial Trees in Option Pricing—History, Practical Applications and
Recent Developments. In: Devroye, L., Karasözen, B., Kohler, M., Korn, R. (eds.) Recent
Developments in Applied Probability and Statistics, pp. 59–77. Springer, Berlin (2010).
19. Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Springer, New York
20. Committee, B.: Basel III: a global regulatory framework for more resilient banks and
banking systems. Basel Committee on Banking Supervision, Basel (2010)
21. Gleeson, S.: International Regulation of Banking: Basel II: Capital and Risk Requirements.
OUP Catalogue (2010)
22. Hakenes, H., Schnabel, I.: Bank size and risk-taking under Basel II. J. Bank. Finance 35(6),
1436–1449 (2011)
23. Tarullo, D.K.: Banking on Basel: The Future of International Financial Regulation. Peterson
Institute (2008)
24. Alexander, C.: Volatility and correlation: measurement, models and applications. Risk
Manag. Anal. 1, 125–171 (1998)
25. Brummelhuis, R., et al.: Principal component value at risk. Math. Finance 12(1), 23–43
26. Chong, J., Keutzer, K., Dixon, M.F.: Acceleration of Market Value-at-Risk Estimation.
Available at SSRN 1576402 (2009)
27. Giot, P.: Market risk models for intraday data. Eur. J. Finance 11(4), 309–324 (2005)
28. Jorion, P.: Value at Risk. McGraw-Hill, New York (1997)
29. McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts,
Techniques and Tools. Princeton University Press (2015)
318 A. Sedighi and D. Jacobson
30. Bertsimas, D., Lo, A.W.: Optimal control of execution costs. J. Financ. Mark. 1(1), 1–50
31. Bodie, Z., et al.: Investments. McGraw-Hill Education (2015)
32. Heston, S.L.: A closed-form solution for options with stochastic volatility with applications
to bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
33. Giesecke, K.: An overview of credit derivatives. Available at SSRN 1307880 (2009)
34. Giesecke, K.: Portfolio credit risk: top-down versus bottom-up approaches. Front. Quant.
Finance, 251 (2009)
35. Fabozzi, F.J.: The Handbook of Mortgage-Backed Securities. Oxford University Press
36. Gentle, J.E.: Random Number Generation and Monte Carlo Methods. Springer, Berlin
37. Niederreiter, H.: Quasi-Monte Carlo methods and pseudo-random numbers. Bull. Am. Math.
Soc. 84(6), 957–1041 (1978)
38. Anderson, D.F., Higham, D.J., Sun, Y.: Computational complexity analysis for Monte Carlo
approximations of classically scaled population processes. Multiscale Model. Simul. 16(3),
1206–1226 (2018)
39. Desmettre, S., Korn, R.: 10 computational challenges in finance. In: De Schryver, C. (ed.)
FPGA Based Accelerators for Financial Applications, pp. 1–31. Springer, Cham (2015).
40. Zhang, P.G.: Exotic options: a guide to second generation options. World Scientific (1998)
41. Dempster, M., Hutton, J.: Fast numerical valuation of American, exotic and complex
options. Appl. Math. Finance 4(1), 1–20 (1997)
42. Pan, S.-Q.: A survey of financial risk measurement. In: 6th International Conference on
Management Science and Management Innovation (MSMI 2019). Atlantis Press (2019)
43. Sedighi, A., Deng, Y., Zhang, P.: Fariness of task scheduling in high performance
computing environments. Scalable Comput.: Pract. Exp. 15(3), 273–285 (2014)
44. Sedighi, A., Smith, M.: Fair Scheduling in High Performance Computing Environments.
Springer, Cham (2019).
45. Seth, T., Chaudhary, V.: Big Data in Finance (2015)
46. Barndorff-Nielsen, O.E., Shephard, N.: Power and bipower variation with stochastic
volatility and jumps. J. Financ. Econom. 2(1), 1–37 (2004)
47. Bollerslev, T., Wright, J.H.: High-frequency data, frequency domain inference, and volatility
forecasting. Rev. Econ. Stat. 83(4), 596–602 (2001)
48. Grammig, J., Wellner, M.: Modeling the interdependence of volatility and inter-transaction
duration processes. J. Econom. 106(2), 369–400 (2002)
49. Comte, F., Renault, E.: Long memory in continuous-time stochastic volatility models. Math.
Finance 8(4), 291–323 (1998)
50. McAleer, M., Medeiros, M.C.: Realized volatility: a review. Econom. Rev. 27(1–3), 10–45
51. High performance compute VM sizes. Virtual Machine Documentation (2019). Accessed 22 July 2019
52. What are field-programmable gate arrays (FPGA) (2019).
azure/machine-learning/service/concept-accelerate-with-fpgas. Accessed 22 July 2019
53. Armbrust, M., et al.: Above the Clouds: A Berkeley View of Cloud Computing (2009)
54. Avram, M.-G.: Advantages and challenges of adopting cloud computing from an enterprise
perspective. Procedia Technol. 12, 529–534 (2014)
55. Armbrust, M., et al.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)
56. Smith, D.M.: Cloud computing primer for 2016. Gartner Inc., Stamford (2016)
Computational Challenges and Opportunities in Financial Services 319
57. Azure HC-series Virtual Machines cross 20,000 cores for HPC workloads (2019). https://
hpc-workloads/. Accessed 22 July 2019
58. Working with large virtual machine scale sets (2019).
azure/virtual-machine-scale-sets/virtual-machine-scale-sets-placement-groups. Accessed 22
July 2019
59. Enabling the financial services risk lifecycle with Azure and R. (2019). Accessed 22 July 2019
60. Cray in Azure (2019).
computing/cray/. Accessed 22 July 2019
61. What is axiomaBlue? (2019). Accessed 22
July 2019
62. Deploy a model as a web service on an FPGA with Azure Machine Learning service (2019).
service. Accessed 22 July 2019
63. Ferguson, R., Green, A.D.: Deeply learning derivatives. Available at SSRN 3244821 (2018)
64. Deploy a deep learning model for inference with GPU (2019).
en-us/azure/machine-learning/service/how-to-deploy-inferencing-gpus. Accessed 22 July
65. Kerrigan, B., Chen, Y.: A study of entropy sources in cloud computers: random number
generation on cloud hosts. In: Kotenko, I., Skormin, V. (eds.) MMM-ACNS 2012. LNCS,
vol. 7531, pp. 286–298. Springer, Heidelberg (2012).
66. Yap, A.Y.: Information Systems for Global Financial Markets: Emerging Developments and
Effects: Emerging Developments and Effects. IGI Global (2011)
67. Tian, X., Benkrid, K.: High-performance quasi-monte carlo financial simulation: FPGA vs.
GPP vs. GPU. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 3(4), 26 (2010)
68. Singla, N., et al.: Financial Monte Carlo simulation on architecturally diverse systems. In:
2008 Workshop on High Performance Computational Finance. IEEE (2008)
69. Kim, H., et al.: Online risk analytics on the cloud. In: Proceedings of the 2009 9th
IEEE/ACM International Symposium on Cluster Computing and the Grid. IEEE Computer
Society (2009)
70. Qiu, M., et al.: Data transfer minimization for financial derivative pricing using Monte Carlo
simulation with GPU in 5G. Int. J. Commun Syst 29(16), 2364–2374 (2016)
71. Azure N-Series VMs and NVIDIA GPUs in the Cloud (2016).
n-series-vms-and-nvidia-gpus-in-the-cloud/. Accessed 31 July 2019
72. Bernemann, A., Schreyer, R., Spanderen, K.: Accelerating exotic option pricing and model
calibration using GPUs. Available at SSRN 1753596 (2011)
73. Gaikwad, A., Toke, I.M.: GPU based sparse grid technique for solving multidimensional
options pricing PDEs. In: Proceedings of the 2nd Workshop on High Performance
Computational Finance. ACM (2009)
74. Abbas-Turki, L.A., Lapeyre, B.: American options pricing on multi-core graphic cards. In:
2009 International Conference on Business Intelligence and Financial Engineering. IEEE
75. De Schryver, C. (ed.): FPGA Based Accelerators for Financial Applications. Springer, Cham
76. Firestone, D., et al.: Azure accelerated networking: SmartNICs in the public cloud. In: 15th
USENIX Symposium on Networked Systems Design and Implementation (NSDI 2018)
A Shamir Threshold Model Based
Recoverable IP Watermarking Scheme
Abstract. In order to solve the problems of low IP capacity and low robustness
in existing IP watermarking techniques, this paper proposed a novel recoverable
IP watermarking algorithm based on Shamir threshold model. This method takes
ðt; nÞ the threshold secret sharing scheme with t as the recovery factor. By
constructing a mapping relationship among n sub-keys and watermark infor-
mation S, n sub-keys of watermark cross-inserted into the respective watermark
information S, and finally the embedded watermark information S is recon-
structed. Experiment result shows that this method greatly improves the
robustness of the watermark while expanding watermark information capacity.
Compared to others watermarking algorithms, this method has the advantages of
large watermark embedding capacity and high watermark recovery ability.
1 Introduction
With the rapid development of semiconductor technology and the increasing integra-
tion on a single chip, SOC technology has become more and more possible to integrate
more functional SoC (system on chip) technology on a single chip, and has gradually
become the mainstream of IC design. IP multiplexing technology plays a key role in
solving design hierarchy, reducing product cost, shortening design cycle and reducing
market risk in SoC design, but the phenomenon of unauthorized IP reuse is becoming
increasingly common. Therefore, how to effectively protect intellectual property rights
of IP cores has attracted the attention of researchers [1, 2].
The usual IP watermarking technology mainly embeds specific watermark identi-
fication information into any abstraction level in the IP design process according to the
FPGA design flow. When it is necessary to obtain evidence, the copyright owner can
detect the watermark information in the core product to prove its attribution, so as to
achieve the purpose of copyright protection [3–5].
In order to prevent illegal users from pirating or forging IP cores at the behavioral
level, Sengupta et al. [6] proposed a multivariate signature coding watermark gener-
ation technique existing in the Advanced Integrated Interval (HLS), which can improve
the IP core reusability and security, although the watermark generated by the tech-
nology has the advantages of lower embedding cost, stronger author identification and
lower hardware overhead, the technology needs to add more in the watermark
embedding process. Additional constraints are used to represent the stored variables
that enforce the interval graph. This scheme of key storage and key transmission
through insecure channels poses certain risks for the security protection of IP cores. To
this end, Abtioğlu et al. [7] have proposed an IP for FPGA platforms. Nuclear forgery
protection method, in which they use the physical unclonable function (PUF) and the
circuit in the device to generate a key, which solves the security problem of key
transmission, and the feasibility of the method extremely depends on the reconfig-
urability of FPGA. Although this method can achieve high security and high ran-
domness, the PUF mentioned in this method is designed based on some special
scenarios, which will make it difficult to resist replay attacks and have security issues.
In order to solve this problem, Zhang et al. [8] proposed a PUF-FSM binding method
based on the protection configuration bitstream, and proposed a reconfigurable physical
unclonable function (PUF) and finite state machine. The (FSM) combined locking
scheme is used to defeat the replay attack. This method can effectively resist replay
attacks and has the advantages of strong reconfigurability and low overhead. However,
this method has unpredictable and usually high design and performance overhead
characteristics. Cui et al. [9] proposed an ultra-low overhead watermarking scheme to
protect IP cores. The scheme mainly enhances the flexibility of the local connection
style by optimizing the scan design, and introduces the virtual scan unit by partial
rewiring to achieve ultra-low overhead and characteristic of easy to detect, but this
scheme has real-time and robustness characteristics of the bar is not high. In order to
improve the robustness and real-time performance of the system, Liang et al. [10]
proposed a method based on Hausdorff distance, which is used for identity verification
of IC chips in IoT environment. LUT resources are used as a set of reconfigurable
nodes in an FPGA. The method first embeds the copyright information into the selected
unused LUT resources searched by the depth-first search algorithm, and then uses the
Hausdorff distance matching function to reorder the random positions and then maps
the positions to meet the specific constraints of the optimal watermark position. This
team also proposed a method of hiding information in the core to achieve the purpose
of proving original ownership [11]. This method introduces four methods based on
core-core watermarking technology: FPGA technology, FSM technology, DFT tech-
nology and self-recovery dual core nuclear watermarking technology. This method
mainly solves two problems: how to hide the information in the core circuit and how to
authenticate the ownership of the core. The experimental results show that the method
has the advantages of low power consumption and high watermarking resistance.
Sengupta et al. [12] also proposed a novel multivariate signature coding method. The
method is mainly applied to the process of embedding dynamic watermark information
into the IP core, and optimizes the embedding cost of the watermark by the particle
322 W. Xiao et al.
swarm optimization driving design based on the area delay constraint, thereby reducing
the embedding cost, running time and reducing the storage hardware resource of the
watermark system. However, its embedded capacity is limited and the real-time per-
formance is not high.
It can be seen from the above research results that lots of IP watermark researches
have been reported, but some problems still exist in previous methods, such as limited
capacity and low watermark security. To this end, this paper proposes a reconfigurable
dual IP watermarking algorithm based on the Shamir threshold scheme. Compared with
the existing IP core watermarks, it is designed to improve the security and integrity of
the watermark. The recoverable mechanism reduces the difficulty of design under the
premise of ensuring security. At the same time, the method has the advantages of high
security, large watermark capacity and recovery ability.
The core watermark is a branch of the watermark technology to solve the problem of
illegal theft in IP reuse. The recoverable IP watermark is embedded in the IP core to
identify the copyright of IP core developer, which can more effectively prove its own
copyright, and more decisively and powerfully judge the copyright infringement.
The watermark in this paper is designed as follows: Let first copyright information
be Cr1 , the second copyright information be Cr2 . The encryption function is E and
encryption key is ke. The watermark correlation function is denoted by F. The
encrypted watermark information is respectively C1 ; C2 . The watermark information
are denoted by S1 ; S2 . Let the embedded control key be kc, watermark embedding
function be I, the watermark extraction function be A, the recovery function be R, and
the watermarked IP carrier be T. The watermark will be transformed as follows during
the preparation phase: S1 F ðC1 ; C2 Þ ! S2 , where C1 ¼ E ðCr1 ; keÞ, C2 ¼ ðCr2 ; keÞ
are obtained by one encryption to obtain an encrypted watermark C1 ; C2 , and then an
association function is used to establish an association mechanism between the
watermarks so that the information between the watermarks can be restored to each
other. After the watermark is prepared, the watermark information S1 ; S2 after the
encryption is associated and embedded in the reusable core T under the control of the
control key kc, and the IP core product Ts containing the information between the
watermarks is obtained, which is expressed by a mathematical expression:
Ts I ðS1 ; S2 ; kcÞ. When a copyright dispute occurs, the copyright owner can extract
0 0
the watermark information in the reusable IP core Cr1 E 1 ðAðkcÞ; keÞ ! Cr2 . If
0 0 0
Cr1 ¼ Cr1 or Cr2 ¼ Cr2 , the copyright of the IP core can be proven. If Cr1 6¼ Cr1 and
Cr2 6¼ Cr2 , the copyright cannot be proven. However, since the watermarks are
mutually recovery, the correlation function F uses the Shamir threshold scheme to
achieve mutual recovery. When C1 in Ts is destroyed, the watermark information C1 ,
i.e. C1 ¼ E 1 ðRðS2 Þ; keÞ, can be recovered by S2 , whereas C2 is restored by S1 , i.e.
C2 ¼ E 1 ðRðS1 Þ; keÞ. Therefore, the watermark can protect the benefits of copyright
owners to a greater extent.
A Shamir Threshold Model Based Recoverable IP Watermarking Scheme 323
Original Original
watermark C1 watermark C2
Key Key
split split
Associated Associated
information Mutual information
flow P1 recovery flow P2
Watermark Watermark
information to be information to be
embedded S1 embedded S2
2. Encryption. The initial key kh is selected to initialize the chaotic system. The
chaotic system generates the chaotic key sequence K ¼ fki ji ¼ 0; 1; 2; . . .; kg, and
the original watermark is encrypted by the function E. Finally, the encrypted
watermark information stream M is obtained, that is, C ¼ EðM; KÞ;
Key stream
Input k1 k2 …… kn
c1 c2 …… cn
Chaotic system mi
kh m1 m2 …… mn Encrypted watermark
Initial key Original watermark information c
information m
3. Association. After the watermark information is obtained in the second step, two
encrypted watermark information streams M1 ; M2 are obtained. The M1 ; M2 is
associated S1 FðS1; S2Þ ! S2 with the correlation function F to obtain the
embedded watermark information S1 ; S2 .
The key of watermark generation is to associate the watermark. The correlation
function F used in the association uses the principle of the Shamir secret sharing
scheme. Since the watermark is recoverable, the implementation of the correlation
function is described in Fig. 1. The robustness of the recoverable watermark is higher
than that of other watermarks. At the same time, it has the advantages of stronger
recovery ability and stronger anti-attack capability (Fig. 2).
its distance L. If the LUT is in T, its position is recorded into the embedded position
Pos sequence.
3. Watermark embedding. The watermark information is embedded in the corre-
sponding LUTs in Pos sequence (Fig. 3).
index table T
LUT11 LUT12 … LUT1n
LUT11 LUT12 … LUT1n
Original watermark
LUT22 … LUT2n
LUT21 informaƟon m’
… … … …
m’1 m’2 …… m’n
LUTn1 LUTn2 … LUTnn
Rea Decryp
d ' −1
C = F (S ) ' t
s’1 s’2 …… s’n de- c’1 c’2 …… c’n
Watermark informaƟon correlation
stream s’
informaƟon c’
When the extracted watermark information has been tampered, the recovery
mechanism between the watermarks is used to recover the impaired watermark
information. The recovery steps are as follows:
4. Extract the associated information. The associated information flow P is extracted
from the S containing the associated redundant information flow.
A Shamir Threshold Model Based Recoverable IP Watermarking Scheme 327
The IP cores used in the experiments were all from the website [13], and
we used our own designed core watermarking system to test with Xilinx ISE design
tools, ModelSim simulation tools and Synplify synthesis tools.
The analysis results show that the logic gates of different IP Cores are not the same,
but the number of Slices used varies far from logic gates. From Table 1, we can clearly
see that as the amount of watermark embedding and the number of resources occupied
328 W. Xiao et al.
by the core design are increasing. This is due to the fact that once the recovery factor t
is selected in the algorithm, the information needed to recover the corrupted watermark
has been determined. According to the test results in the table, although the recoverable
watermark requires more resources than the watermark in [10], the embedded water-
mark information capacity is twice the watermark capacity in the literature [10], and it
has better robustness.
It can be known from the experiment in Table 2 that although our method is
relative to the other two methods: the watermark embedding has a certain increase in
the CLB resource occupancy rate, but the recovery ability of the proposed method after
suffering attacks cannot be surpassed by the other two methods.
5 Conclusion
encryption improve the robustness of the previous IP core watermark, and expand the
embedded watermark information capacity in the process of watermark information
embedding. From the experimental results: this solution does not cause large hardware
overhead, and achieve good watermark recovery performance. It has a good application
prospect. In the next step, we will consider using more attacks to test the security of this
algorithm, and further study the effective embedding mechanism of recoverable core
watermarks in different abstraction layers, thus further optimizing our algorithm.
Acknowledgements. This research was funded by the Fujian Provincial Natural Science
Foundation of China (Grant 2018J01570) and the CERNET Innovation Project (Grant
1. Liang, W., Xie, S., Li, X., Long, J., Xie, Y., Li, K.-C.: A novel lightweight PUF-based RFID
mutual authentication protocol. In: Hung, J.C., Yen, N.Y., Hui, L. (eds.) FC 2017. LNEE,
vol. 464, pp. 345–355. Springer, Singapore (2018).
2. Liang, W., Long, J., Cui, A., et al.: A new robust dual intellectual property watermarking
algorithm based on field programmable gate array. J. Comput. Theor. Nanosci. 12(10),
3959–3962 (2015)
3. Han, Q., Noura, H., Qiu, M., et al.: A user-centric data protection method for cloud storage
based on invertible DWT. IEEE Trans. Cloud Comput., 1 (2019)
4. Anirban, S., Dipanjan, R.: Antipiracy-aware IP chipset design for CE devices: a robust
watermarking approach hardware matters. IEEE Consum. Electron. Mag. 6(2), 118–124
5. Han, Q., Noura, H., et al.: All-or-nothing data protection for ubiquitous communication:
challenges and perspectives. Inf. Sci. 502, 434–445 (2019)
6. Anirban, S., Saumya, B., Saraju, M.P.: Embedding low cost optimal watermark during high
level synthesis for reusable IP core protection. In: IEEE International Symposium on Circuits
& Systems, pp. 974–977. IEEE (2016)
7. Abtioglu, E., Yeniceri, R., Govem, B., et al.: Partially reconfigurable IP protection system
with ring oscillator based physically unclonable functions, pp. 58–65. IEEE (2017)
8. Zhang, J., Lin, Y., Qu, G.: Reconfigurable binding against FPGA replay attacks. ACM
Trans. Des. Autom. Electron. Syst. 20(2), 1–20 (2015)
9. Cui, A., Qu, G., Zhang, Y.: Ultra-low overhead dynamic watermarking on scan design for
hard IP protection. IEEE Trans. Inf. Forensics Secur. 10(11), 2298–2313 (2017)
10. Liang, W., Huang, W., Chen, W., et al.: Hausdorff distance model-based identity
authentication for IP circuits in service-centric internet-of-things environment. Sensors 19
(3), 487 (2019)
11. Liang, W., Long, J., Zhang, D., Li, X., Huang, Y.: Study on IP protection techniques for
integrated circuit in IOT environment. In: Di Martino, B., Li, K.-C., Yang, L.T., Esposito, A.
(eds.) Internet of Everything. IT, pp. 193–216. Springer, Singapore (2018).
12. Sengupta, A., Bhadauria, S.: Exploring low cost optimal watermark for reusable IP cores
during high level synthesis. IEEE Access 4, 2198–2215 (2016)
13. OpenCores Web Site. Accessed 25 Jan 2019
Applications of Machine Learning Tools
in Genomics: A Review
1 Introduction
are interpreted by cellular machinery which calls for then may call for the synthesis of a
molecule, or the activation or deactivation of another gene.
Elements of computer science are then incorporated to analyze these sequences
with machine learning tools. Machine learning simply refers to the process by which a
machine develops an understanding of a dataset from which it can then develop
inferences about. At the most fundamental level, these inferences can refer to identi-
fying common patterns across entries in a data set, from which researchers can then
develop their own conclusions. In the realm of genomics, this principle equates to
machines, with an input of a standard “labeled,” DNA sequence, operating on a new
DNA sequence, comparing its contents to the standard before finally labeling it
accordingly-perhaps the machine develops an understanding that the promoter of a
certain gene contains high quantities of the triplet GCT, thus when it next encounters a
GCT sequence in the analyte DNA, it can then label that sequence as the previously
identified or “known” promoter [1].
However, these machine learning tools may manifest in a variety of forms, from
support vector machines to neural networks, to clustering algorithms, all of which have
been the frequent subjects of inquiry for their applications in predictive genomics.
Thus, there exists no dominant strategy and as such inquiry is divided among innu-
merable proprietary procedures, impeding comparison and progress in a developing
field. It is the intent of this review to provide a summative review of the newest and
most influential findings regarding these learning tools and the effectiveness of their
implementation to determine which is the most efficient for machine learning based
DNA sequence analysis based on parameters of applicability, accessibility, and accu-
racy. The methods and results of each study will be disseminated along with a brief
evaluation corresponding to these aforementioned parameters, before concluding with
an overall determination of the optimal strategy in accordance with this analysis.
Several universal techniques exist across all machine learning techniques when ana-
lyzing DNA sequences [1, 18, 19]. Fundamentally, all methods will be predicated on
determining the significance of a given nitrogenous base (G, C, A, T) sequence given
their respective bonding patterns. Differences then emerge in the presence or absence of
a standard for comparison and the algorithms employed thereafter to formulate and
evaluate associations between these sequences. Machine learning tools can thus be
grouped into two general categories: supervised and unsupervised. Supervised
approaches are provided with a standard for comparison, essentially definitions for
certain sequences that determine their interpretation of DNA sequences, whereas
unsupervised approaches are provided with no such definitions and develop their own
via association.
In the case of the former, analysis begins with the previously selected DNA training
sequence, applicable to the inquiry conducted and with appropriate labels for areas of
interest, being transformed into a multi factored vector value. Next, this vector is
332 J. L. Fracasso and M. L. Ali
functionalized by the selected machine learning tool within the four parameters of the
nitrogenous bases, before finally being cross validated. The latter operates in loosely
the same manner, but lacking a training sequence instead causes unsupervised
approaches to rely on statistical analysis instead, developing definitions independently
over multiple iterations [1]. Thus the general procedure for both techniques can be
defined as information gathering, transformation, analytical cross checking, and
Pattern recognition is the principal mode of analysis for DNA sequences, providing
different information depending on the perspective within which it is developed [18,
19, 20]. Busia et al. [2] propose a convolutional neural network for the alignment of
16 s RNA, a DNA analogue, with a standard sequence in order to determine its species
of origin. Results were then compared to probabilistic Bayesian standards. Ultimately
this method was found to be more accurate than widely accepted standards such as
BLAST and BWA, within 2% of perfect memorization.
Yuen et al. [3] contends that conventional methods for distinguishing nucleotides in
DNA sequencing are too nebulous and error prone and rectifies this with the imple-
mentation of empirical, electronic density of states data. The data are then extracted via
principal component analysis, yielding vectorized principal components. These prin-
cipal components are then used with the membership formula from the fuzzy c-means
algorithm to determine probabilistically which nitrogenous base is present in a clus-
tering method. Further verification is then obtained from the implementation of the
Hidden Markov and Viterbi algorithms to distinguish nitrogenous bases from noise.
This technique yielded classification of unlabeled DOS data with 91% accuracy.
Methylation sites in DNA can greatly influence the expression of the gene wherein
they occur. To predict the location of these methylation sites, Bhasin et al. [4] proposed
the use of a support vector machine using vectorized human DNA fragments as input to
generate predictions regarding human methylation sites. Methylated and unmethylated
CpG dinucleotide sequences were obtained from MethDB database. These sequences
were broken and aligned into uniformly length fragments and were used to train the
support vector machine. Five-fold cross validation was then used to evaluate the per-
formance of this machine learning tool in completing this task, determining that this
method was more effective when compared to artificial neural network, Bayesian
statistics and decision trees.
A knowledge based neural network draws from standardized data to develop
inference rules much like the other forms of machine learning previously mentioned.
Noordewier et al. [5] applied this tool to predict and identify the bordering sites
between introns and exons where splicing occurs, and whether that splicing site is an
acceptor, donor, or neither. As such the neural network was provided with two
information sets, one which labeled the center of a certain DNA fragment, and another
Applications of Machine Learning Tools in Genomics: A Review 333
which determines which of the three aforementioned categories the central piece cor-
responded to. After provisions of 3,190 examples, the network became able to generate
weight based inference rules to address this splice junction problem. This knowledge
based neural network approach was seen to outperform similar neural network
approaches using similar example numbers. Down and Hubbard [6] presented a pro-
prietary relevance vector machine, Eponine, a machine learning tool which determines
the most applicable of a given set of basis formulas in a pruning process of comparing
them to the assay data. However, to compensate for the size of DNA sequences relative
to other data types, the algorithm was modified slightly and provided with randomly
selected weight matrix data. The fundamental sampling transformations utilized were
then to adjust the center of a distribution, adjust its width, adjust its weight in a DNA
matrix, randomly construct a new DNA probability distribution and add it randomly to
one end of the existing DNA matrix, and finally remove another column from the
matrix. Promoters were most often identified as containing repeating TATA motifs at
their center flanked by CG rich sites. Specificity was ultimately found to be greater than
70% for the determination of mammalian promoters.
Bzhalava et al. [7] have utilized next generation sequencing with the Illumina
platform to generate metagenomic data that would be parsed to determine if they
contained viral or other microorganism genomes. Analyte data undergoes rigorous
editing to ensure relevancy through a number of systems before finally being analyzed
using two algorithms, a BLASTn and HMMER3. BLASTn operated on an NCBI
nucleotide database to align and close gaps between sequences, before finally taxo-
nomically classifying them. Unknown sequences were then analyzed with HMMER3,
based on the Hidden Markov models (HMM) for detecting viral DNA sequences. Data
was then aggregated into a series of 1,000 random forests to determine the usefulness
of coding portions of each sequence in identifying viral codons. Finally, these deter-
minations were validated by an artificial neural network that created by-weight con-
nections between identified sequences. Comparative accuracy was indeterminate [7].
With an indeterminate comparative accuracy, however, it is likely that this method
should be developed further before consideration as a standardized procedure.
The evolution of a genome at a fundamental level is reliant on the completion of
recombination such as those in meiosis. Liu et al. [8] proposed a pseudo nucleotide
composition of vectorized dinucleotide sequences to prevent data loss in the trans-
formation from sequence to vector, constraining information according to the param-
eters of neighboring nucleic acid residues, rank or tier of the sequence, and the weight
factor of the sequence. Sequences were analyzed using a modified support vector
machine implementing the RBF kernel and the regularization parameter and kernel
width parameter to determine and predict sites of recombination. Optimizations were
performed by 5-fold cross validation. Cross validation demonstrated an efficacy of 3.5–
17.7% greater than other predictors. In Schietgat et al. [9], a proprietary system of
random forests known as TE-Learner is proposed to identify transposable elements
within a given order, classifying them at the superfamily level. The algorithm first
selects sequences from the well documented Drosophila melanogaster and Arabidopsis
thaliana species based on parameters provided by the characteristics of the given order.
334 J. L. Fracasso and M. L. Ali
Following this, the sequences are annotated and these annotated sequences are used as
input for the aforementioned random forest algorithm which then determines if the
sequence belongs to the specified order and accordingly assigns its superfamily. The
random forest used operates with a first-order logic format to determine the probability
that a given annotated sequence corresponds to the designated superfamily. It was
determined that this technique yielded both better performance and runtime than
comparable prediction techniques.
Zien et al. [10] proposes three modified support vector machine kernels for the
determination of translation start sites in unlabeled DNA sequences. The machine was
trained utilizing sequences obtained from GenBank as refined through the removal of
introns and other purifications in a previous experiment. The first of the three kernel
algorithms used was a sparse bit encoding scheme which gave each nucleotide a unique
bit identity that would then be correlatively parsed by the SVM. The following kernel
then allowed for similar functions, though with accommodations for local comparisons.
The last of the three kernels utilized was a linear function that transformed the data into
coordinate based information. Collectively these kernels were found to improve
recognition by 26%.
Fang et al. [11] posits established methods of vector based analysis for DNA
sequencing neglect important order reliant information. Sequence order retaining
analysis has been effectively demonstrated for amino acid arrangements, but no such
tool exists for DNA sequences, thus prompting further inquiry. Based on the genome
analysis web server, pseudo k-tupler nucleotide composition, the proposed Python
package presents a variety of 15 built-in physicochemical features, as well as user
generated features to contextually analyze input sequences. In most instances, the
machine utilizes the feature vectors of k-mers. Final outputs were compared in the
tasting phase to known values and acceptable ranges of accuracy were determined to
have been met.
Similar to Bhasin et al. [4], the principal concern of Angermueller et al. [12] is the
prediction of methylation sites for the transcriptional changes they confer. This inquiry
is distinguished then by the machine learning tool implemented and the input data
utilized for training that tool. Input sequences of mouse embryonic stem cells as well as
human and mouse cells were profiled utilizing genome-wide bisulfite sequencing and
reduced representation, respectively. In place of an annotated training set, the proposed
predictor, DeepCpG, instead both analyzes and refines models from unaltered sequence
information by utilizing a deep learning neural network which operates within local
sequence windows and neighboring sites of methylation. DeepCpG is capable of
multiple assays, and features a CpG module, a DNA module, and a Joint module. The
CpG module relies specifically on a gated recurrent network using vectors representing
local methylation sites and their relative distances. The DNA module utilizes a con-
volutional neural network with positional weighted vector matrices as inputs in order to
determine sequence motifs. The Joint module then concatenates the hidden vectors of
the DNA and CpG modules and their interactions. The predictions of DeepCpG were
found to be more accurate than established alternatives such as RF Zhang with every
analyte and implemented module, with the greatest difference being 3.75% [12].
Applications of Machine Learning Tools in Genomics: A Review 335
Vidaki et al. [13] too is concerned with a deterministic approach for the methylation
status of DNA, and similarly leverages a unique methodology for creating and refining
this approach, with the intention of evaluating methylation patterns for forensic science
and determine the age of a blood donor or suspect. Input data consisted initially of CpG
methylation profiles organized by general age groups which were then averaged and
normalized within those groups. Researchers selected 45 age associated CpG sites from
a 353 marker pool as given by previous inquiry, using these sites as inputs for the
training and subsequent testing of multiple neural network architectures. A total of one
hundred million generalized regression neural network architectures were tested by a
network design tool with 50–70% of the potential 1156 cases (blood samples) being
equivalently designated as verification and blind tests. The correlation and output error
rates from the training, verification, and blind test datasets were then used to select the
fifty most suitable architectures from each stage. The best performing architectures
from this stage were then compared with fixed blind test cases but with unique training
and verification datasets as per the previous stage in a method of advanced random
sampling. The ten best of these architectures were then selected and subjected to
another stage of testing with fixed training, verification and blind test data subsets, with
the ultimate model being selected from this group. This final model was trained on a
60:20:20 training, verification, and blind test data set ratio, with predicted ages varying
from true ages by an average absolute error of 3.8 ± 3.3, for a 2.93% increase in
accuracy from other linear regression based models [13].
Guo et al. [14] were concerned with the development of a broadly applicable tool
for the identification of nucleosome positioning within chromosomes in order to more
efficiently predict and understand their regulatory effects and general interaction with
major cellular processes. Complete genomic sequences annotated with the position of
nucleosomes for humans, C. elegans, and D. melanogaster were collected and scored
based on the propensity of DNA fragments within those genomes to form nucleosomes.
The sequences were then transformed by the pseudo k-tuple nucleotide composition
formula to correlate the sequences into by-weight tiers. A support vector machine with
the radial basis function as the kernel to analyze the transformed data. As determined
by jackknife cross validation predictions were found to be 79%–86.9% accurate
between the assayed species [14].
As opposed to predicting the location of known functional sequence fragments, the
inquiry of Quang and Xie [15] was instead occupied with determining the function of a
given sequence. The authors highlighting the fact that over 98% of the human genome
is non-coding and thus functionally poorly understood. As such, it was determined that
the ideal machine learning tools to implement would be a proprietary combination of a
convolutional neural network and a bi-directional long-short term memory network,
referred to as DanQ (utilizing the same functional SNP priority framework from
DeepSEA [16]). Input data was obtained from DeepSEA, separated into non-
overlapping 200 bp bins, specific samples then consisted of 1000 bp segments centered
on a 200 bp bin that aligned with a transcription factor binding peak as determined by
chromatin immunoprecipitation. The data are then matricized, with each column cor-
responding to a nitrogenous base. In addition, distinct training, verification, and blind
336 J. L. Fracasso and M. L. Ali
test datasets were obtained from DeepSEA, including reverse complements. Evaluation
strategies also resembled that of DeepSEA with predicted probabilities being calculated
as the average of reverse and forward probabilities. The RMSprop algorithm was then
used to train two models, one of which utilizing the JASPAR motif database, which
were then compared to a logistic regression model as a benchmark. It was then found
that DanQ held a 50% improvement rate over other models in determining DNA
sequence function.
Lee, Karchin and Beer [17] recognized the difficulties in accuracy and timeliness
presented by analyzing complete mammalian genomes and thus proposed a machine
learning tool to more efficiently provide predictions for these large datasets. Input data
were analyzed by chromatin immunoprecipitation and divided into positive and neg-
ative examples to formulate a training set for a support vector machine. Varyingly sized
k-mers were used as sequence features for the machine and weights of each sequence
fragment were optimized to maximally distinguish between positive and negative
classes. Using five-fold cross validation, it was ultimately determined that the proposed
machine was capable of accurate prediction regardless of tissue type, k-mer length, or
kernel used with regards to the detection of EP3000-and was even found to trainable to
detect novel enhancers, including those of humans, though was in these cases, more
tissue dependent [17]. The accuracy appears comparable to other techniques, with the
only issues then being presented by the amount of transformations and training nec-
essary for the accurate prediction of enhancers.
Fig. 1. A graphical representation of the generalized steps for the creation and validation of a
machine learning tool for bioinformatics, with repeatable steps indicated by hashed arrows.
4 Discussion
Overcoming the tedium of manual DNA sequence analysis is made simplest by the
implementation of machine learning tools that can stochastically interpret large data-
sets. These tools, with the recognition of standardized data or the mechanistic deter-
mination of their own standards can be made to label specific sections of DNA and
create associations with those sections, the general process of which is displayed in
Fig. 1. The intention of this article was to then demonstrate which specific machine
learning tool is the most effective for the broadest range of tasks in sequence
Applications of Machine Learning Tools in Genomics: A Review 337
interpretation with a holistic review of primary literature, the general results of which
can be displayed in Table 1.
Table 1. Aggregation of methods and assayed for reviewed articles in machine learning as
applied to genomics.
Reference Assay Machine learning tools
Busia et al. [2] Species of origin Neural Network (NN)
Yuen et al. [3] Density of states Fuzzy C-means clustering,
Hidden markov model
Bhasin et al. [4] CpG methylation sites Support vector machine
Noordewier et al. [5] Splicing sites Neural network
Down & Hubbard [6] Promoter sites Relevance vector machine
Bzhalava et al. [7] Viral genome HMM, random forests, NN
Liu et al. [8] Sites of recombination Support vector machine
Schietgat et al. [9]. Transposable elements Random forest
Zien et al. [10] Translation start sites Support vector machine
Liu et al. [11] DNA attributes Pseudo k-tuple
Angermueller et al. [12] CpG methylation sites Neural network
Vidaki et al. [13] CpG methylation sites Neural network
Guo et al. [14] Sites of nucleosome formation Pseudo k-tuple, SVM
Quang and Xie [15] Functional noncoding DNA Neural network
Zhou and Troyanskaya [16] Functional noncoding DNA Neural network
Lee, Karchin and Beer [17] Enhancer sites Support vector machine
Ultimately it was found that while each study suggested a greater efficacy than
existing alternatives, neural networks and support vector machines were the most
commonly implemented techniques, as shown in Table 2, demonstrating the interest in
neural networks that exists in the broader bioinformatics community with their majority
percentages when compared the far smaller percentages of competing tools-with a
difference of ten percent existing between neural networks and the second most
common tool, support vector machines. These tools were universally modified to best
analyze DNA sequences with specific kernels for labeling larger data sets or accepting
different information and parameters (such as the aforementioned nucleotide bases,
length of a sequence, or AT/GC content of an area), and made to produce labels and
predictions for specific active sites in DNA molecules. This modality is suggestive of a
dominant strategy in DNA sequence analysis that can inform the development of future
machine learning tools, possibly through combining support vector machines and
neural networks with verification from Bayesian algorithms.
338 J. L. Fracasso and M. L. Ali
Table 2. Frequency distribution of each machine learning tools used in reviewed articles.
Machine learning tools Frequency
Neural network 35%
Support vector machine 25%
Relevance vector machine 5%
Random forest 10%
Pseudo k-tuple 10%
Hidden markov model 10%
Fuzzy c-means clustering 5%
The medical relevance of these models cannot be understated, with the greatest
majority utilizing standard embryonic mouse or human genomic datasets, common
assays across all biological fields. It is likely that with further investment and devel-
opment, these machine learning methodologies could facilitate unprecedented
advancements in bioinformatics and beyond, just as contemporary strategies have
within recent years. Yet for the sake of learners and applications beyond medical
sequencing, it is essential for the ideal standard approach to offer a unilateral appli-
cability with accessible datasets and comprehendible frameworks, which may then be
developed further for specialized inquiry. Thus, as demonstrated by the examined
approaches, an ideal standardized sequence should offer a well-founded and simply
written codebase with modular capabilities that can accept a variety of unaltered
sequence data. These data should then undergo rigorous transformation, likely by bin
separation, weight mapping and vectorization before utilization in a support vector
machine or multi layered neural network framework. In either instance, frameworks
should be generationally compared to produce a single ideal system, thus creating the
most accurate, applicable and accessible tool possible.
5 Conclusion
1. Libbrecht, M.W., Noble, W.S.: Machine learning applications in genetics and genomics.
Nat. Rev. Genet. 16, 321 (2015)
2. Busia, A., et al.: A deep learning approach to pattern recognition for short DNA sequences.
BioRxiv 353474 (2018)
3. Yuen, H., et al.: DNA Sequencing via Quantum Mechanics and Machine Learning. CoRR,
abs/1012.0900 (2010)
4. Bhasin, M., Zhang, H., Reinherz, E.L., Reche, P.A.: Prediction of methylated CpGs in DNA
sequences using a support vector machine. FEBS Lett. 579(20), 4302–4308 (2005)
5. Noordewier, M.O., Towell, G.G., Shavlik, J.W.: Training knowledge-based neural networks
to recognize genes in DNA sequences. In: Advances in Neural Information Processing
Systems, pp. 530–536 (1991)
6. Down, T.A., Hubbard, T.J.P.: Computational detection and location of transcription start
sites in mammalian genomic DNA. Genome Res. 12(3), 458–461 (2002)
7. Bzhalava, Z., Tampuu, A., Bała, P., Vicente, R., Dillner, J.: Machine Learning for detection
of viral sequences in human metagenomic datasets. BMC Bioinf. 19(1), 336 (2018)
8. Liu, B., Wang, S., Long, R., Chou, K.-C.: iRSpot-EL: identify recombination spots with an
ensemble learning approach. Bioinformatics 33(1), 35–41 (2016)
9. Schietgat, L., et al.: A machine learning based framework to identify and classify long
terminal repeat retrotransposons. PLoS Comput. Biol. 14(4), e1006097–e1006097 (2018)
10. Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., Müller, K.-R.: Engineering
support vector machine kernels that recognize translation initiation sites. Bioinformatics 16
(9), 799–807 (2000)
11. Liu, B., Liu, F., Fang, L., Wang, X., Chou, K.C.: repDNA: a Python package to generate
various modes of feature vectors for DNA sequences by incorporating user-defined
physicochemical properties and sequence-order effects. Bioinformatics 31(8), 1307–1309
12. Angermueller, C., Lee, H.J., Reik, W., Stegle, O.: DeepCpG: accurate prediction of single-
cell DNA methylation states using deep learning. Genome Biol. 18(1), 67 (2017)
13. Vidaki, A., Ballard, D., Aliferi, A., Miller, T.H., Barron, L.P., Court, D.S.: DNA
methylation-based forensic age prediction using artificial neural networks and next
generation sequencing. Forensic Sci. Int.: Genet. 28, 225–236 (2017)
14. Guo, S.H., et al.: iNuc-PseKNC: a sequence-based predictor for predicting nucleosome
positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 30(11),
1522–1529 (2014)
15. Quang, D., Xie, X.: DanQ: a hybrid convolutional and recurrent deep neural network for
quantifying the function of DNA sequences. Nucleic Acids Res. 44(11), e107–e107 (2016)
16. Zhou, J., Troyanskaya, O.G.: Predicting effects of noncoding variants with deep learning–
based sequence model. Nat. Methods 12(10), 931 (2015)
340 J. L. Fracasso and M. L. Ali
17. Lee, D., Karchin, R., Beer, M.A.: Discriminative prediction of mammalian enhancers from
DNA sequence. Genome Res. 21(12), 2167–2180 (2011)
18. Ali, M.L., Monaco, J.V., Tappert, C.C., Qiu, M.: Keystroke biometric systems for user
authentication. J. Signal Process. Syst. 86(2–3), 175–190 (2017)
19. Gai, K., Qiu, M.: Reinforcement learning-based content-centric services in mobile sensing.
IEEE Netw. 32(4), 34–39 (2018)
Hierarchical Graph Neural Networks
for Personalized Recommendations
with User-Session Context
1 Introduction
With a dramatic increase in the Web information environment, information overload has
received wide social attention. In most of all online services, recommender modules
have become essential components to help users relieve the pressure on information
overload and obtaining interesting information from huge amounts of data. Recom-
mender systems can be also used to increase personal comfort by suppling items or
products orderly with gauge of historical activities or synergetic relationships of items or
users. For example, recommender systems could automatically list items of interest, or
give a suggestion of new discounts relevant to each other. Many recommender systems
are working on the assumption that user identification, item label and past activities are
2 Related Work
various abovementioned tasks. In the work of [5], RNNs are first applied to session-
based recommendations, which provide recommendations for new sessions by learning
a one-hot representation from the clicked item-IDs in the current session. After that,
several valuable works are proposed based on RNNs methods, such as proper data
augmentation techniques [6], the fusion of sequential patterns and co-occurrence sig-
nals [7], and attention mechanism [8].
Graph Neural Networks. Neural networks have been exploited for capturing repre-
sentation for graph-structured data, such as social network, knowledge bases and etc. In
work of [13, 14], GNNs are proposed to operate on directed graphs by using a form of
RNNs. GNNs have been applied for various tasks, such as script even prediction [15],
situation recognition [16], and image classification [17]. Recently, SR-GNNs [9] have
been employed efficiently for session-based recommendations. Different with [9], both
session-based recommenders and session-aware recommenders are considered by
applying a Hierarchical GNNs model.
In this section, we will provide the idea of our proposed Hierarchical GNN model,
which consist of two layers of GNNs for personalized recommendation with user-
context. Our model is based on SR-GNN model presented in [9], which aims to predict
which item a user will click next based on current sequential session data. Different
from SR-GNN, our work takes the evolution of the user interests over time into
account, showing in Fig. 1. We describe our model by two parts, i.e. the architecture of
hierarchical GNNs and learning process of supper parameters.
3.1 Architecture
As shown in Fig. 1, our HGNN model takes a user-level GNN (U-GNN) to model the
user activity across sessions, and a session-level GNN (S-GNN) to model the global
preference of common items from all sessions. Different from SR-GNN, at each time
step, S-GNN and U-GNN make a joint decision to the result of computing recom-
mendations. To SR-GNN, it is easier for our HGNN model to learn the user’s pref-
erence on selection of item sequences. In [9], authors claimed that sessions could be
represented directly by nodes involved in a session, then they focused on developing a
method to learn long-term preference and current interests on items from sessions. In
their situation, there doesn’t exist the information of user activities across sessions. On
the contrary, we have taken this factor into account, as there are a lot of information
that user’s based activities could reveal. For example, it is not hard to associate ani-
mations for toddlers with children, rather than a video of hot-pot making. However, the
baby A may prefer Spider-Man than Superman, just contrary to the baby B. Therefore,
if there are videos of Spider-Man and Superman simultaneously, the recommendation
ranking of videos should be different for two babies.
In session-based recommendation, V ¼ fv1 ; v2 ; . . .; vm g stands for the set consisting
of all unique items involved in all the sessions. The clicked sequence Su is ordered by
344 X. Shen et al.
timestamps from a session of the user U. In order to predict the next click, top-K values
will be the basis of the candidate selection for recommendation. Same to [9], we embed
every item into a unified embedding space and the latent vector of each item is learned
via GNNs, and based on node vectors, each session can be represented by two
embedding vector, which are composed of node vectors used in U-GNNs and S-GNNs.
3.2 Learning
In our S-GNNs, we take the same learning method for item embedding, and it is similar
to obtain update functions for U-GNNs. It is different to obtain long-term preference
and current interests of the input session after feeding all session graphs into GNNs. As
in the workflow of our model, U-GNNs and S-GNNs will generate two independent
local or global embedding vectors. To compute the hybrid embedding vector by taking
Hierarchical Graph Neural Networks for Personalized Recommendations 345
linear transformation over the concatenation of four embedding vectors, including local
and global embedding vectors obtained from U-GNNs and S-GNNs.
From our framework, when user is unknown our model would deteriorate to SR-
GNN model, as the anonymous user makes U-GNNs same to S-GNNs. And we will
test this part for our model. We applied the Back-Propagation through Time (BPTT)
algorithm to train our model.
4 Experiments
In this section we describe our experimental setup and provide an in-depth discussion
of the achieved results.
4.1 Datasets
Same to [9], we also evaluate the proposed method on two public representative
datasets, including Yoochoose [18] and Diginetica [19]. The Yoochoose and Diginetica
datasets are obtained from the RecSys Challenge 2015 and CIKM Cup 2016 respec-
tively. The first dataset consists of a six-month stream of user clicks on an e-commerce
website, and for the second one only transactional data is used.
As shown in Table 1, the most recent training sequences of Yoochoose are
employed in the ratio of 1/128. In order to compare fairly with other methods, we drop
sessions and items, which cannot meet session length greater than 1 or item occur-
rences not less than 5. Then there is a data-partition policy for generating training and
test sets from the input sequence, with which the sessions of subsequent days and
months (i.e. 30 days) are divided as test sets for Yoochoose and Diginetica respec-
tively. Furthermore, to evaluate the performance of our approach, we filter out all
anonymous users on the Diginetica dataset.
Rank), as the average of reciprocal ranks of correctly recommended items, takes the
order of ranking into account. The number 20 means to set MRR to 0 once ranking
exceeds 20. The larger the MRR value, the higher the ranking of correct recommen-
dations in the top list.
To evaluate the performance of our proposed method, there are five representative
baselines involved to comparisons, including POP, User-KNN, GRU4REC [5], NARM
[20] and SR-GNN [9]. POP and S-POP recommend the top-N frequent items in the
training set and in the current session respectively. GRU4REC employs RNNs to
provide session-based recommendations by modeling the session sequences. NARM
also applies RNNs to make recommendations, however it take attention mechanism
into account for capturing user’s sequential behavior and main purpose. SR-GNN
exploits GNNs to model session sequences seamless with the ability of computing
complex transitions of items.
Following previous method [9], the dimensionality of latent vectors d ¼ 100 is
established for both datasets. We initialize all parameters with a Gaussian distribution
with a mean of 0 and a standard deviation of 0.1. In order to optimize these parameters,
a mini-batch Adam optimizer is applied with the initial learning rate of 0.001. And we
set the L2 penalty and the batch size to 10−5 and 100 respectively.
From the result listed in Table 2, it is obvious that our proposed approach is
outstanding from other five methods in terms of P@20 and MRR@20 over three
datasets. In the architecture of our proposed model, we not only consider the impact of
items of common sessions, but also take user’s preference as well. All these session
sequences are integrated as graph-structured data seamless. The result also shows that
neural network-based approaches outperform the traditional methods by adopting deep
learning mechanisms. Furthermore, SR-GNN is so outstanding from GRU4REC and
NARM because of capturing more complex and implicit connections between user
Hierarchical Graph Neural Networks for Personalized Recommendations 347
clicks. To be more advanced, our proposed model takes consider of user session
information as well as global session information.
In this paper, we proposed a mode based on Hierarchical GNNs to address the problem
of personalized recommendations with user-session context, which exploits two-layer
GNNs by involving both session-based recommenders and session-aware recom-
menders. This method is able to capture temporal information by analyzing the user
activity over sessions and the present session status, which means that our proposed
method has an ability to acquire the knowledge hidden in the long-term dynamics of
user sessions. Our proposed Hierarchical GNNs model outperform state-of-the-art
session-based methods for personalized recommendation by taking experiments on
In future works, we have a plan to search the way of refining user representations
by exploiting features of items and users.
Acknowledgement. The authors gratefully acknowledge the anonymous reviewers for their
helpful suggestions.
1. Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative
filtering. In: Proceedings of the 24th International Conference on Machine Learning, ICML
2. Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering
model. In: Proceedings of the 14th ACM International Conference on Knowledge Discovery
and Data Mining (2008)
3. Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender
systems. IEEE Comput. 42(8), 30–37 (2009)
4. Koenigstein, N., Koren, Y.: Towards scalable and accurate item-oriented recommendations.
In: Proceedings of the 7th ACM Conference on Recommender Systems (2013)
5. Hidasi, B., Karatzoglou, A., Baltrunas, L., et al.: Session-based recommendations with
recurrent neural networks. In: ICLR (2016)
6. Tan, Y.K., Xu, X., Liu, Y.: Improved recurrent neural networks for session-based
recommendations. In: Proceedings of the 1st Workshop on Deep Learning for Recommender
Systems (2016)
7. Jannach, D., Ludewig. M.: When recurrent neural networks meet the neighborhood for
session-based recommendation. In: Proceedings of the 11th ACM Conference on
Recommender Systems (2017)
8. Li, J., Ren, P., Chen, Z., et al.: Neural attentive session-based recommendation. In:
Proceedings of the 2017 ACM Conference on Information and Knowledge Management
9. Wu, S., Tang, Y., Zhu, Y., et al.: Session-based recommendation with graph neural
networks. In: Proceedings of AAAI Conference on Artificial Intelligence (2019)
348 X. Shen et al.
10. Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for
sequence learning. CoRR, 1506.00019 (2015)
11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780
12. Cho, K., van Merrienboer, B., Bahdanau, D., et al.: On the properties of neural machine
translation: encoder-decoder approaches. In: Proceedings of 8th Workshop on Syntax,
Semantics and Structure in Statistical Translation (2014)
13. Gori, M., Monfardini, G., Scarselli, F.: A new model for learning in graph domains. In:
IJCNN, vol. 2, pp. 729–734 (2005)
14. Scarselli, F., Gori, M., Tsoi, A.C., et al.: The graph neural network model. TNN 20(1), 61–
80 (2009)
15. Li, Z., Ding, X., Liu, T.: Constructing narrative event evolutionary graph for script event
prediction. In: Proceedings of International Joint Conferences on Artificial Intelligence
Organization (2018)
16. Li, R., Tapaswi, M., Liao, R., et al.: Situation recognition with graph neural networks. In:
Proceedings of IEEE International Conference on Computer Vision (2017)
17. Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for
image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition (2017)
18. Yoochoose.
19. Diginetica.
20. Li, J., Ren, P., Chen, Z., et al.: Neural attentive session-based recommendation. In: CIKM
Coattention-Based Recurrent Neural Networks
for Sentiment Analysis of Chinese Texts
1 Introduction
In many online customer services where users are monitored whether they enjoy a
gratifying chatting, the sentiment of user’s attitudes behind talking texts could be
obtained to reflect their service satisfaction. The goal lies in sentiment analysis or
opinion mining is to reveal people’s opinions, sentiments, emotions, appraisals, and
attitudes towards entities such as products, services, individuals, issues, topics and their
attributes [1]. The sentiment analysis is not only related to computer science, but also
involves management science and social sciences such as marketing, communications,
and finance, due to its influence over both business and society. The reason of this fact
is that attitudes or opinions play dominant roles in our activities and behaviors, and our
beliefs and perceptions of reality, and the choices we make, could be implicitly effected
by others’ worldviews and values. Large amounts of review texts which reflect our
positive or negative feelings are posted towards different aspects of products and
services we received.
The task of sentiment analysis is always redefined as a classification question of
text on different levels. There are mainly three levels for sentiment analysis [2], i.e.
sentence-level [3], document-level [4], and aspect-level [5]. The first two level tasks
aim at revealing the whole sentiment orientation over the topic of the review or other
entities. And the aspect-level task is highly relevant to characteristic or property of a
specific entity. This paper focus on sentence-level sentiment analysis, which is a
common task in the field of customer services. In our task, there are three categories
according to the sentiment polarity, i.e. positive, neural and negative.
There are numerous techniques proposed to tackle various tasks of sentiment
analysis including supervised and unsupervised approaches [6]. While supervised
methods have been applied into various tasks of sentiment analysis by taking various
supervised machine learning mechanisms and feature combinations [7, 8], unsuper-
vised methods generally take advantage of sentiment lexicons, grammatical analysis,
and syntactic patterns [9]. Deep learning techniques have emerged as a powerful
toolkit, which have already found an increasingly wide utilization with state-of-art
results in a lot of application domains such as computer vision, speech recognition and
natural language process processing. Specially, there are evidences shown that deep
neural networks are effective solutions for sentiment analysis in automatic feature
extraction, such as CNN (Convolutional Neural Network) [10, 11], LSTM (Long Short
Term Memory) [12–14], and attention networks [15, 16]. Different from most of
research on sentiment analysis, of which the subject text is in English, our study is on
the Chinese texts. Chinese is quite different from English in grammatical structure,
meaning, expression and idioms. While English sentences tend to be longer with a need
to be specific, Chinese prefer to use simple and short sentences to express rich and
vivid information. Generally, Chinese texts are more complex and rely on context.
Recently, RNN (Recurrent Neural Networks) [17, 18] has been proved that it can
capture relationships of words, and have an ability of obtaining long/short-term
dependencies by its gating mechanism. In our task, a Chinese sentence may embrace
several targets with individual modifiers. When judging the total sentiment of a sen-
tence, targets with uncorresponding modifiers could cause noises for each other.
However, there are several works to exploit attention mechanism to learning context
feature with target effectively for sentiment analysis [19–22].
In this paper, we propose a Coattention-based RNN model analyzing the sentiment
polarities of Chinese short texts. Here, bidirectional RNNs are introduced, which can
learn representations of context and target. To capture context features with quality for
related targets, traditional attention mechanism which takes an average pooling method
is inappropriate. Coattention mechanism is involved to address this problem by con-
sidering the attention representation for context generated from the target representa-
tion. We evaluate our approach on two public datasets, and results have shown our
proposed architecture achieves superior performance on the task of sentiment analysis
of Chinese texts.
Coattention-Based Recurrent Neural Networks 351
2 Model
In this section, we describe the Coattention-based RNNs model in details for sentiment
analysis of Chinese texts. To make our model work, there are four core components,
which are word embedding, GRU networks, the coattention encoder and sentiment
classifier in our framework, showing in Fig. 1.
embedding matrix is pre-trained. Mapped from thus word embedding matrix, two
embedding matrices are obtained for encoding the abovementioned word sets,
½e1c ; e2c ; . . .; em n
c and ½en ; en ; . . .; en .
1 2
s ¼ v s C a þ bs ð2:3Þ
3 Experiments
In this section we describe our experimental setup and provide an in-depth discussion
of the achieved results.
3.1 Datasets
We evaluate our model on two public datasets, which have been collected from online
hotel reviews and Sina’s Weibo data. The corpus of hotel reviews, named ChnSentiHtl,
has 7, 767 reviews in total, consisting of 5, 323 positive reviews and 2, 444 negative
reviews. The corpus of Sina’s Weibo data, named SinaWeiboSenti, has 119, 988
reviews, including 59, 994 positive reviews and 59, 994 negative reviews. During
training and test phases for sentiment prediction tasks, datasets are partitioned as shown
in Table 1.
Acc ¼ ð3:1Þ
Tr þ Fs
To avoid over-fitting and balance the sizes of positive and negative samples, the
method of random playback cross validation has been applied. We make five times of
cross validation, and training set and test set do not intersect. More details are listed in
Table 2. From results, it is not hard to know that our proposed method is outstanding
from other methods on both the total dataset and polarity datasets.
This paper proposed a model based on Coattention-based RNNs for sentiment analysis
of Chinese text, which takes the word embedding as input. It is important for this task
to capture semantic and syntactic information by using word embedding during
training. Bidirectional RNNs and Coattention mechanism are involved to learn repre-
sentations of context and target with high quality. The proposed Coattention-based
RNNs model significantly outperforms state-of-the-art methods by taking experiments
on two public datasets.
In future work, we plan to consider taking an extensive structure to capture
negations modifiers and learn unknown sentiment words and phrases in an acceptable
way by exploring knowledge mapping and collaborative learning.
Acknowledgement. The authors gratefully acknowledge the anonymous reviewers for their
helpful suggestions.
1. Bo, P., Lillian, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–
2), 7–9 (2008)
2. Yang, C., Zhang, H., Jiang, B., et al.: Aspect-based sentiment analysis with alternating
coattention networks. Inf. Process. Manag. 56(2019), 463–478 (2019)
3. Long, J., Yu, M., Zhou, M., et al.: Target-dependent twitter sentiment classification. In:
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies (2011)
4. Balahur, A., Steinberger, R., Kabadjov, M.: Sentiment analysis in the news. Infrared Phys.
Technol. 65, 94–102 (2014)
5. Ma, Y., Peng, H., Cambria, E.: Targeted aspect-based sentiment analysis via embedding
commonsense knowledge into an attentive LSTM. In: Proceedings of the 32nd AAAI
Conference on Artificial Intelligence (2018)
6. Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. Wiley
Interdisc. Rev.: Data Min. Knowl. Discov. 8(4), e1253 (2018)
7. Zhang, Z., Lan, M.: ECNU: extracting effective features from multiple sequential sentences
for target-dependent sentiment analysis in reviews. In: Proceedings of the 9th International
Workshop on Semantic Evaluation (2015)
8. Wagner, J., Arora, P., Cortes, S., et al.: DCU: aspect-based polarity classification for
SemEval task 4. In: Proceedings of the 8th International Workshop on Semantic Evaluation
9. Liu, B.: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. The Cambridge
University Press (2015)
10. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text
classification. In: Advances in Neural Information Processing Systems (2015)
11. Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for
modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for
Computational Linguistics (2014)
356 L. Liu et al.
12. Qian, Q., Huang, M., Lei, J., et al.: Linguistically regularized LSTMS for sentiment
classification. In: Proceedings of the 55th Annual Meeting of the Association for
Computational Linguistics (2017)
13. Ruder, S., Ghaffari, P., Breslin, J.G.: A hierarchical model of reviews for aspect-based
sentiment analysis. In: Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing (2016)
14. Zhou, P., Qi, Z., Zheng, S., et al.: Text classification improved by integrating bidirectional
LSTM with two-dimensional max pooling. In: Proceedings of the 26th International
Conference on Computational Linguistics (2016)
15. Lin, Z., Feng, M., dos Santos, C.N., et al.: A structured self-attentive sentence embedding. In
Proceedings of International conference on learning representations (2017)
16. Yang, Z., Yang, D., Dyer, C., et al.: Hierarchical attention networks for document
classification. In: Proceedings of the 2016 Conference of the North American Chapter of the
Association for Computational Linguistics: Human Language Technologies (2016)
17. Dieng, A.B., Wang, C., Gao, J., et al.: TopicRNN: a recurrent neural network with long-
range semantic dependency. In: Proceedings of International Conference on Learning
Representations (2017)
18. Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for
sequence learning. Comput. Sci. (2015)
19. Tay, Y., Tuan, L.A., Hui, S.C.: Learning to attend via word-aspect associative fusion for
aspect-based sentiment analysis. In: Proceedings of the Thirty-Second AAAI Conference on
Artificial Intelligence (2018)
20. Chen, P., Sun, Z., Bing, L., et al.: Recurrent attention network on memory for aspect
sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural
Language Processing (2017)
21. Ma, D., Li, S., Zhang, X., et al.: Interactive attention networks for aspect-level sentiment
classification. In: Proceedings of the Twenty-Sixth International Joint Conference on
Artificial Intelligence (2017)
22. Wang, Y., Huang, M., Zhao, L., et al.: Attention-based LSTM for aspect-level sentiment
classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing (2016)
Design and Implementation of Small-scale
Sensor Network Based on Raspberry Pi
1 Introduction
Therefore, this paper mainly designs and implements a small-scale sensor network
using the Raspberry Pi as the central control board. The Raspberry Pi is used to collect
sensor information, and the information is sent to the server through the JSON trans-
mission format after the information is collected and processed, and the instructions
sent by the server are used to operate the hardware devices in the sensing system,
thereby realizing stable data transmission and more. The type sensor accesses and
automatically detects the reconnection after disconnection, which solves the problem
that the device node goes to sleep after the network is accidentally disconnected. The
sensor network has high stability and good synchronization, and has application pro-
spect and value.
2 Related Technology
2.2 Websocket
Websocket is a full-duplex communication protocol first proposed by HTML5. The
protocol is established on a single TCP connection [3], which makes the exchange of
data between the client and the server easier. The protocol allows the server to take the
initiative. Push data to the client. In the WebSocket API, the browser and the server
only need to complete a handshake. The two can directly create a persistent connection
and perform bidirectional data transmission. Through this fast data channel, the data
between the server and the client. Transfer is more efficient [4]. The WebSocket
protocol based on the HTTP 1.1 version can save server resources and bandwidth, and
communicate in a stable and real-time manner (Fig. 2).
Fig. 2. Schematic diagram of the connection process between the server and the client
Cut into the sensor network part of this study, that is, the function design is: the
sensor node reads the data and then reads it by the central control board, encapsulates it
with the embedded JSON format, and uploads it to the data server by the central control
Design and Implementation of Small-scale Sensor Network 361
board using the wireless network. After receiving the instruction information from the
user, the central control board parses the received JSON information packet, thereby
reading the operation state of the sensor node, and performing corresponding settings.
The sensor network hardware system architecture.
The development environment of the small-scale sensor network based on the Rasp-
berry Pi is based on the Linux Debian family Rasbian operating system. Due to the
performance limitation of the Raspberry Pi, part of the compilation work is compiled
by configuring the cross-compilation environment on the PC side. After compiling, the
Raspberry is compiled. Run the execution file.
(2) Power on the Raspberry Pi and connect peripherals such as the display via USB.
(3) Connect DHT11 temperature and humidity sensor, connect with DuPont line,
connect DHT11 VCC pin to Raspberry Pi pin 1, GND to Raspberry Pi pin 6,
realize 3.3 V power supply of DHT11 sensor, data The pin is connected to the
Raspberry Pi pin 7. The above Raspberry Pi pin connections are all based on the
WiringPi pin position. The connection status is shown in Fig. 7.
364 H. Zheng et al.
(4) Run the program, the server verifies that the device ID is successful, and sends the
heartbeat packet to the Raspberry Pi. After receiving the information, the Rasp-
berry Pi prints the successful connection information on the external display and
continues to print the connection. Heartbeat information to show whether the
current connection status is normal.
(5) The Raspberry Pi sensor network system maintains a stable and long connection
between the server and the server. The server continuously sends a heartbeat
packet to the Raspberry Pi. The Raspberry Pi receives the heartbeat packet from
the server and detects the connection status. The heartbeat information is printed
on the external display, allowing the user to check whether the current system is
running normally and print heartbeat information [6].
(6) After receiving the message sent by the sensor network, the server stores the data
in the MySQL database. All databases of the system in the server are shown in
Fig. 8.
(7) After selecting the t_data data table, you can view the collection information of all
sensing devices, including the device ID number, sensor data in JSON format, and
data acquisition time. The specific data is shown in Fig. 9.
1. Luo, S.: The development of the Internet of Things industry and the trend of technological
innovation. Inf. Commun. Technol. Cont. Unit 12(04), 4–8 (2018)
2. Gao, F., Wen, H., Zhao, L., et al.: Design and optimization of a cross-layer routing protocol
for multi-hop wireless sensor networks. In: International Conference on Sensor Network
Security Technology & Privacy Communication System, pp. 16–87. IEEE (2013)
3. Hou, L., Zhao, S., Li, X., et al.: Design and implementation of application programming
interface for Internet of Things cloud. Int. J. Netw. Manag. 57–58 (2016)
4. Khan, I., Belqasmi, F., Glitho, R., et al.: Wireless sensor network virtualization: a survey.
IEEE Commun. Surv. Tutorials 18(1), 553–576 (2017)
5. Kamoshida, Y.: Simplifying install-time auto-tuning for cross-compilation environments by
program execution forwarding. In: IEEE International Conference on Parallel & Distributed
Systems, pp. 15–97. IEEE (2013)
6. Park, S., Kang, J., Park, J., et al.: One-bodied humidity and temperature sensor having advanced
linearity at low and high relative humidity range. Sens. Actuators B (Chem.) 76(1–3), 322–326
Research on Template-Based Factual
Automatic Question Answering Technology
1 Introduction
For major search engine developers, the greatest challenge they are facing is how to
quickly and effectively extract the information people genuinely seek from the Internet.
As a new generation of retrieval methods, the question answering system has become a
heated issue. Compared to traditional search engines, its advantage lies in that with the
natural language questions as input, it returns the exact answer of the question directly
to the user. In the retrieval process of the fact-based question answering system, the
main emphasis of research is the analysis of the semantics of questions. A feasible
research direction is that it matches question input by users with problem templates in
knowledge base, and deduces the query intention of the question in accordance with the
template information, so the corresponding fact query statement can be generated and
accurate answers from the knowledge base can be directly gained.
2 Relevant Research
In terms of template generation, vast majority of the question answering corpus tem-
plate libraries of automatic question answering system are constructed manually.
Question answering corpus is processed and question answering templates or rules are
sorted out manually. For example, Fader [2], Lopez et al. [3] summarized the main
forms of questions in English, and performed entity alignment with natural language
expressions to generate problem template. With three kinds of artificial constructions,
Bast and Haussmann [4] generated query template without relying on natural language
expressions, which only instantiated the query template from the question corpus.
Artificial template construction requires spending extensive amount of manpower and
time, and question and answer templates are also very limited. Therefore, some
scholars have proposed automatic template construction methods, question-answering
templates are generated from question-answering corpus through question-answering
corpus learning. Such as a template generation method based on expression and
relationship matching proposed by Abujabal [5].
Essentially, the generated question template is still a question sentence manually or
automatically, and it is only abstracted on the basis of sentence, and some specific
elements are expressed in a sentence by type [6]. Therefore, it is still a sentence-to-
sentence similarity calculation when matching the question with the template [7].
Sentence similarity is mainly reflected in the degree of repetition between words that
constitute a sentence. Based on the current research situation, the calculation is mainly
carried out in two directions: statistical-based method and word-sentence vector dis-
tance based method. The current mainstream direction for the research on sentence
similarity is word similarity [8]. Some algorithms for calculating the global similarity
368 W. Hu et al.
of sentences by using word similarity are proposed based on the research of word
similarity. Many scholars have put forward their own calculation methods in this
respect. For example, Cranias et al. [9] first proposed the use of multilevel dynamic
programming (Dynamic Programming) to calculate sentence similarity; Ding [10]
proposed a method to calculate whether different sentences express the same semantics
based on Latent Semantic Indexing; Carbonell and Goldsdein [11] of Columbia
University proposed using Maximal Marginal Relevane to calculate sentence similar-
ity; Nirenburg [12] proposed a similarity calculation method based on string matching:
by calculating the similarity between words, the whole sentence similarity algorithm is
obtained ultimately. Some algorithms achieve overall similarity by comparing character
levels, such as famous string editing distance and MCWPA algorithm.
The template generation based on relational dictionary proposed in this paper is mainly
divided into two steps, which are construction of relational dictionary with text corpus
and generate question-answering template through question-answering corpus.
Where, T(S) represents a set of items generated by sentence s, E(s) represents the
set of entities in sentence s, and I(s) repents the set of important words in sentence s
without entities and auxiliary words.
Due to the limitations of corpus, item set is not screened by threshold method;
instead, the number of occurrences of an item set is used as its weight to indicate the
probability of occurrence of the item set. The item set after statistics is integrated, if (e1,
e2, {t1}), (e1, e2, {t2}), (e1, e2, {t1, t2}), and the occurrence times of the three are
basically similar, then, it is considered that {t1, t2} is a fixed combination of words, and
only one item (e1, e2, {t1, t2}) is retained.
Research on Template-Based Factual Automatic Question Answering Technology 369
The additional information from the sentence are deleted under the circumstance
that the above three types of words are retained, so as to simplify and template the
question syntax dependency tree.
Template matching is the core of the question answering system based on semantic
analysis. In this paper, the neural network-based deep learning method is used to
calculate sentence similarity.
tree is not fixed due to uncertainty in sentences. The output of the neural network is a
distribution, representing the possibility of inputting questions belongs to each type of
The basic structure of tree-based convolutional neural network is as follows
(Fig. 1):
The network has four layers, in addition to the final full connection layer and the
softmax layer, there are special convolution layer and pooling layer. A “continuous
binary tree” is introduced to deal with the multi-fork tree in the syntactic tree.
In the process of convolution, if the depth of the sub-tree is less than the number of
layers of the window, the insufficient layers are supplemented by vector 0. After the
convolution is completed, a tree constituted by eigenvectors with original syntactic tree
structure is obtained.
372 W. Hu et al.
In tree-based convolutional neural networks, the input data is the syntactic tree
obtained from the syntactic dependency analysis of questions. Different sentences lead
to different tree structures; hence, the number of nodes in each window is also different.
The number of nodes in the syntax tree requires corresponding quantities of Wconv , as the
parameter. A “continuous binary tree” is introduced to solve this problem. Only Wconvt ,
Wconvl , Wconvr are used as the parameters of the model, based on the location information
of the node i, and linear combination of Wconvt , Wconvl , Wconvr are used to express Wconv ,
i. The implementation of “continuous binary tree” will be described in Sect. 4.2.3.
so the training time and computational resources are also relatively large. Based on the
advantages and disadvantages of the above two methods, a compromise approach has
emerged: a certain number of parameters can be used, and Wi can be obtained with the
linear combination of these parameters.
In this paper, three basic parameters Wl, Wr and Wt are used, the parameter Wi of
node i is expressed by the linear combination of Wl, Wr and Wt, and formula is satisfied:
Where ait, ail and air are calculated according to the relative position of node i:
di 1
ati ¼ ð4Þ
di represents the depth of node i in the current sliding window, and d represents the
depth of sliding window.
pi 1
ari ¼ 1 ati ð5Þ
pi represents the location of node i at the current layer, and n represents the total
number of sub-nodes in the current layer.
ali ¼ 1 ati 1 ari ð6Þ
The method of continuous binary tree can process arbitrary tree structure with
fewer parameters in case of fewer parameters. It also has good adaptability for windows
of any size and location.
5 Experiment
Test samples contain two pieces of information: questions and corresponding template
numbers. Some of the samples are as follows (Table 1):
Training set and test set are needed for tree-based convolutional neural network.
The samples were randomly grouped into {P1, P2, P3, P4, P5}. A total of five sets of
data sets were constructed. Pi in group i was used as the test sample, and other data
were used as training samples. The obtained results of five sets of data are as follows
(Table 2):
In the sample data, 200 samples were randomly selected for test set. Several
algorithms were tested and the obtained results are as follows (Table 3):
6 Conclusion
is further considered, and the tree convolution neural network is used to extract the
features of the dependency syntactic relationship. A neural network model and its
training method for template prediction are designed.
Acknowledgement. Supported by the 2018 project Research and Application of Key Tech-
nologies on Online Monitoring, Efficient Operation and Maintenance and Intelligent Evaluation
of Health Condition of Electrical Equipment’ in state Power Grid Corporation
(No. PDB17201800280), and Major State Research Development Program of China
(No. 2016QY04W0804).
1. Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to Answer Open-Domain
Questions. arXiv preprint arXiv:1704.00051
2. Fader, A., Zettlemoyer, L., Etzioni, O.: Paraphrase-driven learning for open question
answering. In: Meeting of the Association for Computational Linguistics, pp. 1608–1618
3. Lopez, V., Tommasi, P., Kotoulas, S., Wu, J.: QuerioDALI: question answering over
dynamic and linked knowledge graphs. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol.
9982, pp. 363–382. Springer, Cham (2016).
4. Bast, H., Haussmann, E.: More accurate question answering on freebase. In: ACM
International on Conference on Information and Knowledge Management, pp. 1431–1440.
ACM (2015)
5. Abujabal, A., Yahya, M., Riedewald, M., et al.: Automated template generation for question
answering over knowledge graphs. In: International World Wide Web Conferences Steering
Committee, pp. 1191–1200 (2017)
6. Berant, J., Chou, A., Frostig, R., et al.: Semantic parsing on freebase from question-answer
Pairs. In: Proceedings of EMNLP (2013)
7. Bordes, A., Chopra, S., Weston, J.: Question answering with subgraph embeddings.
Comput. Sci. (2014)
8. Joshi, M., Sawant, U., Chakrabarti, S.: Knowledge graph and corpus driven segmentation
and answer inference for telegraphic entity-seeking queries. In: Conference on Empirical
Methods in Natural Language Processing, pp. 1104–1114 (2014)
9. Cranias, L., Papageorgiou, H., Piperidis, S.: A matching technique in example-based
machine translation. In: Conference on Computational Linguistics, pp. 100–104. Association
for Computational Linguistics (1994)
10. Ding, C.H.Q.: A similarity-based probability model for latent semantic indexing. In:
International ACM SIGIR Conference on Research and Development in Information
Retrieval, pp. 58–65. ACM (1999)
11. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering
documents and producing summaries. In: International ACM SIGIR Conference on
Research and Development in Information Retrieval, pp. 335–336 (1998)
12. Nirenburg, S., Domashnev, C., Grannes, D.J.: Two approaches to matching in example-
based machine translation. In: TMI, pp. 47–57 (1993)
A Novel Scheme for Recruitment Text
Categorization Based on KNN Algorithm
1 Introduction
With the development of the Internet, a large number of recruitment informa-
tion will be put on the third-party recruitment website every day [1]. Knowledge
discovery through artificial intelligence techniques has become a mainstream in
web-based applications [2,3]. Job seekers can get relevant recruitment informa-
tion and log in to third-party recruitment websites. However, the recruitment
information on the third-party recruitment website is complicated and confus-
ing, which makes it difficult for job seekers to find jobs that suit them. So if
there is a detailed classification of recruitment information in different fields,
and can be quickly indexed and searched, searching functions will be improved
the efficiency significantly [4,5].
Through many surveys, we found that most of the recruitment information
is displayed in text format, so we use the text classification algorithm to classify
c Springer Nature Switzerland AG 2019
M. Qiu (Ed.): SmartCom 2019, LNCS 11910, pp. 376–386, 2019.
A Novel Scheme for Recruitment Text Categorization 377
the collected text. Text classification is a hot research field and the purpose of
this research is to decide the class of text quickly and accurately with the help
of classification algorithms. At present, KNN [6] algorithm, SVM [7], neural
network [8] and Naı̈ve Bayes algorithm are all applied to text classification.
By improving the original KNN text classification algorithm, the problem of
low efficiency of KNN algorithm in the classification of recruitment information
can be effectively solved, and the work efficiency of job seekers will be improved.
The main work of this paper includes the following three aspects:
(1) The principle of KNN algorithm is described, and the reason of low classifi-
cation efficiency is analyzed. RS-KNN algorithm is proposed by improving
KNN algorithm, which has better classification efficiency.
(2) The basic conditions are satisfied for classification experiments by prepro-
cessing data, Chinese word segmentation, feature extraction, etc.
(3) The experimental results are obtained by comparing the KNN classifica-
tion algorithm before and after improved, and the experimental results are
2 Related Work
Yuan [9] proposed a modified KNN algorithm based on the center. A semantic
relationship between features is introduced based on the original KNN algo-
rithm, Firstly, the sample set is clustered according to the semantic relation-
ship, then the central document is generated, thereby the need for searching
text is reduced, increasing the classification speed; Yang et al. [10] proposed an
improved M operator and a symbol-based improved KNN algorithm to generate
a strategy, which can effectively reduce the sample data set, reduce the compu-
tational complexity of the KNN algorithm, thereby improving the efficiency of
the algorithm; Gu [11] proposed an improved particle swarm optimization KNN
classification algorithm. The algorithm uses the random search ability of particle
swarm optimization to perform global random search on the training document
set. In the searching process, particle swarm skims over a large number of text
vectors, eliminating the effects of individual particles.
At the same time, the interference factor is added to avoid local convergence
of the algorithm, and the K nearest neighbors of the test sample are quickly
found. The algorithm Bojanowski [12] proposed calculated the distribution of
various texts in the vector space during the training process, and classified the
378 W. Qin et al.
text vectors according to the distribution positions in the sample space to nar-
row the scope. Next, K-nearest neighbor search improves the similarity calcu-
lation method, which can more accurately determine the high-dimensional and
large sample sets of text vectors, which improves the inefficiency of KNN algo-
rithm classification. Particularly, the traditional KNN algorithm acquires the
consistent contribution of each feature, which means that some undifferentiated
words can also affect the classification result. Yang et al. [13] proposed a feature
weighted-based KNN text classification algorithm, which can solve this prob-
lem. It considers different contributions to the classification by giving weights to
different characteristics, improve the important characteristics of weight, which
can help to reduce the impact of undifferentiated words on classification results
and improve the classification accuracy of the algorithm.
3 Concept
3.1 Web Crawler
The web crawler starts with the initial URL set, extracts all the links to the page,
adds them to the URL set, repeats the loop until the termination condition, and
gets the content rules on the page based on some content. The core principle of
web crawlers is: through uniform resource locator address, Hypertext Transfer
Protocol (HTTP) is used to simulate the way of browser requesting access to
the web server, encapsulate the necessary request limits, get the permission of
the web server, return to the original page and analyze the data [14]. By web
crawler, we can quickly get the recruitment information on the major recruitment
websites, and the obtained recruitment information can be used as the data set
of our classification algorithm.
In terms of word segmentation, there are some common methods such as string
matching based methods, rule-based methods and statistics-based methods, but
the matching speed is slow and the words that are not included cannot be
matched. The algorithm based on the statistical method’s principle is: the fre-
quency at which contextually adjacent words appear together can judge the
chance of generated word. Through the statistics of the joint frequency of the
words appearing next to each other in the corpus, their mutual occurrence infor-
mation is calculated. The mutual occurrence information reflects the close degree
of the connection between Chinese characters. When the close degree is higher
than a certain threshold value, the word group can be judged to form a word.
The advantage of this approach is that it won’t be limited by the text field to be
processed and does not need a specialized dictionary. Based on chance theory,
statistical segmentation abstracts the occurrence of Chinese character combina-
tion strings in Chinese context into a random process. The parameters of the
random process can be obtained through large-scale corpus training. Popular
A Novel Scheme for Recruitment Text Categorization 379
After getting the preprocessed text data, feature choice is required. Good fea-
ture choice can improve the performance of the model and help us understand
the characteristics and underlying structure of the data, which plays an impor-
tant role in further process of improving the model and algorithm. There are two
functions of feature selection: One is reducing the number of features and dimen-
sionality, so that the model will have stronger generalization ability; another is
enhancing the understanding between features and eigenvalues. Currently, the
commonly used methods for feature choice are CHI statistics [17], information
entropy, mutual information (MI) [18], stop word removal and information gain
(IG) and TF-IDF algorithms.
The core idea of TF-IDF algorithm is: if a certain word appears many times
in a certain category of text documents, then it will be able to distinguish the
category very well. Similarly, if there are many times in all categories of a certain
word, this word cannot be distinguish very well. It uses the idea of statistics
to test the importance of a certain word in a certain article. TF-IDF algorithm
integrates word frequency factor and inverse document frequency, and then takes
the product of term frequency (TF) and inverse document frequency (IDF) as
the weight of characteristic words.
Term frequency weighting, also known as feature frequency, refers to the fre-
quency of a term in a text document. In different categories of documents, the
frequency of feature items will vary. Therefore, it is also possible to take the fre-
quency information of feature items as the reference index of text classification.
If the number of occurrences of a word in each category is similar, the ability of
the word to distinguish that class is weaker. For the specific term ti , the word
frequency calculation formula is as Eq. (1),
tfi,j = m (1)
k=1 nk,j
where n(i,j) is the sum of the frequency of all the words in document dj . Inverse
document frequency (IDF) is a measure of the importance of a word in all text
documents. The calculation formula of inverse document frequency is shown as
Eq. (2) below,
idfi = log (2)
{|j : ti < dj |}
380 W. Qin et al.
where N is the total number of documents in the data set {|j : ti < dj |} Once
we have the values of tfi,j , we can evaluate TF-IDF. The calculation formula is
shown in Eq. (3) below:
is classified, the default weight of each dimension is equal, which will affect the
classification accuracy.
4 Algorithm
In order to solve the problem of high time cost of KNN classification algorithm,
the original KNN algorithm was improved in combination with the character-
istics of recruitment text, so that it can be applied in the actual recruitment
text classification with better classification efficiency. One of the shortcomings
of KNN algorithm is its large classification time cost. When faced with a large
number of recruitment texts, the efficiency of KNN algorithm will be relatively
lower, so it is necessary to improve the KNN algorithm. This thesis starts with
the reducing the number of the professional terms weights such as real estate,
finance, development, sales, etc.
In general, the improved KNN classification algorithm process is: first, get
the initial data, and then the data preprocessing, next, weighted terms selection,
and then calculate the known categories of data set points and the distance
between the current points, the weights of the same key calculated only once,
next, sorted by increasing the distance and selection and the current K point
distance minimum points, determine the type of first K points in frequency,
finally, return the former K points frequency of the highest category as the
current point prediction category. The process is shown in Fig. 1:
After the above analysis, the specific process of the improved KNN algorithm
is as follows:
(1) Data preprocessing. Use the crawler to get the job text on the internet and
preprocess the data.
(2) Select technical terms for weighting. After word segmentation and deletion
of stopped words, the professional terms in the recruitment text can be easily
382 W. Qin et al.
(3) Calculate the distance. The cosine formula is used for distance calculation.
If a word with the same weight is encountered, it will be treated as the first
word with the same weight by default, particularly, the word with the same
weight will be calculated only once.
(4) Sort by increasing distance. After calculating the distance between the clas-
sified text and all samples, all distances are arranged in order from smallest
to largest.
(5) Select the first K sample points with the smallest distance. When the order is
good, the top K sample points closest to each other are selected for analysis
to see which category these closest K sample points belong to.
(6) Identify the category. Judge which category the most recent K sample points
belong to and divide them into this category. The RS-KNN algorithm pro-
posed in this paper is implemented as follows:
Firstly, the weighted training set Train, category Lables, K value and the
unknown data set Test were input. Then we traversed the unknown data set
Test to calculate the distance between the text of each unknown category and the
known sample. If the data set for the unknown category is the same as the data
centralization value for the known category, skip and calculate the next word.
Computing every unknown sample. Then sort by distance. The first K sample
points with the smallest distance are selected to determine their categories.
A Novel Scheme for Recruitment Text Categorization 383
With the increase of K value, the pairs of time consumed by the algorithm
are shown in Fig. 10 below:
6 Conclusion
Aiming at the existing problems of the original KNN algorithm in the text
classification of recruitment, this paper proposes the RS-KNN algorithm, which
can reduce the influence of unfamiliar professional terms on the text classification
results. Experimental results show that the improved KNN algorithm has better
classification efficiency.
1. Gai, K., Qiu, M.: Reinforcement learning-based content-centric services in mobile
sensing. IEEE Netw. 32(4), 34–39 (2018)
2. Gai, K., Xu, K., Lu, Z., Qiu, M., Zhu, L.: Fusion of cognitive wireless networks
and edge computing. IEEE Wirel. Commun. 26(3), 69–75 (2019)
3. Gai, K., Qiu, M., Zhao, H.: Energy-aware task assignment for mobile cyber-enabled
applications in heterogeneous cloud computing. J. Parallel Distrib. Comput. 111,
126–135 (2018)
4. Yin, H., Gai, K., Wang, Z.: A classification algorithm based on ensemble feature
selections for imbalanced-class dataset. In: 2016 IEEE 2nd International Confer-
ence on Big Data Security on Cloud, pp. 245–249. IEEE (2016)
5. Yin, H., Gai, K.: An empirical study on preprocessing high-dimensional class-
imbalanced data for classification. In: 2015 IEEE 17th International Conference on
High Performance Computing and Communications, 2015 IEEE 7th International
Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International
Conference on Embedded Software and Systems, pp. 1314–1319. IEEE (2015)
6. Wang, Y., Chaib-draa, B.: KNN-based Kalman filter: an efficient and non-
stationary method for gaussian process regression. Knowl.-Based Syst. 114, 148–
155 (2016)
7. Tong, S., Koller, D.: Support vector machine active learning with applications to
text classification. J. Mach. Learn. Res. 2(Nov), 45–66 (2001)
8. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text
classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
9. Yuan, X., Sun, M., Chen, Z., Gao, J., Li, P.: Semantic clustering-based deep hyper-
graph model for online reviews semantic classification in cyber-physical-social sys-
tems. IEEE Access 6, 17942–17951 (2018)
10. Yang, K., Cai, Y., Cai, Z., Xie, H., Wong, T., Chan, W.: Top k representative:
a method to select representative samples based on k nearest neighbors. Int. J.
Mach. Learn. Cybern. 10, 1–11 (2017)
11. Gu, S., Cheng, R., Jin, Y.: Feature selection for high-dimensional classification
using a competitive swarm optimizer. Soft Comput. 22(3), 811–822 (2018)
12. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with
subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
13. Yang, H., Cui, H., Tang, H.: A text classification algorithm based on feature weight-
ing. In: AIP Conference Proceedings, vol. 1864, p. 020026. AIP Publishing (2017)
14. Heydon, A., Najork, M.: Mercator: a scalable, extensible web crawler. World Wide
Web 2(4), 219–229 (1999)
15. Goetz, B.: The lucene search engine: powerful, flexible, and free. JavaWorld (2000).
16. Carpenter, B.: Lingpipe for 99.99% recall of gene mentions. In: Proceedings of the
2nd BioCreative Challenge Evaluation Workshop, vol. 23, pp. 307–309. BioCreative
17. Fienberg, S.: The use of chi-squared statistics for categorical data problems. J.
Roy. Stat. Soc.: Ser. B(Methodol.) 41(1), 54–64 (1979)
18. Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual informa-
tion maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)
19. Wang, X., et al.: Research and implementation of a multi-label learning algorithm
for Chinese text classification. In: 2017 3rd International Conference on Big Data
Computing and Communications (BIGCOM), pp. 68–76. IEEE (2017)
386 W. Qin et al.
20. Ma, Y., Li, Y., Wu, X., Zhang, X.: Chinese text classification review. In: 2018 9th
International Conference on Information Technology in Medicine and Education
(ITME), pp. 737–739. IEEE (2018)
21. Zhao, Y., Qian, Y., Li, C.: Improved KNN text classification algorithm with
MapReduce implementation. In: 2017 4th International Conference on Systems
and Informatics (ICSAI), pp. 1417–1422. IEEE (2017)
Machine Learning for Cancer Subtype
Prediction with FSA Method
1 Introduction
The main contributions of this paper are two folds: 1. We generate two prediction
models using SVM and Random Forest algorithms along with a feature extraction
approach to predict the subtype of 222 lung cell lines, including 124 non-small lung
cancer cell lines, 39 small cell lung cancer cell lines and 59 normal lung cell lines based
on their RNA expression data; 2. We also compare the prediction accuracy of models
based on these two algorithms.
The results of this research provide theoretical supports that that SVM, along with a
FSA, is a perfect classification algorithm for cancer subtype classification. Along with
Machine Learning for Cancer Subtype Prediction with FSA Method 389
lung cancer, FSA may be able to be migrated and applied in multiple diseases, which
requires further research for identifying and proving.
The structure of this paper is follows. We provide the rationale and background
information about machine learning applications in disease classification in Sect. 2.
Following the description of the model, an example about data processing and mod-
elling is given in Sect. 3 in order to demonstrate the implementation of FSA in practice.
The experimental results and analysis are represented in Sect. 4. The conclusions are
given in Sect. 5.
2 Related Work
Throughout the history of medicine, disease classification is always one of the most
primary and important factors which would determine how we treat patients. Human
cancers are a large family of diseases and are classified by the location in the body
where the cancer first developed, such as breast cancer, lung cancer and liver cancer.
However, each primary cancer may have multiple subtypes [6]. Discovering cancer
subtypes is helpful for guiding clinical treatment [7]. In the last decade with the
development of human genome project, cancer biologists started to study how
molecular subtypes of cancer may be useful in planning treatment and developing new
therapies. As the cost of DNA sequencing continues decreasing, performing
DNA/RNA sequencing of patient’s tumor samples is possible. However, since tumor
samples contain thousands of genes, deciding the genes which can be used to classify
different types of cancer is a hard topic and almost impossible without computer’s help.
Fortunately, with the development of machine learning and big data analytics, it is
possible to employ these tools to discover cancer subtypes. There are two kinds of
studies of cancer subtype classification. One study is to predict the cancer subtype
based on the current knowledge [8]. In this kind of study supervised learning will be
used to make prediction. Each sample will be labeled by its class and prediction models
will be generated using the features, which are the gene expression data [2]. SVM
learning is one of methods used in this kind of study. Compared to other methods SVM
is very powerful at recognizing subtle patterns in complex datasets [9]. This kind of
study may apply to clinical service, which may help pathologists to make decision.
The other study is to classify cancer subtypes only based on their gene expression
data, but not the previous knowledge [10]. K-means and cluster models are always used
in these studies [11]. The idea behind k-means is to divide N samples (cancer patients
or cancer cell lines) into K clusters such that each sample belongs to the cluster with the
nearest mean [11, 12]. This kind of study may lead to discovering new cancer subtypes,
which will be especially important for basic medical research.
Although technically we can use machine learning methods to classify cancer
subtypes based on the gene expression data, in fact it requires a lot of efforts before we
can generate prediction models. One challenge is that each tumor sample or cancer cell
line has thousands of genes (features). Deciding which features to use is extremely
difficult. Feature selection is one core concept in machine learning, which may
390 Y. Liu et al.
3.1 Dataset
The dataset was created by Dr. John Minna at UT Southwestern Medical Center at
Dallas, and contains 222 observations, including 59 normal lung cell lines, 39 small
cell lung cancer cell lines and 124 non-small cell lung cancer cell lines, and 18,696
features [13–15]. The gene expression was analyzed by gene microarray. The data was
downloaded from (accession no. GSE32036).
Si ¼ lv =li ð1Þ
Machine Learning for Cancer Subtype Prediction with FSA Method 391
The bead-type intensities of all arrays in the experiment are normalized by Si.
Each probe is indexed by p and the number of probes ranges from 1 to n. Each
array is indexed by i and the number of arrays ranges from 1 to m.
Z ¼ ðx lÞ=r ð3Þ
Gene expression data with l genes and n samples can be represented by the matrix
X. To eliminate the batch effect of multiple gene expression profiling experiments, we
use Z-score to normalize the original data matrix by applying the function (3). The
cancer classification was calculated as a supervised learning situation, by defining the
cluster center. Then, a dissimilarity U between a sample gene and a cluster centroid
could be defined using function (4).
Filter-based feature selection approach ranks all the genes independently by
weighting each feature according to a particular method (Here we use an appropriate
nonlinear kernel-based clustering method), then selecting genes based on their
weights W.
Random forest consists of a large number of individual decision trees. Each of them
in the random forest spits out a class prediction and the class with the most votes
becomes our model’s prediction. Decisions trees are very sensitive to the data they are
Machine Learning for Cancer Subtype Prediction with FSA Method 393
trained on. Random forest takes advantage of it by allowing each individual tree to
randomly sample from the dataset with replacement, resulting in different trees, which
is known as bagging. We use this algorithm because one of its advantages is that it can
handle thousands of input variables without variable deletion. Our dataset has 18696
variables. When we use SVM model, we perform FSA, which is not necessary in this
We use GSE32036 (described in the part of Dataset) to develop and examine the
proposed methods. There are three known subtypes: normal lung, non-small lung
cancer, and small lung cancer. According to the characteristics of this dataset, SVM and
Random Forest models are extremely suitable for this small sample case with a large
number of features. Before we generate the prediction models, we need to pre-process
the dataset. All experiments are run using R language 3.6.1.
was optimized to 500. In terms of supervised learning algorithms, all observations were
pre-labeled to train the classifier as well as to evaluate the precision of the
4.2 Results
The cross-validation method is used here to train and test the algorithms. Before
training the two models, the dataset is randomly divided into training set (60%) and test
set (40%). We repeat the experiments 10 times. Each time we perform the prediction
with the two algorithms using the same training set and test set. The results are shown
in Table 1. We list the actual subtype and the predicted subtype by two models.
In Table 1, we randomly divide the data into training set and test set for 10 times.
Each time we generate two models using SVM and Random Forest algorithms with
FSA. Then we evaluate both models using test data. This table records the actual
classification and the predicted classification by using these two models we proposed.
4.3 Analysis
In order to compare the precision of the two algorithms, we computed the accuracy
rates. Figure 4 shows that in SVM model, the accuracy rates of normal lung cell group,
NSCLC group and SCLC group are 100%, greater than 90%, and greater than 85%
respectively. In comparison, in Random Forest model, these rates are greater than 90%,
greater than 90%, and greater than 70% respectively. Apparently, Random Forest
Machine Learning for Cancer Subtype Prediction with FSA Method 395
model is not as stable as SVM model. Therefore, in term of accuracy, SVM model is
better than Random Forest model.
In addition to accuracy, we also measure the running time used by the two models.
We setup the system time before and after the experiments and compute the running
time of the 10 experiments. Figure 5 demonstrates that the running time of SVM model
is less than 10 s and the running time of Random Forest model is greater than 40 s.
5 Conclusion
In summary, we generated two prediction models, SVM and Random Forest to classify
the lung cancer subtype: normal lung, non-small lung cancer and small lung cancer.
Both models led to a good prediction of three different cell types. The SVM model
gave more accurate results and used shorter running time compared to the Random
forest model.
1. Samuel, A.: Some studies in machine learning using the game of checkers. ii—recent
progress. IBM J. Res. Dev. 11, 601–617 (1967)
2. Kourou, K., Exarchos, T., Exarchos, K., Karamouzis, M., Fotiadis, D.: Machine learning
applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17
3. Zemouri, R., Zerhouni, N., Racoceanu, D.: Deep learning in the biomedical applications:
recent and future status. Appl. Sci. 9, 1526 (2019)
4. Gulshan, V., et al.: Development and validation of a deep learning algorithm for detection of
diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016)
5. Inamura, K.: Lung cancer: understanding its molecular pathology and the 2015 WHO
classification. Front Oncol. 7, 193 (2017)
6. “what-is-cancer”.
7. Jiang, L., Xiao, Y., Ding, Y., Tang, J., Guo, F.: Discovering cancer subtypes via an accurate
fusion strategy on multiple profile data. Front. Genet. 10, 20 (2019)
8. Wu, M., et al.: Prediction of molecular subtypes of breast cancer using BI-RADS features
based on a “white box” machine learning approach in a multi-modal imaging setting. Eur.
J. Radiol. 114, 175–184 (2019)
9. Aruna, S., Rajagopalan, S.: A novel SVM based CSSFFS feature selection algorithm for
detecting breast cancer. Int. J. Comput. Appl. 31(8), 14–20 (2011)
10. de Souto, M., Costa, I., de Araujo, D., Ludermir, T., Schliep, A.: Clustering cancer gene
expression data: a comparative study. BMC Bioinf. 9, 497 (2008)
Machine Learning for Cancer Subtype Prediction with FSA Method 397
11. Kakushadze, Z., Yu, W.: *K-means and cluster models for cancer signatures. Biomol.
Detect. Quantification 13, 7–31 (2017)
12. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28, 129–137 (1982)
13. Wang, X., et al.: Subtype-specific secretomic characterization of pulmonary neuroendocrine
tumor cells. Nat. Commun. 10, 3201 (2019)
14. Borromeo, M., et al.: ASCL1 and NEUROD1 reveal heterogeneity in pulmonary
neuroendocrine tumors and regulate distinct genetic programs. Cell Rep. 16, 1259–1272
15. Augustyn, A., et al.: ASCL1 is a lineage oncogene providing therapeutic targets for high-
grade neuroendocrine lung cancers. Proc. Natl. Acad. Sci. U.S.A. 111, 14788–14793 (2014)
16. Liu, S., et al.: Feature selection of gene expression data for cancer classification using double
RBF-kernels. BMC Bioinformatics 19, 396 (2018)
17. Chen, H., Zhang, Y., Gutman, I.: A kernel-based clustering method for gene selection with
gene expression data. J. Biomed. Inf. 62, 12–20 (2016)
Autonomous Vehicle Communication in V2X
Network with LoRa Protocol
Abstract. The weakness of short-range wireless signal and security issues will
make a bad effect on the communication in Vehicle-to-Vehicle or Vehicle-to-
Infrastructure (V2X). In this study, we proposed a system, based on Long
Range (LoRa) protocol and Long Range Wide-Area Network (LoRaWAN), to
reduce the latency of communication and minimize the data size in V2X net-
works. Through the experiment, it shows that the proposed system can enhance
the overall performance and reduce the latency in V2X networks. Moreover, the
security of transmitting data is increased.
1 Introduction
of IoT in automotive industry will grow 37 times from 2014 to 2020. Moreover, it is
expected that by 2021, additional IoT devices for vehicles in the US connected car
market will reach $ 18.1B. IHS Markit believes that the installation of IoT devices and
sensors will rise up from $27B in 2017 to $73B in 2025.
In Cooperative Intelligent Transport Systems (C-ITS), which is based on V2X, a
safe connected cloud-based service is emphasized. Therefore, independent vehicles
from different manufacturers are allowed to share real-time hazard information via the
cloud service. It is highly possible to reduce issues of autonomous vehicle or accidents
on the road by C-ITS [1]. However, the usage of V2X communication is still limited by
different restrictions, such as the security issue in broadcasting harmful message [2].
Considering a scenario, if a car in front on the highway has an emergency, it will
send a signal via internet connected device to C-ITS cloud [1]. Then, the information
will be forward to the corresponding cloud service at vehicles behind that car. How-
ever, there is no 100% guarantee coverage of internet in reality, which means that it is
possible that information cannot transmit to other vehicles.
Researchers now are focusing on different way to increase the security in this field
[2]. We propose a method to simplify the connection and location prediction to enhance
the efficiency and stability of communication.
Looking at Tesla, which is a great example for new generation IoT vehicle, it has
3G cell connection to the internet obstinately and includes Geolocation information.
Meanwhile, the attitude and acceleration of car is detected by the sensors. The sensors
detect and collect all data which generated by all devices, such as energy consumption,
wheel position, brake and emergency baking position, climate system, seat position,
rearview mirror, door handle, etc. Different from before, all the related data about the
car will be shared with other systems, infrastructure and vehicles. Therefore, those
information about high piracy [3] should be welly handled and processing in an effi-
cient way.
Therefore, the demand on Internet accessing will rapidly increase. Manufacturers
are mainly adding Long Term Evolution (LTE) for providing vehicle Internet con-
nection. However, the latency and bandwidth of LTE are still two big challenges [3].
Furthermore, vehicles are high speed moving objects and possible to driving across
different geographical areas.
While people discussing how to improve the performance of LTE. We are
proposing a novel technology, LoRa protocol and LoRaWAN, to replace the LTE in
the communication system. Based on the LoRa protocol and LoRaWAN, we redefine
the way of communication for vehicles and IoTs in V2X. In order to provide better
traffic prediction, machine learning is applied in route prediction algorithm [4].
Meanwhile, we suggest a simple way to predict and monitor the route. Based on the
location of vehicle, the proposed system can predict the next closer gateways. The
quick decision and stable connection will increase the overall performance in V2X
Despite various benefits brought by V2X communication in autonomous vehicles,
the connection method still is one of the bottlenecks for improving the overall per-
formance and safety in self-driving. In this paper, we are using LoRa protocol and
LoRaWAN in V2X communication network instead of LTE and trying to provide
another way to predict driving direction of vehicle and filter gateways, which are closer
400 Y. Cheung et al.
2 Background
With the current development of autonomous vehicle, sensors, IoT and monitoring
devices will generate lots of data. Those data will be exchanged and process in the
whole transportation system. When a bunch of data, which includes location, vehicle
status and other related conditions, is used for assisting self-driving, the latency of data
processing and delay of exchange will increase. If the security of data transmission or
early prediction of location can be achieved by changing communication methods,
autonomous vehicles can focus on other topics. In this paper, we are going to imple-
ment LoRaWAN in V2X with LoRa protocol. The majority applications for the pro-
tocol of LoRa and LoRaWAN are in IoT, but new generation vehicles can be treated as
product of combining IoT components.
2.1 LoRa
LoRa uses proprietary spread spectrum modulation that is similar to a derivative of
Chirp Spread Spectrum modulation (CSS). This allows LoRa to trade off data for
sensitivity with a fixed channel bandwidth by selecting the amount of spread used,
which a selectable radio parameter is from 7 to 12. This expansion factor determines
the data rate and determines the sensitivity of the radio.
In addition, LoRa uses Forward Error Correction coding to improve resilience
against interference. LoRa’s high range is characterized by extremely high wireless link
budgets, around 155 dB to 170 dB. The data at the gateway use Frequency Shift
Keying (FSK) to transmit. It is fully bi-directional communication.
Autonomous Vehicle Communication in V2X Network with LoRa Protocol 401
2.2 LoRaWAN
The LoRaWAN defines the communication protocol and system architecture for the
network, while the LoRa physical layer enables the long-range communication link.
The LoRa physical layer is also responsible for managing the communication fre-
quencies, data rate, and power for all devices. Devices in the network are asynchronous
and transmitted when they have data available to send. Data transmitted by an end-node
device is received by multiple gateways, which forward the data packets to a cen-
tralized network server. The network server filters duplicate packets, performs security
checks, and manages the network. Data is then forwarded to the application server. The
technology shows high reliability for the moderate load. It is bidirectional.
LoRaWAN is very useful technology in geolocation applications. It can reach up to
15 km in rural areas by the long-range feature. However, since the receivers are still
sensitive in urban areas, it can only cover 5 km.
3 System Design
There are four components in the system: end-node, gateways, server and the appli-
cation. End-node is connected sensors in the vehicle, which sends data by LoRaWAN
protocol and the gateways are station to receive and transmit data in range. Server, The
Things Network (TTN) used in the experiment, processes the forwarded packets from
gateways. Next, TTN routes messages to the application to calculate the position.
3.1 End-node
The end-node is responsible for sending the data acquired from GPS receiver by LoRa
module. The GPS module is used to check the geolocation and it coordinates will be
transmitted over LoRaWAN as a payload in the packet. It sends uplink messages to the
network server.
3.2 Gateway
The function of the gateway is routing the data received from the end-node to the
server. In order to estimate and predict the location of self-driving vehicles, the cal-
culation of receiving time of the packet from each gateway is needed to apply the
The protocol between the gateway and the server is set in a binary file called
“packet forwarder” that runs inside the gateway. There is no authentication of the
gateway or the server, and acknowledgements are only used for network quality
402 Y. Cheung et al.
assessment, not to correct lost packets. This protocol only allows certain types of
packets to be exchanged between the gateway and the server.
3.4 Application
Application mainly consists of database and program. The data obtained from TTN will
be parsed and stored. Then it publishes device activations and messages.
4 Formatting Message
Data transmitted from End-node to the gateway will be simplified. Rather than sending
all detailed information, we can mainly send important in numeric codes, which rep-
resent actions and situations. On the vehicle side, action will be much more important
than other information (Fig. 2).
For the testing, we defined the 8-digit number string format to represent a message.
Since we can categorize actions and statuses, vehicles can read the representing code
and operate immediately.
We explain the format by the following example,
03 15 012 4
First two digits indicate action category. 03 means that vehicle has to stop. For next
2 digits, it means the value for that certain action. 15 is representing that the vehicle
needs to stop after 15 miles, so autonomous vehicle can has longer buffer to slow down
and adjust the speed. Then, 3 digits are used for illustrating the reasons. 012 is the
Autonomous Vehicle Communication in V2X Network with LoRa Protocol 403
traffic light. Finally, the last digit is supporting the reasons. In the example, the traffic
light is red, which is 4, right now.
Therefore, autonomous vehicles can react and send control signals by just reading
first four digits. For the edge computer, server, and applications are able processing
other operations by last four digits.
5 Security
LoRaWAN utilizes security in two layers, TTN and application. TTN verifies the
authenticity of the end-node. While the application layer ensures the operators in TTN
layer, it cannot access to the application data.
There are two unique 128-bit session keys, one for TTN layer (NekSKey) and
another for Application layer (AppSLey). The data is transmitted under LoRa protocol
by radio wave. Even the radio waves themselves cannot be encrypted. The biggest
advantage is that the CSS signal is low in power consumption and is not easily scanned
and intercepted by other devices.
6 Defining Problem
Assuming that the coordinates of the vehicle are (x0 , y0 , z0 ) at time t = 0, then, at
t = 1, the coordinates are (x1 , y1 , z1 ). The application of Atan2 to calculate the direction
of the vehicle driving and defining the direction is illustrated as following:
// enumerated counterclockwise, starting from east = 0:
enum compass {
E = 0, NE = 1,
N = 2, NW = 3,
W = 4, SW = 5,
S = 6, SE = 7
const string[8] headings = { "E", "NE", "N", "NW", "W", "SW",
"S", "SE" };
// actual conversion code:
float angle = atan2 ( vector.y, vector.x );
int octant = round ( 8 * angle / ( 2 * PI ) + 8 ) % 8;
The great-circle distance between two points, which identified by using GPS
coordinates, is determined by using Haversine formula. That is the shortest distance
over the surface of earth, which giving an as-the-crow-flies distance between the two
The H is the center angle between any two points on a sphere, where d is the
distance and r is the radius of the earth.
H ¼ dr ð1Þ
where u1 and u2 are the latitude of point 1 and latitude of point 2 respectively. And k1
and k2 are the longitude of point. Finally, we can get the versine of the angle (Fig. 4):
h 1 cosðhÞ
havðhÞ ¼ sin2 ¼ ð3Þ
2 2
After getting the direction of vehicle, we narrow down the number of nearby
gateways. Then, the closer gateways can be ready transmitting packet back to the
vehicle, by calculating the displacement (d) between vehicle and the gateway:
d¼ ðxv xg Þ2 þ ðyv yg Þ2 þ ðzv zv Þ2 ð4Þ
7 Evaluation
Our design for the evaluation will simplified to one vehicle with four gateways. In the
experiments, we are mainly utilizing the LoRa protocol and LoRaWAN in V2X
communication system to obtain the overall performance of connection. And then,
verify whether the latency of data transmission can be reduced or not.
To evaluate the transmission performance, a gateway is installed inside an office.
A car with the LoRa device was driving from time (t) = 0 min at starting point, office, and
then driving back to starting point at T = 26 min. The route of the car is shown in Fig. 6.
We sent five packets in every minute, and then calculate the received rate by:
Rate ¼ 100% ð5Þ
where PacketReceived is the data stored in the TTN database and PacketSent is data sent
from end-node and vehicle.
As shown in Fig. 7, the results are shown. From t = 6 to t = 15, there are tall
buildings between the car and gateway, so the rate dropped. Next, between time t = 12
and t = 17, the car was stopped for 3 min and then driving back. The rate dropped to
80% due to buildings and hill.
It was observed that even the receiving rate will affect the environment and the
stability of LoRa is still good enough for using in V2X communication network.
Moreover, it implied that the vehicles will not lose connection in V2X network and do
not need to search network to connect.
For the latency, it includes transmission time on processing, queueing, and prop-
agation. And the waiting time is due to cycling of regulatory duty. The transmission
time is calculated by the formula; Ttotal ¼ Ttx þ Tw , where Ttx is the time for processing
and propagation and Tw is waiting time. The waiting time is calculated by:
Tw ¼ Pc ð6Þ
ð i¼1 li þ kÞ 2
where Pbusy;all is the Erlang-C probability that all servers are busy.
408 Y. Cheung et al.
We can observe the results of latency in Fig. 8. We recorded the transmitting time
and receiving time in milliseconds and corrected to one decimal number. While the car
was driving far away from the gateway, the latency is increased. However, the latency
is decreased after driving back to the gateway. The average time is 16.7 ms, which is
acceptable latency for data transmission.
8 Discussion
In this paper, we mainly utilized the LoRa protocol and LoRaWAN in V2X commu-
nication system to enhance the overall connection performance. Meanwhile, the latency
of data transmission also can be reduced. In the current V2X communication network,
LTE has its own limitations and connection issues. By redefining the message format,
we can diminish the packet size. Since we are following the format, which is the trade-
off between transmission speed and content, the data and information will be limited.
Apart from the potential advantages brought by using LoRa in the whole V2X
communication system, the cost will decreases by less gateways, which can cover wide
area. The power consumption of LoRa devices are low, and the vehicles or gateways
are able to save more energy than using LTE. The transmission parameters of LoRa
protocol are defined by Adaptive Data Rate (ADR) scheme. The Spreading Factor,
Bandwidth and Transmission Power are controlling the uplink of LoRa.
9 Conclusion
In conclusion, we presented the LoRa protocol and LoRaWAN and those technologies
are implemented in V2X communication system to provide much more reliable con-
nection between autonomous vehicles and infrastructures. The aim of this work is
enhancing the performance of data transmission. In addition, the claims of avoiding the
loss connection and reduction of latency are verified by our evaluations.
However, we are mainly focus on one autonomous vehicle in this paper. In real
case, there are so many vehicles in the transportation system. Therefore, it is important
to handle the load-balancing and decide how to weight the gateways. The future
Autonomous Vehicle Communication in V2X Network with LoRa Protocol 409
1. Qiu, H., Noura, H., Qiu, M., Ming, Z., Memmi, G.: A user-centric data protection method
for cloud storage based on invertible DWT. IEEE Trans. Cloud Comput. (2019)
2. Haidar, F., Kaiser, A., Lonc, B.: On the performance evaluation of vehicular PKI protocol
for V2X communications security. In: Proceeding of IEEE 86th Vehicular Technology
Conference (VTC-Fall) (2017)
3. Petit, J., Shladover, S.: Potential cyber attacks on automated vehicles. IEEE Trans. Intell.
Transp. Syst. 16(2), 546–556 (2015)
4. Lv, Y., Duan, Y., Kang, W., Li, Z., Wangi, F.: Traffic flow prediction with big data: a deep
learning approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2015)
5. Qiu, H., Qiu, M., Lu, Z., Memmi, G.: An efficient key distribution system for data fusion in
V2X heterogeneous networks. Inf. Fusion 50, 212–220 (2019)
6. Kontzer, T.: Driving Change: Volvo’s “Drive Me” Project to Make Self-Driving Cars
Synonymous with Safety (2016).
7. Wang, X., Mao, S., Gong, M.: An overview of 3GPP cellular vehicle-to-everything
standards. GetMobile: Mob. Comput. Commun. 21(3), 19–25 (2017)
8. Abboud, K., Omar, H., Zhuang, W.: Interworking of DSRC and cellular network
technologies for V2X communications: a survey. IEEE Trans. Veh. Technol. 65(12),
9457–9470 (2016)
9. Atallah, R., Khabbaz, M., Assi, C.: Vehicular networking: a survey on spectrum access
technologies and persisting challenges. Veh. Commun. 2(3), 125–149 (2015)
10. MacHardy, Z., Khan, A., Obana, K., Iwashina, S.: V2X access technologies: regulation,
research, and remaining challenges. IEEE Commun. Surv. Tutorials 20(3), 1858–1877
11. Feng, Y., Hu, B., Hao, H., Gao, Y., Li, Z.: Design of distributed cyber-physical systems for
connected and automated vehicles with implementing methodologies. IEEE Trans. Ind. Inf.
14(9), 4200–4211 (2018)
12. Li, L., Ota, K., Dong, M.: Humanlike driving: empirical decision-making system for
autonomous vehicles. IEEE Trans. Veh. Technol. 67(8), 6814–6823 (2018)
13. Zhao, Z., Chen, W., Wu, X., Peter, C.Y., Chen, P.C., Liu, J.: LSTM network: a deep learning
approach for short-term traffic forecast. IET Intell. Transp. Syst. 11(2), 68–75 (2017)
14. Lukosevicius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network
training. Comput. Sci. Rev. 3(3), 127–149 (2009)
15. Guo, L., et al.: A secure mechanism for big data collection in large scale internet of vehicle.
IEEE Internet of Things J. 4(2), 601–610 (2017)
16. Pacheco, J., Hariri, S.: IoT security framework for smart cyber infrastructures. In: IEEE
International Workshops on Foundations and Applications of Self* Systems, pp. 242–247.
IEEE, September 2016
410 Y. Cheung et al.
17. Lloret, J., Tomas, J., Canovas, A., Parra, L.: An integrated IoT architecture for smart
metering. IEEE Commun. Mag. 54(12), 50–57 (2016)
18. Sornin, N., Luis, M., Eirich, T., Kramp, T., Hersent, O.: LoRaWAN Specification, pp. 1–82,
January 2015
19. “LoRaWAN Network Server Demonstration: Gateway to Server Interface Definition,”
Semtech, Application note, pp. 1–19, July 2015
20. Augustin, A., Yi, J., Clausen, T., Townsley, W.: A study of LoRa: long range & low power
networks for the Internet of Things. Sensors 16(9), 1466 (2016)
Author Index