0% found this document useful (0 votes)

30 views36 pages

V1 0-Mdpi

Uploaded by

mengxianxin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views36 pages

V1 0-Mdpi

Uploaded by

mengxianxin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 36

1

1 Article

2 Research on anonymous network application classification and

3 fingerprint attack technology
4 Xiaoyang Wang1, Xiangyang Xu2 and Firstname Lastname 2,*

5 1
School of Information and Computer Science, Beijing Jiaotong University; [email protected]
6 2
Rail Transit College, Suzhou University; [email protected]
7 * Correspondence: [email protected]

8 Abstract: With the development of the Internet, people pay more and more attention to the
9 privacy protection when browsing the web. More users tend to use anonymous communication
10 tools, such as The Second-Generation Onion Router (Tor), which is currently the most used
11 anonymous communication system. It can protect user privacy however some criminals use this
12 feature to carry out illegal activities. In this article, research on anonymous network application
13 classification and fingerprint attack technology. On the one hand, it provides technical and
14 theoretical support for network supervisors to purify network environment; On the other hand,
15 the vulnerability of Tor system can be found to improve the Tor system and provide anonymity
16 for legitimate users. This article mainly studies the identification and application classification of
17 Tor traffic, and further carries out fingerprint attack to identify the specific page visited by the
18 client. The main contributions of this paper are: (1) In view of the low accuracy of Tor anonymous
19 traffic identification and traffic classification, this paper proposes the XL-Stacking model based on
20 integrated learning. K-Nearest-Neighbor (KNN), XGBoost and Random-Forest algorithm are
21 selected for the first layer of the stacking model. Logistic-Regression algorithm is used in the
22 second layer, which can achieve higher classification accuracy under smaller feature dimensions.
23 The algorithm can quickly identify whether the user's traffic is dark web traffic, and the accuracy
24 rate can reach 99.7% on the data set collected by itself. Further classification of dark web traffic,
25 Citation: To be added by editorial you can quickly locate the traffic categories, which are divided into the following eight categories:
26 staff during production. video, web browsing, chat, file transfer, mail, P2P, audio, VOIP. On the publicly available UNB-
27 CIC dataset, the accuracy rate is 90.3% and the recall rate is 87.4%, which is better than the
Academic Editor: Firstname Last-
28 classification performance of similar work. (2) In view of the large amount of data required for
name
29 website fingerprint attacks. This paper proposes a spatiotemporal BiGRU-ResNet fingerprint
30 Received: date attack model, which makes full use of the Tor website fingerprint sequence, including time, space
31 Revised: date and website information triples, and integrates it into Spatiotemporal multi-modality improves
Accepted: date
32 the efficiency and accuracy of model recognition. In the closed world scene, an accuracy rate of
Published: date
33 98.46% was achieved. When the number of instances on each monitoring page was 100, the
34 accuracy rate reached 87.51%, proving that the model can achieve better performance even with a

Copyright: © 2023 by the authors.

Submitted for possible open access
publication under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/license
s/by/4.0/).

3 Remote Sens. 2023, 15, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/remotesensing

4 Remote Sens. 2023, 15, x FOR PEER REVIEW 2 of 36
5

35 small training sample. The effect is to reduce the cost for regulators to supervise the Tor
36 anonymous network.

37 Keywords: Anonymous network; Tor; Traffic identification; Website fingerprint attack; Space-time
38 multimodal
39

40 1. Introduction
41 The original intention of the emergence of anonymous networks is to protect user
42 privacy. According to the survey, most users can make reasonable use of anonymous
43 networks, such as anonymous voting, etc. However, some anonymous websites have
44 illegal web pages and use the anonymity of the dark web to conduct some illegal and
45 criminal activities, seriously damaging the green and pure Internet environment. What's
46 more, they endanger the safety of people's lives and property, such as using anonymity
47 to conduct online blackmail, spreading Internet viruses and conducting cyberattacks,
48 and conducting illegal transactions and selling drugs, guns and other illegal activities on
49 the dark web. For example, in 2013, the FBI seized Silk Road, the largest darknet website
50 at the time. This website could only be accessed through the Tor anonymous network. It
51 had a large number of black transactions, and its monthly turnover as high as $1.2
52 million.
53 It can be seen that the monitoring of the dark network is imminent, and
54 strengthening the supervision of the dark network is very necessary to protect the
55 security of user information. At the same time, due to the anonymity of the dark
56 network, the supervision of the supervision department is also subject to greater
57 difficulties and resistance. Fingerprint attack technology can identify the traffic in the
58 dark net network. On the one hand, it can help network administrators or judicial
59 personnel to take corresponding measures, which is conducive to reducing the damage
60 of such cyber-criminal activities to the security of cyberspace and maintaining the
61 security of the Internet. On the other hand, through the attack and prevention of
62 fingerprint identification, the vulnerability of the system can be found, and the Tor
63 system can be improved and perfected. Provide secure anonymity for legitimate users.
64 Therefore, based on the existing research, this paper will further explore the attack and
65 defense technology of Tor anonymous network, in order to realize that anonymous
66 network technology will not be used by criminals and purify the network environment.
67 And through corresponding attack technologies, we can discover the vulnerabilities of
68 the Tor network and improve the Tor network system.
69 Tor uses fixed-size packets of 512 bytes for transmission, called Cells. Each contains
70 a header and a payload, and there are two types of packets. They are control package
71 (Control) and relay package (Relay) respectively. Control packets are used to parse and

6
7 Remote Sens. 2023, 15, x FOR PEER REVIEW 3 of 36
8

72 execute commands related to filling, building, extending, and disconnecting links. Relay
73 packets are used to transmit end-to-end data flows. It includes the identity of the data
74 stream (StreamID), end-to-end integrity check (Digest), length of forwarded data (Len),
75 and forwarding command (CMD). The structures of these two types are shown in Figure
76 1 and Figure 2. They carry end-to-end communication messages between clients and
77 servers. The CircID and Command fields are not encrypted and are used on each OR,
78 and the rest are encrypted.
79 Figure1 Structure of control cell [1]

80 Figure2 Structure of data cell [1]

81 Tor anonymous network traffic identification is to identify whether the user is

82 visiting an ordinary website or a Tor onion routing website. Application classification

83 attacks are based on the characteristics of anonymous traffic, classifying different web
84 application types such as email, file transfer, chat, and extracting traffic characteristics of
85 different application traffic on anonymous networks. Use machine learning methods to
86 establish an application classification model, and use the application classification model
87 to associate users with communication targets, and ultimately identify which type of
88 website the user has visited.
89 In 2017, the Canadian Network Institute proposed to detect and characterize Tor
90 traffic based on time analysis [2]. The feature set is only time-based statistics such as
91 forward packet interval (IAT), backward packet interval, stream duration, etc. KNN,
92 C4.5, Random Forest and other algorithm models can be used to accurately analyze the
93 types of web pages visited by users within 15 seconds, such as chat, audio/video stream,
94 mail, file transfer, etc., but the accuracy of the model needs to be further improved.
95 In 2019, He Y et al. proposed an Obfs4 traffic detection scheme based on two-stage
96 filtering [3]. The high precision and real-time recognition of Obfs4 traffic is realized by
97 using coarse-grained fast filtering and fine-grained accurate recognition [4]. In the
98 coarse-grained filtering stage, randomness detection algorithm is used to detect the
99 randomness of the handshake packet payload in the communication, and the timing
100 characteristics of the packet are used to remove other interference flow. In the fine-
101 grained recognition stage, the accuracy of Obfs4 identification using SVM (support
102 vector machine) algorithm is more than 99%. However, the randomness detection of this

9
10 Remote Sens. 2023, 15, x FOR PEER REVIEW 4 of 36
11

103 method does not perform well in the case of other encrypted traffic interference, and
104 other packets before and after TCP need to be captured, and the space efficiency is poor.
105 Machine learning has high efficiency and accuracy in identifying and classifying
106 Tor traffic. Therefore, this paper selects a machine learning model to identify and
107 classify Tor traffic [5]. In the existing studies, most feature selection of machine learning
108 algorithms lacks feature vectors for Tor traffic. This paper tries to combine Tor routing
109 protocols to find targeted feature vectors, so as to improve the traffic analysis effect and
110 model training efficiency from the source.
111 Based on the fingerprint information in the traffic link, the network fingerprint
112 attack compares the unknown traffic monitored with the known traffic in the fingerprint
113 database to identify the specific websites visited by users. A large number of known
114 website traffic is collected in advance, formed into a fingerprint database, and the
115 fingerprint attack model is trained offline. In the online traffic identification stage, the
116 specific websites visited by users are identified.
117 In recent years, fingerprint attacks based on deep learning neural networks (DNNS)
118 have been researched and developed, and have surpassed the effectiveness of machine
119 learning-based research, using deep learning neural networks for automatic feature
120 extraction, and these attacks are more effective than the original attacks compared to
121 traditional attacks [6].
122 Rimmer et al. proposed the first DNN-based attack - AWF [7] in 2017. Pioneering
123 use of stacked denoising autoencoders (SDAE), long short-term memory (LSTM), and
124 convolutional neural networks (CNN) to automatically select features. Rimmer et al.
125 built one of the largest data sets of network fingerprints, containing more than 3 million
126 network traces. In a closed world of 100 sites [8], the success rate was over 96%.
127 However, the neural network they designed is too simple, and the network layer is
128 shallow, which cannot better extract network features [9].
129 In 2018, Sirinam et al. proposed a deep fingerprint recognition (DF) model [10],
130 which uses CNN model, adopts complex architecture design, and achieves good results
131 on their own dataset. DF attacks on their own large data set, without defense against the
132 closed world accuracy of more than 98%, better than all previous attacks.
133 In recent years, the fingerprint attack based on deep learning neural network
134 (DNN) has been researched and developed [11]. The attack using deep learning neural
135 network to extract fingerprints automatically is more effective than the traditional attack.
136 However, most deep learning algorithms do not make full use of the
137 spatiotemporal information of data packets, and deep learning algorithms need to
138 collect a large number of data samples, which consumes more training time and training
139 costs, and has low iteration efficiency [10]. From a practical point of view, due to the
140 privacy of user traffic, it is impossible to collect large amounts at will, and the network
141 status of the Tor system is constantly changing, and plug-ins are advancing with the
142 times. A good attack model not only needs high accuracy but also needs to be able to

12
13 Remote Sens. 2023, 15, x FOR PEER REVIEW 5 of 36
14

143 adapt to changes in the Tor network system as quickly as possible and reduce the cost
144 and time of model training. The new model proposed in this article solves these
145 problems.
146 This paper proposes the XL-Stacking model based on the stacking model to identify
147 Tor anonymous traffic and classify applications. After verification through experiments
148 and tests, the first layer of the stacked model uses KNN, XGBoost, and random forest
149 algorithms. The second layer uses the Logistic algorithm, which can achieve higher
150 classification accuracy under smaller feature dimensions. Experiments show that the XL-
151 Stacking model can quickly identify whether a user's traffic is darknet traffic. If the
152 algorithm detects that the user's traffic is darknet traffic, it will further classify the
153 darknet traffic and quickly locate the monitored traffic category. Specifically, the
154 following eight categories can be identified: video, web browsing, chat, file transfer,
155 email, P2P, audio, and VOIP.
156 For fingerprint attacks, it aims to identify which specific monitored website a user
157 visits. This article proposes the BiGRU-ResNet model based on spatiotemporal features.
158 According to the characteristics of Tor traffic packets, make full use of the Tor website
159 fingerprint sequence. Including time, space and website information triples, integrating
160 spatiotemporal multi-modality to improve the efficiency and accuracy of model
161 identification. BiGRU-ResNet uses bidirectional GRU to extract temporal features and
162 ResNet to extract spatial features of website fingerprints. On a smaller sample space, the
163 experimental results of our attack show that it performs better than the state-of-the-art
164 attacks, and the time overhead is acceptable [12].
165 This article is organized as follows. Part one, introduction. First, the research
166 background and significance of this article are explained, and then the Tor anonymous
167 communication system is introduced. An overview of Tor anonymous network traffic
168 identification and application classification, as well as the current research status of
169 website fingerprint attacks at home and abroad. The research content and innovation
170 points of this article are summarized. The second part introduces Tor anonymous
171 network traffic identification and application classification model and fingerprint
172 attacks. Proposed XL-Stacking integrated learning model and BiGRU-ResNet model
173 based on spatiotemporal features. Compare with existing research. In the third part, the
174 efficiency and accuracy of the XL-Stacking model and BiGRU-ResNet model are verified
175 through experimental data. Section 4 discusses the results in a broader context and
176 points out the shortcomings of this article and future research directions. Finally,
177 conclusions are drawn in Chapter Five.

178 2. Materials and Methods

179 The identification of Tor anonymous traffic is often described as a website binary
180 classification problem [13]. The main purpose is to identify whether the user is visiting
181 an ordinary web page or a Tor anonymous website. The application classification of Tor

15
16 Remote Sens. 2023, 15, x FOR PEER REVIEW 6 of 36
17

182 anonymous traffic is usually described as a multi-classification problem, which can

183 identify which category of website a user visits. Specific categories include web
184 browsing, chat, video, audio, email, VoIP, P2P and file transfer [14]. Collect a large
185 number of traffic tracking instances from each web page, extract features and train
186 machine learning algorithms. When users browse the web, the traffic is passively
187 collected, passed to the classification model, which identifies Tor traffic and performs
188 application classification.
189 The purpose of the Tor website fingerprint attack is to identify which website the
190 anonymous Tor network traffic specifically visits. The website fingerprint attack
191 technology is an effective method to infer which web page the user browses in the
192 anonymous network [15]. Due to the low-latency nature of the Tor network, Tor does
193 not provide adequate protection against traffic analysis attacks. Previous research
194 mainly focused on the selection of traffic characteristics and the improvement of
195 traditional classification methods. This article studies Tor traffic attacks from the
196 perspective of practicality and Tor protocol.

197 2.1. Tor anonymous network traffic identification and application classification model
198 This chapter proposes the XL-Stacking model for Tor anonymous traffic
199 identification and classification. The overall architecture is shown in Figure 3, which
200 includes five modules: traffic collection, feature acquisition, data preprocessing, model
201 training, identification and classification.

202

203 Figure 3 Tor anonymous traffic identification and classification model architecture

204 The traffic collection module is used to collect raw traffic samples. The feature
205 processing module is used to extract features from the collected traffic. After in-depth

18
19 Remote Sens. 2023, 15, x FOR PEER REVIEW 7 of 36
20

206 analysis of the Tor and Obfs4 protocols, the combined traffic characteristics of
207 handshake packet length characteristics, information entropy characteristics, and time
208 interval characteristics were mainly selected. The data processing module is used to
209 calculate and preprocess correlation features at the granularity of data streams. The
210 model training module learns traffic fingerprint features and trains the parameters of the
211 final classifier. The final trained model is used to predict labels for anonymous
212 communication traffic. That is, identify whether the user traffic is Tor anonymous traffic
213 and classify the Tor anonymous traffic application type. Different XL-Stacking model
214 classifiers are trained for two different scenarios of traffic identification and
215 classification. In these scenarios, the captured unknown label traffic is extracted,
216 preprocessed, and then fed into the model and the classification results are obtained. In
217 the traffic identification scenario, determine whether the input traffic is Tor anonymous
218 network traffic; in the application classification scenario, determine the application type
219 corresponding to the input Tor anonymous traffic.
220 2.1.1. Feature design of Tor anonymous traffic Experimental data set
221 The purpose of extracting features is to effectively distinguish whether it is Tor
222 traffic based on the traffic loaded by web pages, and to further classify the types of Tor
223 applications, so as to effectively identify the categories of websites accessed by
224 anonymous network traffic and strengthen the management of the network
225 environment by regulatory authorities. The quality of features directly affects the quality
226 of results.
227 The Obfs4 plug-in further obfuscates the packet size, reorganizes and randomly fills
228 the packets based on the obfuscated traffic quintuple. Based on the characteristics of
229 random filling, the traffic characteristics related to information entropy can be analyzed.
230 Although Obfs4 hides the surface characteristics of anonymous traffic, it does not have
231 functions such as reordering, random packet insertion, and delay. Therefore, time
232 correlation features such as packet interval time of traffic when accessing different
233 services can be extracted for traffic fingerprint analysis. Liang Di et al. proposed the
234 handshake packet length feature and information entropy feature, and Arash et al.
235 proposed the time interval feature. This article combines the traffic characteristics of
236 these three dimensions and inputs them into the classifier model proposed in this
237 chapter. The specific traffic characteristics are shown in Table 1.

238 Table 1. Traffic characteristics.

Type Feature
Total length of data stream, C2S data packet length, S2C data packet length;
Handshake packet length charac-
Mean, minimum, maximum, total length, quartiles, median, and variance of
teristics
the overall length.

21
22 Remote Sens. 2023, 15, x FOR PEER REVIEW 8 of 36
23

Information entropy characteristics Overall and all-directional packet information entropy.

Arrival time intervals for upstream parties
(average, minimum, maximum, standard value);
The amount of time a stream is active before becoming idle
Time interval characteristics
(average, minimum, maximum, standard);
The amount of time the stream is idle before becoming active
(average, minimum, maximum, standard).
239
240 2.1.2. Choice of base learner
241 The data set was randomly divided into training set and test set according to 8:2.
242 To compare the performance of different classifiers, the main models compared are
243 Gaussian Naive Bayes (GaussianNB), KNN, XGBoost, random forest algorithm, fully
244 connected neural network (Dense), and SVM. To tune the hyperparameters of each
245 model, the GridSearch method in the Scikit-Learn library is mainly used for parameter
246 tuning. Conduct an exhaustive search on the parameter list to find the optimal
247 parameter combination. However, the exhaustive search requires high computing
248 resources and takes a long time. Therefore, we first set the initial parameters in advance
249 and manually adjust the parameters based on experience and observation. After
250 reaching a certain range, use the GridSearch method to adjust parameters.
251 The final experimental results are shown in Table 2. There should be two
252 preliminary analysis reasons for the poor performance of Gaussian Naive Bayes and
253 fully connected neural networks. One is that the time features in the data set are not
254 independent of each other. The other reason is that the samples in the data set are
255 limited and are not suitable for deep neural networks. Finally, selects three models with
256 better results as the first-layer base learner, namely: KNN, XGBoost, and random forest
257 algorithm. The second layer selects a simpler Logistic model as the meta-learner to
258 reduce model complexity.

259 Table 2. Accuracy rate (Acc) under different models.

Models GaussianNB KNN XGBoost Random Forest Dense SVM

ACC 25% 83% 85% 84% 57% 71%
260
261 2.1.3. XL-Stacking model
262 Stacking integrated learning first trains multiple different base learners, and then
263 uses the output of each previously trained model as input to train a meta-learner to
264 obtain a final output. It is required that the learning effect of the base learner should be

24
25 Remote Sens. 2023, 15, x FOR PEER REVIEW 9 of 36
26

265 good and the principles should be as different as possible, and the meta-learner should
266 be simple.
267 The learning effect of Stacking integrated learning does not come from the stacking
268 of multi-layer models, but from the learning capabilities of different learners for
269 different features. Multi-layer aggregation will face more complex over-fitting problems
270 and has limited benefits. Generally, two layers are enough. Therefore, the Stacking
271 model in this article also chooses two layers.
272 The XL-Stacking integrated learning model in this article mainly consists of the
273 following two layers. The first-layer base learner uses XGBoost, random forest, and
274 KNN, as shown in Figure 4. The second-layer meta-learner selects a simpler Logistic
275 model to reduce the complexity of the model, as shown in Figure 5.
276 The XL-Stacking model training steps are as follows:
277 (1) First train each base learner model using five-fold cross-validation on the
278 original training set [16]. Four out of five copies are selected as training data and the
279 remaining one is used as test data [17].
280 (2) After the data is trained, predict the test data to obtain the corresponding
281 prediction result ResultA, and predict the original test set to obtain the corresponding
282 prediction result ResultB. Each model is trained five times, the ResultA obtained five
283 times is combined into one column, and the ResultB results obtained five times are
284 averaged [18].
285 (3) The primary models XGBoost, RF and KNN obtain 3 groups of ResultA and 3
286 groups of ResultB through step (1), from which new training sets and test sets can be
287 formed.
288 (4) Input the new training set into the logistic regression model training, predict the
289 newly generated test group, and obtain the final output result.
290

27
28 Remote Sens. 2023, 15, x FOR PEER REVIEW 10 of 36
29

291

292 Figure 4. XL-Stacking first layer model.

293

294

295 Figure 5. XL-Stacking second layer model

296

297 2.2. Tor network fingerprint attack model based on spatiotemporal characteristics
298 Machine learning methods applied to fingerprint attacks have no obvious effect
299 because the traffic differences between the same type of websites are small. Feature set
300 design and construction will affect the accuracy of model classification results more than
301 using different classifiers. Although some research has proposed designing Tor
302 anonymous traffic feature selection methods with different classification granularities
303 and gaining expert experience in analyzing operating mechanisms such as the Tor

30
31 Remote Sens. 2023, 15, x FOR PEER REVIEW 11 of 36
32

304 network and its plug-ins. However, the selection and determination of features are also
305 based on different assumptions, and there is no way to apply them to real network
306 environments. And as anonymous network access technology is repeatedly upgraded,
307 the existing selected features may no longer be effective.
308 Deep neural networks can more effectively extract fingerprint features of websites,
309 but require a large amount of data and the time cost of model training is very high.
310 Supervisors must collect data sets in advance, this identification model is a static model,
311 and time will affect the accuracy of the classifier [10]. Fingerprint models require large
312 amounts of data, and the accuracy of the model will decrease significantly over time.
313 And since website tracking changes rapidly, attackers must frequently update the
314 tracking database to match user traffic, which weakens the attack in practice. As deep
315 learning continues to develop, another problem is that the network is getting deeper and
316 deeper. Due to the vanishing gradient problem, neural networks with deeper layers are
317 more difficult to train. If you simply increase the number of layers of the neural network,
318 it will not have a big effect. Because backpropagation will pass the gradient to the
319 previous layer, repeated multiplication will make the gradient infinitesimal. As a result,
320 as the number of neural network layers increases, its performance tends to be saturated
321 or even begins to decline. Therefore, a new attack model needs to be proposed to achieve
322 better results in small samples and without gradient disappearance.
323 In order to effectively solve the problem of fingerprint models that require a large
324 amount of data, the accuracy of the model will significantly decrease over time. This
325 paper proposes a deep neural network model based on temporal and spatial features,
326 the BiGRU-ResNet model, which can effectively solve this problem. BiGRU-ResNet uses
327 a two-layer GRU to extract temporal features and an 18-layer ResNet to extract spatial
328 features of website fingerprints. On smaller sample spaces, our attack proves to perform
329 better than state-of-the-art attacks with acceptable time overhead, significantly reducing
330 the amount of training data required to perform website fingerprinting attacks. This
331 shortens the time required for data collection and solves the problem of data instability.
332 The model framework proposed in this chapter is shown in Figure 6.
333

33
34 Remote Sens. 2023, 15, x FOR PEER REVIEW 12 of 36
35

334 Figure 6. BiGRU-ResNet Architecture

335
336 The website fingerprint is sent to the Bi-GRU network to extract temporal features, while it is sent to the 18-layer
337 ResNet network to obtain the spatial features of the website fingerprint. We merge the feature vectors extracted by the
338 GRU network and ResNet network into one vector and send it to the Softmax classifier for classification. See
339 Algorithm 1 for the overall description of the pseudocode. The sequences extracted by Tor cell are used as data set F,
340 which is the built fingerprint database of monitoring websites. Among them, fk is the unknown fingerprint to be
341 tested, which is used as the input vector. After the detection and full connection of the BiGRU-ResNet model, the
342 probability value p is obtained for comparison with λ. If p≥λ, add to the candidates set. Otherwise, it is a fingerprint
343 of a non-supervised page. If fingerprint data exists in candidates and belongs to the previously constructed regulatory
344 page fingerprint, the regulatory page fingerprint is returned. Otherwise, it is classified to the most similar regulatory
345 page.
346
347
348
349 Algorithm 1: Fingerprint attack algorithm based on BiGRU-ResNet
Input： Data set F extracted through Tor cell
Output： matching fingerprint id
Begin：
Function：
candidates←∅
For fk ∈ F do
<Xt-1, Xt-2, Xt > = feature Vector (fu, fk)
P←PBiGRU-ResNet ( fu.id = fk.id <Xt-1, Xt-2, Xt>)
If p ≥ λ
then candidates=candidates ∪< fk, p >
else return unmonitored
end if
end for
If |candidates|>0 and sameIds (candidates)
then return candidates[0].id
else return similarId ( )
end if
end function

36
37 Remote Sens. 2023, 15, x FOR PEER REVIEW 13 of 36
38

350 2.2.1. Spatial dimension information extraction layer

351 We use the residual module RestNet to extract spatial features. An 18-layer ResNet
352 network is designed to learn the spatial characteristics of website fingerprints. As shown
353 in Figure 7, the website fingerprint is a feature vector with a dimension of 5000. ResNet
354 is a deep learning model in the field of image recognition. The model input is usually a
355 two-dimensional feature matrix. Therefore, we need to design a method to convert the
356 fingerprint vector into a two-dimensional feature matrix. We copy the website
357 fingerprint vector into 10 copies and then combine them into a vector with a length of
358 50,000. The vector is then resized into a 224*224 two-dimensional vector matrix, and the
359 matrix is generated to meet the ResNet network input requirements.
360 In Figure 7, the markers in each convolutional layer indicate the kernel size. Wiring
361 represents a skip connection that adds the block's input to its output. The dashed skip
362 connection indicates that the input is down sampled before adding so that its
363 dimensions match the output of the block, resulting in a final output vector dimension of
364 512.
365 The model has 4 independent sequences, each consisting of 2 residual modules. Each
366 block contains 2 convolutional layers. Each convolutional layer is connected to the BN
367 batch processing module and the ReLU nonlinear function, as shown on the left side of
368 Figure 8. Skipping these two convolution operations and adding the input directly to the
369 final ReLU function requires that the output and input of the 2 convolutional layers have
370 the same dimension so that they can be added [19]. If you want to change the number of
371 channels, you need to add an additional 1×1 convolution layer to convert the input into
372 the required shape and then add it, as shown on the right side of Figure 8.

39
40 Remote Sens. 2023, 15, x FOR PEER REVIEW 14 of 36
41

373 Figure 7. ResNet-18 Architecture

374

375 Figure 8. Residual module

376

377 The "skip" connection between ResNet input and output facilitates the optimization of
378 large networks. By simply having deep blocks copy previous blocks, deeper networks
379 can be extrapolated from shallower ones, simplifying larger network optimizations. This
380 facilitates higher-level feature extraction and improves expressiveness. ResNet-18 is both
381 general enough to accept any sequence of similar inputs and powerful enough to
382 perform well on these inputs.
383 2.2.2. Time dimension information extraction layer
384 We use a two-layer GRU network to extract the temporal features of website
385 fingerprints. Because the traffic at time t depends not only on the previous traffic status,
386 but also on the future traffic status. So, we choose the bidirectional GRU network. It can
387 not only extract the previous moments but also extract the impact of future moments on
388 the traffic status, and further extract the website fingerprint timing characteristics.
389 There is an input at each moment, and the hidden layer has two nodes, one for
390 forward calculation and the other for reverse calculation. The output layer is determined
391 by these two values [20]. The calculation of the output layer requires obtaining the input
392 Xt of the current time step t. The forward hidden state and the reverse hidden state at
393 time t-1. And the hidden state Ht at time t is obtained by weighted summation of the
394 hidden states in both directions. And finally get the output Yt. Whh, Whq are the weight

42
43 Remote Sens. 2023, 15, x FOR PEER REVIEW 15 of 36
44

395 coefficients, bh, bq are the deviations corresponding to the hidden state at a certain
396 moment, see formula (1):

(1)

397 ⃗
H t =σ ( X t W xh + ⃗
H t−1 W hh +bh )

398 The input of the BiGRU module in this article uses 1D input. Experiments have
399 proven [10] that 1D input is much faster than 2D input, even though the total number of
400 data in both input dimensions is the same. Analysis suggests that this difference is due
401 to tensor operations, which must process higher-dimensional data. So, it trains faster
402 and provides better classification performance. The GRU input dimension is also the
403 vector dimension of the website fingerprint, which is 5000, and the output is determined
404 by GRU in both directions simultaneously. The output dimension in one direction is 256,
405 so the output dimension of the bidirectional GRU is also 512.
406 2.2.3. Spatiotemporal information fusion layer
407 Features learned from directional inputs are significantly different from those
408 learned from temporal inputs, making it difficult for shared models to find a set of
409 shared weights. In order to effectively combine temporal features and spatial direction
410 features, we take the arithmetic mean of its Softmax output after training each of the
411 above models separately. We perform a vector merging operation on the spatial feature
412 vector output by the ResNet network with dimension 512 and the temporal feature
413 vector output by the BiGRU network with dimension 512. Finally, a website fingerprint
414 feature vector with a dimension of 1024 is generated, and then input to the Dorpout
415 layer and Softmax layer for fingerprint prediction. The Softmax layer will predict the
416 probability that the input website fingerprint belongs to different monitored websites,
417 and the monitored website with the largest probability value is the prediction result [13].
418 In order to further reduce the overfitting of the model, this article uses the Dropout
419 algorithm. Even complex neural networks can prevent overfitting and improve the
420 efficiency of neural networks.
421 In the training phase of the model, the calculation formula of the neural network is
422 as follows:
(2)
423 Set the probability p and enter the neural network calculation of the Dropout
424 algorithm, see formula (3):

45
46 Remote Sens. 2023, 15, x FOR PEER REVIEW 16 of 36
47

(3)

425 Among them, Bernoulli is the Bernoulli random distribution function, which is
426 used to randomly generate a 0-1 probability vector with probability p. Among them, wi
427 and ri represent the weight parameters of the network, and bi is the bias. Finally, the
428 discarded neurons are recovered and replaced, and the above process is repeated.
429 Reduce interactions between hidden nodes by randomly ignoring them. During
430 propagation, neurons with probability p stop working, so the learning process does not
431 become too dependent on local features. Overfitting mostly occurs in fully connected
432 layers and is less of a problem in convolutional layers. Through experiments, it is found
433 that setting the probability value p of Dropout to 0.5 during model training can improve
434 the robustness of the model and prevent overfitting.
435 The normalization mentioned in the previous section helps the model learn and
436 generalize to new data. So not only do you need to normalize the data before entering
437 the model, but you should also consider normalizing it after each transformation of the
438 network. Batch Normalization (BN) is a type of layer proposed by Loffe and Szegedy in
439 2015 [21]. The BatchNormalization layer receives an axis parameter, which specifies
440 which feature axis should be normalized. The main advantage of batch normalization is
441 that it helps gradient propagation and improves classification performance, learning
442 faster while maintaining or even improving accuracy.
443 Therefore, we combine BN and Dropout to achieve improvements in both
444 performance and generalization. However, adding a BN layer requires additional
445 training time, which increases the training time by approximately twice per echo
446 compared to the model without BN applied. However, we believe that BN is worth
447 applying because the additional training time is compensated by the fast-learning rate
448 and ultimately enables higher test accuracy. In the BiGRU-ResNet model, we apply BN
449 after convolutional and fully connected layers [22].

450 2.3. Experimental data set

451 2.3.1. Traffic classification data set
452 In order to better compare with similar work, the ISCX Tor data set released by the
453 Canadian Institute of Cyber Security (UNB-CIC) was selected for experimental
454 evaluation and verification. This dataset contains 8 traffic types (browsing, chat, audio
455 streaming, video streaming, mail, VOIP, P2P and file transfer) from representative
456 applications such as Facebook, Skype, Spotify, Mail, etc.
457 2.3.2. Traffic fingerprint attack experimental data set
458 Fingerprint datasets are usually divided into two types: one is a closed-world
459 dataset and the other is an open-world data set. The open world dataset contains a large

48
49 Remote Sens. 2023, 15, x FOR PEER REVIEW 17 of 36
50

460 number of unmonitored web fingerprint instances and therefore requires a large amount
461 of data. For the sake of user privacy and comparison with similar work, this article
462 selects the data collected by Rimmer et al. in 2018. This is currently the largest data set
463 for website fingerprinting attacks, and many studies have been conducted on this data
464 set.
465 1. Closed world: Rimmer et al. visited the homepages of 1200 most popular websites
466 based on Alexa data. Start b3 filtering the list of popular websites and removing
467 duplicate entries. Data for these 1,200 websites were collected in four iterations,
468 each containing 300 websites. Network traces were collected for four iterations over
469 approximately 14 days starting in January 2017. After collecting data on 3.6 million
470 page views, invalid entries were filtered out. These entries are caused by timeouts,
471 browser or Selenium driver crashes. Additionally, Rimmer et al. filtered out web
472 pages that displayed verification codes on every visit. Finally, the dataset is
473 balanced, fixing the same number of traces for each site, ensuring an even
474 distribution of instances across sites.After this filtering process, the Rimmer et al.
475 closed world dataset consists of 900 websites, each with 2500 valid web traces. Call
476 this data set CW900. Similarly, for datasets consisting of subsets of the websites,
477 corresponding notations are used: the datasets for the top 100, 200, and 500
478 websites are called CW100, CW200, and CW500, respectively [7].
479 2. 2) Open World: Since the open world data is only used for testing purposes and not
480 for training the model, only one instance is collected for each page in the open
481 world. Collected 400,000 web traffic on Alexa website. In addition, an additional
482 2,000 test traces (400,000 total) were collected for each site of the monitored closed
483 CW200. Finally, 800,000 pieces of traffic data were evaluated in the open world, half
484 from the closed world and half from the open world. Rimmer et al.'s dataset
485 represents a 4x increase compared to the largest dataset in previous work [7]. The
486 complete data set composition is shown in Table 3:
487
488 Table 3. Dataset Composition
Data Website number Instances per website The length of trace
CW100 100 2500 5000
CW200 200 2500 5000

CW300 300 2500 5000

CW500 500 2500 5000

CW900 900 2500 5000

51
52 Remote Sens. 2023, 15, x FOR PEER REVIEW 18 of 36
53

monitored:200 monitored:2000
CW200_400000 5000
unmonitored:400000 unmonitored:1

489

490 2.4. Experimental design

491 2.4.1. Traffic application classification experimental design
492 This experiment was run on Jupyter Notebook configured with Tensorflow
493 environment. The specific configuration parameters are shown in Table 4.
494
495 Table 4. The experimental configuration
Lab environment Configuration instructions
operating system Win X64

memory 8G
harddisk 256GB
CPU 4-core GPU
processor AMD A8-700
python version 3.6

496
497 We designed two experiments.
498  The first experiment is the identification of Tor traffic. Validated on the public ISCX
499 Tor dataset. Finally, the experimental results are compared with the results of
500 Canadian research institutes.
501  The second experiment is to further classify Tor traffic by application. Use the ISCX
502 Tor data published by UNB-CIC to experiment, and conduct comparative analysis
503 of the experimental results.
504 2.4.2. Traffic application classification experimental design
505 The experiment uses the Pytorch deep learning framework to implement the model
506 we designed. Deep learning experiments were conducted on the cloud platform with
507 Nvidia 2080Ti GPU and 12GB CPU. The specific experimental configuration is shown in
508 Table 5.
509
510 Table 5. The experimental configuration

54
55 Remote Sens. 2023, 15, x FOR PEER REVIEW 19 of 36
56

Lab environment Configuration instructions

operating system Ubuntu 18.04.5 LTS
memory 18G
harddisk 120GB
CPU 4-core GPU
processor Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
python version RTX 2080 Ti
operating system 3.6

511
512
513 We designed three experiments.
514  Fingerprint attack experiment in a closed world
515  Open world fingerprint attack experiment
516  Ablation experiment

517 2.5. Evaluation index

518 For multi-classification scenarios, in addition to accuracy, precision, recall, and F1
519 value [23]. The overall Macro-average precision rate and recall rate are also needed, see
520 formula (4), where Plablei and Rlablei represent the precision rate and recall rate of traffic
521 type i respectively.

(4)

522
523 TP is the number of correctly classified regulated sites. TN is the number of
524 correctly classified non-regulated websites. FN is the number of regulated sites that are
525 incorrectly classified as unregulated sites. FP represents the number of non-regulated
526 websites that are incorrectly classified as regulated websites [10].
527 The closed world fingerprint attack is a multi-classification task, which is evaluated
528 using accuracy and recall. The specific definitions are shown in equations (5) and (6):

57
58 Remote Sens. 2023, 15, x FOR PEER REVIEW 20 of 36
59

(5)

(6)

529
530 Open-world fingerprinting attack is a two-class problem. The performance of the
531 model is not only reflected in correctly identifying monitored web pages, but also in
532 minimizing the misidentification of non-monitored pages as monitored pages.
533 Therefore, the True Positive Rate (TPR) and False Positive Rate (FPR) are used to
534 evaluate the performance of the model [24]. TPR is the proportion of monitored pages
535 that are correctly classified as any monitored page. FPR is the ratio of unmonitored
536 traffic that is misclassified as a monitored site and is a measure of an attacker's false
537 identification. See equations (7) and (8):

(7)

(8)

538 3. Results

539 3.1. Experimental results of Tor anonymous traffic identification

540 The main purpose of the experiment is to identify network traffic and whether it is
541 Tor traffic. According to the introduction in the previous section, three models with
542 better results were finally selected as base learners, namely: KNN, XGBoost, and random
543 forest algorithm. Choose a simpler Logistic model as the meta-learner to reduce the
544 complexity of the model.
545 Verified on the public ISCX Tor data set, the accuracy rate reaches 99.1% and the
546 recall rate reaches 99.3%. According to the research results released by the Canadian
547 Institute of Cyber Security (UNB-CIC) [Error: Reference source not found], the two
548 better models are C4.5 and KNN. Therefore, the experimental results of this article are
549 compared with it. Since UNB-CIC only provides classification accuracy and recall under
550 weighted average, this section only compares these two indicators. In previous research,
551 the SVM classifier performed excellently in binary classification scenarios. And the SVM
552 model is selected as the classifier in the literature [25]. Therefore, the XL-Stacking model
553 proposed in this article is also compared with the SVM model.
554 The experimental results are shown in Figure 9, which are compared in two
555 dimensions: accuracy and recall. Experimental results show that the XL-Stacking model
556 proposed in this article can well identify Tor traffic from Internet traffic. The accuracy
557 rate is the highest among the four models, reaching 99.1% on the public data set, and the

60
61 Remote Sens. 2023, 15, x FOR PEER REVIEW 21 of 36
62

558 recall rate is also the highest at 99.3%. A higher recall rate means a lower false negative
559 rate in network supervision. The experimental results strongly prove that although the
560 Tor protocol and Obfs4 plug-in confuse the surface characteristics of the traffic.
561 However, they are still different from normal network traffic in terms of combined
105
562 features such as handshake packet length features, information entropy features, and
563 time interval
100
features.
Precision/Recall （ % ）

564
95

75
Acc Recall

SVM C4.5 KNN XL-Stacking

565

566
567 Figure 9. Comparison of Tor anonymous traffic identification experimental results
568

569 3.1. Experimental results of Tor application classification

570 Using the XL-Stacking stacking model proposed in this article, five-fold cross-
571 validation is used to train the XL-Stacking stacking model on the ISCX Tor data set.
572 After the model is trained, the confusion matrix is output on the two test sets. As shown
573574
in Figure10(a), the normalized confusion matrix is shown in Figure 10(b).

63
64 Remote Sens. 2023, 15, x FOR PEER REVIEW 22 of 36
65

575 (a) (b)

576 Figure 10. (a) ISCX Confusion matrix; (2) ISCX Normalized Confusion matrix
577
578 Let’s analyze the confusion matrix. Because the ISCX sample is unbalanced. The
579 number of samples per category used for training is distributed between 245 and 2325.
580 The number of samples per category used for testing ranges from 49-465. There is a large
581 difference in the number of samples of different application types, which is directly
582 reflected in the large difference in the color of the confusion matrix. The application
583 types with lighter colors in the ISCX confusion matrix are because there are fewer
584 samples, which does not mean that the model classification accuracy is not high.
585 Looking further at the confusion matrix, we can see that Browsing traffic
586 classification has the most errors. Browsing traffic is incorrectly marked as Audio or
587 Video traffic. Further analysis shows that Browsing, Audio, and Video traffic all have the
588 characteristics of small upstream traffic and large downstream traffic. They are most
589 easily confused with each other and therefore will be misclassified. Browsing traffic is
590 incorrectly marked as Chat traffic because all chat applications used are network-based
591 or use https as the communication protocol. Therefore, the browsing category has
592 become the application type with the most classification errors.
593 Experiments show that the XL-Stacking proposed in this article also has good
594 experimental results in multi-classification scenarios for classifying Tor anonymous
595 traffic application types. Can accurately identify different application types
596 corresponding to Tor traffic. As shown in Figure 11, the XL-Stacking model has an
597 accuracy of 90.3% and a recall rate of 87.4% on the imbalanced ISCX Tor data set. The
598 1.000of UNB-CIC are using the random forest classifier, and the experimental
best results
599 0.900
results are 84.2% and 84.0% respectively.
Precision/Recall

600 0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
Acc Recall

XL-Stacking RF C4.5 KNN

601

602 Figure 11. Tor traffic application classification

603

66
67 Remote Sens. 2023, 15, x FOR PEER REVIEW 23 of 36
68

604 Finally, on the ISCX Tor dataset, the test dataset is used to evaluate the precision
605 and recall of different application categories. And compared with the RF, KNN, and
606 C4.5 models used by UNB-CIC. The results are shown in Figures 12 and 13, which show
607 the precision andForest
Random recall valuesC4.5
for each of KNN
the eight application categories. It can be seen
XL-Stacking
608 1.2 that the XL-Stacking model proposed in this article has the best classification results
609
1

0.8

0.6

0.4

0.2

0
VOIP AUDIO B RO W S CHA T MA I L FILE- P2P VIDEO
ING T RA NS
FER
610

611 Figure 12. Tor Characterization Precision

612
613
Random Forest C4.5 KNN XL-Stacking
1.2

0.8

0.6

0.4

0.2

0
VOIP AUDIO B RO W S CHA T MA I L FILE- P2P VIDEO
ING T RA NS
FER

614 Figure 13. Tor Characterization Recall

615

616 3.3. Fingerprint attack closed world experimental results

69
70 Remote Sens. 2023, 15, x FOR PEER REVIEW 24 of 36
71

617 In a closed world, network regulators can determine the specific websites visited by
618 users through fingerprint attacks and monitor the network environment. We compare
619 the BiGRU-ResNet proposed in this article with the SDAE, CNN, and LSTM models
620 proposed by Rimmer et al. on the same data set. First, on the CW100 data set, for the
621 monitored websites, the impact of different number of instances on the accuracy of
622 fingerprint attacks is shown in Table 6 and Figure 14.
623 When the number of instances of the monitoring page is 100, the accuracy rate
624 reaches 87.51%. It is proved that this model can achieve better results even when the
625 training sample is small. It reduces the cost of attacks to a certain extent and significantly
626 reduces the amount of training data required to perform website fingerprint attacks.
627 This shortens the time required for data collection and reduces the likelihood of data
628 instability issues. As the number of training examples increases, the accuracy of website
629 identification also increases. When 2500 instances are involved in training and testing,
630 the accuracy of each model reaches its maximum. Among them, the BiGRU-ResNet
631 model proposed in this article has higher accuracy than other models in different
632 number of instances. When the number of instances of each monitoring page is 2500, the
633 accuracy reaches a maximum of 98.46%. It is much better than the 96.26% accuracy of the
634 best CNN model proposed by Rimmer et al.
635
636 Table 6. Accuracy of different number of instances

BiGRU-RestNet LSTM CNN SDAE Traces

87.51 40.6 81.25 85 100
90.73 57.3 86.63 87.3 200
94.12 79.54 91.43 91.34 500
95.40 91.63 94.72 92.64 1000
96.84 91.93 95.95 94.49 1500
97.67 93.98 96.14 95.17 2000
98.46 94.02 96.26 95.46 2500

637

72
73 Remote Sens. 2023, 15, x FOR PEER REVIEW 25 of 36
74

SDAE CNN LSTM BiGRU-RestNet

100
90
Accuracy （ % ）

80
70
60
50
40
100 200 500 1000 1500 2000 2500
638 Figure 14. The accuracy of each model in CW100 with different number of instances

639
640 Collecting more monitored traffic for each website helps improve the classification
641 accuracy of these classifiers [26]. Because more sufficient data can help the model train
642 and learn the characteristics of the sequence more comprehensively. But this type of
643 model strongly depends on the amount of data [27]. In contrast, our model can still
644 perform well even with a limited number of instances on each website. Our model is
645 robust compared to state-of-the-art fingerprint attacks.
646 Then compare the accuracy of each model when the number of instances of each
647 page in the closed world data sets CW100, CW200, CW300, CW500 and CW900 is 2500.
648 The results are shown in Table 7 and Figure 15. It can be observed that as the type and
649 number of monitoring pages increase, each model shows a small range of decline. But no
650 matter which data set it is on, the BiGRU-ResNet model has a higher page recognition
651 accuracy than other models.
652
653 Table 7. The accuracy of each model on different datasets

GRU-RestNet LSTM CNN SDAE Dataset

98.46 94.02 96.66 95.46 CW100
98.15 93.10 96.52 95.76 CW200
97.72 90.80 92.31 95.04 CW500
97.23 88.04 91.79 94.25 CW900

75
76 Remote Sens. 2023, 15, x FOR PEER REVIEW 26 of 36
77

SDAE CNN LSTM BiGRU-RestNet

100
98
96
94
Accuracy(%)

92
90
88
86
84
82
CW100 CW200 CW500 CW900
654

655 Figure 15 The accuracy of each model on different datasets

656

657 3.4. Fingerprint attack open world experimental results

658 In the open world, the goal of identification is whether Tor anonymous traffic is the
659 traffic of a monitored website, which is a two-category problem. We evaluate the
660 performance of each model in the open world by True Positive Rate (TPR) and False
661 Positive Rate (FPR). That is, if the input traffic is a monitored website traffic, the
662 maximum output probability belongs to any monitored website, and is greater than a
663 threshold, we consider this to be a True Positives.
664 To study the impact of the number of unmonitored web pages on model
665 performance in an open world [28]. This paper experimentally compares the TPR and
666 FPR of BiGRU-ResNet under different numbers of unmonitored web pages. The
667 experimental results are shown in Figure16.
668

78
79 Remote Sens. 2023, 15, x FOR PEER REVIEW 27 of 36
80

100 20
95
90 16
85
TPR （ % ）

80 12

FPR （ % ）
75
70 8
65
60 4
55
50 0
0 10000 20000 30000 40000

TPR FPR
669 Figure 16. BiGRU-ResNet results on different numbers of unmonitored pages

670

671 It can be seen that as the number of unmonitored pages for fingerprint attacks
672 increases, the FPR decreases. And when the number of unmonitored pages is at most
673 40,000 (the ratio of monitored and unmonitored pages is 1:1), FPR reaches the minimum
674 value. TPR will decrease slightly as FPR decreases. Generally speaking, the smaller the
675 proportion of monitored pages in the training set, the less is known about the monitored
676 category and the lower the TPR. And more unmonitored pages will bias the attack
677 model to correctly distinguish monitored from unmonitored, thereby reducing FPR.
678 In order to prove the efficiency of the BiGRU-ResNet model proposed in this article
679 in open world fingerprint attacks. Compare it with the SDAE, CNN, and LSTM models
680 mentioned in the previous section on the same CW200_40000 data set, as shown in
681 Figure 17. The results show that the BiGRU-ResNet model performs best on TPR and
682 FPR. In the 40,000 unmonitored training set, TPR and FPR are 86.26% and 3.21%
683 respectively. This shows that the model proposed in this chapter can still achieve the
684 best results in open world fingerprint recognition
685 Figure 17. TPR and FPR for each model
686

687 3.5. Fingerprint attack ablation experiment

688 In order to verify the effectiveness and necessity of each part of the model proposed
689 in this article, an ablation experiment is designed in this part [29]. It mainly includes the
690 following three parts:

81
82 Remote Sens. 2023, 15, x FOR PEER REVIEW 28 of 36
83

100
90
80
TPR/FPR （ % ）

70
60
50
40
30
20
10
0
SDAE CNN LSTM BiGRU-RestNet

TPR FPR

691 1) Remove the spatial dimension information extraction layer. That is, the model
692 only removes the ResNet-18 component that extracts spatial information. The remaining
693 time dimension information extraction component BiGRU. Its output is directly input to
694 the fusion layer.
695 2) Remove the time dimension information extraction layer. That is, only the BiGRU
696 component that extracts time information is removed. The spatial dimension
697 information extraction component ResNet-18 remains. Its output is directly input to the
698 fusion layer.
699 3) Remove the spatiotemporal information fusion layer. Pass the output vector of
700 the spatial dimension information extraction layer and the time dimension information
701 extraction layer. The splicing operation is not input to the Dropout and Softmax layers,
702 but the traffic prediction is performed directly. Then the maximum value of the output
703 probability of the ResNet-18 component and the BiGRU component is taken as the
704 experimental result of the ablation experiment.
705 The results of the ablation experiment are shown in Table 8[30]. In the closed world
706 CW100 and open world CW200_400000 test sets, the results of the model without the time
707 dimension information extraction layer are similar on the two data sets. It shows that the
708 final prediction effect of fingerprint attack depends more on the spatial information in
709 the traffic. It can be seen that ResNet-18 has certain advantages in extracting spatial
710 information in traffic. In contrast, the time dimension extraction layer has a much
711 smaller impact on the prediction results, but it is also an essential part.
712 Table8. Ablation experimental results

Model CW100 Test set accuracy CW200_400000 TPR of test set

BiGRU-ResNet 98.46 86.26
-ResNet18 95.30 82.61

84
85 Remote Sens. 2023, 15, x FOR PEER REVIEW 29 of 36
86

-BiGRU 98.12 85.87

-dropout 97.71 84.45
713
714 It can be seen that each module in the model proposed in this article has a
715 corresponding contribution to the final fingerprint attack. Any component of the BiGRU-
716 ResNet model is indispensable for the extraction of traffic information. According to the
717 experimental results, it can be concluded that it is meaningful to use a reasonable fusion
718 method to fuse multi-mechanism information in the model proposed in this article.

719 4. Discussion

720 4.1. Traffic identification and classification

721 This article combines the characteristics of Tor routing and Obfs4 protocols. In-
722 depth analysis incorporates packet-based length features, information entropy features,
723 and time-based features. Perform discrete normalization on the scraped data set.
724 Improve traffic analysis results and model training efficiency from the source of data
725 and feature selection.
726 The XL-Stacking model based on the stacking model is proposed to identify and
727 classify Tor anonymous traffic. After experimental testing, the first layer of the stacked
728 model uses KNN, XGBoost, and random forest algorithms, and the second layer uses the
729 Logistic algorithm. This enables it to achieve higher classification accuracy with smaller
730 feature dimensions. The algorithm can quickly identify whether the user's traffic is
731 darknet traffic.
732 The traffic identification and classification model XL-Stacking proposed in this
733 article, the Tor traffic input to the model all known and commonly used applications.
734 For traffic identification and classification of some niche and uncommon applications,
735 samples need to be further enhanced to improve generalization capabilities.

736 4.2. Fingerprint attack

737 Considering that the ResNet and GRU models have achieved significant success in
738 image classification and speech recognition respectively, and are highly efficient.
739 Inspired by this, we try to combine models from the computer vision field and natural
740 language processing with website fingerprint classification models to solve the problem
741 of website fingerprint classification. Inspired by the fusion of spatio-temporal features
742 by Xi Rongkang and others, a deep neural network model based on temporal features
743 and spatial features, namely the BiGRU-ResNet model, was proposed. It can effectively
744 solve this problem. BiGRU-ResNet uses bidirectional GRU to extract temporal features
745 and ResNet to extract spatial features of website fingerprints. On a smaller sample space,
746 our attack reduces the amount of training data required to perform a website

87
88 Remote Sens. 2023, 15, x FOR PEER REVIEW 30 of 36
89

747 fingerprinting attack. The time required for data collection is shortened and the problem
748 of data instability over time is reduced. Compared with previous state-of-the-art WF
749 attacks, deep learning WF attacks can still achieve high accuracy even with a small
750 amount of training data.
751 This article proposes a BiGRU-ResNet model based on space and time. According
752 to the characteristics of Tor traffic packets, make full use of the Tor website fingerprint
753 sequence, including time, space and website information triples. Integrate multi-
754 modality to improve the efficiency and accuracy of model recognition. Dramatically
755 reduce the amount of training data required to perform website fingerprinting attacks.
756 Reduced time required for data collection. Reduces the possibility of data stability
757 issues.
758 This paper sees the bright future of applying deep learning to anonymous network
759 fingerprinting attacks. The model can be further improved in the future to improve
760 classification accuracy. Apply it to more complex scenarios. For example, there are
761 fingerprint recognition scenarios where users visit multiple websites at the same time;
762 scenarios where users download files in the background while listening to music, add
763 noise traffic, and pay attention to improving the computational efficiency of the model.

764 5. Conclusions
765 The XL-Staking model based on ensemble learning proposed in this chapter can not
766 only identify Tor anonymous traffic, but also classify Tor traffic. In terms of the effect of
767 Tor traffic identification, the model proposed in this article is better than the traditional
768 SVM and the results of C4.5 and KNN used by the Canadian Network Research
769 Institute. The precision and recall rates are higher, reaching 99.1% and 99.3%
770 respectively. In the Tor application traffic classification experiment, it also reached an
771 accuracy of 90.3% on the ISCX public data set. Compared with UNB-CIC's RF, KNN,
772 and C4.5 multi-classification models, the results are better.
773 Experimental results show that the model proposed in this chapter can realize the
774 identification and application classification of Tor anonymous traffic. Provide certain
775 technical support for the supervision of Tor traffic. At the same time, it lays the
776 foundation for further website fingerprinting.
777
778 Experimental results show that the BiGRU-ResNet model proposed in this article
779 achieves high accuracy and effect in closed world scenes and open world scenes. At
780 acceptable time cost and training cost, the attack performance is better than similar
781 work. In the closed world scene, the accuracy rate was 98.46%, and in the open world
782 scene, the true positive rate was 86.26%.
783 This article achieves a balance between training accuracy and training efficiency.
784 When the number of instances of each monitoring page is 100, the accuracy reaches
785 87.51%. It can also achieve better results when the training sample is smaller. Allows

90
91 Remote Sens. 2023, 15, x FOR PEER REVIEW 31 of 36
92

786 attackers to use fewer resources and less time to collect data. It reduces the cost of
787 attacks to a certain extent, allowing weak attackers with fewer data collection resources
788 to successfully conduct WF attacks. In terms of supervision of the network environment,
789 it improves the cost and efficiency of supervisors' supervision of the Tor anonymous
790 network and lays the foundation for subsequent research.
791
792 Author Contributions:
793 Data Availability Statement: Not applicable.
794 Funding:
795 Conflicts of Interest: The authors declare no conflict of interest.
796
797 Appendix A

798 Base learner and meta-learner parameter settings used in the XL-Stacking model
799 The main parameters of the XGBoost model are: the threshold gamma parameter of
800 the node splitting loss function. The larger the value set, the more conservative the
801 model will be. The maximum depth of the tree is max_deep. Reasonable setting can
802 prevent overfitting. Booster is used to specify the type of weak learner. The default value
803 is 'gbtree', which means using a tree-based model for calculation. This article chooses the
804 default value. This article first sets the parameters according to the default initial values,
805 and uses grid_search to tune the max_deep and gamma parameters. First make rough
806 adjustments and then fine-tune, and finally determine that max_deep is set to 7 and the
807 gamma parameter is set to 0. Adjust the regularization parameters alpha and lambda in
808 the range [
809 0,1,2,3,4,5]. The best results are alpha =2 and lambda=1.
810 The method of adjusting the parameters of the random forest algorithm is similar.
811 First set the parameters to initial parameters. max_features represent the maximum
812 number of features that a random forest allows a single decision tree to use [31].
813 Increasing max_features generally improve the performance of the model. Because at
814 each node, we have more options to consider. We set max_features to None and simply
815 select all features. Every tree can take advantage of them. In this case, there are no
816 restrictions on every tree. n_estimators represent the number of subtrees to build. More
817 subtrees can give the model better performance, but at the same time make the code
818 slower. We set the starting value to 1, the ending value to 100, increase by 5 each time,
819 and finally select n_estimators to 30. If this parameter is further increased, the
820 improvement effect of the model will not be significant. min_sample_leaf represents the
821 minimum sample leaf size. Smaller leaves make it easier for the model to capture noise
822 in the training data [32]. The starting value is 1 and the ending value is 100, increasing

93
94 Remote Sens. 2023, 15, x FOR PEER REVIEW 32 of 36
95

823 by 5 each time. As the min_sample_leaf parameter increases, the decision tree submodel
824 gradually transforms from a complex structure to a simple structure, and the classifier f1
825 score also gradually decreases. In order to maintain the classification performance of the
826 classifier while shortening the model training time, select the value of min_sample_leaf
827 as 10. In the same way, the value of min_samples_split is determined to be 20.
828 Parameter adjustment of KNN algorithm. In this section we choose Euclidean
829 distance to measure the distance between two samples. First set the parameters to initial
830 parameters. n_neighbors indicate the number of selected neighbors. We also use the grid
831 search method and grid_search is tuned to 9.
832 The logistic regression model is implemented using the LogisticRegression function
833 of the Sklearn library. The parameter penalty is the penalty term. In this article, we
834 choose L1 regularization.

835 Appendix B

836 Adjustment of BiGRU-ResNet model hyperparameters

837 Cross-validation is used on smaller datasets, but on larger datasets this is
838 computationally infeasible. Instead, we follow the best practices of the deep learning
839 community and those of Sirinam et al., and split the data into training, validation, and
840 test sets. The proportions of random draws are 90%, 5%, and 5% respectively. There is
841 no overlap between the data used for training, validation, and testing. Adam acts as an
842 optimizer for the model to improve computational efficiency and accelerate
843 convergence. Use the cross-entropy loss function, that is, the Loss function is Cross
844 entropy, and set the number of model training rounds to 30 rounds. Systematically
845 examine each hyperparameter, scanning a range of values for that hyperparameter while
846 holding all other hyperparameters constant. Once you find a value that performs well,
847 fix it and then test other hyperparameters. We measure performance using accuracy on
848 the test set [33]. Other parameters of the BiGRU-ResNet model are shown in Table B1
849 Table B1. Parameters Table

layer parameter

Input_dim = 10*5000
RestNet Output_dim = 1*512
The activation function is Relu

Input_dim = 1*5000
Bi-GRU
Output_dim = 1*512

96
97 Remote Sens. 2023, 15, x FOR PEER REVIEW 33 of 36
98

dropout P=0.5

Input_dim = 1*1024
The activation function is softmax
Fully connected Optimizer：Adam
Batchsize：64
Epochs：30

850
851 Compared with using a fixed number of learning rates, research has found that
852 determining learning rate changes based on validation set performance is more effective.
853 The two parameters we adjust here are the initial learning rate and the factor used for
854 learning rate decay. From our experiments, the default learning rate that is too small will
855 lead to more local minima, while increasing the learning rate does not improve the
856 accuracy of the model. In some cases, increasing the patience value of a training session
857 can slightly improve accuracy, but also significantly increase the average number of
858 training runs. Through experimental comparison, we finally started training with a
859 learning rate of 0.001. This is the default value of the Adam optimizer. The network is
860 allowed to train for 5 epochs without improving validation accuracy (called the patience
861 value) before reducing the learning rate by a factor. The lowest learning rate is 0.00001,
862 and training is stopped after 10 epochs without improving validation accuracy. 30
863 epochs are used in all experimental settings. We save the best performing model in the
864 validation set, which can be reloaded to perform final classification on the test set.

865 References
866 1. Basyoni, L.; Fetais, N.; Erbad, A.; Mohamed, A.; Guizani, M. Traffic analysis attacks on Tor: a survey. In Pro -
867 ceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT),
868 2020; pp. 183-188.
869 2. Lashkari, A.H.; Gil, G.D.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of tor traffic using time based fea-
870 tures. In Proceedings of the International Conference on Information Systems Security and Privacy, 2017; pp.
871 253-262.
872 3. Cao, Z.; Li, Z.; Zhang, J.; Fu, H. A Homogeneous Stacking Ensemble Learning Model for Fault Diagnosis of
873 Rotating Machinery With Small Samples. IEEE Sensors Journal 2022, 22, 8944-8959, doi:10.1109/
874 JSEN.2022.3163760.

99
100 Remote Sens. 2023, 15, x FOR PEER REVIEW 34 of 36
101

875 4. He, Y.; Hu, L.; Gao, R. Detection of tor traffic hiding under obfs4 protocol based on two-level filtering. In Pro -
876 ceedings of the 2019 2nd International Conference on Data Intelligence and Security (ICDIS), 2019; pp. 195-
877 200.
878 5. Lingyu, J.; Yang, L.; Bailing, W.; Hongri, L.; Guodong, X. A hierarchical classification approach for tor anony-
879 mous traffic. In Proceedings of the 2017 IEEE 9th International Conference on Communication Software and
880 Networks (ICCSN), 6-8 May 2017, 2017; pp. 239-243.
881 6. Pandey, L. Lip Reading as an Active Mode of Interaction with Computer Systems; University of California, Merced:
882 2022.
883 7. Rimmer, V.; Preuveneers, D.; Juarez, M.; Van Goethem, T.; Joosen, W. Automated website fingerprinting
884 through deep learning. arXiv preprint arXiv:1708.06376 2017.
885 8. Bhat, S.; Lu, D.; Kwon, A.; Devadas, S. Var-CNN: A data-efficient website fingerprinting attack based on deep
886 learning. arXiv preprint arXiv:1802.10215 2018.
887 9. He, X.; Wang, J.; He, Y.; Shi, Y. A Deep Learning Approach for Website Fingerprinting Attack. In Proceedings
888 of the 2018 IEEE 4th International Conference on Computer and Communications (ICCC), 7-10 Dec. 2018,
889 2018; pp. 1419-1423.
890 10. Sirinam, P.; Imani, M.; Juarez, M.; Wright, M. Deep fingerprinting: Undermining website fingerprinting de-
891 fenses with deep learning. In Proceedings of the Proceedings of the 2018 ACM SIGSAC Conference on Com-
892 puter and Communications Security, 2018; pp. 1928-1943.
893 11. Lu, Y.; Cai, M.; Zhao, C.; Zhao, W. Tor Anonymous Traffic Identification Based on Parallelizing Dilated Con-
894 volutional Network. Applied Sciences 2023, 13, 3243.
895 12. Wang, M.; Li, Y.; Wang, X.; Liu, T.; Shi, J.; Chen, M. 2ch-TCN: A Website Fingerprinting Attack over Tor Us-
896 ing 2-channel Temporal Convolutional Networks. In Proceedings of the 2020 IEEE Symposium on Computers
897 and Communications (ISCC), 7-10 July 2020, 2020; pp. 1-7.
898 13. Wu, S.; Wang, Y. Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly
899 Detection. In Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications,
900 Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking
901 (ISPA/BDCloud/SocialCom/SustainCom), 30 Sept.-3 Oct. 2021, 2021; pp. 1452-1459.
902 14. Chen, H.Y.; Lin, T.N. The Challenge of Only One Flow Problem for Traffic Classification in Identity Obfusca -
903 tion Environments. IEEE Access 2021, 9, 84110-84121, doi:10.1109/ACCESS.2021.3087528.
904 15. Attarian, R.; Abdi, L.; Hashemi, S. AdaWFPA: Adaptive online website fingerprinting attack for tor anony-
905 mous network: A stream-wise paradigm. Computer Communications 2019, 148, 74-85.

102
103 Remote Sens. 2023, 15, x FOR PEER REVIEW 35 of 36
104

906 16. Xie, X.; Zhang, X.; Fu, J.; Jiang, D.; Yu, C.; Jin, M. Location recommendation of digital signage based on multi-
907 source information fusion. Sustainability 2018, 10, 2357.
908 17. Džeroski, S.; Ženko, B. Is combining classifiers with stacking better than selecting the best one? Machine learn-
909 ing 2004, 54, 255-273.
910 18. Xian, S.; Li, T.; Cheng, Y. A novel fuzzy time series forecasting model based on the hybrid wolf pack algo-
911 rithm and ordered weighted averaging aggregation operator. International Journal of Fuzzy Systems 2020, 22,
912 1832-1850.
913 19. Wang, S.; Wang, X.; Guo, X. Advanced Face Mask Detection Model Using Hybrid Dilation Convolution Based
914 Method. Journal of Software Engineering and Applications 2023, 16, 1-19.
915 20. Luo, Z.; Zhu, J.; Li, Z.; Liu, S. Research the Method of Joint Segmentation and POS Tagging for Tibetan Using
916 BiGRU-CRF. In Proceedings of the Proceedings of the 2020 3rd International Conference on Algorithms, Com-
917 puting and Artificial Intelligence, 2020; pp. 1-6.
918 21. Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate
919 shift. In Proceedings of the International conference on machine learning, 2015; pp. 448-456.
920 22. Xu, J.; Wang, J.; Qi, Q.; Sun, H.; He, B. DEEP NEURAL NETWORKS FOR APPLICATION AWARENESS IN
921 SDN-BASED NETWORK. In Proceedings of the 2018 IEEE 28th International Workshop on Machine Learning
922 for Signal Processing (MLSP), 17-20 Sept. 2018, 2018; pp. 1-6.
923 23. Zhu, S.; Xu, X.; Gao, H.; Xiao, F. CMTSNN: A Deep Learning Model for Multiclassification of Abnormal and
924 Encrypted Traffic of Internet of Things. IEEE Internet of Things Journal 2023, 10, 11773-11791, doi:10.1109/
925 JIOT.2023.3244544.
926 24. Baek, I.; Kim, S.B. 3-Dimensional convolutional neural networks for predicting StarCraft Ⅱ results and extract -
927 ing key game situations. Plos one 2022, 17, e0264550.
928 25. Zhioua, S. Tor traffic analysis using hidden markov models. Security and Communication Networks 2013, 6,
929 1075-1086.
930 26. Panchenko, A.; Mitseva, A.; Henze, M.; Lanze, F.; Wehrle, K.; Engel, T. Analysis of fingerprinting techniques
931 for Tor hidden services. In Proceedings of the Proceedings of the 2017 on Workshop on Privacy in the Elec-
932 tronic Society, 2017; pp. 165-175.
933 27. Wang, Y.; Xu, H.; Guo, Z.; Qin, Z.; Ren, K. snWF: Website Fingerprinting Attack by Ensembling the Snapshot
934 of Deep Learning. IEEE Transactions on Information Forensics and Security 2022, 17, 1214-1226, doi:10.1109/
935 TIFS.2022.3158086.
936 28. Sirinam, P. Website fingerprinting using deep learning; Rochester Institute of Technology: 2019.

105
106 Remote Sens. 2023, 15, x FOR PEER REVIEW 36 of 36
107

937 29. Wang, T.; Huang, Z.; Wu, J.; Cai, Y.; Li, Z. Semi-Supervised Medical Image Segmentation with Co-Distribu -
938 tion Alignment. Bioengineering 2023, 10, 869.
939 30. Zhang, X.; Hu, D.; Li, S.; Luo, Y.; Li, J.; Zhang, C. Aircraft Detection from Low SCNR SAR Imagery Using Co -
940 herent Scattering Enhancement and Fused Attention Pyramid. Remote Sensing 2023, 15, 4480.
941 31. Zhang, Y.; Deng, Q.; Liang, W.; Zou, X. An efficient feature selection strategy based on multiple support vec-
942 tor machine technology with gene expression data. BioMed research international 2018, 2018.
943 32. Zhang, Z. Data Sets Modeling and Frequency Prediction via Machine Learning and Neural Network. In Pro-
944 ceedings of the 2021 IEEE International Conference on Emergency Science and Information Technology (ICE-
945 SIT), 22-24 Nov. 2021, 2021; pp. 855-863.
946 33. Zhou, B.; Yin, Y.; Wang, M.; Zhang, R.; Zhang, Y.; Guo, W. Identification of Strong Motion Record Baseline
947 Drift Based on Bayesian Optimized Transformer Network. 2023.

948

108

Defense in Depth
From Everand
Defense in Depth
Qasim
No ratings yet
Bsides
No ratings yet
Bsides
83 pages
The Current State of Anonymous Filesharing
100% (2)
The Current State of Anonymous Filesharing
65 pages
Cybersecurity Key Topics: A Field Guide
From Everand
Cybersecurity Key Topics: A Field Guide
Dr. Betina Tagle
No ratings yet
Tor Uing
No ratings yet
Tor Uing
54 pages
Quantum Networks: The Future of Computer Communication
From Everand
Quantum Networks: The Future of Computer Communication
Manoj RC
No ratings yet
Enhancing Censorship Resistance in The Tor Anonymity
100% (1)
Enhancing Censorship Resistance in The Tor Anonymity
86 pages
De-Anonymisation Attacks On Tor - A Survey
No ratings yet
De-Anonymisation Attacks On Tor - A Survey
28 pages
MSC Thesis
No ratings yet
MSC Thesis
102 pages
TOR Anonymity Network
No ratings yet
TOR Anonymity Network
32 pages
Gani 3
No ratings yet
Gani 3
25 pages
Digital Forensics and Cybercrime Explained
From Everand
Digital Forensics and Cybercrime Explained
Kanti Shukla
No ratings yet
Detection of Anonymising Proxies Using ML Anon - July 2021 Final
No ratings yet
Detection of Anonymising Proxies Using ML Anon - July 2021 Final
25 pages
From "Onion Not Found" To Guard Discovery
No ratings yet
From "Onion Not Found" To Guard Discovery
22 pages
Darknet Traffic Analysis A Systematic Literature Review
No ratings yet
Darknet Traffic Analysis A Systematic Literature Review
30 pages
TDSC DAENet Making Strong Anonymity Scale in A Fully Decentralized Network
No ratings yet
TDSC DAENet Making Strong Anonymity Scale in A Fully Decentralized Network
18 pages
Shen 等 - Subverting Website Fingerprinting Defenses With
No ratings yet
Shen 等 - Subverting Website Fingerprinting Defenses With
18 pages
Sec15 Paper Sun
No ratings yet
Sec15 Paper Sun
17 pages
Sendner 等 - 2024 - MirageFlow A New Bandwidth Inflation Attack on To
No ratings yet
Sendner 等 - 2024 - MirageFlow A New Bandwidth Inflation Attack on To
16 pages
Darknet Traffic Classification and Adversarial Att
No ratings yet
Darknet Traffic Classification and Adversarial Att
31 pages
Shedding Light On The Dark Corners of The Internet: A Survey of Tor Research
No ratings yet
Shedding Light On The Dark Corners of The Internet: A Survey of Tor Research
34 pages
Rimmer2018 DLWF
No ratings yet
Rimmer2018 DLWF
15 pages
An Anonymity Vulnerability in Tor
No ratings yet
An Anonymity Vulnerability in Tor
14 pages
A Model For Detecting Tor Encrypted Traffic Using Supervised Machine Learning
No ratings yet
A Model For Detecting Tor Encrypted Traffic Using Supervised Machine Learning
15 pages
FedFingerprinting A Federated Learning Approach To Website Fingerprinting Attacks in Tor Networks
No ratings yet
FedFingerprinting A Federated Learning Approach To Website Fingerprinting Attacks in Tor Networks
14 pages
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)
Understanding The Tor Network
No ratings yet
Understanding The Tor Network
14 pages
Caronte ccs15
No ratings yet
Caronte ccs15
12 pages
Cryptography Basics for New Coders: A Practical Guide with Examples
From Everand
Cryptography Basics for New Coders: A Practical Guide with Examples
William E. Clark
No ratings yet
18 Feb
No ratings yet
18 Feb
11 pages
Wpes11 Panchenko
No ratings yet
Wpes11 Panchenko
11 pages
Connecting The Dots in The Sky Website
No ratings yet
Connecting The Dots in The Sky Website
10 pages
IPSJCSS2018044 高橋元春
No ratings yet
IPSJCSS2018044 高橋元春
9 pages
TOR Anonymous Traffic Fingerprint Extraction and R
No ratings yet
TOR Anonymous Traffic Fingerprint Extraction and R
8 pages
Darkweb Python Hidden Services
No ratings yet
Darkweb Python Hidden Services
61 pages
Towards Efficient Traffic-Analysis Resistant Anonymity Networks
No ratings yet
Towards Efficient Traffic-Analysis Resistant Anonymity Networks
12 pages
Practical Attacks Against The I2p Network
No ratings yet
Practical Attacks Against The I2p Network
20 pages
Identifying The True IP of I2P Service Hosts
No ratings yet
Identifying The True IP of I2P Service Hosts
23 pages
Anonymity On Quicksand - Using BGP To Compromise Tor (Laurent Vanbever, Oscar Li, Et. Al.) (2014)
No ratings yet
Anonymity On Quicksand - Using BGP To Compromise Tor (Laurent Vanbever, Oscar Li, Et. Al.) (2014)
7 pages
Dark Digital Histories
From Everand
Dark Digital Histories
Shah Rukh
No ratings yet
Privacy: Anonymous Routing, Mix Nets, and User Tracking
No ratings yet
Privacy: Anonymous Routing, Mix Nets, and User Tracking
35 pages
Iron Fish Protocols and Privacy in Modern Cryptocurrency: The Complete Guide for Developers and Engineers
From Everand
Iron Fish Protocols and Privacy in Modern Cryptocurrency: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Wireshark Protocol Analysis and Network Investigation: Definitive Reference for Developers and Engineers
From Everand
Wireshark Protocol Analysis and Network Investigation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Developing on Tron Blockchain: Definitive Reference for Developers and Engineers
From Everand
Developing on Tron Blockchain: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Design and Construction of 20 K VA Automatic Voltage Stabilizer Control System
No ratings yet
Design and Construction of 20 K VA Automatic Voltage Stabilizer Control System
11 pages
Blocking of Ips in A Heterogeneous Network
No ratings yet
Blocking of Ips in A Heterogeneous Network
9 pages
A First Look at References From The Dark To Surface Web World
No ratings yet
A First Look at References From The Dark To Surface Web World
15 pages
Tcpdump in Depth: Definitive Reference for Developers and Engineers
From Everand
Tcpdump in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Leaving Timing Channel Fingerprints in Hidden Service Log Files
No ratings yet
Leaving Timing Channel Fingerprints in Hidden Service Log Files
10 pages
Circuit Finger
No ratings yet
Circuit Finger
16 pages
Dark Web Content Analysis and Visualization
No ratings yet
Dark Web Content Analysis and Visualization
7 pages
Learning Web3 Development
From Everand
Learning Web3 Development
IT Campus Academy
No ratings yet
A3 Colour Multifunction Printers: It'S in The Details
No ratings yet
A3 Colour Multifunction Printers: It'S in The Details
12 pages
Network Engineer's Bible: Mastering 100 Protocols For Communication, Management, And Security
From Everand
Network Engineer's Bible: Mastering 100 Protocols For Communication, Management, And Security
Rob Botwright
No ratings yet
Penetration Testing Fundamentals -1: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals -1: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet
6lpsohàrzfkduwripdoixqfwlrq : 4. Troubleshooting
No ratings yet
6lpsohàrzfkduwripdoixqfwlrq : 4. Troubleshooting
57 pages
B.a.programme Office Management & Secretarial Practice (O M S P) Computer Applications Sem-III (5357)
0% (1)
B.a.programme Office Management & Secretarial Practice (O M S P) Computer Applications Sem-III (5357)
4 pages
Software-Testing-Life-Cycle PPT 2 in Unit1
No ratings yet
Software-Testing-Life-Cycle PPT 2 in Unit1
13 pages
Manual Agent Manager
100% (1)
Manual Agent Manager
335 pages
Ijret - Modeling and Prevention of Cell Counting Based Attacks On Tor
No ratings yet
Ijret - Modeling and Prevention of Cell Counting Based Attacks On Tor
4 pages
Cryptojacking: The Hidden Threat of Digital Mining in the World of Monero Cryptocurrency
From Everand
Cryptojacking: The Hidden Threat of Digital Mining in the World of Monero Cryptocurrency
Fouad Sabry
No ratings yet
Project Poster
No ratings yet
Project Poster
1 page
Tokenization Understand what it is and how it works
From Everand
Tokenization Understand what it is and how it works
Alex Carvalho
No ratings yet
Top PL SQL Interview Questions
No ratings yet
Top PL SQL Interview Questions
3 pages
SOFTWARE
No ratings yet
SOFTWARE
70 pages
Cryptography: The Art of Secret Keeping in the Digital Age
From Everand
Cryptography: The Art of Secret Keeping in the Digital Age
John Smith
No ratings yet
Rust by Example
No ratings yet
Rust by Example
255 pages
Firewalls: The Engineer's Guide in the Age of Cyber Threats
From Everand
Firewalls: The Engineer's Guide in the Age of Cyber Threats
Rob Botwright
No ratings yet
Min and Max Mode 8259
No ratings yet
Min and Max Mode 8259
11 pages
Computer Architecture Test 1
100% (1)
Computer Architecture Test 1
6 pages
RG-EG300GH-E Series Datasheet-20240524
No ratings yet
RG-EG300GH-E Series Datasheet-20240524
12 pages
Public Vs Private Vs Hybrid Cloud Differences Explained
No ratings yet
Public Vs Private Vs Hybrid Cloud Differences Explained
8 pages
Wardialing: Unveiling Cyber Tactics and Electronic Warfare
From Everand
Wardialing: Unveiling Cyber Tactics and Electronic Warfare
Fouad Sabry
No ratings yet
Cellular Technology
No ratings yet
Cellular Technology
21 pages
Enhanced Cybersecurity Using Random Number Generator Based On Perovskite LED Offers Safer Exchange of Digital Information
No ratings yet
Enhanced Cybersecurity Using Random Number Generator Based On Perovskite LED Offers Safer Exchange of Digital Information
3 pages
Requerimientos Rheocompass
No ratings yet
Requerimientos Rheocompass
3 pages
Logjjaja
No ratings yet
Logjjaja
8 pages
@vtucode - in 18CS56 Model Set 1 Paper
No ratings yet
@vtucode - in 18CS56 Model Set 1 Paper
3 pages
Log
No ratings yet
Log
50 pages
Network Security and Cryptography: Course Code: 15Cs1105 Pre-Requisites: Computer Networks
No ratings yet
Network Security and Cryptography: Course Code: 15Cs1105 Pre-Requisites: Computer Networks
3 pages
5 - Blockchain Protocols-1
No ratings yet
5 - Blockchain Protocols-1
13 pages
Midex2023 Solutions
No ratings yet
Midex2023 Solutions
8 pages
CN-Unit - 4 Notes - AR20 - REC
No ratings yet
CN-Unit - 4 Notes - AR20 - REC
19 pages
Creation of Usual Abap Class
No ratings yet
Creation of Usual Abap Class
9 pages
Source: Md. Salim Reza Jony Manager-HR at System Engineering LTD Follow. (N.D.) - Google Glass: A Futuristic Fashion
No ratings yet
Source: Md. Salim Reza Jony Manager-HR at System Engineering LTD Follow. (N.D.) - Google Glass: A Futuristic Fashion
2 pages
Lab-2 3
No ratings yet
Lab-2 3
3 pages
Play Claw 7
No ratings yet
Play Claw 7
11 pages
GR 10 Ict Pra Exam Eng
No ratings yet
GR 10 Ict Pra Exam Eng
4 pages
PANASONIC Dell - PowerEdge - R730xd
No ratings yet
PANASONIC Dell - PowerEdge - R730xd
1 page
Quickly Test RS-232 Signals On DB9 Ports
No ratings yet
Quickly Test RS-232 Signals On DB9 Ports
2 pages

V1 0-Mdpi

Uploaded by

V1 0-Mdpi

Uploaded by

1

2 Research on anonymous network application classification and

Copyright: © 2023 by the authors.

3 Remote Sens. 2023, 15, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/remotesensing

80 Figure2 Structure of data cell [1]

81 Tor anonymous network traffic identification is to identify whether the user is

178 2. Materials and Methods

182 anonymous traffic is usually described as a multi-classification problem, which can

238 Table 1. Traffic characteristics.

Information entropy characteristics Overall and all-directional packet information entropy.

259 Table 2. Accuracy rate (Acc) under different models.

Models GaussianNB KNN XGBoost Random Forest Dense SVM

292 Figure 4. XL-Stacking first layer model.

295 Figure 5. XL-Stacking second layer model

334 Figure 6. BiGRU-ResNet Architecture

350 2.2.1. Spatial dimension information extraction layer

373 Figure 7. ResNet-18 Architecture

375 Figure 8. Residual module

450 2.3. Experimental data set

CW300 300 2500 5000

CW500 500 2500 5000

CW900 900 2500 5000

490 2.4. Experimental design

Lab environment Configuration instructions

517 2.5. Evaluation index

539 3.1. Experimental results of Tor anonymous traffic identification

SVM C4.5 KNN XL-Stacking

569 3.1. Experimental results of Tor application classification

575 (a) (b)

XL-Stacking RF C4.5 KNN

602 Figure 11. Tor traffic application classification

611 Figure 12. Tor Characterization Precision

614 Figure 13. Tor Characterization Recall

616 3.3. Fingerprint attack closed world experimental results

BiGRU-RestNet LSTM CNN SDAE Traces

SDAE CNN LSTM BiGRU-RestNet

GRU-RestNet LSTM CNN SDAE Dataset

SDAE CNN LSTM BiGRU-RestNet

655 Figure 15 The accuracy of each model on different datasets

657 3.4. Fingerprint attack open world experimental results

687 3.5. Fingerprint attack ablation experiment

Model CW100 Test set accuracy CW200_400000 TPR of test set

-BiGRU 98.12 85.87

720 4.1. Traffic identification and classification

736 4.2. Fingerprint attack

836 Adjustment of BiGRU-ResNet model hyperparameters

You might also like