Practical Trajectory Anonymization Method Using La
Practical Trajectory Anonymization Method Using La
Paper
The global positioning system (GPS) data are commonly used for location-based services such as traffic flow prediction. However,
such data contain considerable sensitive information and thus, they must be anonymized before being published. In this study,
we investigate trajectory anonymization. Previous methods have limitations in that they cannot be applied for different load
network sparseness and cannot preserve the trajectory information. Thus, we propose a DNN-based method that can anonymize
trajectories with different load network sparseness and also preserve the trajectory information. Specifically, the trajectories are
projected to the latent space using the pre-trained encoder-decoder model, and the latent variables are generalized. Furthermore,
to reduce the information loss, we propose a segment-aware trajectory modeling and study the effectiveness of assuming the
normal distribution to the latent space. The experimental results using real GPS data show the effectiveness of the proposed
method, presenting the improvement in the data reservation rate by approximately 3% and reducing the reconstruction error by
approximately 31%. © 2024 The Author(s). IEEJ Transactions on Electrical and Electronic Engineering published by Institute
of Electrical Engineers of Japan and Wiley Periodicals LLC.
© 2024 The Author(s). IEEJ Transactions on Electrical and Electronic Engineering published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium,
provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Y. SAKUMA, AND H. NISHI
Table I. Notation definitions 4.3.1. Embedding function The input to the embedding
function is a segment-aware trajectory traj seg . The location and
Notation Definition timestamp sequences are processed separately to enable segment-
loci , tsi Location and timestamp at the i -th node aware modeling. For the location sequence ({loc0 , loc1 , · · · , locn }),
traj, traj , traj seg Implicit, -sampling, and segment-aware following BERT [25], it is formatted by attaching it to the
trajectories start-of-sentence and end-of-sentence tokens. Subsequently, it
femb , fenc , fdec , frec The embedding, encoding, decoding, and is fed into the embedding layer. For the timestamp sequence
reconstruction functions ({ts0 , ts1 , · · · , tsn }), it is fed into the sine-cosine encoding followed
z The latent variable by the fully-connected (FC) layer to align the feature dimension.
y The output feature of the proposed DNN The timestamp sequence is padded to align its length with
model the location sequence. The location and timestamp features are
LCE , LMSE , Lvar The cross entropy, MSE, and variational concatenated with the positional encoding.
losses
4.3.2. Reconstruction function The output feature y =
fdec (z ) is converted to the reconstructed trajectory traj . Different
from the -sampling trajectory, a multitask head, which is an FC
layer that predicts the sequences of the reconstructed nodes, is
employed to compute the output feature for location (yloc ) and
timestamps (yts ) separately. Here, yloc is the probability of the
location IDs. The details of decoding from yts is provided bellow.
where femb and fenc denote the embedding function and the
Transformer encoder, respectively. The latent variable z follows 4.4. Trajectory reconstruction training The recon-
the normal distribution with mean μ and variance σ . Then, the struction error can be calculated from the cross-entropy (CE) loss
reconstructed trajectory traj is calculated from z such that: for node prediction (4) and the mean squared error (MSE) for
speed prediction (5), such that:
traj = frec (fdec (z )) (2)
1
n n_nodes
where fdec and frec are the Transformer decoder and reconstruction LCE = − pj log yloc,j (4)
n
i =0 j =0
function, respectively.
1
n
2
4.3. Segment-aware trajectory modeling The LMSE = vi − yts,i (5)
n
i =0
embedding (femb ) and reconstruction function (frec ) are designed
specifically for the proposed segment-aware trajectory modeling. where p is the probability of the trajectory passing each node and
We provide the formulation as follows. vi is the speed passing between nodes loci and loci +1 .
4.4.1. Assuming a normal distribution in the latent dataset was collected from January to August 2017 and January
space When the model is trained with only the reconstruction to November 2019 for Saitama and Tsurumi, respectively. For
error (the weighted sum of (4) and (5), which is the same Saitama, the load network is relatively dense, such that the target
setting as training AE), the anonymized trajectories are often area is busy, having both residential and commercial areas. For
unrealistic. This is caused by the sparsity in the latent space of Tsurumi, the load network is relatively sparse such that the target
the AE; since one-to-one mapping is performed by the AE, some area is residential. The raw data were organized into a set of daily
subspace in the latent space is left empty. When the generalized GPS points, with each user assigned a unique ID for each day.
latent variables are mapped to such empty space, the decoded The datasets contain the timestamp, latitude, longitude, accuracy,
trajectory may be broken because the decoder is not trained departure/arrival, and direction data for each GPS point. Only the
for such latent variables. The VAE [27] addresses this issue by timestamps, latitudes, and longitudes were used in the experiment.
introducing normal distribution into the latent space to make it Because this study assumes a moving trajectory, we excluded tra-
continuous [28]. Although more complex latent space modeling jectories with over 25% of the nodes that exhibited a spontaneous
(e.g., Gaussian Mixture Models or normalizing flows) can be used, speed of zero. The minimum and maximum numbers of nodes
the improvement is considered to be marginal [29]. Thus, we train were 3 and 200, respectively. The timestamps for each node were
the VAE such that z is sampled from the normal distribution with resampled to every 30 s. The trajectories were map-matched using
mean μ and variance σ as follows: the OpenStreetMap [30] API, and split into sequences of 1 h.
The Transformer architecture was designed based on BERT-
Lvar = Eq [ln p(traj|z )] − KL[q(z |traj) p(z ))] (6) small [31,32], which uses an embedding layer with 512 dimensions
and four transformer encoder/decoder layers (with four head self-
LKL = KL[q(z |traj) p(z ))] attention layers and an FC layer with 512 dimensions). The model
was trained for 100 epochs using an NVIDIA RTX8000 GPU with
1
D
48GB of memory. The Adam optimizer [33] was used for training.
= 1 + log σ2 − σ 2 − μ2 (7) The proposed method was compared to the generalization-
2
d =1 based method [5] which directly anonymizes the trajectories
where p and q are prior and posterior distributions, respectively. (denoted as “direct”) and the segment-based [9] (denoted as
The first term of (7) is the reconstruction error. The reparameteri- ‘segment’) methods. For all methods, the trajectories were greedily
zation trick is used to allow backpropagation during training. The grouped to minimize the dynamic time warping (DTW, detailed in
total training loss Ltotal can be formulated as: Section 5.2) within a group of k trajectories. For the direct method,
the nodes on two trajectories were greedily matched to minimize
Ltotal = LCE + λMSE LMSE + λKL LKL (8) the DTW between them. Then, the center of each bounding box
was used as an anonymized node. For the segment method, first,
where λMSE and λKL are balancing weights. The proposed the trajectories were split into segments. Then, the segments with
anonymization method is summarized in Algorithm 1. more than k overwraps were preserved to be published.
Fig. 4. Original (black) and anonymized (blue) trajectories for load networks with different sparsity when k = 4. The anonymized
trajectories generated by the direct method [5] pass unrealistic segments especially for sparse load networks in Tsurumi (bottom). The
segment method [9] splits the trajectories into segments and the trajectory information is lost. The proposed DNN-based method generates
realistic trajectories for locations with dense and sparse load networks
networks with different sparsity and preserves the trajectory infor- Direct
mation. When the load network was sparse (e.g., Tsurumi), the 0.8 Segment
Proposed
Seg reserv rate ↑
jectory information than the direct and segment methods. For 0.6
instance, the proposed method generated a realistic trajectory that
0.4
passes through broader streets for both the Saitama and Tsurumi
datasets and preserved the segment information of the original 0.2
map. Furthermore, the length and direction of the trajectory were
0.0
similar to those of the original trajectories, preserving trajectory
information well. 21 22 23 24 25 26 27
k (log scale)
5.4. Evaluation on segment reservation rate (b)
Segment information loss was evaluated using the segment
Fig. 5. Segment reservation rate. The proposed method presents
reservation rate as presented in Fig. 5. The proposed method
larger segment reservation rate for larger k , and thus better
outperformed the direct method when k ≥ 16 and k ≥ 8 for the
preserves trajectory distribution than the direct and segment-based
Saitama and Tsurumi datasets, respectively. As presented in the
method. ↑ denotes that the performance is better when the value
qualitative analysis (Section 5.3), the direct method tends to lose
is larger. (a) Saitama and (b) Tsurumi
segment information because it greedily generates an anonymized
trajectory that only reduces the spatial and temporal distances. The
segment method preserved segments better when k was smaller; segment methods by 3.06% and 1.66% on average, respectively,
however, when k was larger, the information loss became larger. for Saitama (Fig. 5(a)).
The proposed method preserved segment information well, even The proposed method performed well for sparse trajectory data
for large k values. The proposed method outperformed direct and for Tsurumi (Fig. 5(b)). While the target area was both commercial
Table II. Trajectory reconstruction errors for location, timestamp, the direct method often generates unrealistic segments in which
and speed for Saitama dataset nodes are spatially far. On the other hand, the proposed method
has the advantage of generating realistic segments. We observed
Reconstruction error ↓ the same characteristics for both the Saitama and Tsurumi datasets.
Location Timestamp Speed
k Method (km) (s) (ms−1 ) 5.6. Effectiveness of the proposed segment-aware tra-
16 Direct 10.96 164.98 9.82 jectories From Table III, the proposed segment-aware method
Proposed 11.76 219.66 6.42 presents a better segment reservation rate than the -sampling
32 Direct 19.21 197.33 10.09 method by about 30.0%. The -sampling method failed to preserve
Proposed 13.41 226.16 6.54 the segment information because it generated anonymized trajec-
64 Direct 36.67 223.97 9.62 tories from maps without considering the segment information.
Proposed 16.78 225.16 6.51 Thus, the proposed segment-aware method is more suitable for pre-
128 Direct 66.75 256.95 8.61 serving map information. Furthermore, the location reconstruction
Proposed 19.96 226.54 6.59 error was smaller than 4.53%. However, the -sampling method
Note: The proposed method’s reconstruction error is kept small when k presented a smaller timestamp reconstruction error (24.3% better).
becomes larger. ↓ denotes that the performance is better when the value is Timestamp reconstruction is more difficult for our segment-aware
smaller. The bold value presents the better method between the direct and method because the speed of each segment needs to be accurately
proposed methods. predicted. Because the evaluation dataset included trajectories with
different moving speeds, reconstruction error was large when the
and residential in Saitama, it was residential in Tsurumi and the prediction was inaccurate. We leave this issue for future works and
recorded data was sparser. The direct and segment methods lost discuss the advantages and disadvantages of the -sampling and
more segment information than in the dense area (i.e., Saitama). segment-aware methods in Section 6.
Notably, the segment reservation rate dropped more for the Next, different timestamp decoding methods were applied to
segment method when k was large, because when the amount confirm the effectiveness of using speed. Notably, the naı̈ve
of recorded data is small, most segments need to be removed to method that directly uses time (denoted as ‘segment-aware
meet the privacy criteria. The proposed method outperformed both [time]’), suffered from reconstructing the timestamp, resulting in
methods for most k values (15.6% and 8.57% better than the direct a loss increase approximately 60 times greater compared to the
and segment methods, respectively). speed-decoding method (denoted as ‘segment-aware (speed)’). It
was observed that ‘segment-aware (time)’ was not robust for
5.5. Evaluation on trajectory information loss Tra- decoding errors, and when anonymized via latent space, the recon-
jectory information loss was evaluated using the reconstruction struction error tended to be large. By using speed, ‘segment-aware
error for the location and timestamp and is presented in Table II. (speed)’ was more robust to errors and successfully reduced the
The reconstruction error for the segment method cannot be cal- reconstruction error compared to time-decoding. This robustness
culated and is not shown in Table II; it splits a trajectory into also improves the segment reservation error. We observed the same
segments and trajectory information is not kept. The location characteristics for both the Saitama and Tsurumi datasets.
reconstruction error of the proposed method was smaller than that
of the direct method when k is greater than 32 (30.5% smaller, on 5.7. Effectiveness of assuming the Normal distribu-
average). When k was large, the direct method often generated a tion in the latent space From Table III, by using λvar , the
spatially distant anonymized trajectory. This is because the direct location reconstruction error was successfully reduced by 30.8%.
method is not robust when an outlier is present, and greedily gen- The effectiveness of using variational loss in the latent space is
erates an anonymized trajectory at the center of the grouped trajec- shown in Fig. 6. The latent variables of the clustered trajectories for
tories. In contrast, the proposed method generated an anonymized k = 32 were projected to the two-dimensional space using t-SNE
trajectory based on the distribution of the original trajectories and [35]. The eight clusters were randomly selected for visualization.
is more robust for outliers. While the proposed method’s times- When the variational loss was not used (Fig. 6(a)), each cluster
tamp reconstruction error is mostly larger than that of the direct was sparsely scattered. It can be observed that as the balancing
method, it is kept constant when k gets larger. Although the pro- weight λKL became larger, the latent space became more continu-
posed method can generate reasonable timestamps, reconstruction ous (Fig. 6(b)–(d)). When λKL was larger, although the segment
loss tends to be large. The analysis and discussion are provided in reservation rate slightly dropped, the location reconstruction loss
Sections 5.6 and 6. The speed reconstruction error is smaller for the decreased. The segment reservation rate and location information
proposed method than the direct method for all k . This is because loss stroked the good balance when λKL = 0.1.
Table III. The effectiveness of the trajectory modeling, timestamp decoding, and variational loss in the proposed method
10
10
Dim1
Dim1
0 0
–10 –10
5.0
10 2.5
Dim1
Dim1
0 0.0
–10 –2.5
–15 –10 –5 0 5 10 15 –2 0 2 4
Dim2 Dim2
(c) (d)
Fig. 6. The latent space of trained models for different balancing weights (λKL ) of the variational loss (Lvar ). When λKL = 0.1, the location
reconstruction error (LRE) is the smallest while the segment reservation rate (SRR) does not drop significantly. (a) λKL = 0 (without
Lvar ), SRR = 0.78, LRE = 12.25 km, (b) λKL = 0.01, SRR = 0.76, LRE = 7.99 km, (c) λKL = 0.1, SRR = 0.76, LRE = 7.65 km, and (d)
λKL = 1, SRR = 0.73, LRE = 8.49 km
6. Discussion Acknowledgments
Although the proposed method successfully reduces the spa- This work was supported by the JST CREST Grant Number
tial information loss, it incurs a larger temporal information loss JPMJCR19K1. The authors also express their gratitude to the JST CSTI
than prior methods because of the difficulty in predicting the SIP (the 3rd period of SIP, ‘Smart Energy Management System’), the
MAFF Commissioned project study (Grant Number JPJ009819), and
timestamps. Each timestamp sampling method has advantages and
the MOE Demonstration Project ‘FY2022 Technology Development
disadvantages. For instance, while the -sampling method can and Demonstration Project for Regional Symbiosis and Cross-Sectoral
accurately model timestamps, it loses the segment information. Carbon Neutrality (Second Round)’, and the commissioned research
Although our segment-based method preserves the location infor- (No. JPJ012368C08001) by National Institute of Information and
mation better, the reconstruction error for the timestamps is larger. Communications Technology (NICT), Japan.
Thus, the trajectory modeling method should be selected based on
applications on their spatial or temporal sensitiveness. References
As discussed in previous studies [5], k -anonymization does
not guarantee the protection of sensitive data. Although this is
(1) Al-Hussaeni K, Fung BC, Iqbal F, Dagher GG, Park EG. SafePath:
beyond the scope of this study, the proposed method can be
Differentially-private publishing of passenger trajectories in trans-
extended to l -diversity by comparing the trajectories with similar
portation systems. Computer Networks 2018; 143:126–139.
latent variables. Another future direction is to study the acceptable (2) Zang H, Bolot J. Anonymization of location data does not work: A
information loss criteria on the use side. There is always a trade-off large-scale measurement study. In Proceedings of the 17th Annual
between information loss and privacy performance and studying International Conference on Mobile Computing and Networking. Las
the acceptable balance is an important research topic. Vegas, Nevada: ACM; 2011; 145–156.
(3) De Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD. Unique in
the crowd: The privacy bounds of human mobility. Scientific Reports
7. Conclusion 2013; 3(1):1–5.
(4) Sweeney L. k-anonymity: A model for protecting privacy. Interna-
We proposed a practical trajectory anonymization method that
tional Journal of Uncertainty, Fuzziness and Knowledge-Based Sys-
can be used for any load network sparseness and preserves
tems 2002; 10(5):557–570.
trajectory information. We introduced the DNN-based method in
(5) Nergiz ME, Atzori M, Saygin Y. Towards trajectory anonymization:
which the trajectories are projected onto the latent space and the
A generalization-based approach. In Proceedings of the SIGSPATIAL
latent variables are generalized for anonymization, and present ACM GIS 2008 International Workshop on Security and Privacy in
the effectiveness to anonymize the trajectory data with different GIS and LBS . Irvine, CA: ACM; 2008; 52–61.
load network sparsity. To better preserve the segment information, (6) Gramaglia M, Fiore M, Furno A, Stanica R. GLOVE: Towards
we propose segment-aware trajectory modeling. Furthermore, to privacy-preserving publishing of record-level-truthful mobile phone
reduce reconstruction error, the effectiveness of assuming the trajectories. ACM/IMS Transactions on Data Science (TDS) 2021;
normal distribution in the latent space was studied. The proposed 2(31–36):1–36.
method was evaluated using real GPS data, and its practicality over (7) Domingo-Ferrer J, Martı́nez S, Sánchez D. Decentralized k-
prior methods was presented. Thus, our method builds a foundation anonymization of trajectories via privacy-preserving tit-for-tat.
for applying DNNs to trajectory anonymization. Computer Communications 2022; 190:57–68.
(8) Mahdavifar S, Deldar F, Mahdikhani H. Personalized privacy- (23) Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez
preserving publication of trajectory data by generalization and distor- AN, Kaiser L, Illia P. Attention is all you need. Advances in Neural
tion of moving points. Journal of Network and Systems Management Information Processing Systems 2017; 30.
2022; 30(1):10. (24) Agoop. https://agoop.co.jp [Accessed August 11, 2024]
(9) Hashimoto M, Morishima R, Nishi H. Low-information-loss (25) Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of
anonymization of trajectory data considering map information. In deep bidirectional transformers for language understanding. arXiv
2020 IEEE 29th International Symposium on Industrial Electronics preprint arXiv:1810.04805 . 2018.
(ISIE). Delft: IEEE; 2020; 499–504. (26) Fan A, Lewis M, Dauphin Y. Hierarchical neural story generation.
(10) Nakamura T, Sakuma Y, Nishi H. Face-image anonymization as arXiv preprint arXiv:1805.04833 . 2018.
an application of multidimensional data k-anonymizer. International (27) Kingma DP. Auto-encoding variational Bayes. arXiv preprint
Journal of Networking and Computing 2021; 11(1):102–119. arXiv:1312.6114 . 2013.
(11) Le M-H, Khan MSN, Tsaloli G, Carlsson N, Buchegger S. Anonfaces: (28) Davidson TR, Falorsi L, De Cao N, Kipf T, Tomczak JM. Hyper-
Anonymizing faces adjusted to constraints on efficacy and security. spherical variational auto-encoders. CoRR. 2018.
In Proceedings of the 19th Workshop on Privacy in the Electronic (29) Tamczak J, Welling M. VAE with a VampPrior. International
Society. Virtual: ACM; 2020; 87–100. Conference on Artificial Intelligence and Statistics 2018:1214–1223.
(12) Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T. Analyz- (30) OpenStreetMap. https://www.openstreetmap.org/. [Accessed August
ing and improving the image quality of stylegan. In Proceedings of the 11, 2024]
IEEE/CVF Conference on Computer Vision and Pattern Recognition. (31) Bhargava P, Drozd A, Rogers A. Generalization in NLI: Ways (not)
Seattle, Washington: IEEE; 2020; 8110–8119. to go beyond simple heuristics. arXiv preprint arXiv:2110.01518 .
(13) Sakuma Y, Tran TP, Iwai T, Nishikawa A, Nishi H. Trajectory 2021.
anonymization through Laplace noise addition in latent space. In (32) Turc I, Chang M-W, Lee K, Toutanova K. Well-read students learn
2021 Ninth International Symposium on Computing and Networking better: The impact of student initialization on knowledge distillation.
(CANDAR). Matsue, Japan: Institute of Electrical and Electronics arXiv preprint arXiv:1908.08962 , Vol. 13, No. 3. 2019.
Engineers Inc.; 2021; 65–73. (33) Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv
(14) Kim JW, Jang B. Privacy-preserving generation and publication of preprint arXiv:1412.6980 . 2014.
synthetic trajectory microdata: A comprehensive survey. Journal of (34) Velichko V, Zagoruyko N. Automatic recognition of 200 words.
Network and Computer Applications 2024; 230:103951. International Journal of Man-Machine Studies 1970; 2(3):223–234.
(15) Si J, Yang J, Xiang Y, Wang H, Li L, Zhang R, Tu B, Chen X. (35) Van der Maaten L, Hinton G. Visualizing data using t-SNE. Journal
TrajBERT: BERT-based trajectory recovery with spatial-temporal of Machine Learning Research 2008; 9(11):2579–2605.
refinement for implicit sparse trajectories. IEEE Transactions on
Mobile Computing 2023; 23:4849–4860.
(16) Xia T, Qi Y, Feng J, Xu F, Sun F, Guo D, Li Y. Attnmove: Yuiko Sakuma (Non-Member) received her B.E., M.E., and
History enhanced trajectory recovery via attentional network. Pro- Ph.D. degrees from Keio University, Japan,
ceedings of the AAAI Conference on Artificial Intelligence 2021; 35: in 2018, 2020, and 2024, respectively. Her
4494–4502.
research interests are energy management
(17) Abul O, Bonchi F, Nanni M. Anonymization of moving objects
systems and deep learning.
databases by clustering and perturbation. Information Systems 2010;
35(8):884–910.
(18) Chen S, Fu A, Shen J, Yu S, Wang H, Sun H. RNN-DP: A
new differential privacy scheme base on recurrent neural network
for dynamic trajectory privacy protection. Journal of Network and
Hiroaki Nishi (Member) received his B.E., M.E., and Ph.D.
Computer Applications 2020; 168:102736.
degrees from Keio University, Japan, in
(19) Hochreiter S, Schmidhuber J. Long short-term memory. Neural
Computation 1997; 9(8):1735–1780. 1994, 1996, and 1999, respectively. The
(20) Mao W, Xu C, Zhu Q, Chen S, Wang Y. Leapfrog diffusion model main theme of his current research is
for stochastic trajectory prediction. In Proceedings of the IEEE/CVF building a total network system including
Conference on Computer Vision and Pattern Recognition. Vancouver, the development of hardware and software
Canada: IEEE; 2023; 5517–5526. architecture. He places great importance on
(21) Li R, Li C, Ren D, Chen G, Yuan Y, Wang G. BCDiff: Bidirectional considering the requirements of the future
consistent diffusion for instantaneous trajectory prediction. Advances highly-networked information society. He
in Neural Information Processing Systems 2024; 36:14400–14413. has expertise in researching Next generation IP router architec-
(22) Ren H, Ruan S, Li Y, Bao J, Meng C, Li R, Zheng Y. Mtrajrec:
ture, Data Anonymization Infrastructure, and the Smart City/Smart
Map-constrained trajectory recovery via seq2seq multi-task learning.
Community.
In Proceedings of the 27th ACM SIGKDD Conference on Knowledge
Discovery & Data Mining. New York: ACM; 2021; 1410–1419.