0% found this document useful (0 votes)
53 views7 pages

Mobility Episode Detection From CDR's Data Using Switching Kalman Filter

Uploaded by

DANIELGONFA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views7 pages

Mobility Episode Detection From CDR's Data Using Switching Kalman Filter

Uploaded by

DANIELGONFA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Mobility Episode Detection from CDR’s Data using

Switching Kalman Filter

Oleg Batrashev Amnir Hadachi Artjom Lind


Institute of Computer Science Institute of Computer Science Institute of Computer Science
University of Tartu, Estonia University of Tartu, Estonia University of Tartu, Estonia
[email protected] [email protected] [email protected]
Eero Vainikko
Institute of Computer Science
University of Tartu, Estonia
[email protected]

ABSTRACT models have been created in order to try to represent the


The detection of stay-jump-and-moving movement episodes real people’s movement [16, 8, 5, 7]. Some of the methods
using only cellular data is a big challenge due to the nature focused on the device’s physical location, for example cell
of the data. In this article, we propose a method to au- tower or coverage area look up, triangulation of cell towers
tomatically detect the movement episodes (stay-jump-and- or WiFi access spot point, and Global positioning system
moving) from sparsely sampled spatio-temporal data, in our with a considerable variance in the accuracy levels. Other
case Call Detail Records (CDRs), using switching Kalman attempts use the Call Detail Records’ (CDRs) data collected
filter with a new integrated movement model and cellular from cellular networks provided by the mobile operators.
coverage optimization approach. The algorithm is capable CDRs data contains data events. These events have infor-
of estimating the movement episodes and classifying the tra- mation about subscribers ID (anonymized ID), timestamp,
jectory sequences associated to a stay, a jump or a moving cell ID, azimuth, etc. The use of this kind of data by re-
action. The result of this approach can be beneficial for ap- searchers has increased widely and we can distinguish two
plications using cellular data related to traffic management, areas of research. The first one is based on quantitative
mobility profiling, and semantic enrichment. methods to model the human mobility characteristics. For
example, in [15] the authors proposed a statistically self-
consistent microscopic model for individual human mobil-
Categories and Subject Descriptors ity. This latter indicated that people have a tendency to
G.3 [Probability and Statistics]: markov processes, prob- visit some locations frequently (home and work).
abilistic algorithms, time series analysis Another approach for modeling human mobility is by us-
ing probability distributions drawn from empirical CDRs
General Terms data as it was presented in [14]. The results show that the
model is capable of illustrating the real human movement
Measurement
behaviors. In the same category, the authors in [6] proposed
a novel approach to estimate transportation mode based on
Keywords coarse-grained call detail records. This method permits to
Switching Kalman filter, Call Detail Records estimate the mean of transportation while knowing the start
and end destinations.
1. INTRODUCTION The second area concerns using data mining techniques
to learn frequent patterns and association rules of human
1.1 Overview behavior from large and complex datasets. Most of the ex-
isting mobility models focus on extracting mobility patterns
The use of spatio-temporal data such as cellular phones [13], and investigating the mobility patterns and their char-
has increased rapidly in recent years and also the amount acteristics [2]. The authors in [3] used CDRs data and they
of data collected about the devices and their users. This show that there is a strong correlation between originations
information collected can be used to understand human mo-
of people attending an event and the type of event. Be-
bility and also their behavior. From this perspective, many
sides, in [4] presented a clustering and regression algorithm
Permission to make digital or hard copies of all or part of this work for to identify locations where people live.
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
However, all these methods were based on the use of CDRs
this notice and the full citation on the first page. Copyrights for components and this kind of data presents challenges due to its sparse-
of this work owned by others than ACM must be honored. Abstracting with ness and its large scale [9]. Thus, it is difficult to reach
credit is permitted. To copy otherwise, or republish, to post on servers or to a strong conclusion on the movement behaviors. For these
redistribute to lists, requires prior specific permission and/or a fee. Request reason, we will propose a method to help to overcome these
permissions from [email protected]. issues.
MobiGIS’15 November 03-06, 2015, Bellevue, WA, USA
Copyright 2015 ACM ISBN 978-1-4503-3977-3/15/11 ...$15.00.
In this paper we put forth an algorithm to automatically
DOI: http://dx.doi.org/10.1145/2834126.2834139

63
means that in case of an event in a cell, we “observe” the
user in the ellipse center and observation error is defined as
the cell coverage area. It is not realistic, because Gaussian
distribution associates higher probability around the center
of the cell coverage area, but may be good enough for our
purposes.
The ellipse for a cell may be computed from the cell poly-
gon by any kind of approximation. Unfortunately, we discov-
ered cases where coverage area or the polygons of two cells
were distant but the trajectory data clearly showed hopping
Figure 1: System Architecture
between the cells. Thus, the cells were covering more of
the area than the polygons suggested. We decided to ex-
tend the ellipses after initial approximation, which posed
detect the movement episodes: Stay-Jump1 -and-Moving in
the question: to which direction and to what extent?
the CDRs data using a switching Kalman filter with an in-
We tried to use antenna azimuth and signal coverage poly-
tegrated movement model. Before, executing the detection
gon, provided by the operator, using several different tech-
process we need to make an extension in the observed cel- niques. The final problem was that these techniques rely
lular network coverage area existing in our trajectories con- on lots of cell data and the cell plan from the operator is
tained in the CDRs data. This process is done in order to updated too often to get enough data for the new cells. The
make the network coverage area as close to reality as possi-
most straightforward way is to use GPS locations of the
ble.
smartphones coupled with the cell events to improve the
1.2 Problem statement coverage.
The main problem that we are trying to solve in this 3.1 Coverage extension with GPS
article is to detect the mobility episodes “Stay-Jump-and- Extending coverage areas in such a way that they cover all
Moving” using the CDRs data. The fact that the CDRs their associated GPS locations of the cellular phones, is done
data are sparsely sampled in time and space [12], makes it by using mathematical optimization as minimizing a penalty
very challenging to recognize the movement episodes in the function f (e). The constructed function penalizes large dis-
trajectories of the users. Moreover, the trajectories are a tances d(x, r, y) between the cell coverage circle (x, r) and
set of cell coverage area locations; therefore, it is hard to the GPS coordinates y at the time of the cell events. It also
distinguish whether the user is moving or not when he/she penalizes large radius extensions ei , i.e. the antenna should
is located in one of the cells. Hence, our approach is to not cover more of the area than is prescribed by the GPS.
model the moving and stay action by using coverage op- Let xoi and rio be the initial circle centers and x′i are the
timization and displacement features with the support of ’pin’ points – the locations of cell towers that together with
Switching Kalman filter (SKF) to make the estimation of the initial coverage circles provide the primary direction of
the episodes. area extension. The current center and the radius of cell i
are computed by
2. SYSTEM DESIGN AND ARCHITECTURE ei
xi = xoi + o (xoi − xi ), ri = rio + ei .

The system design and architecture is divided into two ri


main sections as illustrated in Figure 1. The first section
Let C be the set of all pairs (j, y) of a cell index and its
is related to preprocessing of the data and it is composed
event GPS coordinates. The penalty function is:
of three steps. First step is related to the extraction of the
trajectories from the CDRs data and it is given as an input X X
to the system core and also to the coupling process step. f (e) = e2i + w [min (0, d(xj , rj , y))]2 , (1)
Next, the coupling process associates the cell data to the i (j,y)∈C
corresponding GPS data of the cellular phones with regard
to the event occurrences. Then, this information is used in where d(x, r, y) = r − |x − y| and w = 10 is the weight of
the coverage optimization algorithm to enhance the coverage non-coverage penalty. We apply L-BFGS-B algorithm [10,
areas of the cells (Network coverage areas: “Cellplan”). After 1] implemented in scipy to minimize the penalty function.
this process, the optimized “cellplan” is used by the system
core. In the system core, we applied Switching Kalman filter 4. MOBILITY EPISODES DETECTION
algorithm (SKF) for the mobility episode detection and its Detecting mobility episodes using CDRs is a difficult task
mobility model component. because of the hops, even during movement. Therefore, in-
specting several events in sequence with consideration of the
3. CELL COVERAGE OPTIMIZATION coverage areas is a must. In this section, we describe the
use of Kalman filter (KF) to solve this problem. Besides, in
Kalman filter (KF) requires observations to be given with
order to take into consideration user’s mobility behaviors in
multivariate Gaussian distribution N (µ, R) where µ is the
our approach, we will apply Switching Kalman Filter, which
observation vector and matrix R defines the error (noise) of
introduces a discrete state variable to KF.
the measurement. It is possible to approximate cell coverage
with an ellipse and define µ and R through the ellipse. This 4.1 Kalman filter
1 Kalman filter is a dynamic linear system defined as fol-
A jump is a change in location with no logic of smooth
continuous movement lows:

64
St−1 St

xt = F xt−1 + qt (2)
Ẋt−1 Ẋt
yt = Hxt + rt , (3)
with hidden state xt , observable evidence yt , transition ma- X̄t−1 X̄t
trix F , observation matrix H, and random Gaussian noises
for transition qt ∼ N (0, Qt ) and for observation rt ∼ N (0, Rt ).
Kalman filter represents the belief states of xt using continu- Yt−1 Yt
ous random variables Xt with Gaussian probability distribu-
tions xt ∼ P (Xt = x) = N (x; µt , Σt ). Inference algorithms Figure 2: Bayes Net for Switching Kalman Filter: discrete
can be used to calculate probability distribution of Xt from variables are rectangular, continuous variables are ellipses,
the evidence up-to-date y1:t (filtering) or all the evidence and filled are observable variables
y1:T (smoothing)
P (Xt |y1:t ) = N (µt|t , Σt|t ) (4) higher chance to stay in the same model (Nm is the number
P (Xt |y1:T ) = N (µt|T , Σt|T ) (5) of models):
(
where T is the total number of time steps and the symbol 0.8 if i = j
(i.e. t or T ) after the vertical bar (here and further) sig- Z(i, j) = 0.2
Nm −1
otherwise
nifies how much evidence is used to infer the value. The
cornerstone property of KF is that if prior x0 has Gaussian The problem with SKF is that the number of Gaussians is
distribution then inferred distribution of any variable is also multiplied by the number of models each step, which gives
Gaussian. rise to the exponential growth of belief state. There are
Kalman filter is good for processes with linear dependen- several solutions listed in [11] and we use the collapsing ap-
cies within state variables, e.g. the movement of a ballistic proach with the GPB2 algorithm2 described in the paper.
rocket consisting of coordinates, velocity, and gravity forces. The results for filtering and smoothing are computed using
Applicability of Kalman filters to the events in mobile net- predefined models of behavior, i.e. no learning is imple-
work is not obvious, because a mobile user usually makes mented.
untracked turns between the arrival of two events and the
observations are not points with Gaussian noise but cover- 4.3 Model general structure
age areas of antennas. Nevertheless, there is still some sense Taking location
 x̄t and velocity ẋt at time t as hidden
using this approach, for example, if a user is moving from x̄t
variables xt = and specifying that a moving user
home to work, then he often preserves the general direction ẋt
and the speed of the movement. coordinates and velocities satisfy the following equations:

4.2 Switching Kalman filter x̄t = x̄t−1 + ẋt−1 △ + q̄t (9)


The Kalman filter only allows for single transition matrix ẋt = ẋt−1 + q̇t (10)
at each time step t; whereas, there are usually several types where △ is the time difference from the previous event, re-
of behavior to choose from. In fact, we are interested in the sults in Bayes Net as shown on Figure 2. In general, for
most probable behavior given the evidence. This is done by each model we need to define transition matrix F and noise
adding discrete random variables to Kalman Filter, which variance matrix Q. For example, the transition matrix and
results in the switching Kalman Filter [11]. the noise for a user moving on 2D plain (Move model) is:
Discrete random variable St defines the model that is used  
for transition to step t. SKF calculates probability of each 1 0 △ 0  
model at time t, again, given up-to-date evidence or all the  0 1 0 △  q̄
FM =   0 0 1 0 
 ∼ N (0, QM ) (11)
evidence q̇
0 0 0 1
Mt|t (i) = P (St = i|y1:t ) (6)
Mt|T (i) = P (St = i|y1:T ) (7) and for a user staying in one location is just identity F S = I.
The observation model (H and R) is the same for all tran-
as well as probability distribution of hidden state variable sition models but may be different in each time step. An ob-
associated with each model: servation should simply reflect the location of the antenna,
that the user is connected to, and observation error is defined
P (Xt |St = i, y1:τ ) = N (µit|τ , Σit|τ ) τ ∈ {t, T }
by the coverage area of the antenna. In our test examples
The consolidated belief state of hidden variable Xt at time we set the same model for all “antennas”:
t is represented as the mixture of Gaussians of all models 1.22
   
1 0 0 0 0
scaled by the probabilities of the models: H= R= 2 . (12)
0 1 0 0 0 1.2
X
P (Xt |y1:τ ) = Mt|τ (i) · P (Xt |St = i, y1:τ ) (8) Figure 3 shows observation points as gray circles with the
i error r ∼ N (0, R) drawn as the circle of the radius 2σ twice
the standard deviation of the Gaussian. The initial position
Finally, it is necessary to define model transition proba-
is shown with a red cross.
bility matrix Z(i, j) = P (St = j|St−1 = i). We define the
2
transition probability from St−1 = i to St = j with the Generalized Pseudo Bayesian algorithm of order 2

65
20
stay
9 jump
9
15
8
8

2 2
7
7
4 4
10
7 7

6 6
6 6

5 3
5
5 3 5
5
1 1

4 4

0
3 3 −0.10 −0.05 0.00 0.05 0.10
3 4 5 6 7 8 9 10 11 12 3 4 5 6 7 8 9 10 11 12
1.0 1.0

0.8 0.8
Figure 4: Noise probability distributions of Stay and Jump
models
M)

)
M
Probability (

Stay(R=0.02) filter

Probability (
0.6 0.6 Stay(R=0.02) filter
Stay(R=0.02) smooth
Stay(R=0.02) smooth

10
Jump(R=2.00) filter Jump(R=2.00) filter

0.4 Jump(R=2.00) smooth 0.4 Jump(R=2.00) smooth

9 5
10
0.2 0.2

8
11 78 6
7
0.0

4
0.0
0 1 2 3 4 5 6 7

9
0 1 2 3 4 5 6 7

6
steps steps

(a) distant groups of towers (b) overlapping areas 5 32


1
4
Figure 3: Observations, predicted trajectory and probabili- 3
ties for Stay and Jump models 1.0 −5 0 5 10
0.8

Probability (M)
As the result, given the evidence up-to-date y1:t the al- 0.6 Stay filter
Stay smooth
gorithm computes the probabilities P (St = k|y1:t ) of each TMove filter
0.4 TMove smooth
model k at time t, and the probability distribution of coordi-
nate and velocity P (Xt |St = k, y1:t ). The same is done us- 0.2
ing all the evidence y1:T which gives smoothed results, that 0.00 2 4 6 8 10 12
are more accurate and therefore are used in actual testing steps
and validation.
Figure 5: Coordinates and probabilities for Stay and Move
4.4 Predefined models models where the interval between observations 7 and 8 is
In this section, we describe the models and show how the comparable to other intervals.
Switching Kalman Filter behaves. Consider the two models:

• Stay model – transition matrix is identity F S = I


The third model takes event time and user velocity into
and transition noise for location aS = 0.02 is very
account, where an observation consists of the coverage area
small but not for velocity aM = 1.0, therefore QS =
and the time τ of the observation :
diag a2S , a2S , a2M , a2M . Although velocity is not used

in the model, we need to allow it to appear anyway, so • Move model – with F M = F (from Equation 11) and
that the Move model (see further) would work. QM = diag(a2S , a2S , a2M , a2M ) where aM = 20.
• Jump model – transition matrix is also identity F J = I An example is shown on Figure 5, where it correctly identi-
but transition noise is larger for location fies movement starting at time step 4. Notice, that filtering
QJ = diag(a2J , a2J , a2M , a2M ) where aJ = 2.0. algorithm (dashed lines) is not confident about the 4th step,
because it does not consider future evidence. Observations
The probabilities of each model and the predicted locations number 7 and 8 are swapped in space but SKF assigns this
are shown on Figure 3. The a) case puts two groups of to the observation error.
observations slightly distant from each other and the b) case If we set the time difference between events 7 and 8 much
mixes them together. In the first case the groups are split larger (e.g. 10 times), then the algorithm identifies stop in
and the smoothing algorithm assigns more probability to the point 8. This effect is shown on Figure 6, where all 3 models
Jump model for the point 4, meaning there is a shift in user are combined into one simulation. As seen from the figure,
location. the Jump model may be preferred over the Move model in
The working property behind the Jump model is that the the start of a movement or in the case of random jumps,
tails of Gaussian density function are longer for larger vari- because the Move model tries to preserve the inertia.
ance. Having variances σ12 and σ22 the intersection is defined One problem here is that the Move model should be as
by the following equation: good as the Stay model, because it uses the same noise for
σ12 σ22 σ2 location and low velocities are not penalized. The solution
x2 = 2 ln (13) is to multiply the likelihood L of the model in the filtering
σ22 − σ12 σ1
part of [11] by the square of the normalized velocity if it is
which for the stay and jump models gives x = ±0.0607. See slower than the “normal” velocity.
the comparison of N (0, QS ) and N (0, QJ ) on Figure 4. Another issue is that the algorithm may prefer many small
jumps in the beginning and/or the ending of one big jump.
Jump model is more about one-time change of location,

66
10
be considered as the move to that location. If the next event
5
9
location yt+1 is not taken into account, the current state is
10
8
7 8 6
left as St = stay and the algorithm pulls the stay location
11
7
9
4 5 km ahead (see Figure 7). The “weakening” of the simpli-
6
fication by taking the following evidence yt+1 into account
32
5
1 solves the problem:
4

3
−5 0 5 10
1.0 P (St = j|St+1 = k, y1:T ) ≈ P (St = j|St+1 = k, y1:t+1 )
Stay filter

Stay smooth
0.8 j|k
Jump filter
and the formula for computing Ut in the algorithm be-
M)

Jump smooth

comes3
Probability (

TMove filter
0.6
TMove smooth

j(k)
0.4
j|k Mt|t (j)Z(j, k)Lt+1
Ut = P  P ,
j(k) ′ )Z(j ′ , k)
0.2
j Mt|t (j)Lt+1 j ′ Mt|t (j
0.0
0 2 4 6 8 10 12 j|k
j|k Ut
steps which must be followed by the normalization Ūt = P j|k .
j Ut
There are still cases that can benefit from taking even more
Figure 6: Coordinates and probabilities for Stay, Jump and
future evidence into account, but we consider them in the
Move models where the interval between observations 7 and
future work.
8 is ten times longer.

10
5. EXPERIMENTATION AND RESULTS
9
8
5.1 Data gathering
7 2 The data gathering process has two main steps. The first
4
7 one is related to the collection of reference data; in our case
6 6
5 5 3 it will be GPS data of the smartphones used by the users.
1 This GPS data will be used as reference data in order to
4
get an idea about the performance of our algorithm. More
3
−5 0 5 10 details about this process is described in the following next
section (Testing strategy). The second step is getting the
Figure 7: Short time between events 3 and 4 pulls first lo- CDR data from the mobile operator of the same time period
cations out of the coverage areas when we were collecting GPS data. This way we will be
able to evaluate the estimations of our algorithm when it is
using this CDRs data as test data. Then, the output of the
therefore we decreased the probability to maintain the jump algorithm will be compared with our reference data (GPS
state P (St = ”jump”|St−1 = ”jump”) = 0.1. data).
The overall rationale behind the models is the following:
5.2 Testing strategy
• If a user is moving by a bus or a car and has frequent
events in the cell network (e.g. GPRS connection), it In this section, we will explain our approach to evaluating
is possible to track him using the Move model. the proposed algorithm. The evaluation will be based on the
use of GPS data collected from the field using mobile phones
• Otherwise, if a user has sparse events on the cells as a reference data. Meanwhile, we used the CDRs collected
around one location then the Stay model is inferred by by the mobile operator of the same mobile phones utilized
SKF, otherwise the location of the user has changed; in the GPS collection campaign as an input for our system.
therefore, SKF should infer new location using the After our algorithm gives its estimation of the movement
Jump model. episodes, we use the timestamp from the CDRs as a mean
to find the corresponding GPS data. Then, the GPS data
4.5 Smoothing fix should be labelled (Stay, Jump, and Move). In order to
There is a simplification in the original SKF algorithm [11] execute this labeling task on the GPS data, we used the
that approximates the probability computation of the cur- following logic:
rent model during the smoothing part:
1. First we check the distance between each successive
j|k
Ut = P (St = j|St+1 = k, y1:T ) ≈ P (St = j|St+1 = k, y1:t ). part of the GPS data and if we find that the distance
This simplification may not be too bad, provided that the is greater or equal than a specified threshold, we label
future evidence yt+1:T does not contain much more informa- the position as a ‘Jump’. Otherwise it is a ‘Stay’.
tion about the current state St beyond than that contained 2. The second step is to run a check on the GPS data
in the next state St+1 . labeled as a “Jump” in the previous step, because they
There are cases when event t + 1 follows within several could be potential ’Move’-s. Therefore, we collect the
seconds after event t and is on the cell tower that is 10 km speed measurements for those GPS locations around
away. This means that two events happen between the tow-
3
ers, where the coverage of two cells overlap and event t must refer to [11] for notation and meaning

67
Actual GPS Movement Episodes
Stay Move Jump
Estimated Stay 109 15 52
SKF Movement Move 8 52 12
Episodes Jump 28 30 12

Table 1: Confusion Matrix “System Test Without Coverage


Optimization”

Actual GPS Movement Episodes


Stay Move Jump
Estimated Stay 134 8 61 Figure 8: System Correct Detection Performance
SKF Movement Move 6 84 5
Episodes Jump 5 5 9
6. CONCLUSION
Table 2: Confusion Matrix “System Test With Coverage Op- In this article, we proposed an automatic algorithm to
timization” detect movement episodes (Stay-Jump-and Move) in the
CDRs data. The approach is based on the use of switch-
ing Kalman method with three embedded movement models
the concerned GPS data and check the average speed. and network coverage optimization technique. The system
If it is above a specific threshold, then the label should has proven to be good in detecting Stay and Move episodes
be changed to a ‘Move’ or else it stays as a ‘Jump’. with an accuracy of 92% for Stay model, 86% for Move
Finally, the GPS data is labelled with the movement episodes model and 12% for Jump model. Overall, the results are
and we can compare it to the algorithm movement episode encouraging; nevertheless, there is more work to be done at
detection output. Moreover, we tested the system by both the level of the Jump model in order to make the detection
using the coverage optimization and not using it, in order to better. Finally, this work is a promising step towards the
get an idea about the effect of the coverage optimization on application of transportation management using Call detail
the system and its performance. Records’ (CDRs) or smartphones’ data.

5.3 Results and discussion Acknowledgments


The data used for testing contains 317 CDR records from The authors gratefully acknowledge the contribution of The
different users. In this CDRs data we have 145 Stay loca- Software Technology and Applications Competence Centre
tions, 97 move locations, 75 jump locations based on the (STACC) through Large-scale Mobile Positioning Data Min-
GPS data used as reference data. The first test was to run ing (Demograft) project and all the partners in Archimedes
the system without the coverage optimization and the re- project ”The Real-time Location-based Big Data Algorithms”
sults are illustrated in the confusion matrix in Table 1. for their help in providing the data. This research was
The results in Table 1 show that our system is capable supported by the European Regional Development Fund
of detecting Stay and Move episodes better than the Jump through the Estonian Center of Excellence in Computer Sci-
episodes. When the cells’ coverage areas are too small, the ence (EXCS).
system introduces jumps between hopping cells, although
the user may be staying or moving forward (28 and 30 jumps 7. REFERENCES
correspondingly). After adding the coverage optimization to [1] C. Zhu R. H. Byrd and J. Nocedal. L-bfgs-b:
the system the problem is alleviated (see Table 2). Algorithm 778: L-bfgs-b, fortran routines for large
Another issue is the very high rate of Jump episodes (52 scale bound constrained optimization. ACM
and 61 after optimization) identified by GPS but not by Transactions on Mathematical Software,
SKF. These are mostly small shifts in the location of a 23(4):550–560, 1997.
user (couple hundred meters) that are not recognized by [2] Rodriguez-Carrion A.; Das S.K. ; Campo C. ;
the Kalman Filter because of the sparsity of the CDRs data Garcia-Rubio C. Impact of location history collection
and cell network. We eventually see that the system assigns schemes on observed human mobility features. IEEE
pretty high probability to the jump model (up to 42%) for International Conference on Pervasive Computing and
these location shifts, so there is chance that they can be Communications Workshops (PERCOM Workshops),
recognized. pages 254–259, 2014.
Moreover, the Figure 8 gives a general view about the
[3] Calabrese F; Pereira F C; Di Lorenzo G; et al. The
system performances, where we can see clearly that the Stay
geography of taste: analyzing cell-phone mobility and
model and Move model were detectable by the system with
social events. Pervasive computing. Springer Berlin
a good detection rate. In general, the algorithm gives very
Heidelberg, pages 22–37, 2010.
interesting results regarding the detection of Move and Stay
movement episodes (Figure 9). However, there is more work [4] Isaacman S; Becker R; Cáceres R; et al. Identifying
to be done to calibrate and enhance the Jump model in order important places in people’s lives from cellular
to make the detection better. network data. Pervasive Computing. Springer Berlin
Heidelberg, pages 133–151, 2011.
[5] Jie Tian ; Yongyao Jiang ; Yuqi Chen ; Wenjun Li;
et al. Automated human mobility mode detection

68
(a) Triangles are GPS locations: red - fast movement, green (b) Circles are SKF predictions: red - Move model, blue - Stay
and blue - walk and stops. Initial polygons and extended model. Dots connect predicted locations with the actual GPS
coverage areas of the cells active during the walk (excluding location at the time of event.
the drive, i.e. red triangles). Small brown circles are the
mobile network towers.

Figure 9: A user visits national park about 50km away from the city.

based on gps tracking data. 22nd International J. Rowland; A. Varshavsky; and W. Willinger. Human
Conference on Geoinformatics (GeoInformatics), mobility modeling at metropolitan scales. MobiSys,
pages 1–6, 2014. pages 239–252, 2012.
[6] Wang H; Calabrese F; Di Lorenzo G; et al. [15] Song C.; Koren; T. Koren; P. Wang; and A. L.
Transportation mode inference from anonymized and Barabási. Modeling the scaling properties of human
aggregated mobile phone call detail records. Intelligent mobility. Nature Physics, 6(10):818–823, 2010.
Transportation Systems (ITSC), pages 318–323, 2010. [16] Ying Zhang. User mobility from the view of cellular
[7] Xingqin Lin; Fleming P.J. ; Andrews J.G. data networks. Proceedings IEEE INFOCOM, pages
Fundamentals of mobility in cellular networks: 1348–1356, 2014.
Modeling and analysis. IEEE Global Communications
Conference (GLOBECOM), pages 5433–5438, 2012.
[8] H.; Sekimoto Y. ; Kurokawa M. ; Watanabe T. et al
Kanasugi. Spatiotemporal route estimation consistent
with human mobility using cellular network data.
IEEE International Conference on Pervasive
Computing and Communications Workshops
(PERCOM Workshops), pages 267–272, 2013.
[9] Ficek M.; Kencl L. Inter-call mobility model: A
spatio-temporal refinement of call data records using a
gaussian mixture model. Proceedings IEEE
INFOCOM, pages 469–477, 2012.
[10] J.L. Morales and J. Nocedal. L-bfgs-b: Remark on
algorithm 778: L-bfgs-b, fortran routines for large
scale bound constrained optimization. ACM
Transactions on Mathematical Software, 38(1), 2011.
[11] Kevin P. Murphy. Switching kalman filters. Technical
report, 1998.
[12] Dongdong Su; Feng Qi. An approach for ensuring the
reliability of call detail records collection in billing
system. International Conference on Research
Challenges in Computer Science. ICRCCS., pages
100–103, 2009.
[13] Yadav K.; Kumar A. ; Bharati A. ; Naik V.
Characterizing mobility patterns of people in
developing countries using their mobile phone data.
Sixth International Conference on Communication
Systems and Networks (COMSNETS), pages 1–8,
2014.
[14] S. Isaacman; R. A. Becker; R. Caceres; M. Martonosi;

69

You might also like