Cell Oscillation Resolution DECRE Method
Cell Oscillation Resolution DECRE Method
DECRE method
Problem definition
2 06/07/2020
Problem definition
3 06/07/2020
Problem definition
4 06/07/2020
Problem definition
5 06/07/2020
Cell oscillation resolution methods
• Mean/median filter
• Exponential smoothing (1st order LPF)
• LPF (2nd and higher order filters) (recursive and non-recursive)
• Kalman filter
• Particle filter
• Heuristic-based methods
• Clustering
• Hidden Markov chains based methods
• Semantic places based methods
• Combined methods
6 06/07/2020
based on
7 06/07/2020
DECRE (Detect, Expand, Check, REmove) method
8 06/07/2020
Definition 1: Same-cell sequences
9 06/07/2020
Same-cell sequence illustration
10 06/07/2020
Definition 2: Stable period
• If the time duration of a same-cell sequence is
long enough (e.g., longer than a threshold L
such as 10 minutes), such a same-cell
sequence is defined as stable period.
11 06/07/2020
Definition 3: Moving at impossible speed
12 06/07/2020
Moving at impossible speed illustration
13 06/07/2020
Moving at impossible speed (cont.)
14 06/07/2020
Definition 4: Suspicious sequences
The following criterion is used to identify such sequences:
within a short period of time
(e.g., a <= 1) minutes
there are at least a few
(e.g., b >= 3) logs
from at least a few
(e.g., c >= 2) cellular towers.
A continuous sequence of logs satisfying the above
condition is identified as suspicious sequence identified
using parameters a, b, and c.
15 06/07/2020
Heuristics
Based on the concepts of stable period, moving at
impossible speed, and suspicious sequences, four
heuristics are introduced to detect log sequences
containing oscillation logs.
16 06/07/2020
Heuristic 1
If two consecutive stable periods’ cell is the same and the
time difference between them is short enough (e.g.,
shorter than a threshold L1T = 2 minutes), the logs
between the two stable periods are very likely due to
oscillation. Let SPi and SPi+1 be two consecutive stable
periods, the condition in this heuristic can be expressed
as
(Spi .cid == SPi+1 .cid)
AND
(TimeDiff(Spi .last; SPi+1 .first) < L1T)
17 06/07/2020
.Illustration of stable period based heuristic 1
18 06/07/2020
Heuristic 2
If shortly after a stable period there is a log whose
cell is far away from the stable period’s cell, that log
is very likely due to oscillation. Let Rj be an
immediate log after stable period SPi, let L2T and L2D
be two thresholds for time and distance, the
condition in this heuristic can be expressed as
(TimeDiff(Spi .last,Rj) < L2T)
AND
(Distance(Spi .last, Rj) > L2D)
19 06/07/2020
.Illustration of stable period based heuristic 2
20 06/07/2020
Heuristic 3
The heuristic for capturing oscillation logs like “moving
at impossible speed” can be expressed as follows below
where V is a threshold for speed and L3 is a threshold
for distance:
(Speed(Ri ,Ri+1) * Speed(Ri+1 ,Ri+2) > V * V ) AND
(Distance(Ri ,Ri+1) > L3) AND
(Distance(Ri+1 ,Ri+2) > L3) AND
(Distance(Ri , Ri+2) < L3 /2)
21 06/07/2020
Heuristic 4
• Most of the oscillations happen in a short period of time and they are
not adjacent to any stable period. They exhibit moving at impossible
speed but do not satisfy heuristic 3 because the distance between
cellular towers in consecutive logs is not long enough. Such
oscillations match the definition of suspicious sequence and typically
happen among cellular towers that are close to each other.
• Note that not all suspicious sequences identified by this heuristic
contain oscillation logs. We understand such sequences are possible
and can be even reasonable when the mobile device is moving fast
(e.g., driving on highway). We use the “Expand” and “Check” steps to
find suspicious sequences that do contain oscillation logs. Only logs
that confidently identified as oscillations, are removed.
22 06/07/2020
Expand
Given a suspicious sequence detected using time window a
(minute), the observed sequence is expanded by looking at most
a minute(s) before the suspicious sequence and at most a
minute(s) after the suspicious sequence. The look-back (or look-
after) process stops when it encounters a log whose cellular
tower did not appear in the suspicious sequence.
23 06/07/2020
Check
• An important characteristic (or evidence) of oscillation is that cycle of
cellular towers is often observed in a short period of time.
• Here a cycle is defined as a continuous sequence of logs whose first
log and last log have the same cellular tower and there is at least one
log from other cellular tower between them. For example, a
sequence of log C1C2C1 exhibits the cycle from C1 to C2 and then
back to C1. On the contrary, C1C1C2 does not contain a cycle.
• For each suspicious sequence identified by heuristic 4 and expanded
by the “Expand” process, the “Check” step tests whether the
sequence contains a cycle of events. If it has a cycle, it is confirmed
that there is oscillation in the sequence. Otherwise, it is claimed that
the suspicious sequence is due to fast movement and will not be
removed.
24 06/07/2020
Remove
• For the oscillations detected by heuristics 1, 2, and 3, it is clear which
logs are oscillation and they should be removed.
• However, for the suspicious sequences identified by heuristic 4 and
further confirmed with the Expand and Check steps, it is needed to
decide which logs in the sequence are oscillation logs and which
cellular tower should be used to represent the location of the mobile
device for this sequence.
• A score based algorithm is introduced to select the cellular tower to
approximate the location of the mobile device. Each cellular tower
contained in the suspicious sequence gets a score based on its
frequency in the sequence and its average distance to other cells
appeared in the sequence. The cellular tower that appears frequently
in the sequence and is close to other cells is favored.
25 06/07/2020
Remove oscillation logs algorithm
26 06/07/2020
Heuristic 4 example
The figure shows an example
where a suspicious sequence
C9C10C9C11C9C11 is identified
and it contains cycles (i.e.,
C9C10C9, C9C11C9, and
C11C9C11).
27 06/07/2020
Heuristic 4 example (cont.)
The algorithm counts the number of times each cell appears and
get Fc9=3, Fc10=1, Fc11=2.
Then it calculates the average distance from each cell to other
cells. Since distance(C9,C10)=0.9, distance(C9,C11)=0.9,
distance(C10,C11)=0.1, then Dc9=(9+9)/2=0.9, Dc10=(9+1)/2=0.5,
Dc11=(9+1)/2=0.5.
Then the score for each cell is calculated as Scorec9=3/0.9,
Scorec10=1/0.5, and Scorec11=2/0.5. As a result, cell C11 is selected
as the mobile device location for this oscillation sequence.
28 06/07/2020
Modularity
• The DECRE algorithm is designed in a modular manner
so that each step of the algorithm can be replaced
withnew algorithm in the future.
• For example, the heuristics in the “Detect” step can be
replaced with more sophisticated detection criterion.
• As another example, if the radius of the cells are
known, a new REmove algorithm can be implemented
by taking that into account and then plug the new
Remove algorithm to the DECRE algorithm.
29 06/07/2020
Performance metrics
• Let’s define the average distance between locations in CDRs and
corresponding GPS locations as the measure of how the records approximate
•
the mobile device’s real mobility trace. Formally, suppose N is the number of
records, the performance metric is /N.
• Let’s use cdroriginal , cdrcleansed and cdrremoved to denote the records before, after
oscillation resolution and removed records respectively.
• Thus we have cdrcleansed = cdroriginal - cdrremoved.
• We compute the performance metric for cdroriginal, cdrcleansed, and cdrremoved.
• We can conclude that our methods are effective if
– cdrcleansed is closer to the GPS data than cdroriginal is; and
– cdrremoved is much farther away from the GPS data than both cdroriginal and
cdrcleansed are.
30 06/07/2020