0% found this document useful (0 votes)
41 views30 pages

Cell Oscillation Resolution DECRE Method

The DECRE method detects and removes oscillation logs from cellular location data using the following steps: 1. Four heuristics are used to detect oscillation sequences, including sequences between stable periods, logs far from stable periods, impossible travel speeds, and suspicious short sequences switching towers. 2. Some detected sequences are expanded to include nearby logs to provide more context. 3. Expanded sequences are checked for cycles of switching between towers, indicating oscillation. 4. For confirmed oscillation sequences, a scoring algorithm selects a single tower that best approximates the device's location, and the other logs are removed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views30 pages

Cell Oscillation Resolution DECRE Method

The DECRE method detects and removes oscillation logs from cellular location data using the following steps: 1. Four heuristics are used to detect oscillation sequences, including sequences between stable periods, logs far from stable periods, impossible travel speeds, and suspicious short sequences switching towers. 2. Some detected sequences are expanded to include nearby logs to provide more context. 3. Expanded sequences are checked for cycles of switching between towers, indicating oscillation. 4. For confirmed oscillation sequences, a scoring algorithm selects a single tower that best approximates the device's location, and the other logs are removed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Cell Oscillation Resolution.

DECRE method
Problem definition

2 06/07/2020
Problem definition

3 06/07/2020
Problem definition

4 06/07/2020
Problem definition

5 06/07/2020
Cell oscillation resolution methods
• Mean/median filter
• Exponential smoothing (1st order LPF)
• LPF (2nd and higher order filters) (recursive and non-recursive)
• Kalman filter
• Particle filter
• Heuristic-based methods
• Clustering
• Hidden Markov chains based methods
• Semantic places based methods
• Combined methods

6 06/07/2020
based on

7 06/07/2020
DECRE (Detect, Expand, Check, REmove) method

• Sequences of logs that contain oscillations are found in the Detect


step . Four heuristics to detect such sequences of logs are introduced.
• Some of the detected sequences are quite short and do not contain
enough information for making informed decision. Therefore an
Expand step is introduced where the logs are observed before and
after the suspicious sequence until certain conditions are satisfied.
• Then Check step tests whether the expanded sequence contains logs
that switches quickly between cellular towers which is a strong
indication of oscillation.
• Finally the Remove step selects a cellular tower to represent the
mobile device’s location for the detected sequence and removes the
oscillation logs.

8 06/07/2020
Definition 1: Same-cell sequences

• Given a sequence of log R1 to Rn of a mobile device


ordered by datetime, the same-cell sequences are
the continuous sequences of logs where the cellular
tower is the same, i.e., R1 .cid = R2 .cid, … , Rn .cid.
• The duration of a same-cell sequence is the time
duration from the time in the first log to the time in
the last log of the same-cell sequence, i.e.,
TimeDiff(R1 ,Rn).

9 06/07/2020
Same-cell sequence illustration

10 06/07/2020
Definition 2: Stable period
• If the time duration of a same-cell sequence is
long enough (e.g., longer than a threshold L
such as 10 minutes), such a same-cell
sequence is defined as stable period.

11 06/07/2020
Definition 3: Moving at impossible speed

• Impossible movements (or “long jumps”) are


observed when the spatial distance between
consecutive logs are too far for the mobile
device to travel in the time duration between
the logs.

12 06/07/2020
Moving at impossible speed illustration

13 06/07/2020
Moving at impossible speed (cont.)

Such oscillation typically happens in a


sequence of three logs.
The second log (477) suddenly jumps
to a cellular tower that is far away
from the cellular tower in the first log
(254), and the third log (1609) jumps
back to a log which is the same as (or
close to) the first log.

14 06/07/2020
Definition 4: Suspicious sequences
The following criterion is used to identify such sequences:
within a short period of time
(e.g., a <= 1) minutes
there are at least a few
(e.g., b >= 3) logs
from at least a few
(e.g., c >= 2) cellular towers.
A continuous sequence of logs satisfying the above
condition is identified as suspicious sequence identified
using parameters a, b, and c.
15 06/07/2020
Heuristics
Based on the concepts of stable period, moving at
impossible speed, and suspicious sequences, four
heuristics are introduced to detect log sequences
containing oscillation logs.

16 06/07/2020
Heuristic 1
If two consecutive stable periods’ cell is the same and the
time difference between them is short enough (e.g.,
shorter than a threshold L1T = 2 minutes), the logs
between the two stable periods are very likely due to
oscillation. Let SPi and SPi+1 be two consecutive stable
periods, the condition in this heuristic can be expressed
as
(Spi .cid == SPi+1 .cid)
AND
(TimeDiff(Spi .last; SPi+1 .first) < L1T)
17 06/07/2020
.Illustration of stable period based heuristic 1

18 06/07/2020
Heuristic 2
If shortly after a stable period there is a log whose
cell is far away from the stable period’s cell, that log
is very likely due to oscillation. Let Rj be an
immediate log after stable period SPi, let L2T and L2D
be two thresholds for time and distance, the
condition in this heuristic can be expressed as
(TimeDiff(Spi .last,Rj) < L2T)
AND
(Distance(Spi .last, Rj) > L2D)
19 06/07/2020
.Illustration of stable period based heuristic 2

The figure shows an example of applying this


heuristic. The time difference between stable
period SP3 and the log of C8 is very short (e.g.,
1 minute) but the distance between C7 and
C8 is very far (e.g., 5 km), according to the
heuristic we are confident that the log of C8 is
due to oscillation.

20 06/07/2020
Heuristic 3
The heuristic for capturing oscillation logs like “moving
at impossible speed” can be expressed as follows below
where V is a threshold for speed and L3 is a threshold
for distance:
(Speed(Ri ,Ri+1) * Speed(Ri+1 ,Ri+2) > V * V ) AND
(Distance(Ri ,Ri+1) > L3) AND
(Distance(Ri+1 ,Ri+2) > L3) AND
(Distance(Ri , Ri+2) < L3 /2)

21 06/07/2020
Heuristic 4
• Most of the oscillations happen in a short period of time and they are
not adjacent to any stable period. They exhibit moving at impossible
speed but do not satisfy heuristic 3 because the distance between
cellular towers in consecutive logs is not long enough. Such
oscillations match the definition of suspicious sequence and typically
happen among cellular towers that are close to each other.
• Note that not all suspicious sequences identified by this heuristic
contain oscillation logs. We understand such sequences are possible
and can be even reasonable when the mobile device is moving fast
(e.g., driving on highway). We use the “Expand” and “Check” steps to
find suspicious sequences that do contain oscillation logs. Only logs
that confidently identified as oscillations, are removed.

22 06/07/2020
Expand
Given a suspicious sequence detected using time window a
(minute), the observed sequence is expanded by looking at most
a minute(s) before the suspicious sequence and at most a
minute(s) after the suspicious sequence. The look-back (or look-
after) process stops when it encounters a log whose cellular
tower did not appear in the suspicious sequence.

23 06/07/2020
Check
• An important characteristic (or evidence) of oscillation is that cycle of
cellular towers is often observed in a short period of time.
• Here a cycle is defined as a continuous sequence of logs whose first
log and last log have the same cellular tower and there is at least one
log from other cellular tower between them. For example, a
sequence of log C1C2C1 exhibits the cycle from C1 to C2 and then
back to C1. On the contrary, C1C1C2 does not contain a cycle.
• For each suspicious sequence identified by heuristic 4 and expanded
by the “Expand” process, the “Check” step tests whether the
sequence contains a cycle of events. If it has a cycle, it is confirmed
that there is oscillation in the sequence. Otherwise, it is claimed that
the suspicious sequence is due to fast movement and will not be
removed.

24 06/07/2020
Remove
• For the oscillations detected by heuristics 1, 2, and 3, it is clear which
logs are oscillation and they should be removed.
• However, for the suspicious sequences identified by heuristic 4 and
further confirmed with the Expand and Check steps, it is needed to
decide which logs in the sequence are oscillation logs and which
cellular tower should be used to represent the location of the mobile
device for this sequence.
• A score based algorithm is introduced to select the cellular tower to
approximate the location of the mobile device. Each cellular tower
contained in the suspicious sequence gets a score based on its
frequency in the sequence and its average distance to other cells
appeared in the sequence. The cellular tower that appears frequently
in the sequence and is close to other cells is favored.

25 06/07/2020
Remove oscillation logs algorithm

26 06/07/2020
Heuristic 4 example
The figure shows an example
where a suspicious sequence
C9C10C9C11C9C11 is identified
and it contains cycles (i.e.,
C9C10C9, C9C11C9, and
C11C9C11).

The figure shows the relative locations


of the cells involved in this sequence.
The distance between C9 and C10 is 0.9
km. The distance between C9 and C11 is
also 0.9 km. The distance between C10
and C11 is 0.1 km. They are in a busy
commercial area where density of
cellular towers is high.

27 06/07/2020
Heuristic 4 example (cont.)
The algorithm counts the number of times each cell appears and
get Fc9=3, Fc10=1, Fc11=2.
Then it calculates the average distance from each cell to other
cells. Since distance(C9,C10)=0.9, distance(C9,C11)=0.9,
distance(C10,C11)=0.1, then Dc9=(9+9)/2=0.9, Dc10=(9+1)/2=0.5,
Dc11=(9+1)/2=0.5.
Then the score for each cell is calculated as Scorec9=3/0.9,
Scorec10=1/0.5, and Scorec11=2/0.5. As a result, cell C11 is selected
as the mobile device location for this oscillation sequence.

28 06/07/2020
Modularity
• The DECRE algorithm is designed in a modular manner
so that each step of the algorithm can be replaced
withnew algorithm in the future.
• For example, the heuristics in the “Detect” step can be
replaced with more sophisticated detection criterion.
• As another example, if the radius of the cells are
known, a new REmove algorithm can be implemented
by taking that into account and then plug the new
Remove algorithm to the DECRE algorithm.

29 06/07/2020
Performance metrics
• Let’s define the average distance between locations in CDRs and
corresponding GPS locations as the measure of how the records approximate
  •
the mobile device’s real mobility trace. Formally, suppose N is the number of
records, the performance metric is /N.
• Let’s use cdroriginal , cdrcleansed and cdrremoved to denote the records before, after
oscillation resolution and removed records respectively.
• Thus we have cdrcleansed = cdroriginal - cdrremoved.
• We compute the performance metric for cdroriginal, cdrcleansed, and cdrremoved.
• We can conclude that our methods are effective if
– cdrcleansed is closer to the GPS data than cdroriginal is; and
– cdrremoved is much farther away from the GPS data than both cdroriginal and
cdrcleansed are.

30 06/07/2020

You might also like