0% found this document useful (0 votes)
134 views6 pages

Data Analytics For Forecasting Cell Congestion On LTE Networks

This document proposes using data analytics techniques like forecasting algorithms on measurements from LTE network probes to predict cell congestion. The probes collect data daily at low cost compared to traditional drive tests. The predictions can be used by Self-Organizing Network strategies to preemptively shift coverage and capacity to congested areas before users experience dropped calls or slow speeds. Experimental results show the approach can accurately forecast network behavior, allowing timely optimization by SON strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views6 pages

Data Analytics For Forecasting Cell Congestion On LTE Networks

This document proposes using data analytics techniques like forecasting algorithms on measurements from LTE network probes to predict cell congestion. The probes collect data daily at low cost compared to traditional drive tests. The predictions can be used by Self-Organizing Network strategies to preemptively shift coverage and capacity to congested areas before users experience dropped calls or slow speeds. Experimental results show the approach can accurately forecast network behavior, allowing timely optimization by SON strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Data Analytics for Forecasting Cell Congestion on

LTE Networks
Pedro Torres, Paulo Marques, Hugo Marques, Rogério Tiago Alves, Luis Pereira, Jorge Ribeiro
Dionísio ALLBESMART LDA
Instituto Politécnico de Castelo Branco Castelo Branco, Portugal
Castelo Branco, Portugal [email protected], [email protected],
[email protected], [email protected], [email protected]
[email protected], [email protected]

Abstract—This paper presents a methodology for using a vehicle, in a predetermined test route, with a test
forecasting the average downlink throughput for an LTE cell engineer operating advanced on-board radio equipment to
by using real measurement data collected by multiple LTE collect network key performance indicators (KPIs).
probes. The approach uses data analytics techniques, namely
forecasting algorithms to anticipate cell congestion events The high efforts for executing these drive tests result in high
which can then be used by Self-Organizing Network (SON) costs and therefore in a low frequency of execution. Typically,
strategies for triggering network re-configurations, such as this kind of measurement is executed no more than 2 or 3 times
shifting coverage and capacity to areas where they are most per year, while changes in the network and the radio
needed, before subscribers have been impacted by dropped environment occur on a much more frequent basis. In
calls or reduced data speeds. The presented implementation comparison to the existing market solutions, the described
results show the prediction of network behaviour is possible approach as some clear advantages, as depicted in Table 1.
with a high level of accuracy, effectively allowing SON
strategies to be enforced in time. The methodology proposed in this paper uses historic
measurements that have been collected by a set of LTE probes,
Keywords—LTE; SON; Machine Learning; Forecasting as the input for forecasting future network behaviour. The
measurements were performed daily, automated and with
I. INTRODUCTION affordable cost. The LTE probes have been developed in the
The overall traffic generated by mobile networks has scope of the European H2020 Research Project MONROE -
increased by 75 % in 2015 and the pace of growth continues to Measuring Mobile Broadband Networks in Europe [1][2][3].
accelerate. To address the capacity needed to transport the These LTE probes have been deployed in existing
increasing mobile traffic, telecom operators have been heavily transportation fleets however they can also be used in taxis,
investing in 4G networks, based on Long Term Evolution buses, private cars and trains, without need for dedicated field
(LTE) standard and its evolution, LTE-A. This increase in personnel, effectively reducing the cost of operation by up to
capacity, taking place especially in urban centres, has been 70%.
achieved thanks to mobile network densification, which
implies large investments from the mobile operators to acquire By using forecasting, this paper also exploits the concept of
new base stations and other network equipment. Furthermore, Self-Organizing Network (SON) strategies. SON has the
as the 4G system will need to coexist with existing networks potential to minimize the lifecycle cost of running a mobile
(2G/3G/Wi-Fi), the result will be a dense, and complex, network by eliminating manual configuration of equipment and
heterogeneous network topology (HetNet). This situation poses troubleshooting during operation, which can significantly
new challenges in the management of the radio access network, reduce service cost. MNOs are keen to capitalize on SON to
which has operating and maintenance costs implications. minimize rollout delays and operational expenditures
associated with their ongoing LTE deployments [4]. The SON
To address the requirements of today’s connectivity demands, ecosystem is increasingly witnessing convergence with other
LTE radio access networks have many features. These however technological trends such as Machine Learning and Big Data
increase the complexity of network planning and maintenance, Analytics. Learning and prediction of network behaviour are
cell optimisation operations and network troubleshooting. key enablers towards the implementation of the SON
Mobile Network Operators (MNOs) use to require independent paradigm. SON can use machine learning predictive
benchmarking companies to perform drive tests, to check functionality to adapt mobile networks to demand in a
coverage and capacity of their networks in order to identify controlled manner (e.g., shifting coverage and capacity to areas
problems and improve network performance in specific with the most need before subscribers have been impacted by
geographical areas. These drive tests are typically conducted dropped calls or reduced data speeds). Therefore, this paper
Table 1 – proposed solution vs existing solutions for MNO benchmarking
Key market solutions Summary of the main characteristics Our proposed solution added value
InfoVista/TEMS, Reference solution in mobile network benchmarking ƒ Improved location – due to a dedicated GPS, our solution can
previously known as (voice+data). Is based on a mobile application that runs on improve up to 50% the location precision, when compared to
ASCOM smartphones (depends on user permissions to use GPS). It triangulation performed by MNOs.
(www.tems.com) provides the main benchmarking KPIs. ƒ Personalized KPIs –capability of measuring a wider diversity of
network parameters, which could also be of interest to MNOs.
Keysight Reference solution in mobile network benchmarking ƒ Lower cost of implementation – the proposed network
Technologies, (voice+data). It uses advanced specialized measurement benchmark probes are autonomous (also remotely accessible)
previously known as equipment that needs a professional technician to operate the and do not need dedicated/specialized field personnel or driver
Anite/NEMO equipment. It also enables importing measurements performed (in case of a mobile probe).
(www.keysight.com) by smartphones though a dedicated application.
CellMining Solution that does not use real measurement equipment. The ƒ Improved location – {same as above}.
(cellmining.com) approach is based on the concept of ‘virtual probes’, which in ƒ Wider and deeper scope –wider diversity of network
fact are algorithms that analyze and correlate information parameters to be monitored and deeper insight on the
based on call-detail records (CDRs) . information obtained, when compared to what is included in
CDRs.
MedUX Recent player, still improving their solution (fixed and mobile ƒ Voice benchmark – capability to measure voice KPIs
(www.caseonit.com) probes) for mobile network benchmarking (data only). It also (narrowband, wideband/VoLTE and super-wideband)
uses network information obtained from crowdsourcing sites. ƒ Personalized KPIs –– {same as above}.

also addresses machine learning algorithms to predict average


throughput per user for a specific cell and time of the day - Experiment 1 Experiment 2 Experiment 3

leveraging on priori measurements from the LTE probes.

In the literature, there are some research studies that address a


similar topic. In [5] load balancing algorithms are studied and
compared to solve localized congestion problems. Methods
based on reinforcement Q-learning algorithm are used for
forecasts load status for every node and combined with the
related concepts of self-organization network. In [6] the authors
propose a reactive load balancing algorithm based also in the Figure 1: MONROE LTE probe software technology
Q-learning algorithm. In [7] the autoregressive integrated
moving average (ARIMA) model and exponential smoothing
model are used to predict the throughput in a single cell and
whole region in a LTE network.

The remaining of the paper is organized as follows: Section II


introduces the MONROE setup; Section III presents the
forecast models whereas the case study and experimental
results are presented in Section IV. Finally, the conclusions are
given in Section V.
II. MONROE SETUP
The MONROE LTE probe software technology is based on
Dockers (Docker Containers) [8]. Dockers provide an isolated Figure 2: The LTE probe
environment to run applications, by wrapping software code in
a complete filesystem that contains everything needed to run. The APU1d4 is the probe’s system board. It interfaces with
This guarantees the software will run the same, regardless of its 4 LTE wireless network cards (1 Sierra Wireless MC7304 LTE
environment. Docker containers share the underlying resources mPCIe and 3 USB SIMCom SIM7100E-EVM) enabling the
of the Docker host but they only include what they need to run simultaneous benchmarking of 4 different LTE MNOs. In fact,
their applications. By using Dockers, multiple experiments can for the time being, the Sierra Wireless interface is restricted
be scheduled to run at the LTE probe. This concept is only for probe management operations. The Raspberry Pi is
illustrated in Figure 1. To reflect the specific needs of required used to send AT control commands to the SIMCom cards for
measurements, the original MONROE LTE probe hardware the establishment and termination of voice calls, the playout of
was updated in order to also support the benchmarking of voice audio files and collection of some of the network KPIs. These
calls (see Figure 2 and Figure 3). This upgraded probe is functionalities can be accessed by the Experiment Container
described simply as an LTE probe henceforward. The hardware through a web service running on the Raspberry Pi. This
setup for this probe is depicted in Figure 2. concept is depicted in Figure 3.
Table 2: Key information retrieved from the LTE probe
General Information
x Unique experiment identifier (Guid);
x Experiment status (Defined, Aborted, Stopped, Finished, …);
x Experiment start and stop time;
x Node identification (NodeId).
Data connection properties
x Integrated Circuit Card Identifier (ICCID);
x International Mobile Station Equipment Identity (IMEI);
x International Mobile Subscriber Identity (IMSI);
x Mobile Country Code (MCC);
x Mobile Network Code (MNC);
x Radio access technology used (mode) by a specific modem;
x Received Signal Strength Indicator (RSSI) for a specific modem;
x Reference Signal Received Power (RSRP) for a specific modem;
x Reference Signal Received Quality (RSRQ) for a specific modem;
x Frequency Band (band) used by a particular modem;
x Local Area Code for the connected cell (LAC) for a specific modem;
x Operator name for a specific modem;
x Node interface(s) name(s), used by the experiment; Figure 3: The LTE probe high-level architecture
x Cell Identification (CID) for a specific modem;
x State (DeviceState) reported to the network by a particular modem;
x Connection submode (DeviceSubmode) for a specific modem;
x IP address used by a particular radio interface;
x GPS location.
Data specific measurements
x Packet loss (UDP only);
x Interarrival jitter (UDP only);
x Round-trip-time (RTT) to a particular destination;
x Download and upload throughput.
Voice KPIs
x Call connection establishment delay;
x Call connection establishment error ratio;
x Call connection loss rate (related to call retainability);
x Call transfer delay of user data frame;
x GSM network registration delay.

With this setup, the LTE probe provides the information


depicted in Table 2. The data connection properties are
obtained by subscribing metadata feeds provided by the
MONROE base container, whilst the data specific
measurements are obtained through dedicated Experiment Figure 4: The lifecycle of an experiment
Containers. Experiments are submitted through a scheduler that
allows the user to specify where the Experiment Container is
located, in which probe will the experiment be executed and
which interfaces (each corresponding to a different MNO) are
to be used. Experiments can also be scheduled for the first
available slot or for a specific date. In any case, recursive
options are available. Once the experiment has started, the data
are collected by the LTE probe and stored on the probe’s local
storage until the experiment is concluded, only then the data
are uploaded to the MONROE repository (a Cassandra
database). Alternatively, during the experiment, data can also
be sent in real-time to external servers for immediate
processing however this approach consumes more traffic quota.
Considering Cassandra is not a relational database, our
approach performs daily synchronizations between the
MONROE repository and its own MySQL database. This is
where data is fetched for various purposes, including learning
the network behaviour for specific areas and predict eventual
Figure 5: allbesmart LTE Dashboard showing the RSSI measured by
capacity issues in an automated fashion as explained in the next a MONROE LTE probe deployed in a bus in Madrid
section. This lifecycle of an experiment is depicted in Figure 4.
The conversion of the raw data in a meaningful and easy to regression forecasting methods [5][6] that can be used. This
understand manner is performed by the allbesmart LTE paper focuses on two well-known algorithms, the naïve
Benchmark Tool, which also acts as a dashboard. It is capable persistence model and a derivation of Autoregressive
of overlaying data related to RSSI, CID, upload and download Integrated Moving Average model (ARIMA) [13].
throughput, amongst others, with the LTE probe’s geolocated
A. Method 1
position. It also provides animations that show how the
different network KPIs behave throughout a day, or during The first method is based on a linear time series model that
specific time periods. Figure 5 depicts the dashboard layout includes a regression component, a variant of ARIMA model,
reflecting the RSSI measurements taken by a mobile LTE named ARIMAX model (Autoregressive Integrated Moving
probe in Madrid, Spain, in a specific time period, whilst Average with Explanatory Variable [9]). This model also
Figure 6 provides the details for the download throughput considers a seasonality effect to estimate the new model
variation for a particular fixed LTE probe. coefficients for forecasting the future outputs.

A linear time series model for response process ‫ݕ‬௧ and


innovations ߝ௧ is a stochastics process that has the form of
equation 1.

‫ݕ‬௧ = ܿ + ‫׎‬ଵ ‫ݕ‬௧ିଵ + ‫ ڮ‬+ ‫׎‬௣ ‫ݕ‬௧ି௣ + ߝ௧ + ߠଵ ߝ௧ିଵ + ‫ ڮ‬+ ߠ௤ ߝ௧ି௤ (1)

The model expresses the conditional mean of ‫ݕ‬௧ as a function


of both past observations, ‫ݕ‬௧ିଵ , … , ‫ݕ‬௧ି௣ , and past innovations,
ߝ௧ିଵ , … , ߝ௧ି௤ . Where ‫ ݌‬is a positive integer that indicates the
Figure 6: Sample of the average download speed variation as
measured by a fixed MONROE LTE probe in a time period of 3 days. degree of the nonseasonal autoregressive polynomial and ‫ ݍ‬a
positive integer that indicates the degree of the nonseasonal
moving average polynomial.
III. FORECAST MODELS
Machine learning is a kind of artificial intelligence that A maximum likelihood function is used to estimate the
provides systems with the ability to learn without being parameters of the ARIMAX model given the observed
explicitly programmed. Machine learning focuses on univariate time series ‫ݕ‬௧ . Given its history, the innovations are
developing computer programs that can change when exposed conditionally independent. Let ‫ܪ‬௧ denote the history of the
to new data. The algorithms use data to detect patterns and process available at time t, where t=1, …, T. The likelihood
adjust program actions accordingly. Typically, these algorithms function of the innovations is obtained from equation 2.

can be categorized as being supervised, unsupervised or
reinforcement learning. Supervised algorithms can apply what ݂(ߝଵ , … , ߝ் |‫ି்ܪ‬ଵ ) = ෑ ݂(ߝ௧ |‫ܪ‬௧ିଵ ) (2)
has been learned to new data. Unsupervised algorithms can ௧ୀଵ
draw inferences from datasets. In reinforcement learning the where f is the standard Gaussian or t probability density
algorithm learns a policy of how to act given an observation of function.
the world. Every action has some impact on the environment, B. Method 2
and the environment provides feedback that guides the learning
This method uses the Naïve model [6] which is a forecasting
algorithm.
method that uses the last observation (time step (t)) to predict
the expected outcome at the next time step (t+1). The naïve
In our proposed approach, the Machine Learning Engine uses
approach can be used with a stable series, with seasonal
the measurements collected by the LTE probes and stored in a
variations, or with trend. With a stable series, the last data
local relational database to learn the network event patterns
point becomes the forecast for the next period. As example, if
and forecast its future behaviour. The predicted KPI values are
the throughput in the last hour was 25Mbps, the forecast for
then forwarded to a Self-Optimization Process that uses these
this hour is 25Mbps.
values to take timely preventive actions. In this analysis, the
network is seen as a dynamic system, and to forecast its
This method is used as a supervised machine learning
behaviour we need to use past output measurements. In other
algorithm to identify trends, seasonality and forecast the
words, given observations ‫(ݕ{ = )ݐ(ݕ‬1), … , ‫ })݊(ݕ‬of the
estimated future values. The historical univariate data
output of the network, forecasting is the prediction of the
(throughput as a function of time) is transformed in a
outputs ‫ ݊(ݕ‬+ 1), … , ‫ ݊(ݕ‬+ ݄) until a future time horizon h.
supervised learning problem with inputs and outputs in the
way that the throughput of instance (t) is the output of instance
In order to apply forecast, a model that fits past measured data
(t+1). Typically, for stable time series data,
from the network needs first to be identified. This can be a
linear time series model, state-pace models, or a nonlinear
model. In the literature, there are numerous time-series and ‫ݕ‬ෝ௧ = ‫ݕ‬௧ିଵ (3)
considering the seasonal variations,
‫ݕ‬ෝ௧ = ‫ݕ‬௧ି௡ (4)
where n is the cycle last n periods.

For data with trends,


‫ݕ‬ෝ௧ = ‫ݕ‬௧ିଵ + (‫ݕ‬௧ିଵ െ ‫ݕ‬௧ି௡ ) (5)
, if there are constant trends between ‫ݕ‬௧ିଵ and ‫ݕ‬௧ି௡ .

IV. EXPERIMENTAL RESULTS


The goal is to forecast one entire week of the cell average
downlink (DL) throughput, with a resolution of one hour. The
input data for our model was obtained from a fixed LTE probe
deployed in the city of Lisbon (Portugal) in a dense urban area
connected to a LTE (4G) mobile network operator. Three Figure 7: Downlink average throughput: 3 weeks for training and 1
weeks of historical collected measurements have been used for week for forecasting.
training the prediction models and one week was used to
compare the observed throughput with the forecast values, as
illustrated in Figure 7.

For method 1, the comparison between the forecast values and


the actual DL throughput is highlighted in Figure 8. We can
observe the confidence interval (uncertainty) increases as the
time goes on.

The ARIMAX model proved to give a very good estimation


during the first 20 hours, actually enabling us to predict cell
congestion events. A Cell congestion event was considered
when we measured a drop of at least 75% on the average
download speed per user (considering an average of 20Mbps
per user during off-peak hours).
Cell congestion
The prediction capability is very relevant for a SON
implementation - if the self-optimization process estimates
that a change in the mobile network configuration will
Figure 8: One week of throughput forecast and associated confidence
compensate possible trade-offs (e.g., interruption in service) it intervals (Method 1)
may schedule an optimization process that performs changes
in the cell’s parameters (such as the activation of additional
LTE carriers) to compensate a forecasted cell outage event
[10].

Figure 9 depicts the forecasting results obtained from method


2 in comparison with the real (measured) values of DL
throughput.

Figure 10 presents the comparison between the two forecast


models.

The Mean Squared Error (MSE) computed for the 2 methods,


presented in Table 2, shows that method 2 is more accurate
than method 1, for this dataset. Although the Naïve persistence
model is quite simple, usually it is quite efficient and assertive
in the time series forecasting.
Figure 9: One week of throughput forecast (Method 2) compared to
real measurements
International Symposium on A World of Wireless,
Mobile and Multimedia Networks (WoWMoM),
Coimbra, 2016, pp. 1-3.
doi: 10.1109/WoWMoM.2016.7523537
[3] Ö.Alay et al, “MONROE: Measuring Mobile Broadband
Networks in Europe”, IRTF & ISOC Workshop on
Research and Applications of Internet Measurements
(RAIM), 2015.
[4] L. Jorguseski, A. Pais, F. Gunnarsson, A. Centonza and
C. Willcock, "Self-organizing networks in 3GPP:
standardization and future trends," in IEEE
Communications Magazine, vol. 52, no. 12, pp. 28-34,
December 2014. doi: 10.1109/MCOM.2014.6979983
[5] J. Xu, L. Tang, Q. Chen and L. Yi, "Study on Based
Reinforcement Q-Learning for Mobile Load Balancing
Techniques in LTE-A HetNets," 2014 IEEE 17th
International Conference on Computational Science and
Figure 10: Comparison between the two forecast models
Engineering, Chengdu, 2014, pp. 1766-1771.
[6] S. S. Mwanje and A. Mitschele-Thiel, "A Q-Learning
Table 3: Forecasting errors (MSE) expressed in Mbps
strategy for LTE mobility Load Balancing," 2013 IEEE
Method MSE [Mbps] 24th Annual International Symposium on Personal,
Method 1: ARIMAX model 7.16 Indoor, and Mobile Radio Communications (PIMRC),
London, 2013, pp. 2154-2158.
Method 2: Naïve persistence model 6.90
[7] Dong, Xin, Wentao Fan, and Jun Gu. "Predicting LTE
Throughput Using Traffic Time Series." ZTE
V. CONCLUSIONS Communications 4 (2015): 014.
This work describes a data analytics methodology and [8] Docker containers. [online].
modelling capable of forecasting the average downlink Available: https://www.docker.com/
throughput of an LTE base station by using two forecast [9] G. G. M. J. a. G. C. R. Box, Time Series Analysis:
models, the ARIMAX and the Naïve persistence models. The Forecasting and Control 3rd ed., Englewood Cliffs, NJ:
obtained results have shown that both models are able to Prentice Hall, 1994
forecast the network behaviour with high accuracy. We are [10] X. S. W. L. a. J. W. Jingyu Li, “Self-Optimization of
able to estimate a cell congestion event up to 30 hours in Coverage and Capacity in LTE Networks Based on
advance which provides SON strategies, enough time to react Central Control and Descentralized Fuzzy Q-Learning,”
(e.g., by shifting coverage and capacity to areas in need, before International Journal of Distributed Sensor Networks,
subscribers have been impacted by dropped calls or reduced vol. 8, 2012
data speeds). As future work it is suggested the comparison [11] A. M. L. Leite da Silva, W. S. Sales, L. A. d. F. Manso
with other forecasting models, probabilistic and fuzzy. and R. Billinton, "Long-Term Probabilistic Evaluation of
Operating Reserve Requirements With Renewable
ACKNOWLEDGMENTS Sources," in IEEE Transactions on Power Systems, vol.
This work is funded by the European Union's Horizon 2020 25, no. 1, pp. 106-116, Feb. 2010.
research and innovation programme under grant agreement [12] L. M. Carvalho, J. Teixeira and M. Matos, "Modeling
No. 644399 (MONROE) through the open call project wind power uncertainty in the long-term operational
Affordable LTE Network Benchmarking Based On MONROE, reserve adequacy assessment: A comparative analysis
and the European Structural Investment Funds (ESIF), through between the Naïve and the ARIMA forecasting models,"
the Operational Competitiveness and Internationalization 2016 International Conference on Probabilistic Methods
Programme (COMPETE 2020) [Project Nr. 17787 (POCI-01- Applied to Power Systems (PMAPS), Beijing, 2016, pp.
0247-FEDER-MUSCLES). 1-6.
[13] M. M. Eljazzar and E. E. Hemayed, "Enhancing electric
REFERENCES load forecasting of ARIMA and ANN using adaptive
[1] MONROE, “Measuring Mobile Broadband Networks in Fourier series," 2017 IEEE 7th Annual Computing and
Europe”, H2020-ICT-11-2014 research project. [online]. Communication Workshop and Conference (CCWC),
Available: https://www.monroe-project.eu/ Las Vegas, NV, USA, 2017, pp. 1-6.
[2] Ö. Alay et al., "Measuring and assessing mobile
broadband networks with MONROE," 2016 IEEE 17th

You might also like