0% found this document useful (0 votes)
26 views

Network-Aware Credit Scoring System For Telecom Subscribers Using Machine Learning and Network Analysis

Uploaded by

Marcio Filho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Network-Aware Credit Scoring System For Telecom Subscribers Using Machine Learning and Network Analysis

Uploaded by

Marcio Filho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/1355-5855.htm

APJML
34,5 Network-aware credit scoring
system for telecom subscribers
using machine learning and
1010 network analysis
Received 11 December 2020 Hongming Gao and Hongwei Liu
Revised 13 July 2021
2 August 2021 School of Management, Guangdong University of Technology, Guangzhou, China
Accepted 23 August 2021
Haiying Ma
School of Internet Finance and Information Engineering,
Guangdong University of Finance, Guangzhou, China
Cunjun Ye
School of Management, Guangdong University of Technology, Guangzhou,
China, and
Mingjun Zhan
Business School, Foshan University, Foshan, China

Abstract
Purpose – A good decision support system for credit scoring enables telecom operators to measure the
subscribers’ creditworthiness in a fine-grained manner. This paper aims to propose a robust credit scoring
system by leveraging latent information embedded in the telecom subscriber relation network based on multi-
source data sources, including telecom inner data, online app usage, and offline consumption footprint.
Design/methodology/approach – Rooting from network science, the relation network model and singular
value decomposition are integrated to infer different subscriber subgroups. Employing the results of network
inference, the paper proposed a network-aware credit scoring system to predict the continuous credit scores by
implementing several state-of-art techniques, i.e. multivariate linear regression, random forest regression,
support vector regression, multilayer perceptron, and a deep learning algorithm. The authors use a data set
consisting of 926 users of a Chinese major telecom operator within one month of 2018 to verify the proposed
approach.
Findings – The distribution of telecom subscriber relation network follows a power-law function instead of the
Gaussian function previously thought. This network-aware inference divides the subscriber population into a
connected subgroup and a discrete subgroup. Besides, the findings demonstrate that the network-aware
decision support system achieves better and more accurate prediction performance. In particular, the results
show that our approach considering stochastic equivalence reveals that the forecasting error of the connected-
subgroup model is significantly reduced by 7.89–25.64% as compared to the benchmark. Deep learning
performs the best which might indicate that a non-linear relationship exists between telecom subscribers’ credit
scores and their multi-channel behaviours.
Originality/value – This paper contributes to the existing literature on business intelligence analytics and
continuous credit scoring by incorporating latent information of the relation network and external information
from multi-source data (e.g. online app usage and offline consumption footprint). Also, the authors have
proposed a power-law distribution-based network-aware decision support system to reinforce the prediction
performance of individual telecom subscribers’ credit scoring for the telecom marketing domain.
Keywords Credit scoring, Relation network, Stochastic equivalence, Power-law distribution,
Machine learning, Deep learning
Paper type Research paper

Asia Pacific Journal of Marketing


and Logistics This research was supported by the National Natural Science Foundation of China [grant number
Vol. 34 No. 5, 2022
pp. 1010-1030 71671048]; Guangdong Construction of High-Level Colleges for Postgraduate Study Abroad Project in
© Emerald Publishing Limited Guangdong University of Technology [grant number 262515006]; and Top Innovation Graduate
1355-5855
DOI 10.1108/APJML-12-2020-0872 Student Cultivation Project Fund.
1. Introduction Network-aware
To recommend new products (or services), or upgrades, or add-ons up-selling to telecom credit scoring
subscribers, telecom operators needs to precisely measure the creditworthiness of
subscribers in a fine-grained manner. Credit scoring is the core of institutions in managing
system
and control risk, which conveys information about the creditworthiness of individual users,
and is applied widely to estimates the likelihood of a user succeeds to meet her debt obligation
(Hand and Henley, 1997; Luo, 2019; San Pedro et al., 2015). Credit scoring issues rooted from
banks’ credit cards granting decision and has shifted to other socio-economic fields. For 1011
instance, the credit granting binary decisions exist in traditional financial institutes (San
Pedro et al., 2015) and emerging peer-to-peer lending platforms (Ahelegbey et al., 2019; Giudici
et al., 2019). Besides, supply chain finance confronts a credit risk multi-class assessment
problem for buyer- or seller-firms (Luo, 2019; Fayyaz et al., 2020). In the context of the telecom
industry, a telecom credit rating system posits a challenge to the continuous prediction of
individual subscribers’ credit scores instead of the category classification problems in the
preceding studies.
Telecom operators conventionally exploit their telecom inner data to evaluate customer
segmentation, analyze fraud detection, and implement churn prediction (Han et al., 2012;
Hung et al., 2006; Kimura, 2021; Xing and Girolami, 2007), which constitutes a major research
domain in telecom marketing and sales. These studies are still limited in that because their
business models reflect subscriber persona profile in a low-dimensional and single data
source. Even multi-source data show the complementary predictive capabilities to financial

credit scoring (Oskarsd ottir et al., 2019; San Pedro et al., 2015; Yu et al., 2020). The telecom
inner data (e.g. mobile phone call-detail records, telecom billing information) explains users’
credit scores as input in some financial fields, but not to be independent variables for the
creditworthiness of telecom subscribers in the telecom marketing domain itself.
In addition, with the development of computing power, a variety of state-of-art techniques
are exploited to evaluate personal credit scores and the risk of default (San Pedro et al., 2015;

Yu et al., 2020; Luo, 2019; Fayyaz et al., 2020; Oskarsd ottir et al., 2019). A set of credit-scoring
decision support systems combined with machine learning and deep learning are
implemented, such as logistic regression, support vector machines, gradient boosted trees,
random forest, and multilayer perceptron, deep neural network. However, the extant models
assumed individual users as a static population. All users are assessed by a credit scoring
homogeneity model without considering heterogeneity, which might cause a high level of
systemic risk in credit scoring and the bias of prediction performance. A few studies captured
heterogeneity in the peer-to-peer lending platform (Giudici et al., 2019; Ahelegbey et al., 2019).
Heterogeneity is conceptualized as stochastic equivalence in this research stream. A
phenomenon is often seen in a group relation network where the vertices can be divided into
different subgroups (or communities) such that members of the same subgroup have
analogous patterns (Hoff, 2008). However, Ahelegbey et al. (2019) simply posited that user
relationships in the network obey a Gaussian function.
Telecom operators require a high-performance credit scoring model to predict the detailed
degree of subscribers who may be at risk of being default, in order to conduct precision
marketing campaigns according to subscribers profiling and credit-based segmentation
(Bayer, 2010; Han et al., 2012). To address these research gaps, we utilize multiple data
sources, including telecom inner data, online app usage and offline consumption footprint, to
train and compare different state-of-art algorithms comprehensively for continuous credit
score management in the telecom industry. Our study has several key contributions to the
extant literature in credit scoring. To the best of our knowledge, this is the first study to apply
online app usage and offline consumption footprint in a continuous credit scoring model for
telecom subscribers. We shed light on the information fusion of multi-source data in this
continuous credit scoring prediction context, as this was ignored by the preceding research.
APJML In addition, our proposed model is a network-aware credit scoring model, which captures the
34,5 stochastic equivalence in the telecom subscriber relation network. This is a heterogeneity
model as compared to the conventional credit scoring homogeneity model. Our findings show
subscriber relationships in the network follows a power-law distribution, which relaxes the
Gauss function assumption. Last, this proposed model employs the decision support
approach in the term of the continuous credit scores problems. The findings reveal that our
model achieves more accurate forecasting performance, can be useful by extending machine
1012 learning, deep learning, and network science in the field of marketing. And managerially,
these help telecom operators to conduct precision marketing campaigns according to the fine-
grained level of creditworthiness of subscribers.
The remainder of this study is organized as follow. Section 2 describes the related work
about credit scoring and the telecom industry. In Section 3, the proposed research
methodology is presented. Section 4 gives a data description provided by a leading Chinese
telecom operator. The experimental settings and the results of forecasting error analysis are
shown in Section 5. Section 6 provides a discussion for theoretical contribution and
managerial implications.

2. Literature review
2.1 Marketing in the telecom industry
There are two major research domains for the telecom operators to shape their business
competitive edges in the telecom industry. One is to improve the network traffic performance
such as self-organizing network automation (Gomez-Andrades et al., 2016), real-time
troubleshooting and root cause analysis (Imran et al., 2014), service failure prevention (Wallin
and Landen, 2008).
On the other, the revenue of marketing is considered the top priority in the telecom
industry. To improve the revenue, researchers integrated the characteristics of telecom data
and the experience of experts to segment different customer lifecycle values (Bayer, 2010; Han
et al., 2012; Phau et al., 2014). Compared with attracting a new subscriber, the cost of retaining
existing subscribers is much lower. Because of the associated revenue losses, telecom
subscriber churn management is a tool for retaining current subscribers through satisfying
subscribers’ demands (Gao and Bai, 2014; Hung et al., 2006; Ram and Wu, 2016; Verbeke et al.,
2012). For example, Hung et al. (2006) utilized telecom subscribers’ demographics, billing
information, call detail records, and service changelogs to assign a score for their propensity
to churn. Verbeke et al. (2012) found the profit significantly increases in a retention marketing
campaign, after considering the optimal fraction of telecom subscribers with the highest
predicted probabilities to attrite. Simultaneously, telecom fraud detection is a basis to identify
those subscribers at-risk who harms other common telecom subscribers (Xing and Girolami,
2007). In a nutshell, extant studies of marketing are still limited in that because they only
considered the traditional inner telecom data. It suggests a research gap that the conventional
telecom business models reflect the subscriber persona profile in the way of the low
dimensional and single-data source.
Though multi-source data sources have shown their efficient complementary capabilities to

the financial credit scoring (Djeundje et al., 2021; Oskarsd
ottir et al., 2019; San Pedro et al., 2015;
Yu et al., 2020). The telecom inner data (e.g. mobile phone call-detail records, telecom billing
information) explains the subscribers’ credit scores as input in the financial field, but not the
creditworthiness of telecom subscribers in the telecom marketing domain itself (Olowe et al.,
2021). A high-performance comprehensive credit scoring model is the core of the subscriber
persona profile. Telecom operators aim to not only achieve credit control of defaulting
subscribers but also facilitate intelligent marketing strategies based on the respective credit
ratings of existing subscribers to make a more profitable sale.
2.2 State-of-art techniques for credit scoring Network-aware
2.2.1 Homogeneity models. A variety of state-of-art techniques has been exploited to evaluate credit scoring
personal credit scores and risk of default with multi-source data (San Pedro et al., 2015; Yu

et al., 2020; Luo, 2019; Fayyaz et al., 2020; Oskarsd ottir et al., 2019; Zhang and Dai, 2020). San
system
Pedro et al. (2015) conducted a study to model users’ financial risk and credit scores with
logistic regression, support vector machines, and gradient boosted trees. Their findings show
that the proposed approach incorporating consumption, network, mobility features from
mobile phone usage data, which is more accurate, comparable to a model without financial 1013
history. Yu et al. (2020) used logistic regression to model online social media data from
Douban.com (a community like Twitter where users can follow each other and post reviews to
movies or books). Their credit evaluation system validated the forecasting capabilities of
complex information within online social media. Besides, the prediction performance of
artificial neural networks is also implemented in the financial credit risk classification
problem. The multilayer perceptron algorithm is compared to four machine learning
algorithms and reveals its generalization in the credit default swaps market (Luo, 2019).
These recent studies have established a connection between multi-source data and state-of-
art credit scoring techniques. However, the extant literature assumed that all individual users
as a static and homogeneous population. That is, all individual users are evaluated by a
homogeneity model without considering heterogeneity. The unobserved heterogeneity needs
to be revealed by different customer segments (Le et al., 2019; Zhan et al., 2020).
2.2.2 Heterogeneity models. A few studies recently captured user heterogeneity for credit
scoring classification problems in peer-to-peer lending platforms (Ahelegbey et al., 2019;
Giudici et al., 2019). In a research stream of network science, user heterogeneity needs to be
captured and identified as stochastic equivalence (Ahelegbey et al., 2019; Giudici et al., 2019;
Gao et al., 2021). That is, a phenomenon is often seen in a group relation network in which the
users (vertices) can be divided into different segments such that members of the same
segment have analogous patterns (Hoff, 2008). Different interconnectedness within any
different user subgroup has been a representation of unique stochastic equivalence
respectively. Specifically, Giudici et al. (2019) leveraged information by extracting topological
features of borrower similarity networks and identifying different communities. They found
the network inference enhances credit risk accuracy. Follow the work (Giudici et al., 2019),
Ahelegbey et al. (2019) employed the inner-product of borrower-specific eigenvectors to
convert into a borrower relation network. Then they simply hypothesized that borrowers’
relationships in the network obey a Gaussian function, and classified the population of
borrowers into a connected subgroup and an unconnected one. Thanks to the stochastic
equivalences of those two subgroups, their heterogeneity models facilitate the prediction
capability when confronting a binary credit-granting decision-making problem.
Since the multi-source data sources can reveal a latent relation of telecom subscribers

(Hung et al., 2006; San Pedro et al., 2015; Oskarsd ottir et al., 2019). For example, the call detail
record reflects the actual relationship between telecom subscribers. Moreover, both their
online mobile app usage patterns and their offline consumption disclose the relationship
between their behaviour patterns. To our best of knowledge, most literature still lacks the
context of continuous credit scores management in which telecom operators can precisely
recommend new products or services, or upgrades or add-ons up-selling according to the
creditworthiness of subscribers in a fine-grained manner. We assume that the stochastic
equivalence of different telecom subscriber subgroups can be inferred from the latent
relation in a perspective of network science. Following the work (Hoff, 2008; Ahelegbey
et al., 2019), we argue that there is a network-aware credit scoring system integrating
identified stochastic equivalence, which might mitigate the high level of systemic risk and
enhance the prediction performance of continuous credit scoring in the field of telecom
marketing.
APJML 3. Methodology
34,5 This study presents a novel network-aware credit scoring model using multiple sources,
e.g. traditional telecom inner data, online app usage data and offline consumption footprint
data. The research framework is shown in Figure 1. We first achieve information fusion using
singular value decomposition and propose a power-law-based relation network model.
Furthermore, several state-of-art algorithms are exploited for business intelligence analytics.
After that, the proposed models are compared with a benchmark that cannot account for
1014 stochastic equivalence. Last, the forecasting error analysis assesses the performance of these
techniques.

3.1 Relation network model using latent factors


3.1.1 Singular value decomposition (SVD). SVD is a classic matrix factorization technique that
decomposes an arbitrary high-dimensional n 3 m matrix A. The SVD of A can be given by:
A ¼ UDV T with U T U ¼ V T V ¼ Im and D ¼ diagðσ 1 ; σ 2 ; . . . ; σ m Þ (1)

Wherein, U and V comprises the orthonormalize eigenvectors of AAT and AT A,


correspondingly. The diagonal elements of D ¼ Λ1=2 are singular values, which are non-
zero eigenvalues ðσ 1 ; σ 2 ; . . . ; σ m Þ of AT A and sorted in descending order.
Let X ¼ ðχ 1 ; . . . ; χ n ÞT be the observed multi-source subscribers’ features. Based on the
studies (Hoff, 2008; Ahelegbey et al., 2019), the observed explanatory features X can be
introduced in Equation (1) as follows:

X ¼ A þ E ¼ UDV T þ E (2)

In Equation (2), A is the expectation of X. E is the stochastic disturbance terms, and


independent and identically distributed (i.i.d) MVN ð0; ΣÞ. This study posits that stochastic

Traditional inner data Multiple data sources

Inner database Online apps usage Offline consumption footprint

Singular value
decomposition

Network
inference

Relation network model

MLR RF SVR MLP DL


Figure 1. -
The proposed network-
aware credit scoring
decision support
system Forecasting error analysis
equivalence exists in the telecom subscribers. That is, latent factors embedded in the Network-aware
observed information of subscribers, enable us to identify a fine-grained level of subscribers’ credit scoring
creditworthiness, the continuous credit scores in this study. Thus, we consider the common
factors fi ¼ ui D as a lower-dimensional vector (i.e. s < m) of latent factor scores to express
system
any χ i as:
χ i ¼ ui DV T þ εi ¼ fi V T þ εi (3)
1015
where fi ¼ ðfi;1 ; . . . ; fi;s Þ0 is a s dimensional vector. V is the m 3 s matrix of factor loadings
associated with fi, ui is also s dimensional vector, and D is a s 3 s diagonal matrix. As we
find the optimal number s of latent factors, Equation (3) represents a reduce-dimension form
of X.
3.1.2 Relation network model. Based on the concept of the eigenvalue-decomposition
relation network (Hoff, 2008; Ahelegbey et al., 2019), the common inner-product model are
employed. Specifically, let rij denote the relation between telecom subscribers i and j, and the
adjacency matrix R for characterizing subscriber relation network can be represented as:
2 3
r11 r12    r1n
6 r21 r22    r2n 7
6 7
R ¼ 6 .. .. .. .. 7 with rij ¼ fiT fj (4)
4 . . . . 5
rn1 rn2    rnn

Ahelegbey et al. (2019) assumed the relationship follows a Gaussian function Φðθ þ rij Þ,
 
wherein Φ is the cumulative density function and a constant θ ¼ Φ−1 n −2 1 . We relax the
hypothetic restrictions. The data-driven discovery in Subsection 5.2 validates that the
subscribers’ relations obey a power-law function instead of Gaussian distribution. This can
be described by a probability density pðrÞ such as

pðrÞdr ¼ Prðr ≤ rij < r þ drÞ ¼ Cr −α dr (5)

In Equation (5), rij is the observed value of the relation between subscribers i and j, C is a
normalization constant. The constant parameter α of the distribution is known as the scaling
parameter, to be estimated. In a discrete case, the probability of the power-law distribution is
PðrÞ ∝ r −α (Clauset et al., 2009). It can be defined as the probability Crij−α of an edge between
vertices i and j (i.e. Aij ¼ Aji ¼ 1 in an undirected network):

PðAij ¼ 1jrij ; αÞ ¼ Prðr ¼ rij Þ ¼ Crij−α (6)


Aij ¼ 1; if Crij−α >λ (7)

Wherein λ is a parameter to reflect the sparse degree of the subscriber relation network. After
inferring the latent subscriber relation network, the stochastic equivalence of different
telecom subscriber subgroups can be identified by Aij.

3.2 Decision support system for credit scoring


This study aims to investigate the prediction capacities of latent network and multi-source
data to the continuous crediting scores of telecom subscribers, instead of category credit-
granting classification problems. After identifying the stochastic equivalence, we capture the
heterogeneity of telecom subscribers. We introduce several state-of-art techniques to assess
the prediction capability of our proposed approach.
APJML 3.2.1 Multivariate linear regression (MLR). MLR is a statistical technique that uses
34,5 multiple explanatory variables to predict the continuous outcome. MLR is also used for
credit scoring, similar to logistics regression (Teles et al., 2020; Luo, 2019; Ahelegbey et al.,
2019). We set yi as a credit score of a subscriber, observing n subscribers. The normal
regression model is:
  
yi β; σ 2 ; xi: Nn xi β; σ 2 In (8)
1016
In Equation (8), xi indicates a vector of m observable features of the subscriber i and the
parameter vector β ¼ ðβ1 ; β2 ; . . . ; βm Þ0 needs to be estimated. Nn ðxi β ¼ μ; σ 2 In ¼ ΣÞ
denotes n-dimension normal distribution and includes the identity matrix In, a mean μ, and a
covariance matrix Σ.
3.2.2 Random forest regression (RF). RF is an ensemble of decision tree predictors
hðxi ; θk Þ; k ¼ 1; . . . ; K where xi still are the vector of m subscribers’ observable features,
and θk are i.i.d random vectors (Segal, 2004). In the continuous outcome setting, this study
concentrates on regression form yi. The observed information consists of data
ðx1 ; y1 Þ; . . . ; ðxn ; yn Þ, which are assumed to obey the joint distribution of ðX ; Y Þ
independently.
The prediction can be expressed by the unweighted average over the ensemble:
X
K
byi ¼ hðxi Þ ¼ ð1=KÞ hðxi ; θk Þ (9)
k¼1

As k approaches ∞, the Law of Large Numbers pledges


 2
2
EX ;Y Y  hðX Þ → EX ;Y ðY  Eθ hðX; θÞÞ (10)

The forecasting error for RF is the quantity on the right of Equation (10). Its’ convergence
indicates that RF will not have an over-fit problem.
3.2.3 Support vector regression (SVR). Support vector machines and SVR are a type of
popular machine learning algorithms to deal with nonlinear problems (Benkedjouh et al.,
2015). The purpose is to determine an optimal hyperplane with a maximum margin that acts
as the decision boundary. SVR is applied to evaluate credit scores and demonstrates its
superior performance (Baesens et al., 2003; Goh and Lee, 2019).
Based on the observed data X ¼ ðxi ; di Þ; i ¼ 1; . . . ; n, where di ∈ R, SVR aims to solve
the problem of inferring a function yi ¼ f ðxi Þ. We need to train a SVR, equivalent to its
regression form:
X
n
 
f ðxi Þ ¼ αi  αi* kðxi ; xj Þ þ b (11)
i¼1

T
Wherein, α ¼ ðα1 ; α2 ; . . . αn ÞT , α * ¼ ðα1* ; α2* ; . . . αn* Þ and b are the parameters. Besides,
kðxi ; xj Þ denotes a positive definite kernel function. By minimizing the following objective
function, we can compute α and α * ði ¼ 1; . . . ; nÞ,

1 X   Xn
  Xn
 
αi  αi* αj  αj* kðxi ; xj Þ þ ε αi þ αi*  d αi  αi* (12)
2 i;j¼1 i¼1 i¼1
Subjecting to: Network-aware
X
l
  credit scoring
αi  α*i ¼ 0 and αi ; α*i ∈ ½0; C (13) system
i¼1

Wherein ε and C indicate the hyper-parameters used to minimize the learning error. With the
notion of support vector, the output prediction in Equation (11) is:
1017
Xn
 
byj ¼ αi  α*i kðxi ; xj Þ þ b (14)
xi ∈ SV

In this study, we use a Gaussian function with the width of the kernel σ:
 
k xi  xj k2
kðxi ; xj Þ ¼ exp − (15)
2 σ
3.2.4 Multi-layer perceptron (MLP). MLP is one of the artificial neural network techniques,
which has good capability of approximating any finite sets of real numbers (Juhos et al., 2009;
Chong, 2013) with multiple layers between the input and output layers. MLP is widely used
for credit risk assessment (Dahiya et al., 2016; Luo, 2019). The two hidden-layered MLP is
deemed to have better performance (Juhos et al., 2009; Chester, 1990), and it non-linearizes
several linear regression models by the typical sigmoid activation function:
1
f ðxÞ ¼ (16)
1 þ e−x
Where both the input and output layers have linear units. The vector xi of m observable
features of subscriber i takes the form xi1 ;    ; xim. Equation (17) displays the class of this
telecom subscriber produced by the one hidden-layered MLP called perceptron, which is the
building block of the multi-layer perceptron. And Equation (18) shows the result of
Equation (17).
!
X m
oi ¼ f
1s
wt xit þ wbias
1s 1s
(17)
tþ1

X
l2
yi ¼ wr o1s
i
r¼1
! (18)
X
l2 X
l1
yi ¼ wr f w2r 1s
s oi þ w2r
bias
r¼1 s¼1

Wherein, the output of the sth perceptron is o1s 1s 2r


i for the subscriber i. And wt , ws , wr are weights
in the first and second layers and output unit, l1, l2 are the number of perceptrons respectively.
Besides, w1s 2r
bias, wbias are the biases in the first and second layers. The most popular back-
propagation method with momentum is used for training the MLP.
3.2.5 Deep learning algorithm (DL). An alternative deep learning algorithm (DL) for multi-
layer neural networks is employed (Candel et al., 2016). As shown in Figure 2 (based on an
experimental result of our study), our proposed DL architecture is comprised of neurons
(nodes) that are distributed in hierarchical layers. The neural networks have four layers –
input for a vector of observable features, two hidden of nonlinearity that have five and three
neutrons respectively, and output for matching the regression outcomes. These black lines
APJML ID1_Age
ID2_BlackList −0
.2
34,5 −5 752
1 1 1

01.4.13.5
−0.932−67564−.4
ID3_CheckName 42 5

04360.902.706.47544660615.52951581131.6562.8.254093.277503484
0.1 .357

23 53224−3.2118.−74
−0

38 78 04 7 0 2
ID4_Undergrad 106 85 .5
0.12 9 20
929

0.3
3
ID5_X4G_Unhealthy

.88009.303−−23181−−1
1.19765

18 15 92
3.105112 100.7.67
ID6_AVR_RecentSixPay −20.33702

−6
9.

.3
2−1 90 31 −2.4
ID7_Ban_Mon .4 .42

57
7 3 823
8.69 62139 8

98
1−.0

0.1 −−00.9−01.65−−4.7−3.4 −−1 −1.1 −1 −1 3 − 1 2

4−01.549
ID8_BanStable 26397883

65 44672111841279369.5098.35462 06402189799876774831646 4894 .36543 69 541 509 258673.335084

.68 24
0−.3 4

0 68.35
1.144276

−0.
ID9_BanStable_Rel6mon

84
6
1−.29.9940498
1
1018
−9334.609.89

5
071
ID10_CreditQuality .5 535 07

.9 274.1−60657.939055409.9 .1.43 2.9772 300


48−00.4 9 7

36

−0
44 1 7568

50 20 687 740 71 .0− .10.4.180.70.−5 .33 .86−0. 0.9 0. −0 0


ID11_Debt_Current 0. 11784.6453

.5
2

6 3 279 40
664

2
5 6 4.4

08
ID12_MonLastPay 0
0. −0.09281

3 8 2 56

8
34 1 556405 1 14 22 64 .34 757 0 39 1 98 02 28 75 50 7 11 3 19 49 48 01 19 49 51
.8 57379
ID13_Number_call 01.4
79490.91126959.79.04091703.5258.3021023.659774.367301412.6.742646.49041.4795058.2237.779−901.10.287654.3972066.48976.359292400.0383906.6.14061.67.4191.4309921.84.46 .14 −0.
52 37 9 −9
−−00.0 0 7 .2
8
47

−0.
ID14_PayType 527382
..1 21
2−121−4.30 80 970 940 .7− .60 .7− 0. − 0 1 0 0. − 0 0

−0 0.588858

274
8

−0.9

−1.4
ID15_PayStable 2 7021

1.36
. 3
16.963 74

18
ID16_RecentPay 38514

9
. 6 40 04 06 2

0..8

35
9

281

519

012
−0 873

65
ID17_Sensity 50 319814
1.0.6

.9
8 9

0 5625
9

−1
ID18_SensityRate .
−11.51288
8 5 88 9 8
19 −718.70−0.7−0.2 −0.7 −−1 −0 −0 − − − −

−28.96065 −0.53621 CreditScore


6

ID19_Tenure_net .3 9409
3−0.5
1 6
39

40. .944817 7
9

0.
ID20_Total_Mon 41 2901

59
−0.04.616727

00
OAS1_AppFinan 36 5
6 4 4

2−.40 61361
0.4620

9
8

.1 996
OAS2_AppLogistic .601 463708
7

1−
.42

83.2563881 6
OAS3_AppOnlineShop 3 21
6

. 22
−68

0 −14.9 62
247 .
.7 −0
12 8 3 26

OAS4_AppPlane

−0.346
−01.27724
01 8 01 76 6
.

OAS5_AppTrain 1−1.25
.1 145
9

−0.2

5
39 9088 4

02
1−1 .8 .46504 87 7

61
OAS6_AppTravel
78

4
5 2
−10.762716

.4
−1
OAS7_AppVideo .589 8
−−10. 082571
2.22012

2.906 84
.11 43

OAS8_NumApp
2

100.1369664878
9381637−9201.3

21
−727.744

OCF1_LuxMarket 4. 71504
.3 −

−712 .39.7 9
0 5 6 7 95 9 9 254
.892805.4

OCF2_Movie −10.3 4 359 −2.1


339.7

01
29 7

.
072

OCF3_Often_Market −1 −0.89454
64 8 50
55
.30−−
3201

8
119.9722
12.7

OCF4_Sam
30
168.3−4

Figure 2. 76
1470.45..5

1.53 4425
OCF5_Spot .1 1
.1−

5
The proposed deep 1
471

− 67
636
728

OCF6_Stadium .10
192
0.64−−

−0 549
0−.160..8

neural network OCF7_Wanda 7


1.0

architecture OCF8_MarketRec3Mon

denote the connections between each layer and the corresponding weights on each
connection. While those blue lines denote the biases. Mathematically, layer l computes an
output vector zl with the output zl−1 of the previous layer, the biases bl , and the weight wl :

zl ¼ tanhðbl þ wl zl−1 Þ (19)

In Equation (19), tanhðaÞ ¼ ðea − e−a Þ=ðea þ e−a Þ is a rescaled and shifted logistic function.
The final output layer zL is used to predict the credit score. According to the continuous
characteristic of the credit score, we specify a Gaussian distribution function for this response
variable. The loss function can be derived:
1X 2
Lðwl ; bl jyÞ ¼ yi  byi (20)
2i∈n

Whereinbyi and yi represent respectively the predicted and actual credit score of the subscriber
i whereas n denotes the total number of subscribers. The multi-threaded and distributed
parallel computation is used to optimize the loss function of DL (Candel et al., 2016).

4. Data description
A leading Chinese telecom operator confronts a challenge in which they cannot implement
precision marketing campaigns according to the existing subscriber persona profile. The
telecom operator provided real-world data, which is a randomly selected sample of 926
subscribers in a month of 2018, consisting of three sources, a traditional inner database, Network-aware
online apps usage, and offline consumption footprint. credit scoring
Table 1 shows the description and definition of our data. The dataset is at the individual
level, including a dependent variable (DV), 20 inner-database (ID), 8 online-apps-usage (OAS),
system
and 8 offline-consumption-footprint (OCF) features. The DV, continuous credit scores of
telecom subscribers approximately range from 400 to 700 shown in Table 1. Meanwhile, 20 ID
features can be mapped into the demographic characteristics (ID1 to ID5), billing information
and call detail records (ID6 to ID20) of subscribers. 1019
When it comes to the OAS characteristics in Table 2, the mobile online apps usage are
available. We observe the subscribers browsed video apps most frequently as its maximum
average (1218.02) this month. All standard deviations of each feature are larger than their
averages. Subscribers behaved heterogeneously in the perspective of mobile online apps
usage. This model-free evidence indicates that stochastic equivalence might exist among

Abbr Variable Definition Mean SD Min Max

DV CreditScore Credit score 616.11 41.10 465 701


ID1 Age Age 37.98 10.58 15 81
ID2 BlackList If in blacklist 0.05 0.22 0 1
ID3 CheckName If passed the real-name verification 0.99 0.09 0 1
system
ID4 Undergrad If an undergraduate 0 0.05 0 1
ID5 X4G_Unhealth If a 4G-unhealthy subscriber 0.07 0.25 0 1
ID6 AVR_RecentSixPay Average of bills in the past 6 months 103.48 64.65 8.46 819.41
(CNY)
ID7 Ban_Mon Balance of this month (CNY) 85.7 62.29 10 270
ID8 BanStable Stable balance 5 Recent bill/ 1.98 2.39 0.07 26.56
(1 þ Ban_CurrentMon)
ID9 BanStable_Rel6mon Relatively stable balance 5 Recent bill/ 1.02 0.30 0.19 5.81
(1 þ AVR_RecentSixPay)
ID10 CreditQuality Quality of credit 0.95 0.22 0 1
ID11 Debt_Current If in arrears this month 0.06 0.25 0 1
ID12 MonLastPay Time from the last payment (months) 0.99 0.07 0 1
ID13 Number_call Num of people communicated this month 48.52 52.91 1 500
ID14 PayType Type of payment 1.97 0.17 1 2
ID15 PayStable Stable payment 0.83 0.69 0 10.46
ID16 RecentPay Recent bill (CNY) 71.67 51.97 0.01 499
ID17 Sensity Subscriber sensitivity of bills 3.20 1.20 1 5 Table 1.
ID18 SensityRate Rate of subscriber sensitivity of bills 3.81 2.02 0 8 Summary and
ID19 Tenuren_net The period from when registered in the 94.10 56.45 1 262 definition of traditional
focal telecom operator (months) inner-database (ID)
ID20 Total_Mon The total cost of this month (CNY) 105.20 66.85 8 659.77 features

Abbr Variable Definition Mean SD Min Max

OAS1 AppFinan Total freq of using financial apps 607.05 862.67 0 3,964
OAS2 AppLogistic Total freq of using logistics apps 0.56 5.61 0 90
OAS3 AppOnlineShop Total freq of using online shopping apps 604.19 1137.37 0 9,677
OAS4 AppPlane Total freq of using air-ticket apps 0.94 24.18 0 729 Table 2.
OAS5 AppTrain Total freq of using train apps 0.37 4.15 0 92 Summary and
OAS6 AppTravel Total freq of using tourism apps 10.84 57.21 0 1,068 definition of online-
OAS7 AppVideo Total freq of using video apps 1218.02 2056.82 0 12,296 apps-usage (OAS)
OAS8 NumApp Total freq of using apps 2441.96 3026.14 0 12,938 features
APJML telecom subscribers. That is, all users are evaluated by a credit scoring homogeneity model
34,5 without considering the unique stochastic equivalence of different subscriber subgroups,
which might not address over-lapping information between subscribers, resulting in a high
level of systemic risk and bias of prediction.
We also observe the offline consumption footprint of telecom subscribers in Table 3.
About 43% of subscribers have been to attractions and 33% to stadiums for sports exercises.
One-third of subscribers often go to markets while the average frequency of going to
1020 market(s) is 24.48 in the recent three months.
As shown in Table 1, we observed that the units of the 36 explanatory variables are varied,
thus we apply min-max scale normalization:
x  xmin
bx¼ (21)
xmax  xmin
Wherein xmax and xmin are the maximum and minimum of the focal variable, respectively.
Multicollinearity might exist between two or more explanatory variables with each
other, resulting in redundancy and high correlation. For inspecting the correlation among
explanatory variables, the heat map is shown in Figure 3. Owing to the redundancy,
Figure 3 merely displays the upper triangles of the correlation matrix. Because of the
absolute values of Pearson correlations, there are four correlation coefficients beyond 0.65.
We observe the strongest correlation between OCF1 and OCF7 is 0.92 implying Wanda
supermarket could be one of the luxury supermarkets. For both ID6 and ID20, the
correlation coefficient is 0.91, which indicates that a majority of subscribers spent a similar
amount of bill in the current month as compared to the average amount over the last
6 months. A high correlation (0.87) between ID17 and ID18 because they are the absolute
and relative measures of sensitivity of phone bills. It is also noticeable that OSA7 is
correlated with OSA8 (0.85), suggesting that sample subscribers might use video-type
applications most frequently among all apps. In sum, the multicollinearity problem is not
significant severe in our data.

5. Result and analysis


The purpose of this study is to propose a credit scoring system by leveraging latent
information embedded in the subscriber relation network using multi-source data in the
telecom industry. To do so, we first discuss the relation network model and the distribution of
the subscriber relations in this section. Then we analyze the forecasting error and comparing
different credit scoring models.

5.1 Information fusion for multi-source data


The observations of telecom subscribers are represented by 36 explanatory features, which in
turn characterize the subscriber persona profile. For constructing the latent subscriber

Abbr Variable Definition Mean SD Min Max

OCF1 LuxMarket If been to luxury market(s) 0.05 0.22 0 1


OCF2 Movie If watch movie(s) 0.25 0.43 0 1
OCF3 Often_Market If often go to market(s) 0.33 0.47 0 1
OCF4 Sam If been to Wal-Mart Sam’s Club 0.01 0.12 0 1
Table 3. OCF5 Spot If been to attraction(s) 0.43 0.5 0 1
Summary and OCF6 Stadium If been to stadium 0.33 0.47 0 1
definition of offline- OCF7 Wanda If been to Wanda market 0.04 0.20 0 1
consumption-footprint OCF8 MarketRec3Mon Average freq of going to market(s) in recent three 24.48 31.41 0 92
(OCF) features months
OCF8 Network-aware
OCF7
OCF6 Pearson credit scoring
OCF5 Correlation
OCF4 1.0 system
OCF3
OCF2
OCF1 0.5
OAS8
OAS7 0.0
OAS6
OAS5
1021
–0.5
OAS4
OAS3
OAS2 –1.0
OAS1
ID20
ID19
ID18
ID17
ID16
ID15
ID14
ID13
ID12
ID11
ID10
ID9
ID8
ID7
ID6
ID5
ID4
ID3 Figure 3.
ID2 Correlation heatmap
ID1
for 36 explanatory
ID1
ID2
ID3
ID4
ID5
ID6
ID7
ID8
ID9
ID10
ID11
ID12
ID13
ID14
ID15
ID16
ID17
ID18
ID19
ID20
OAS1
OAS2
OAS3
OAS4
OAS5
OAS6
OAS7
OAS8
OCF1
OCF2
OCF3
OCF4
OCF5
OCF6
OCF7
OCF8
features

relation network, singular value decomposition (SVD) executes information fusion with
traditional telecom inner data, online apps usage data, and offline consumption footprint
data. It is worth noting that the dependent variable, continuous credit scores, is not included.
SVD enables to more accurately dimensionality reduction with an appropriate number of
latent factors and addresses the over-lapping network community detection problem (Sarkar
and Dong, 2011; Ahelegbey et al., 2019). Eigenvalues ðσ 1 ; σ 2 ;    ; σ 36 Þ of SVD, namely, the
diagonal elements of D ¼ Λ1=2 estimated by Equation (2), are shown in Table 4 and Figure 4.
Concluding from Figure 4, we observe that the first s ¼ 4 eigenvalues explain 99.99% of the
variance of information within 36 features. More intuitively, Figure 4 shows the decay of
eigenvalues in which a pronounced “elbow” is revealed. The magnitude of eigenvalues
plummets firstly and s ¼ 4 is the elbow point, the characteristic gives us the optimal number
of latent factors according to the work (Sarkar and Dong, 2011). The first four common factors
fi ¼ ui D, wherein fi ¼ ðfi;1 ; . . . ; fi;s Þ0 and s ¼ 4 < 36, as a lower-dimensional vector of latent
factor scores which accounts for 99.99% information of all 36 observed multi-source
subscribers’ features.

5.2 Network inference


An inner product between any two vectors measure a relation of similarity between any two
subscribers, can further identify the stochastic equivalence of different subgroups (or
community belongingness) in the relation network (Sarkar and Dong, 2011; Hoff, 2008).
Inferred by the inner product model expressed in Equation (4), we obtained a 926 3 926
adjacency matrix R to represent the subscriber relation network. The value of any two
subscribers’ relation ranges from 0.0013 to 1.8096. The closer the value of two telecom
APJML Factor Eigenvalue Variance explained Cumulative
34,5
1 10.8873 0.8909 0.8909
2 3.118 0.0731 0.9640
3 1.8274 0.0251 0.9891
4 1.1477 0.0099 0.9990
5 0.2168 0.0004 0.9993
1022 6 0.1485 0.0002 0.9995
7 0.1394 0.0001 0.9997
8 0.1246 0.0001 0.9998
9 0.0978 0.0001 0.9998
10 0.0926 0.0001 0.9999
11 0.0716 0.0000 0.9999
Table 4. ... ... ... ...
Eigenvalues of SVD 36 0 0 100

Figure 4.
Eigenvalues of SVD

subscribers’ relation is to 0, the more orthogonal the two subscriber-level vectors, which
indicates the lower the probability that the two subscribers belong to the same network
subgroup.
Because of the undirected characteristic of the adjacency matrix, the upper and lower
triangle elements are equal whereas the diagonal elements represent the relation of individual
subscribers themselves. As illustrated in Figure 5, we remove the upper and diagonal
elements for reducing redundancy. The horizontal axis denotes the value of any two

Figure 5.
Distribution of
subscribers’ relation
subscribers’ relation while the vertical axis is the density. We observe that the 95th percentile Network-aware
of subscribers’ relation is 0.262, represented by the dash vertical line. In particular, on the left credit scoring
side of that dashed line, we notice that the central or head part of the distribution lies in the
interval (0.0013, 0.2621), which is dominant but a minority of the total distribution. In other
system
words, 95% of subscribers’ relations are “orthogonal”, they are not closely connected
relatively. Whereas the long tail (0.2621, 1.8096) of the subscribers’ relation distribution on the
right part of Figure 5, accounts for only 5% of the whole density. This phenomenon seems
that the subscribers’ relations obey some kind of power-law distribution instead of a normal 1023
Gaussian distribution (Ahelegbey et al., 2019) because the density of the subscribers’ relation
varies as a power of the subscribers’ relation itself (Clauset et al., 2009).
We posit that the value of subscribers’ relation (i.e. elements of the adjacency matrix R)
obeys a power-law function. Following Equation (5) and (6), our proposed power-law network
inference model need to estimate the constant C and slope parameter α in the first step. We
transform Equation (5) into a natural logarithm form:
ln PðAij ¼ 1jrij ; αÞ ¼ ln C  α ln rij (22)

Then we use the least-squares method to estimate the above equation, the estimates are
b ¼ 0:0008 and b
C α ¼ 1:678 for our case. Figure 6 shows the observed data of the telecom
subscribers’ relations (i.e. the black triangles) and the fitted blue curve represents the power-
law function with the estimates. Figure 6 demonstrates that our estimated function very
closely fits the observed data, which suggests the adjacency matrix R of the subscriber
relation network obeys the power-law distribution. Thus we transform it back into the power-
law function form:
Prðr ¼ rij Þ ¼ 0:0008 * rij−1:678 (23)

It is evidence that the distribution of subscribers’ relations goes far beyond the managerial
Pareto principle in which 80% of outcomes result from 20% of all causes (Craft and Leake,
2002). Our findings show that 95% of density results from the former 14% values of
subscribers’ relations, computed by (0.2621–0.0013)/(1.8096–0.0013) 5 0.1442. On this sparse
network model, the choice of the set is 0.95, in light of the sparse degree of the network.
Derived from the inference of our proposed latent relation network model, a power-law-based
network reveals in Figure 7.
Two latent subgroups exist among the focal telecom subscribers. In other words, we
observed that there are two unique stochastic equivalences identified in the network. Figure 7

Figure 6.
The fit of the estimated
power-law distribution
APJML displays the full network in a global view, demonstrating that a large majority of subscribers
34,5 (591 of 926) are disconnected and distributed discretely in the outer ring. Besides, the other
subscriber subgroup in the centre is connected closely. Stochastic equivalence is a
phenomenon that is often seen in a group relation network where the vertices can be divided
into different subgroups such that members of the same subgroup have analogous patterns
(Hoff, 2008). In a nutshell, this network-aware inference identifies two unique stochastic
equivalences, namely, it divides the subscriber population into a connected subgroup (centre
1024 part) and a discrete subgroup (outer ring) while the two subgroups are with a sparse distance.
In particular, Figure 8 zooms in the connected subgroup to probe its stochastic
equivalence. Subscribers who belong to the connected subgroup exhibit a good deal of
interconnection in the term of demographics, telecom billing information, call detail records,
online apps usage, and offline consumption footprint. We noticed that the source of this
interconnection is the latent information between subscribers, not necessarily their actual
contact. Owing to its stochastic equivalence, the connected subscribers have an analogous
pattern. On the other hand, the subscribers in the discrete subgroup reveal another stochastic
equivalence. Their interconnection is rather sparse. The average and standard deviation of
the credit scores for these 335 connected subscribers are 629.98 and 36.27. After adding the
discrete subgroup, the statistics for all telecom subscribers are 616.11 and 41.10, respectively.
This suggests that the creditworthiness of subscribers becomes lower and unstable without
the two unique stochastic equivalences. The nature of stochastic equivalence merely
indicates any subscriber subgroup behaves similarly but they do not have to be in an
identical magnitude of creditworthiness. These findings provide us model-free evidence that
the credit scoring homogeneity models might cause a high level of systemic risk in credit
scores and a bias of prediction performance without considering stochastic equivalence. We
argue that such latent information embedded from the subscriber relation network enhances
the predictive capabilities of credit scoring models. Thus, we proposed a power-law-
distributed network-aware credit scoring system considering stochastic equivalence.

Figure 7.
Full network in a
global view

Figure 8.
The connected
subgroup in a
local view
5.3 Forecasting performance and comparison Network-aware
In this section, several algorithms are implemented to build credit scoring models for credit scoring
analyzing forecasting performance and comparison, including MLR, RF, SVR, MLP, and DL.
Credit scoring models within each algorithm are constructed with three sample subsets
system
separately. The full-sample models as benchmark models are composed of all 926
subscribers. To account for the stochastic equivalence, the network-aware credit scoring
system is proposed for predicting the credit scores of telecom subscribers within two
subgroups. The connected-subgroup models consider 335 interconnected subscribers 1025
whereas the discrete-subgroup models are based on the other 591 subscribers. Note that
the observation of any individual subscriber comprises 36 explanatory features, which come
from their telecom record, online apps usage, and offline consumption footprint as seen in
Table 1. Due to managerial practice in prediction, we use 60% of each sample subset to train
the models and the rest as a testing dataset to validate the performances and analyze
forecasting errors.
Specifically, we compute the predicted credit scores byi of individual subscribers. Mean
absolute error (MAE) and root mean squared error (RMSE) are extensively employed to
evaluate the effectiveness of credit scoring models (Chang and Yeh, 2012; Ince and Aktan,
2009; Zhan et al., 2020):
1 X 

MAE ¼ yi  byi  (24)
n i∈n
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 X 2
RMSE ¼ yi  byi (25)
n i∈n

Wherebyi and yi represent respectively the predicted and sample credit scores of the subscriber
i whereas n denotes the total number of subscribers in the testing dataset of each data sample.
Obviously, the lower values of MAE and RMSE represent a better accuracy.
Furthermore, in order to compare the relative prediction performance of network-aware
models with that of the benchmark models, forecasting error reduction (FER) is exploited
(Shang et al., 2020). We let MAE0 and RMSE0 be the performance of a benchmark model using
a full sample without network-aware knowledge. Then we denote MAEτ and RMSEτ as those
performances of a model τ using connected-subgroup sample subset or discrete-subgroup
sample subset. Hence, the performance of considering latent information of stochastic
equivalence can be calculated by FER:

MAEτ  MAE0
FERMAE ¼ 3 100 (26)
MAE0
RMSEτ  RMSE0
FERRMSE ¼ 3 100 (27)
RMSE0

The performance of different credit scoring models by the full sample (benchmark), the
connected-subgroup sample, and the discrete-subgroup sample datasets, as summarized in
Table 5, respectively. We intuitively observe that, as compared to the benchmark model
without network information, the network-aware credit scoring models (i.e. connected-
subgroup-based and discrete-subgroup-based models) improve forecasting performance in
terms of FERRMSE and FERMAE. The MAEs and RMSEs of MLR, RF, SVR, MLP, and DL
decrease after using the latent network-aware information. Specifically, the results
demonstrate that the connected-subgroup models have significant forecasting error
reduction in the range of 7.89–25.64% over the benchmark models. Whereas, the discrete-
APJML Model Network-aware MAE RMSE FERMAE/% FERRMSE/%
34,5
MLRbenchmark No 18.712 24.441 / /
MLRconnected Yes 15.170 19.207 18.929 21.415
MLRdiscrete 18.001 23.470 3.799 3.973
RFbenchmark No 16.030 20.990 / /
RFconnected Yes 14.738 19.334 8.060 7.889
1026 RFdiscrete 15.389 20.392 3.999 2.849
SVRbenchmark No 16.729 22.865 / /
SVRconnected Yes 13.894 18.845 16.947 17.581
SVRdiscrete 16.107 21.830 3.718 4.527
MLPbenchmark No 14.573 18.836 / /
MLPconnected Yes 11.096 14.006 23.859 25.642
Table 5. MLPdiscrete 14.538 18.655 0.2402 0.961
Credit scoring models DLbenchmark No 12.783 16.177 / /
performance on the DLconnected Yes 10.844 13.720 15.169 15.188
testing dataset DLdiscrete 12.009 15.508 6.055 4.136

subgroup models also outperform the benchmark models with an average forecasting
accuracy improvement of 3.43%. As our proposed models remark an improved prediction
performance, we find strong support that the network-aware credit scoring system can
leverage latent information embedded in the subscriber relation network.
In particular, MLR performs the worst in terms of MAE and RMSE. The best credit
scoring model is DL with the lowest MAE and RMSE. Although the connected-subgroup
MLP model has the highest forecasting error reductions, FERMAE and FERRMSE. It suggests
that the stochastic equivalence in the connected-subgroup has a comparative advantage in
improving the predictive capability of MLP. The connected-subgroup DL model performs
better than the other algorithms. We illustrate its estimated architecture as our proposed deep
neural network architecture in this study, as shown in Figure 2. The finding indicates that a
non-linear relationship exists between telecom subscribers’ credit scores and their multi-
channel behaviours.
On the condition of stochastic equivalence, the predictive capability of connected-
subgroup models is superior to that of the discrete-subgroup models. This finding implies
that within the connected-subgroup identified, subscribers are more homogeneous to each
other internally when they use apps online, consume offline, and in the focal telecom operator.
A managerial direction for the telecom operator is to achieve an intelligent subscriber persona
profile, leading to more accurate cross-sell precision marketing campaigns. Additionally,
though the discrete-subgroup models outperform the benchmark models, the diversity of
these discrete telecom subscribers’ deserves further study.

6. Discussion and conclusion


This study presents a network-aware credit scoring system for predicting individual telecom
subscribers’ credit scores, which includes latent information embedded in subscriber relation
network, which enables telecom operators to leverage their inner database, data of telecom
subscribers’ online apps usage, and offline consumption footprint. Thereby, the theoretical
contribution and managerial implications are discussed in this section. We also point out the
limitations and the direction of future research.

6.1 Theoretical contribution


In general, this study fills a research gap in the continuous credit scoring context by
conducting information fusion using multi-source data such as the traditional telecom inner
database, subscribers’ online apps usage, and offline consumption footprint. Previous Network-aware
literature on telecom marketing exploited the single inner telecom data to reflect subscriber credit scoring
persona profile (Hung et al., 2006; Verbeke et al., 2012). Besides, prior studies on credit scoring
primarily shed light on the category credit scoring classification problems (San Pedro et al.,
system

2015; Yu et al., 2020; Luo, 2019; Fayyaz et al., 2020; Oskarsd ottir et al., 2019). In contrast, this
study deals with the continuous credit scoring of telecom subscribers denoting a detailed fine-
grained level of creditworthiness in the telecom industry. Our findings indicate that the
proposed credit scoring system enhances the prediction performance of continuous credit 1027
scoring in the perspective of multi-source.
Besides, this study contributes to the literature by considering stochastic equivalence
from network science in the field of credit scoring (Giudici et al., 2019; Ahelegbey et al., 2019).
Thus, our proposed network-aware credit scoring model follows a power-law distribution-
based decision support system, providing another function to fit the user relation network
instead of simply a Gauss function hypothesis.
From the methodological point of view, this study adds to the literature by proposing a broad
range of heterogeneous decision support systems for continuous credit scoring, including
network analysis, machine learning, and deep learning. The extant literature either employed
logistic regression to analyze credit scoring (Ahelegbey et al., 2019; Yu et al., 2020) or
implemented multiple machine learning algorithms into a credit score homogeneity model (San

Pedro et al., 2015; Yu et al., 2020; Luo, 2019; Fayyaz et al., 2020; Oskarsd ottir et al., 2019).
Conversely, a comprehensive study is executed to exam the forecasting performance of our
proposed heterogeneous models as compared to the homogeneous models. In the term of RMSE,
MAE, and forecasting error reduction, the findings demonstrate that the network-aware
decision support system achieves a better, more accurate forecasting performance. Especially,
our approach with stochastic equivalence by using a connected-subgroup sample reveals that
the forecasting error is significantly reduced by 7.89–25.64% as compared to the benchmark. DL
outperforms the other algorithms, which indicates that a non-linear relationship exists between
telecom subscribers’ credit scores and their multi-channel behaviours.

6.2 Managerial implications


Our study helps telecom operators to comprehensively understand the creditworthiness of
different groups of telecom subscribers and has three-fold managerial implications. Firstly, the
empirical results show that 95% of the distribution of subscribers’ relations results from 14%
values of subscribers’ relations, beyond the managerial Pareto principle, i.e. 80/20 principle. This
power-law network-aware inference identifies two unique stochastic equivalences, namely, it
divides the subscriber population into a connected subgroup and a discrete subgroup. It
suggests that marketing segmentation strategies can be conducted by telecom operators to
measure a detailed telecom subscriber persona profile. In particular, by employing network
inference and information fusion, telecom operators can identify the unique stochastic
equivalence in different subgroups. So that they can conduct precision marketing segmentation.
Secondly, this study has proposed a network-aware decision support system to reinforce
the prediction performance of individual subscribers’ credit scoring. Telecom operators can
forecast the creditworthiness of subscribers comprehensively depending on the proposed
decision support system, in order to recommend new products or services, or upgrades, or
add-ons up-selling to telecom subscribers according to the fine-grained level subscribers
persona profiling.
Last, the comprehensive credit scoring system allows the telecom operators to evaluate
subscriber value, detect fraud subscribers, and predict the degree of subscribers who may be
at risk of being the default. Telecom operators can also exploit this decision support system to
better detect the default and improve the revenue stability of customer relationship
management.
APJML 6.3 Limitations and future direction
34,5 There are also several limitations to this study. Even if the discrete-subgroup models perform
better than the benchmark models. It is a possible direction to explore the diversity of these
discrete subscribers for achieving a more accurate credit scoring prediction. Besides, owing
to the unavailability of a time-varying panel dataset, the study measures the creditworthiness
of telecom subscribers in the way of cross-sectional data. A dynamical credit scoring system
is more valuable for telecom operators, needs to be further investigated. Even though the data
1028 about the frequency of customer used apps are available, engagement effect such as the
dwell time in some types of apps can be further examined their potential for prediction.

References
Ahelegbey, D.F., Giudici, P. and Hadji-Misheva, B. (2019), “Latent factor models for credit scoring in
P2P systems”, Physica A: Statistical Mechanics and Its Applications, Vol. 522, pp. 112-121.
Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J. and Vanthienen, J. (2003),
“Benchmarking state-of-the-art classification algorithms for credit scoring”, Journal of the
Operational Research Society, Vol. 54 No. 6, pp. 627-635.
Bayer, J. (2010), “Customer segmentation in the telecommunications industry”, Journal of Database
Marketing and Customer Strategy Management, Vol. 17 Nos 3-4, pp. 247-256.
Benkedjouh, T., Medjaher, K., Zerhouni, N. and Rechak, S. (2015), “Health assessment and life
prediction of cutting tools based on support vector regression”, Journal of Intelligent
Manufacturing, Vol. 26 No. 2, pp. 213-223.
Candel, A., Parmar, V., LeDell, E. and Arora, A. (2016), Deep Learning with H2O, H2O. ai, Mountain
View, pp. 1-21, available at: https://www.h2o.ai/resources/.
Chang, S.-Y. and Yeh, T.-Y. (2012), “An artificial immune classifier for credit scoring analysis”, Applied
Soft Computing, Vol. 12 No. 2, pp. 611-618.
Chester, D.L. (1990), “Why two hidden layers are better than one”, Proc. IJCNN, Washington, District
of Columbia, Vol. 1, pp. 265-268.
Chong, A.Y.-L. (2013), “Predicting m-commerce adoption determinants: a neural network approach”,
Expert Systems with Applications, Vol. 40 No. 2, pp. 523-530.
Clauset, A., Shalizi, C.R. and Newman, M.E. (2009), “Power-law distributions in empirical data”, SIAM
Review, Vol. 51 No. 4, pp. 661-703.
Craft, R.C. and Leake, C. (2002), “The Pareto principle in organizational decision making”,
Management Decision, Vol. 40 No. 8, pp. 729-733.
Dahiya, S., Handa, S.S. and Singh, N.P. (2016), “A rank aggregation algorithm for ensemble of multiple
feature selection techniques in credit risk evaluation”, International Journal of Advanced
Research in Artificial Intelligence, Vol. 5 No. 9, pp. 1-8.
Djeundje, V.B., Crook, J., Calabrese, R. and Hamid, M. (2021), “Enhancing credit scoring with
alternative data”, Expert Systems with Applications, Vol. 163, 113766, doi: 10.1016/j.eswa.2020.
113766.
Fayyaz, M.R., Rasouli, M.R. and Amiri, B. (2020), “A data-driven and network-aware approach for
credit risk prediction in supply chain finance”, Industrial Management and Data Systems,
Vol. 121 No. 4, pp. 785-808.
Gao, H., Liu, H. and Yi, M. (2021), “Inferring values of recommendation links: analysis of co-purchase
network based on ERGM and product involvement”, 2021 IEEE International Conference on
Consumer Electronics and Computer Engineering, pp. 108-113.
Gao, L. and Bai, X. (2014), “An empirical study on continuance intention of mobile social networking
services”, Asia Pacific Journal of Marketing and Logistics, Vol. 26 No. 2, pp. 168-189.
Giudici, P., Hadji-Misheva, B. and Spelta, A. (2019), “Network based scoring models to improve credit risk
management in peer to peer lending platforms”, Frontiers in Artificial Intelligence, Vol. 2, p. 3.
Goh, R. and Lee, L.S. (2019), “Credit scoring: a review on support vector machines and metaheuristic Network-aware
approaches”, Advances in Operations Research, Vol. 2019, pp. 1-30.
credit scoring
Gomez-Andrades, A., Barco, R., Munoz, P. and Serrano, I. (2016), “Data analytics for diagnosing the
RF condition in self-organizing networks”, IEEE Transactions on Mobile Computing, Vol. 16
system
No. 6, pp. 1587-1600.
Han, S.H., Lu, S.X. and Leung, S.C. (2012), “Segmentation of telecom customers based on customer
value by decision tree model”, Expert Systems with Applications, Vol. 39 No. 4, pp. 3964-3973.
Hand, D.J. and Henley, W.E. (1997), “Statistical classification methods in consumer credit scoring: a
1029
review”, Journal of the Royal Statistical Society: Series A (Statistics in Society), Vol. 160 No. 3,
pp. 523-541.
Hoff, P. (2008), “Modeling homophily and stochastic equivalence in symmetric relational data”,
Advances in Neural Information Processing Systems, Vol. 20 No. 6, pp. 657-664.
Hung, S.-Y., Yen, D.C. and Wang, H.-Y. (2006), “Applying data mining to telecom churn management”,
Expert Systems with Applications, Vol. 31 No. 3, pp. 515-524.
Imran, A., Zoha, A. and Abu-Dayya, A. (2014), “Challenges in 5G: how to empower SON with big data
for enabling 5G”, IEEE Network, Vol. 28 No. 6, pp. 27-33.
Ince, H. and Aktan, B. (2009), “A comparison of data mining techniques for credit scoring in banking:
a managerial perspective”, Journal of Business Economics and Management, Vol. 10 No. 3,
pp. 233-240.
Juhos, I., Makra, L. and Toth, B. (2009), “The behaviour of the multi-layer perceptron and the support
vector regression learning methods in the prediction of NO and NO2 concentrations in Szeged,
Hungary”, Neural Computing and Applications, Vol. 18 No. 2, pp. 193-205.
Kimura, M. (2021), “Customer segment transition through the customer loyalty program”, Asia Pacific
Journal of Marketing and Logistics, Vol. ahead-of-print, doi: 10.1108/APJML-09-2020-0630.
Le, A.N.H., Tran, M.D., Nguyen, D.P. and Cheng, J.M.S. (2019), “Heterogeneity in a dual personal
values–dual purchase consequences–green consumption commitment framework”, Asia Pacific
Journal of Marketing and Logistics, Vol. 31 No. 2, pp. 480-498.
Luo, C. (2019), “A comprehensive decision support approach for credit scoring”, Industrial
Management and Data Systems, Vol. 120 No. 2, pp. 280-290.
Olowe, A., Olorundare, J.K. and Phillips, T. (2021), “Using open APIs to drive financial inclusion via
credit scoring built on telecoms data”, International Journal on Data Science and Technology,
Vol. 7 No. 1, pp. 17-22.
 ottir, M., Bravo, C., Sarraute, C., Vanthienen, J. and Baesens, B. (2019), “The value of big data
Oskarsd
for credit scoring: enhancing financial inclusion using mobile phone data and social network
analytics”, Applied Soft Computing, Vol. 74, pp. 26-39.
Phau, I., Quintal, V. and Shanka, T. (2014), “Examining a consumption values theory approach of
young tourists toward destination choice intentions”, International Journal of Culture, Tourism
and Hospitality Research, Vol. 8 No. 2, pp. 125-139.
Ram, J. and Wu, M.-L. (2016), “A fresh look at the role of switching cost in influencing customer
loyalty”, Asia Pacific Journal of Marketing and Logistics, Vol. 28 No. 4, pp. 616-633.
San Pedro, J., Proserpio, D. and Oliver, N. (2015), “MobiScore: towards universal credit scoring from
mobile phone data”, International Conference on User Modeling, Adaptation, and
Personalization, pp. 195-207.
Sarkar, S. and Dong, A. (2011), “Community detection in graphs using singular value decomposition”,
Physical Review E, Vol. 83 No. 4, 046114.
Segal, M.R. (2004), Machine Learning Benchmarks and Random Forest Regression, Center for
Bioinformatics and Molecular Biostatistics, UC, San Francisco, CA, available at: https://
escholarship.org/uc/item/35x3v9t4.
Shang, G., McKie, E.C., Ferguson, M.E. and Galbreth, M.R. (2020), “Using transactions data to improve
consumer returns forecasting”, Journal of Operations Management, Vol. 66 No. 3, pp. 326-348.
APJML Teles, G., Rodrigues, J.J., Kozlov, S.A., Rab^elo, R.A. and Albuquerque, V.H.C. (2020), “Decision support
system on credit operation using linear and logistic regression”, Expert Systems, Vol. 38 No. 6,
34,5 e12578.
Verbeke, W., Dejaeger, K., Martens, D., Hur, J. and Baesens, B. (2012), “New insights into churn
prediction in the telecommunication sector: a profit driven data mining approach”, European
Journal of Operational Research, Vol. 218 No. 1, pp. 211-229.
Wallin, S. and Landen, L. (2008), “Telecom alarm prioritization using neural networks”, 22nd
1030 International Conference on Advanced Information Networking and Applications-Workshops
(AINA Workshops 2008), pp. 1468-1473.
Xing, D. and Girolami, M. (2007), “Employing latent Dirichlet allocation for fraud detection in
telecommunications”, Pattern Recognition Letters, Vol. 28 No. 13, pp. 1727-1734.
Yu, X., Yang, Q., Wang, R., Fang, R. and Deng, M. (2020), “Data cleaning for personal credit scoring by
utilizing social media data: an empirical study”, IEEE Intelligent Systems, Vol. 35 No. 2, pp. 7-15.
Zhan, M., Gao, H., Liu, H., Peng, Y., Lu, D. and Zhu, H. (2020), “Identifying market structure to monitor
product competition using a consumer-behavior-based intelligence model”, Asia Pacific Journal
of Marketing and Logistics, Vol. 33 No. 1, pp. 99-123.
Zhang, Z. and Dai, Y. (2020), “Combination classification method for customer relationship
management”, Asia Pacific Journal of Marketing and Logistics, Vol. 32 No. 5, pp. 1004-1022.

About the authors


Hongming Gao is a PhD candidate at School of Management, Guangdong University of Technology,
China, majored in Management Science and Engineering. His research interests include consumer
behavior, data mining, and electronic commerce. He was a visiting scholar in Tilburg University,
Netherlands and the University of Hong Kong. His works have been published in such journals as Asia
Pacific Journal of Marketing and Logistics, Industrial Management and Data Systems, Journal of
Business Economics and Management, Data Analysis and Knowledge Discovery, Chinese Journal of
Management Science among others. His works have appeared in conferences such as Americas
Conference on Information systems (AMCIS), Pacific Asia Conference on Information Systems (PACIS),
and International Conference on Electronic Commerce (ICEC).
Hongwei Liu is Professor of Information Management at the Department of Management Science
and Engineering, Guangdong University of Technology, China. His research interests include business
intelligence, systems design, and privacy issues of data mining. His publications have appeared in such
journals as Information and Management, Information Systems, Impulsive Dynamical Systems and
Applications, Wireless Personal Communications, among others. He also holds several projects of
Natural Science Foundation of China (NSFC).
Haiying Ma is a lecturer at School of Internet finance and information engineering, Guangdong
University of Finance, teaching the courses including python and e-commerce. Her research interests
include consumer behavior, data mining, and electronic commerce. She received a PhD degree from
Guangdong University of Technology, and was a visiting scholar in the University of Hong Kong. Her
main works have been published in such journals as Frontiers in Neuroscience, Neuroscience Letters,
Neuroreport. Haiying Ma is the corresponding author and can be contacted at: [email protected]
Cunjun Ye is a PhD candidate at School of Management, Guangdong University of Technology,
China, majored in Management Science and Engineering. His research interests include human resource
management and business intelligence.
Mingjun Zhan is a doctor at School of Management, Guangdong University of Technology, China,
majored in Management Science and Engineering. His research interests include consumer behavior,
data mining and business intelligence.

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: [email protected]

You might also like