0% found this document useful (0 votes)
51 views3 pages

Data Mining in Telecommunication Industr

The document discusses how data mining techniques are widely used in the telecommunications industry to analyze vast amounts of customer, network, and call detail data to solve business problems like fraud detection, churn management, and network fault identification. It provides an overview of data mining concepts and processes like discovery, predictive modeling, and forensic analysis. The document also describes the types of data collected in the telecom industry and the methodology used, including data acquisition, preparation, and applying data mining techniques to extract useful patterns and knowledge.

Uploaded by

Habiba Medhat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views3 pages

Data Mining in Telecommunication Industr

The document discusses how data mining techniques are widely used in the telecommunications industry to analyze vast amounts of customer, network, and call detail data to solve business problems like fraud detection, churn management, and network fault identification. It provides an overview of data mining concepts and processes like discovery, predictive modeling, and forensic analysis. The document also describes the types of data collected in the telecom industry and the methodology used, including data acquisition, preparation, and applying data mining techniques to extract useful patterns and knowledge.

Uploaded by

Habiba Medhat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

IJSRD - International Journal for Scientific Research & Development| Vol.

2, Issue 08, 2014 | ISSN (online): 2321-0613

Data Mining in Telecommunication Industry


Hiren H Darji1
1
Assistant Professor

Abstract— Telecommunication companies today are from business data to useful information, each step is built
operating in highly competitive and challenging on the previous ones. [7]
environment. Vast volume of data is generated from various
C. Data mining techniques
operational systems and these are used for solving many
business problems that required urgent handling. These data Data mining tools take data and construct a representation of
include call detail data, customer data and network data. reality in the form of a model. The resulting model describes
Data Mining methods and business intelligence technology patterns and relationships present in the data. From a
are widely used for handling the business problems in this process orientation, data mining activities fall into three
industry. The goal of this paper is to provide a broad review general categories [7]
of data mining concepts.
Key words: Data mining, Telecommunications, Fraud
Detection, Neural Networks, Churn management

I. INTRODUCTION
The concept of Data Mining has gained a common
market acceptance. Telecommunication is one of the
most data intensive industries in the world. One of the
first industries to accept Data Mining is the
telecommunications. Companies in the telecom industry are
making use of Data Mining technologies to improve their 1) Discovery:
marketing techniques, for identification of customer fraud The process of looking in a database to find hidden patterns
and for the better management of their networks.[1] without a fixed idea or hypothesis about what the patterns
Most of the telecom companies have realized may be.
that the vast volume of data they collect and possess 2) Predictive Modeling:
could be effectively utilized for solving their business The process of taking patterns discovered from the database
problems by converting them into information and and using them to predict the future.
knowledge. Data Mining can be viewed as a technique 3) Forensic Analysis:
automatically generating this knowledge from the data The process of applying the extracted patterns to find
available. One of the first industry to experience the strange or unusual data elements. Data mining is used to
benefits from the application if Business Intelligence construct six types of models expected at solving business
(BI) and Data Mining technologies in the problems: classification, regression, time series, clustering,
telecommunications industry. [1] association analysis, and sequence discovery. The first two,
classification and regression, are used to make predictions,
II. DATA MINING: AN OVERVIEW while association and sequence discovery are used to
describe behavior. Clustering can be used for either
A. Definition forecasting or description.
“Data mining” is defined as a sophisticated data search
capability that uses statistical algorithms to discover patterns III. TYPES OF TELECOMMUNICATION DATA
and correlations in data. Data mining finds and extracts
The different kinds of data used in this industry are mainly
knowledge (“data pieces”) hidden in corporate data
grouped into 3 different types.
warehouses, or information that visitors have dropped on a
website, most of which can lead to improvements in the A. Call detail data
understanding and use of the data. Data mining discovers This is the information about the call, which stores as the
patterns and relationships hidden in data, and is actually part call detail record. The number of call detail records
of a larger process called “knowledge discovery” which generated is huge since every call is placed on the network,
describes the steps that must be taken to ensure meaningful the details are stored. Call detail record includes information
results. Data mining helps business analysts to generate like making and ending phone numbers, date, time and
suggestions, but it does not validate the hypotheses. [7] duration of call. Usually these call detail records are not
B. The evolution of data mining directly used for Data Mining. A list of features can be

 Average call duration


generated from the call detail data such as [1]
Data mining techniques are the result of a long research and
 Average number of call created per day
product development process. The origin of data mining lies
 Average number of call received per day
with the first storage of data on computers continues with

 Percentage of no-answer calls


improvements in data access, until today technology allows
users to navigate through data in real time. In the evolution

All rights reserved by www.ijsrd.com 7


Data Mining in Telecommunication Industry
(IJSRD/Vol. 2/Issue 08/2014/002)



Percentage of day time calls V. METHODOLOGY
Percentage of weekday calls KDD (Knowledge Discovery in Databases) is defined as the
“nontrivial process of identifying valid, novel, potentially
B. Network data
useful and ultimately understandable patterns of in data”.
Telecommunication networks contain thousands of The first step in predictive modeling is the acquisition and
components, which are interconnected. These components preparation of data. Having the correct data is as important
are capable of generating error and status messages which as having the correct method. [3]
leads to a large volume of network data. These network data
are used for network management functions like fault A. Data Acquisition
detection. Expert systems have been developed to It is a difficult problem for the researchers to acquire
analysis these messages automatically, since the huge the actual dataset from the telecom industries. This is
volume of network messages generated cannot be because the customer’s private details may be misused.
handles by technicians. Hence Data Mining Since churn prediction models requires the past history
technologies are used in identification of network faults or the usage behavior of customers during a specific
by automatically extracting knowledge from network period of time to predict their behavior in the near
data. Network data is also generated in real time future, they cannot be applied directly to the actual
which can be accomplished by applying a time window dataset. Therefore, it is the usual practice to perform
to the data. [1] some kind of aggregation on the dataset. During the
process of aggregation, in addition to the actual
C. Customer data
variables, new variables will be generated which display
Like any other business, telecommunication companies also the periodic consuming behavior of the customers. These
have millions customers. Hence it is very much essential to variables have vital information to be used by the prediction
have a database for storing the information about these models in forecasting the behavior of customers in advance.

 Name of the customer


customers. Information about the customer will include: The dataset used here was aggregated for 6 months duration.

 Address details B. Data Preparation


 Payment history In data mining problems, data preparation consumes
 Service plan and so on considerable amount of time. In the data preparation
Group customer data is used to provide call detail phase, data is collected, integrated and cleaned.
data in order to identify fraud. [1] Integration of data may require extraction of data from
multiple sources. Once the data has been arranged in
IV. CHURN PREDICTION - PROBLEM DESCRIPTION tabular form, it needs to be fully characterized. Data needs
to be cleaned by resolving any ambiguities, errors. Also
In a business environment, the term, customer attrition redundant and problematic data items are to be removed at
simply refers to the customers leaving one business this stage.
service to another. Customer churn or subscriber churn is
also similar to attrition, which is the process of C. Derived Variables
customers switching from One service provider to another Derived variables are new variables based on original
anonymously. From a machine learning perspective, churn variables. The most effective derived variables are those that
prediction is a managed problem defined as follows: Given a represent something in the real world, such as a description
predefined forecast horizon, the goal is to predict the future of some original customer behavior. There are some general
churners over that horizon, given the data associated with classes of derived variables, like total values, average
each subscriber in the network. The churn prediction
 The average number of calls in last 6 months
values, and ratios. Some examples are:
problem represented here involves 3 phases, namely, I)
training phase, ii) test phase, iii) prediction phase. The  The average number of late in last 6 months
input for this problem includes the data on past calls  The ratio of incoming and outgoing calls
 Average payment amount for last six months
for each mobile subscriber, together with all personal
 Average late count in last 6 months
and business information that is maintained by the
service provider. In addition, for the training phase,
labels are provided in the form of a list of churners. D. Variable Extraction
After the model is trained with highest accuracy, the model
The selected variables are grouped under 4 categories and
must be able to predict the list of churners from the real
are described below
dataset which does not include any churn label. In the
 Age: It is found that the customers between the age
1) Customer Demography:
viewpoint of knowledge discovery process, this problem
is categorized as predictive mining or predictive modeling.
group of 45 – 48 have high probability to churn.
 Line_Tenure: Customers with 25 – 30 months of
Churn Prediction is a occurrence which is used to
identify the possible churners in advance before they
 Customer_Class : Generally the churn probability
leave the network. This helps the CRM department to tenure period are about to churn.
prevent subscribers who are likely to churn in future
by taking the required retention policies to attract the of the corporate account holders is high. This
likely churners and to retain them. Thereby, the potential is due to the fact that their account will be
loss of the company could be avoided. This study utilizes maintained by the company and customers who
data mining techniques to identify the churners. [3]

All rights reserved by www.ijsrd.com 8


Data Mining in Telecommunication Industry
(IJSRD/Vol. 2/Issue 08/2014/002)

quit the company would churn. The Customer [4] Khalida binti Oseman, Sunarti binti Mohd Shukor,
Norazrina Abu Haris, Faizin bin Abu Bakar, “Data
 Days_to_Contract_Expiry: Most of the customers
Class can be any one of VIP /Individual/Corporate.
Mining in Churn Analysis Model for
would subscribe to a new service with the Telecommunication Industry”.
intention [5] N.Kamalraj, Dr.A.Malathi, “Applying Data Mining
Acquiring new HAND_SET. These people would leave Techniques in Telecom Churn Prediction”.
the network after the contract expires. [6] Wiktor Daszczuk, Piotr Gawrysiak, Tomasz
Gerszberg, “Data Mining for Technical Operation of
 Average_Bill_Amount
2) Bill and Payment
Telecommunications Companies: a Case Study”.
 Avg_Pay_Amount [7] Anita B. Desai, Dr. Ravindra Deshmukh, “Data mining
 Overdue_Payment_Count techniques for Fraud Detection”.
[8] Pareek, D.: Business Intelligence for
 Avg_Min_OB: If the average out bound call is less
3) Call Detail Record
Telecommunications. Auerbach Publications, Taylor &
Francis Group LLC. (2007).
 Tot_Past_Delink: If the count of total past delink
than 168 minutes they will churn.
[9] Yu-Teng Chang, “Applying Data Mining To Telecom
Churn Management”, IJRIC, 2009 67 – 77.
 Tot_Dis_Int: If the customers who make more
is greater than 3 then they will churn.

number of distinct international calls then they will


churn. If the count is greater than 6 they may
churn.

VI. CONCLUSION
Data Mining play a significant role in the
telecommunication industry due to the availability of large
volume of data and the rigorous competition in the
sector. The primary application areas include marketing and
Customer Relationship Management, Fraud detection and
Network Management. The recent developments in the
Data Mining and the implementation and enhancement of
existing techniques and methods ensure the continuous
growth and compatibility of telecommunication companies
that make use of them.

REFERENCES
[1] Madhuri V. Joseph, “Data Mining and Business
Intelligence Applications in Telecommunication
Industry”.
[2] D. Ćamilović*, “DATA MINING AND CRM IN
TELECOMMUNICATIONS”.
[3] V. Umayaparvathi, K. Iyakutti, “Applications of Data
Mining Techniques in Telecom Churn Prediction “.

All rights reserved by www.ijsrd.com 9

You might also like