Regression Analytics Using Rapid Miner
Regression Analytics Using Rapid Miner
Objective:-
1) SubscriberDailyUsage
2) SubscriberDailyRecharge
Pre-requisites
Created an additional table ‘RapidUsage’ in REAM database to store daily summary of total
usage/recharge amount. The same can be created at runtime from Rapidminer during regression
analytics steps configured.
RapidUsage schema
Field Data Type Null Key Default Extra
DateOfUsage Int(11) NO PRI 0
TotalUsage float YES 0
This will generate the output of step 1 query function and write to RapidUsage table(sample
below)
DateOfUsage TotalUsage
1 1027.94
2 1368.02
3 1652.46
4 1962.25
5 1630.46
6 1328.01
7 998.27
8 1298.46
10 1670.49
3) DatabaseExampleSource(2):- This function is reading RapidUsage intermediate table for
regression analysis as configured below. Output is configured as label attribute ‘TotalUsage’ field
in table.
4) PolynomialRegression:- Apply polynomial regression model on data read from RapidUsage table
with maximal degree for final polynomial configured as 2.
5) ExecuteAnalytic models:- Execute configured models to generate polynomial regression formula
as the result shown below.
6) The same steps should be followed to calculate the polynomial regression formula for
SubscriberDailyRecharge table with a calculation of cumulative sum of ‘ RechargeTalkTime’ for step
1 as below [[select substring(DateOfRecharge,9,2) as DateOfUsage, sum(RechargeTalkTime) as
TotalUsage from SubscriberDailyRecharge group by 1 order by 1]]
Conclusion
Accuracy of regression model is not correct in the tested examples. Further testing required for
identifying the reason.