0% found this document useful (0 votes)
331 views4 pages

Regression Analytics Using Rapid Miner

This document outlines steps to perform regression analysis using Rapidminer to predict daily subscriber usage and recharge amounts from REAM database tables. It describes creating an intermediate RapidUsage table, using DatabaseExampleSource and DatabaseExampleSetWriter operators to read from and write to tables, applying polynomial or linear regression models, and concluding that further testing is needed to improve accuracy.

Uploaded by

Joe Emmanuel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
331 views4 pages

Regression Analytics Using Rapid Miner

This document outlines steps to perform regression analysis using Rapidminer to predict daily subscriber usage and recharge amounts from REAM database tables. It describes creating an intermediate RapidUsage table, using DatabaseExampleSource and DatabaseExampleSetWriter operators to read from and write to tables, applying polynomial or linear regression models, and concluding that further testing is needed to improve accuracy.

Uploaded by

Joe Emmanuel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Regression Analytics using Rapidminer

Objective:-

1) Predict daily subscriber usage amount


2) Predict daily subscriber recharge amount

REAM Database Tables considered

1) SubscriberDailyUsage
2) SubscriberDailyRecharge

Pre-requisites

Created an additional table ‘RapidUsage’ in REAM database to store daily summary of total
usage/recharge amount. The same can be created at runtime from Rapidminer during regression
analytics steps configured.

RapidUsage schema
Field Data Type Null Key Default Extra
DateOfUsage Int(11) NO PRI 0
TotalUsage float YES 0

Regression Analytics steps in Rapidminer for Usage/Recharge prediction

1) DataBaseExampleSource:- This function is used to read data from SubscriberDailyUsage table


using query [[select substring(DateOfUsage,9,2) as DateOfUsage, sum(TotalUsage) as
TotalUsage from SubscriberDailyUsage group by 1 order by 1]]
This will give an output of cumulative daily total usage amount and the output will be passed to
the next operation
2) DatabseExampleSetWriter:- This will format the step 1 output and write to the table configured
as ‘RapidUsage’ in REAM database as configured below.

This will generate the output of step 1 query function and write to RapidUsage table(sample
below)

DateOfUsage TotalUsage
1 1027.94
2 1368.02
3 1652.46
4 1962.25
5 1630.46
6 1328.01
7 998.27
8 1298.46
10 1670.49
3) DatabaseExampleSource(2):- This function is reading RapidUsage intermediate table for
regression analysis as configured below. Output is configured as label attribute ‘TotalUsage’ field
in table.

4) PolynomialRegression:- Apply polynomial regression model on data read from RapidUsage table
with maximal degree for final polynomial configured as 2.
5) ExecuteAnalytic models:- Execute configured models to generate polynomial regression formula
as the result shown below.

6) The same steps should be followed to calculate the polynomial regression formula for
SubscriberDailyRecharge table with a calculation of cumulative sum of ‘ RechargeTalkTime’ for step
1 as below [[select substring(DateOfRecharge,9,2) as DateOfUsage, sum(RechargeTalkTime) as
TotalUsage from SubscriberDailyRecharge group by 1 order by 1]]

7) Enable LinerRegression in the above configurations in place of PolynomialRegression to


calculate linear regression for the patterns showing a linear growth or decline.

Conclusion

Accuracy of regression model is not correct in the tested examples. Further testing required for
identifying the reason.

You might also like