Journal of Data Mining
Journal of Data Mining
ABSTRACT
The data mining its main process is to collect, extract and store the valuable
information and now-a-days it’s done by many enterprises actively. In advanced
analytics, Predictive analytics is the one of the branch which is mainly used to make
predictions about future events which are unknown. Predictive analytics which uses
various techniques from machine learning, statistics, data mining, modeling, and
artificial intelligence for analyzing the current data and to make predictions about
future. The two main objectives of predictive analytics are Regression and
Classification. It is composed of various analytical and statistical techniques used
for developing models which predicts the future occurrence, probabilities or events.
Predictive analytics deals with both continuous changes and discontinuous changes.
It provides a predictive score for each individual (healthcare patient, product SKU,
customer, component, machine, or other organizational unit, etc.) to determine, or
influence the organizational processes which pertain across huge numbers of
individuals, like in fraud detection, manufacturing, credit risk assessment,
marketing, and government operations including law enforcement.
KEYWORDS
Predictive analytics, Credit history, forecasting, Regression techniques
1. INTRODUCTION:
Large amount of data available in information databases becomes waste until the
useful information is extracted. Predictive analytics is the roof of advanced analytics
which is to predict the future events. Predictive analytics is capsuled with the data
collection and modelling, statistics and deployment. The figure 1 gives the basis of
predictive analysis. Based on the availability of high quality data and effective
sharing, the success of data mining relies.
Data
Collection
Predictive
Statistics
Development
Monitor Understand
the data
data
Evaluate Model
PREDICTIVE ANALYTICS:
REGRESSION TECHNIQUES
FORECASTING
Forecasting focus towards the predictions of the future based past and present
data [3, 14]. The two terms are focussed on the forecasting are risk and
uncertainty.
3. LITERATURE SURVEY
Branko G. Celler et al [2] The telemonitoring of vital signs from the home for the
management of patients with chronic conditions. New measurement modalities and
signal processing techniques are proposed for increasing the ions about
future.quality and value of vital signs monitoring. Automated risk stratification
algorithm is used. QRS height and width (ECG), PR (Pressure Rate), RR
(Respiratory Rate), QT (Abnormal heart rate) and Body temperature are the
parameters. jBoss tool is used. Data’s are taken from Department of Human
Services Medicare databases and Telemonitoring system data. It results in
identifying the key technical performance characteristics of at-home telemonitoring
systems.
Hao-Tsung Yang et al [3] A neural network model to predict the value of playing
style information in predicting match quality. A mix of Sternberg’s thinking style
theory and individual histories are used to categorize League of Legend (LoL)
players. The data’s are collected from LoLBase website. The algorithms used are
Feed forward/ feedback propagation algorithm and Match making algorithm. Win
rate, Match duration, Group number (ELO) are the parameters for performance
measurement. PYTHON is used. It finds that the presence of global-liberal (G-L)
style players is positively correlated with match enjoyment.
Quanzeng You, Liangliang Cao et al [5] To build a reliable forecasting system for
the elections and its modelled to figure out the inter relationship between social
multimedia as image-centric and real-world entities. Competitive Vector Auto
Regression (CVAR) algorithm is used. By the competition mechanism, CVAR
compares the popularity among multiple competing candidates. Data’s are taken
from Flickr. Number of images, Number of users, images uploaded per day (IPD)
and users uploading images per day (UPD) are the parameters for measurement.
OpenCV Tool is used. As a 3 International Journal of Chaos, Control, Modelling and
Simulation (IJCCMS) Vol.5, No.1/2/3, September 2016 result CVAR is able to take
prior knowledge which leads to better performance in terms of prediction accuracy.
Abish Malik, Ross Maciejewski et Al [7] A visual analytics approach that provides
decision makers with a proactive and predictive environment which helps them in
making effective resource allocation and deployment decisions. Analysts are
provided with a suite of natural scale templates and methods that enable them to
focus and drill down to appropriate geospatial and temporal resolution levels.
Prediction algorithm is used. Accuracy, Average, Count and Time are the parameters
for performance measurement. This Methodology is applied to Criminal, Traffic and
Civil (CTC) incident datasets. It provides users with a suite of natural scale
templates that support analysis at multiple spatiotemporal granularity levels.
Nanlin Jin, Peter Flach et al [10] Data mining methods for exploring incredible
consumption patterns and their associated descriptive models from smart electricity
meter data. Target concept, Target type, Double regression, Coverage and Strategy
type are the parameters for performance measurement. Subgroup discovery
algorithms and S-Transform algorithms are used. Data used were collected by the
Energy Demand Research Project (EDRP). Cortana tool is used. This approach
outperforms more conventional data mining methods in terms of their predictive
power and classification accuracy, while consuming similar computational resource.
Thinking Style Feed Forward/ Lolbase Win Rate, Match Finds That
And Team Feedback Propagation Website Duration, Group The Presence
Competition Algorithm And Match Number (ELO) Of Global-
Game Making Algorithm. Liberal (G-L)
Performance And Style Players
Enjoyment Is Positively
Correlated
With Match
Enjoyment
5. CONCLUSION
Predictive analytics is the future of data mining .This study focus towards the
predictive analytics, regression techniques and forecasting in knowledge discovery
domain. Business intelligence is used in predictive analytics for modelling and
forecasting. Predictive analytics are more efficient in choosing marketing methods
and helpful in social media analytics.
REFERENCES
[2] Branko G. Celler, and Ross S. Sparks, “Home Telemonitoring of Vital Signs
Technical Challenges and Future Directions”, IEEE Journal Of Biomedical And
Health Informatics, Vol. 19, No. 1, January 2015.
[3] Hao Wang, Hao-Tsung Yang, and Chuen-Tsai Sun,” Thinking Style and Team
Competition Game Performance and Enjoyment”, IEEE Transactions On
Computational Intelligence And Ai In Games, Vol. 7, No. 3, September 2015.
[4] Jacob R. Scanlon and Matthew S. Gerber, “Forecasting Violent Extremist Cyber
Recruitment”, IEEE Transactions On Information Forensics And Security, Vol.
10, No. 11, November 2015.
[5] Quanzeng You, Liangliang Cao, Yang Cong, Xianchao Zhang, and Jiebo Luo” A
Multifaceted Approach to Social Multimedia-Based Prediction of Elections”,
IEEE Transactions On Multimedia, Vol. 17, No. 12, December 2015.
[6] Sean M. Arietta Alexei A. Efros Ravi Ramamoorthi Maneesh Agrawala, “City
Forensics: Using Visual Elements to Predict Non-Visual City Attributes”, IEEE
Transactions On Visualization And Computer Graphics, Vol. 20, No. 12,
December 2014.
[7] Abish Malik, Ross Maciejewski, Sherry Towers, Sean McCullough, and David
S. Ebert,” Proactive Spatiotemporal Resource Allocation and Predictive Visual
Analytics for Community Policing and Law Enforcement”, IEEE Transactions
On Visualization And Computer Graphics, Vol. 20, No. 12, December 2014.
[8] Ronaldo C. Prati, Gustavo E.A.P.A. Batista, and Maria Carolina Monard,” A
Survey on Graphical Methods for Classification Predictive Performance
Evaluation”, IEEE Transactions On Knowledge And International Journal of
Chaos, Control, Modelling and Simulation (IJCCMS) Vol.5, No.1/2/3,
September 2016 Data Engineering, Vol. 23, No. 11, November 2011.
[9] Gang Fang, Gaurav Pandey, Wen Wang, Manish Gupta, Michael Steinbach, and Vipin
Kumar,” Mining
[10]Nanlin Jin, Peter Flach, Tom Wilcox, Royston Sellman, Joshua Thumim, and
Arno Knobbe,” Subgroup Discovery in Smart Electricity Meter Data”, IEEE
Transactions On Industrial Informatics, Vol. 10, No. 2, May 2014.
[11] Hao Wang, Hao-Tsung Yang, and Chuen-Tsai Sun,” Thinking Style and Team
Competition Game Performance and Enjoyment”, IEEE Transactions On
Computational Intelligence And Ai In Games, Vol. 7, No. 3, September 2015.
[12]Quanzeng You, Liangliang Cao, Yang Cong, Senior Member, IEEE, Xianchao
Zhang, and Jiebo Luo, Fellow, IEEE.” A Multifaceted Approach to Social
Multimedia-Based Prediction of Elections”, IEEE Transactions On Multimedia,
Vol. 17, No. 12, December 2015.
[13]Yun Wang and Sudha Ram,” Predicting Location- Based Sequential Purchasing
Events by Using Spatial, Temporal, and Social Patterns”, IEEE Intelligent
Systems, May/June 2015.
[14]Jesse Rio Russell,” Predictive analytics and child protection: Constraints and
Opportunities”, Child Abuse & Neglect 46 (2015) 182–189- ELSEVIER.
[15]Karel Dejaeger, Wouter Verbeke, David Martens, and Bart Baesens,” Data
Mining Techniques for Software Effort Estimation: A Comparative Study”,
IEEE Transactions On Software Engineering, Vol. 38, No. 2, March/April 2012.
[17]Josep Ll. Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer,
Daron Green,” ALOJA: A Framework for Benchmarking and Predictive
Analytics in Big Data Deployments”, IEEE Transactions on Emerging Topics in
Computing • November 2015.
[18]Minghui Zhou and Audris Mockus,” Who Will Stay in the FLOSS Community?
Modeling Participant’s Initial Behavior”, IEEE Transactions On Software
Engineering, Vol. 41, No. 1, January 2015 .
[20] Francisco C. Pereira, Member, IEEE, Filipe Rodrigues, Evgheni Polisciuc, and Moshe Ben-Akiva”,
Why so many people? Explaining Nonhabitual Transport Overcrowding With Internet Data”,
IEEE Transactions On Intelligent Transportation Systems, Vol. 16, No. 3, June 2015.