IBM Watson Analytics Automating Visualization Desc
IBM Watson Analytics Automating Visualization Desc
Original Paper
Robert Eugene Hoyt1*, MD, FACP; Dallas Snider2*, PhD; Carla Thompson3*, EdD; Sarita Mantravadi1*, MS, MPH,
CPH, CHES, PhD
1
College of Health, Department of Health Sciences and Administration, University of West Florida, Pensacola, FL, United States
2
College of Science and Engineering, Department of Computer Science, University of West Florida, Pensacola, FL, United States
3
College of Education and Professional Studies, Community Outreach Research and Learning (CORAL) Center, University of West Florida, Pensacola,
FL, United States
*
all authors contributed equally
Corresponding Author:
Robert Eugene Hoyt, MD, FACP
College of Health
Department of Health Sciences and Administration
University of West Florida
11000 University Parkway
Pensacola, FL, 32514
United States
Phone: 1 8503845235
Fax: 1 850 474 2173
Email: [email protected]
Abstract
Background: We live in an era of explosive data generation that will continue to grow and involve all industries. One of the
results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required
a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released
the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package
for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users
with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes
datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and
visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick
response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools
for noncommercial purposes.
Objective: To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data
mining programs.
Methods: The salient features of the IBMWA program were examined and compared with other common analytical platforms,
using validated health datasets.
Results: Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source
data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as
Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent
component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results,
nor were odds ratios, confidence intervals, or a confusion matrix.
Conclusions: IBMWA is a new alternative for data analytics software that automates descriptive, predictive, and visual analytics.
This program is very user-friendly but requires data preprocessing, statistical conceptual understanding, and domain expertise.
KEYWORDS
data analysis; data mining; machine learning; statistical data analysis; natural language processing
Figure 4. Predict option for nominal category of less than or greater than 30% obese by county.
Assemble Option
Predict Option
The assemble option contains functionality to create dashboards,
The predict option in IBMWA is utilized for predictive analytics.
infographics, and slide shows. An example of an IBMWA
In our analysis of the 2014 County Health Rankings for the state
interactive dashboard display using the dataset, 2014 County
of Florida, 74 associations were noted at the top of the page.
Health Rankings for the state of Florida reflecting the “%
The attribute “children in poverty” was associated with “teen
obesity” by Florida County is depicted in Figure 5. The county
birth rate.” Select “statistical details” and a Pearson correlation
dropdown list is active; such that once a county is selected, the
of .79 with P<.001 and effect size of 0.63 was noted.
percentages change and correspondingly the map changes.
When % obese was selected as the target, predictions were
automatically generated. The top predictor for “% obesity” was
Comparative Study
“% physically inactive” at 69%, but IBMWA recommended The results from the comparative study among the software
the addition of “% African-American,” which increased the packages are presented in the following subsections. The same
predictive ability to 85%. A screenshot of the predict function heart disease dataset was used as the input to each software
results is shown in Figure 3. package and each package provides differing statistics and
measures which are summarized in Table 7.
The “% obesity” column of attributes was then subdivided into
counties with less than or more than 30% obesity reported and IBM Watson Analytics
the predict function was reexecuted. This second analysis used The IBMWA software conducted a logistic regression for
logistic regression and produced household income as the classification purposes. When the target attribute of heart disease
strongest predictor at 88% predictive strength. A chi-square (present or absent) was used in IBMWA, it revealed that the
analysis comparing the categorical variables demonstrated the thallium test had a predictive strength of 76% (percent correct
following: P<.001, effect size = .46 (Figure 4). classification). The thallium test attribute had 3 subcategories:
3 = normal, 6 = fixed defect, and 7 = reversible defect. Based
on either normal exam or reversible defect on thallium testing, both the intercept only model. However, IBMWA uses the LR
the chi-square test revealed P<.001 and an effect size of 0.48 test to compare models with the reduced fit single predictor
for the normal test and 0.63 for the reversible defect. model as the default setting, whereas SPSS uses the baseline,
intercept only model as the default setting for LR test
When a full model (3 variables) analysis is conducted with
comparison.
logistic regression, the software also conducts a likelihood ratio
test (chi-square) to determine if the addition of the variables A chi-square analysis was also performed using SPSS with a
improves the fit of the model. Predictive strength increases to resulting likelihood ratio of 78% for comparison purposes.
80% (percent correct classification) and statistical significance Based on the normal exam or reversible defect on thallium
of the target predictor of thallium reduced. Interactions between testing, the chi-square test revealed a significant relationship
thallium and the number of vessels calcified on fluoroscopy (χ21= 76.1, P<.001) and an effect size of 0.53 for the normal
were not significant, P<.09. The likelihood ratio test (χ21= 11.08, test and 0.67 for the reversible defect test.
P<.08) was not statistically significant to the 5% significance
level, and thus the reduced model of thallium alone and heart
Microsoft Excel Analysis ToolPak
disease is the best fit. The ToolPak software can only conduct linear regression, not
logistic regression for classification.
IBM Statistical Package for the Social Sciences
Analysis was not performed because a chi-square test would
Binary logistic regression with heart disease as the dependent
have to be manually run between the target attribute and each
variable and thallium as a single predictor was conducted. As
column. The expected values would need to be calculated and
confirmed in the IBMWA results, predictive strength and percent
run against the actual values to arrive at the chi-square result
correctly classified increases as more variables are included in
and P value. This is very labor intensive compared with the
the regression; however, statistical significance reduces.
other platforms tested.
Logistic regression (LR) with 3 predictors—thallium, number
of vessels calcified on fluoroscopy, and the interaction
Microsoft SQL Server Analysis Services
effect—was conducted, illustrating that the predictive strength Data were analyzed using a decision tree and neural network
of the model was 78%, and the interaction effect was not to compare for classification accuracy. To train the classifier
significant. The number of vessels calcified by fluoroscopy and models, 70% of the data was used, whereas the remaining 30%
the thallium test variables were statistically significant with was held out for testing. The decision tree algorithm was chosen
P=.04 and P<.001, respectively. The LR test compared with because of the ease of understanding the results, while a neural
the intercept only model was significant, with χ23=120.5 and network was selected because of the ability to generally produce
better classification results. The decision tree yielded a
P<.001, indicating that the 3 variable model improved model
sensitivity of 0.80 and specificity of 0.78, while neural networks
fit over the intercept only model.
yielded a sensitivity of 0.77 and a specificity of 0.92. Both
Thereafter, forward selection using the LR test was also algorithms have parameters that can be adjusted to improve
conducted for appropriate variable selection, reducing classification accuracy; however, these parameters need to be
collinearity and demonstrating model fit. By the end of the adjusted cautiously to avoid “overfitting” the model.
stepwise forward regression concerning all variables, the LR
test indicated that thallium remained a statistically significant
Waikato Environment for Knowledge Analysis
predictor, as well as gender, type of chest pain, A J48 decision tree was used as the algorithm with 10-fold cross
electrocardiogram results, exercise-related angina, ST wave validation. The outcome was correctly classified 78% of the
depression, and number of vessels calcified by fluoroscopy. time. The precision for the presence of heart disease was 0.931
Percent correctly classified increased to 90%. The variables and recall (sensitivity) was 0.628. Precision for the absence of
gender (χ21=3.9, P=.049), exercise-related angina (χ21=5.7, heart disease was 0.692 and the recall was 0.947.
P=.02), and electrocardiogram results were statistically Summary
significant at the 5% level, whereas, types of chest pain These preliminary informal analyses indicate that the 4
(χ21=13.3, P<.001), ST wave depression (χ21=11.7, P=.001), analytical programs provide similar results using the same
number of vessels calcified by fluoroscopy (χ21=19.9, P<.001), dataset. WEKA does provide a confusion matrix, Kappa statistic,
and receiver operator characteristics curve area statistic, with
and thallium (χ21=15.1, P<.001) were statistically significant
neither of these analytics supplied by IBMWA. WEKA, in
at the 1% level. Comparing this full model with the intercept contrast to IBMWA, includes more than 50 different algorithms,
only model, it was found that χ21= 77.8 and P<.001. These without any recommendations regarding the optimal choice.
results illustrate that the full model had a better model fit than
SQL Server Analysis Services Decision tree analysis yielded a sensitivity of 0.80 and specificity of 0.78, while neural networks yielded a sensi-
tivity of 0.77 and a specificity of 0.92
Waikato Environment for Knowl- Decision tree precision for presence of heart disease was 0.93 and recall (sensitivity) was 0.63; precision for absence
edge Analysis of heart disease was 0.69 and recall was 0.95
Conflicts of Interest
None declared.
References
1. Wegwarth O, Schwartz LM, Woloshin S, Gaissmaier W, Gigerenzer G. Do physicians understand cancer screening statistics?
A national survey of primary care physicians in the United States. Ann Intern Med 2012 Mar 6;156(5):340-349 [FREE Full
text] [doi: 10.7326/0003-4819-156-5-201203060-00005] [Medline: 22393129]
2. Manrai AK, Bhatia G, Strymish J, Kohane IS, Jain SH. Medicine's uncomfortable relationship with math: calculating
positive predictive value. JAMA Intern Med 2014 Jun;174(6):991-993 [FREE Full text] [doi:
10.1001/jamainternmed.2014.1059] [Medline: 24756486]
3. Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz LM, Woloshin S. Helping doctors and patients make sense of
health statistics. Psychol Sci Public Interest 2007 Nov;8(2):53-96 [FREE Full text] [doi: 10.1111/j.1539-6053.2008.00033.x]
[Medline: 26161749]
4. Yimaz M. Amstat. 1996. The Challenge of Teaching Statistics to Non-Specialists URL: http://www.amstat.org/publications/
jse/v4n1/yilmaz.html [accessed 2016-01-20] [WebCite Cache ID 6f3StcC1m]
5. DeGaspari J. Managing the Data Explosion. Healthcare-informatics. 2013. URL: http://www.healthcare-informatics.com/
article/managing-data-explosion [accessed 2016-02-04] [WebCite Cache ID 6f3UHVtLq]
6. Davis P. McKinsey Report Highlights the Impending Data Scientist Shortage Pivotal Blog. 2013 Jul 23. URL: https://blog.
pivotal.io/data-science-pivotal/news/mckinsey-report-highlights-the-impending-data-scientist-shortage [accessed 2016-02-04]
[WebCite Cache ID 6f3TQiGxE]
7. IBM. IBM Watson Analytics URL: http://www.ibm.com/analytics/watson-analytics/ [accessed 2016-05-20] [WebCite
Cache ID 6f3URVTtm]
8. IBM. IBM Watson Analytics Academic Program-WAP URL: https://watson.analytics.ibmcloud.com/solutions/industry/
education/wap [accessed 2016-06-15] [WebCite Cache ID 6f3V7K6s2]
9. IBM. IBM SPSS Software URL: http://www-01.ibm.com/software/analytics/spss [accessed 2016-01-02] [WebCite Cache
ID 6f3XEEkOU]
10. Use the Analytics ToolPak to Perform Complex Data Analysis. Microsoft; 2016. URL: https://support.office.com/en-us/
article/Use-the-Analysis-ToolPak-to-perform-complex-data-analysis-f77cbd44-fdce-4c4e-872b-898f4c90c007[WebCite
Cache ID 6f3X2Rz37]
11. Microsoft. Data Mining (SSAS) URL: https://msdn.microsoft.com/en-us/library/bb510516.aspx [accessed 2016-02-04]
[WebCite Cache ID 6f3XbsRvU]
12. Waikato. WEKA 3: Data Mining Software in Java URL: http://www.cs.waikato.ac.nz/ml/weka [accessed 2016-01-02]
[WebCite Cache ID 6f3XL1Psy]
13. County Health Rankings. County Health Rankings & Roadmaps: Building a Culture of Health, County by County URL:
http://www.countyhealthrankings.org/rankings/data/fl [accessed 2015-05-20] [WebCite Cache ID 6f3VUACLq]
14. UCI. Heart Disease Data Set URL: https://archive.ics.uci.edu/ml/datasets/Heart+Disease [accessed 2016-01-02] [WebCite
Cache ID 6f3WAvzRv]
15. Hersh W. A stimulus to define informatics and health information technology. BMC Med Inform Decis Mak 2009 May
15;9:24 [FREE Full text] [doi: 10.1186/1472-6947-9-24] [Medline: 19445665]
16. Chambers JM. Greater or lesser statistics: a choice for future research. Stat Comput 1993 Dec;3(4):182-184. [doi:
10.1007/BF00141776]
17. Witten I, Frank E, Hall M. Introduction to data mining. In: Data Mining: Practical Machine Learning Tools and Techniques.
3rd edition. Burlington, MA: Morgan Kaufmann-Elsevier; 2011.
18. Donoho D. 50 Years of Data Science. Csail.mit. 2005 Sep 18. URL: http://courses.csail.mit.edu/18.337/2015/docs/
50YearsDataScience.pdf[WebCite Cache ID queryphp]
19. Jiawei H, Micheline K, Jian P. Data preprocessing. In: Data Mining: Concepts and Techniques. 3rd edition. Waltham, MA:
Morgan Kaufmann-Elsevier; 2012.
Abbreviations
ANOVA: Analysis of variance
IBM: International Business Machines Corporation
IBMWA: IBM Watson Analytics
LR: Logistic regression
NLP: natural language processing
SPSS: Statistical Package for the Social Sciences
WEKA: Waikato Environment for Knowledge Analysis
Edited by G Eysenbach; submitted 29.03.16; peer-reviewed by S Bhattacharya, CH Li, C McGregor, A Ramachandran; comments to
author 25.07.16; revised version received 31.08.16; accepted 17.09.16; published 11.10.16
Please cite as:
Hoyt RE, Snider D, Thompson C, Mantravadi S
IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics
JMIR Public Health Surveill 2016;2(2):e157
URL: http://publichealth.jmir.org/2016/2/e157/
doi: 10.2196/publichealth.5810
PMID: 27729304
©Robert Eugene Hoyt, Dallas Snider, Carla Thompson, Sarita Mantravadi. Originally published in JMIR Public Health and
Surveillance (http://publichealth.jmir.org), 11.10.2016. This is an open-access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly
cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this
copyright and license information must be included.