SPSS Modeler Tutorial 1

This document provides instructions for using SPSS Modeler software to analyze patient drug treatment data and identify relationships between variables. It describes opening an existing drug project in SPSS Modeler, loading patient data containing attributes like age, sex, and drug administered. Various SPSS Modeler nodes are used to visualize the data distribution, plot relationships between variables, derive new fields, and identify a threshold ratio between sodium and potassium levels that distinguishes drug effectiveness. Band selection is applied to separate patients above and below the identified threshold.

Uploaded by

xor657

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

838 views10 pages

SPSS Modeler Tutorial 1

Uploaded by

xor657

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

SPSS Modeler Tutorial 1

The Drug Project

Data Warehousing and Data Mining
March 2014
SPSS Modeler (formerly Clementine) is the SPSS enterprise-strength data mining workbench. It helps organizations to improve
customer and citizen relationships through an in-depth understanding of data. Organizations use the insight gained from SPSS
Modeler to retain profitable customers, identify cross-selling opportunities, attract new customers, detect fraud, reduce risk, and
improve government service delivery. The current version is SPSS Modeler 15.

1 The Drug Project Exercise
Briefing: Imagine that you are a medical researcher compiling data for a study. You have collected data about a set of patients, all of
whom suffered from the same illness. During their course of treatment, each patient responded to one of five medications. Part of your
job is to use data mining to find out which drug might be appropriate for a future patient with the same illness.

1.1 Launch the SPSS Modeler:
Open the SPSS Modeler by going to the Start menu All Programs IBM SPSS Modeler 15.0 IBM SPSS Modeler 15.0. Select
Open an existing project and double-click on More files. In the Open dialog window, goto the path of
N:\DWDM\SPSSModeler\Demos and double-click on the drug.cpj file to open it. The SPSS Modeler should open and displays as
Figure 1.

Figure 1: The Drug Project
Control Panel
Main Panel
Module Panel
Current Working
Space
Project Space
1

1.2 Displaying the Properties of the Data
To open a data source, the SPSS Modeler provides many options listed in the Sources tab from the Module Panel.
Here, we will use the Var. File node.
1. Select the Sources tab from the Module Panel
2. Double click on the Var.File node and it will appear in the Main Panel. You can also add a node by single
left-click on the node in the Module Panel, then single left-click at the place where you want to place that node
in the Main Panel.
3. Double click the Var.File node in the Main Panel to open its property window (Figure 2), and Click the
button next to the File field. In the Open dialog window, select to open the DRUG1n file that contains
records of drug information. The Var.File node now should have properties as in Figure 2. The DRUG1n file
contains records for 7 attributes, termed Age, Sex, BP, Cholesterol, Na, K, and Drug.
4. Click OK to close the Var.File property window.

Figure 2: Var.File Property
To display the properties of the data, we use a Distribution node.
1. Select the Distribution node listed in the Graphs tab from the Module Panel, and add it to the Main Panel.
2. Establish a link between the DRUG1n node and the Distribution node by right-clicking on the DRUG1n
node and select the Connect option, then left-clicking on the Distribution node (Figure 3).

Figure 3: Link between two nodes
3. Double-click the Distribution node to open its property window.
4. Select Drug for the Field option (Figure 4) to display the distribution of drugs. Click Run
2

Figure 4: Distribution Node Property
5. You should see a distribution window for the Drug attribute in the DRUG1n file (Figure 5). This window
illustrates the count of different drugs and their percentages.

Figure 5: Distribution of Drugs
6. Click OK to close the window.
1.3 Finding a Relationship in Numeric Data
To investigate a relationship between sodium (Na) and potassium (K) levels, the most natural way would be to produce a
point plot. To do this, we create a Plot node and connect it to the Var.File node.

1. Select the Plot node listed in the Graphs tab from the Module Panel, and add it to the Main Panel.
2. Establish a link between the DRUG1n node and the Distribution node by right-clicking on the DRUG1n
node and select the Connect option, then left-clicking on the Plot node (Figure 6).
3

Figure 6: Link between DRUG1n and Plot
3. Double-click the Plot node to open its property window.
4. Select K (Potasium) for the X Field option and select Na (Sodium) for the Y Field option (Figure 7).

Figure 7: Plot Node Property
5. Click Run. The plot window of the K attribute and Na attribute will be displayed (Figure 8). This appears to be a
random scattering, with no obviously apparent relationship between the Na and K attributes. However, this graph
takes no account of which drug was used in each case. Therefore, we need to modify the property of the Plot
node in order to display the correlations between Na and K with respect to different drugs.

4

Figure 8: Plot of K v. Na
6. Double-click the Plot node to open its property window.
7. Select Drug for the Color option in the Overlay group (Figure 9).

Figure 9: Plot Node Property
8. Click Run. The plot window of the K attribute and Na attribute with respect to different drugs will be displayed
(Figure 10). We can observe that a clear pattern emerges in the overlaid plot. The threshold is neither the Na nor
K field, but in a ratio between them.
9. Click OK to close the window.

5

Figure 10: Plot of K v. Na

1.4 Finding the Threshold
We can find the threshold by calculating the ratio and examining its distribution. To do so, we need to create a Derive
node and connect it to the Var.file node.

1. Select the Derive node listed in the Field Ops tab from the Module Panel, and add it to the Main Panel.
2. Establish a link between the DRUG1n node and the Derive node by right-clicking on the DRUG1n node
and select the Connect option, then left-clicking on the Derive node (Figure 11).

Figure 11: Link between Var.File and Derive Nodes
3. Double-click the Derive node to open its property window.
4. Type string Na_to_K in the Derive field, and formula Na/K in the Formula area (Figure 12). This will
create a new field named Na_to_K containing numbers calculated as Na/K.
6

5. Click OK to close this property window. The Derive node will be renamed to Na_to_K.

Figure 12: Derive Node Property

Next, we need to create a Histogram node to display the output from the Derive node.
1. Select the Histogram node listed in the Graphs tab from the Module Panel, and add it to the Main Panel.
2. Establish a link between the Na_to_K node and the Histogram node by right-clicking on the Na_to_K node
and select the Connect option, then left-clicking on the Histogram node (Figure 13).

Figure 13: Link between Na_to_K and Histogram Nodes
3. Double-click the Histogram node to open its property window.
4. Select Na_to_K for the Field option, and Drug for the Color option in the Overlay group (Figure 14).
7

Figure 14: Histogram Node Property
5. Click Run. The histogram window will be display as in Figure 15.

Figure 15: Histogram of Na_to_K
The histogram shows that the distribution of the ratio of Na and K. In addition, the threshold is clear as the column in the
bars change from multi-coloured to the pure yellow colour at the critical value.
We can now add a band selection line to this histogram to separate the records before and after the threshold.
1. Tick the Interactions option from the View menu (Figure 16).
2. Left-click the Activates band selection option (Figure 17).
8

3. Place the RED colour line as close as possible to the point at which the bars of the histogram change colour (the
threshold point). (Figure 18).
4. Right-click at the right side of the threshold line, and select Generate Derive Node for Band option (Figure 19).
5. A new Derive node will then be added to the Main Panel. Open its property window, and observe the
selection condition. Rename this node as band.
6. Connect this band node to Na_to_K derive node and also add a new histogram node to connect to it (Figure
20).
7. Double-click the Histogram node to open its property window.
8. Select Na_to_K for the Field option, and Band for the Color option in the Overlay group
9. Run the new histogram node and observe the result (Figure 21).

Figure 16: Histogram Interactions

Figure 17: Activates band selections

Figure 18: Threshold Line

Figure 19: Generate Derive Node
9

Figure 20: The new band node

Figure 21: The new band

End of Tutorial 1

10

NP Iii
No ratings yet
NP Iii
10 pages
Final Assessment - 5549959
No ratings yet
Final Assessment - 5549959
6 pages
Plant Risk Assessment Worksheet (Pra)
No ratings yet
Plant Risk Assessment Worksheet (Pra)
19 pages
Identifying Physical Database Requirements
No ratings yet
Identifying Physical Database Requirements
11 pages
Marine Diesel Engine
100% (1)
Marine Diesel Engine
5 pages
Autoclaved Aerated Concrete (AAC) Blocks Project - Brief Report
77% (31)
Autoclaved Aerated Concrete (AAC) Blocks Project - Brief Report
12 pages
Gce Npv20n2.en
100% (1)
Gce Npv20n2.en
52 pages
My SQR
50% (2)
My SQR
30 pages
Cours Machine Learning
0% (1)
Cours Machine Learning
204 pages
Case Studies For BI
No ratings yet
Case Studies For BI
6 pages
Top 50 Data Analyst Portfolio Project
50% (2)
Top 50 Data Analyst Portfolio Project
53 pages
Solutions of Triangle Sheet
100% (2)
Solutions of Triangle Sheet
16 pages
Construction Services PDF
No ratings yet
Construction Services PDF
2 pages
Power BI and Tableau Online Course
No ratings yet
Power BI and Tableau Online Course
15 pages
IT Elec 1-Professional Elective (Mobile Programming 1) - V2
No ratings yet
IT Elec 1-Professional Elective (Mobile Programming 1) - V2
13 pages
Data Mining
100% (13)
Data Mining
25 pages
Lecture 5 Introduction To Data Mining Business Intelligence
No ratings yet
Lecture 5 Introduction To Data Mining Business Intelligence
50 pages
Data Pipeline
No ratings yet
Data Pipeline
13 pages
DC-Tutorial Sheet 2
100% (2)
DC-Tutorial Sheet 2
2 pages
Null 001.2015.issue 273 en
No ratings yet
Null 001.2015.issue 273 en
26 pages
Durapac - Pumps - LR
No ratings yet
Durapac - Pumps - LR
29 pages
Analytics Case Studies Ebook
No ratings yet
Analytics Case Studies Ebook
12 pages
Aci 311.1
No ratings yet
Aci 311.1
1 page
Tableau Certification Training Course
No ratings yet
Tableau Certification Training Course
10 pages
Erp
No ratings yet
Erp
2 pages
' ''Shivanshu
No ratings yet
' ''Shivanshu
12 pages
Semarchy XDM Ebook
No ratings yet
Semarchy XDM Ebook
46 pages
Ashish
No ratings yet
Ashish
1 page
Power Bi Developer Embedded
No ratings yet
Power Bi Developer Embedded
369 pages
Hurl 170425
No ratings yet
Hurl 170425
9 pages
CIS285Assignments Fall2023
No ratings yet
CIS285Assignments Fall2023
2 pages
Exploring The ElasticSearch and Kibana
No ratings yet
Exploring The ElasticSearch and Kibana
28 pages
PLCC Overview
100% (2)
PLCC Overview
27 pages
Electrochemistry
100% (1)
Electrochemistry
78 pages
Data+Visualization+in+Python
No ratings yet
Data+Visualization+in+Python
17 pages
Counseling Intake Assessment Information Form
No ratings yet
Counseling Intake Assessment Information Form
7 pages
How To Train AI Models Step by Step Effectively
No ratings yet
How To Train AI Models Step by Step Effectively
8 pages
Third Periodic Examination in Math Problem Solving: Main Campus - Level 12
No ratings yet
Third Periodic Examination in Math Problem Solving: Main Campus - Level 12
9 pages
Data Mining: Business Intelligence
No ratings yet
Data Mining: Business Intelligence
68 pages
Analysis of Segment Reporting With Reference To Selected Software Companies
No ratings yet
Analysis of Segment Reporting With Reference To Selected Software Companies
18 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
48 pages
Group 'C'
No ratings yet
Group 'C'
11 pages
CP04 Business Intelligence
No ratings yet
CP04 Business Intelligence
25 pages
Chapter 6: ER - Entity Relationship Diagram
No ratings yet
Chapter 6: ER - Entity Relationship Diagram
44 pages
Java-Important Questions
100% (3)
Java-Important Questions
3 pages
Sem PM 12-13 BPMN
No ratings yet
Sem PM 12-13 BPMN
133 pages
DataMiningForTheMasses (001 158)
No ratings yet
DataMiningForTheMasses (001 158)
158 pages
Teenager Problems
No ratings yet
Teenager Problems
4 pages
7 Data Pre-Processing in Clementine
No ratings yet
7 Data Pre-Processing in Clementine
7 pages
Data Analysis Power Bi Classnotes
No ratings yet
Data Analysis Power Bi Classnotes
4 pages
Toefl - Klaudio Fersely Sareng - 110017058
No ratings yet
Toefl - Klaudio Fersely Sareng - 110017058
1 page
Fabric Data Science 150 300
No ratings yet
Fabric Data Science 150 300
151 pages
Data Visualization - Day 4 - in Class Exercises - Dashboards and Story Points - Solution
No ratings yet
Data Visualization - Day 4 - in Class Exercises - Dashboards and Story Points - Solution
44 pages
Snowflakes Beginner To Intermediate Path Updated
No ratings yet
Snowflakes Beginner To Intermediate Path Updated
4 pages
CH 6
No ratings yet
CH 6
72 pages
Development On The Four Domain Skills of English Language by Grade 12 Contact Center Services Students Through Work Immersion
No ratings yet
Development On The Four Domain Skills of English Language by Grade 12 Contact Center Services Students Through Work Immersion
55 pages
Agadu Du Du
No ratings yet
Agadu Du Du
15 pages
2667A Introduction To Programming ENU Companion Content
No ratings yet
2667A Introduction To Programming ENU Companion Content
37 pages
Maruti Suzuki
No ratings yet
Maruti Suzuki
18 pages
Data Mining ppt-1
No ratings yet
Data Mining ppt-1
16 pages
Web Analytics, Web Mining, and Social Analytics
No ratings yet
Web Analytics, Web Mining, and Social Analytics
53 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
91 pages
Business Intelligence
No ratings yet
Business Intelligence
41 pages
Administrative and Business Chapter 4
No ratings yet
Administrative and Business Chapter 4
20 pages
Course Guide Big Data University College Groningen: Academic Year 2020/2021, Semester Ib 1. General Information
No ratings yet
Course Guide Big Data University College Groningen: Academic Year 2020/2021, Semester Ib 1. General Information
6 pages
Lecture 3 Data Mining
No ratings yet
Lecture 3 Data Mining
30 pages
Machine Learning - Brief
No ratings yet
Machine Learning - Brief
12 pages
Excel VBA - Objects
No ratings yet
Excel VBA - Objects
18 pages
Infromation System1
No ratings yet
Infromation System1
47 pages
Microsoft Office Specialist (MOS) Overview: Click To Edit Master Title Style
No ratings yet
Microsoft Office Specialist (MOS) Overview: Click To Edit Master Title Style
30 pages
DBMS Course Outline
No ratings yet
DBMS Course Outline
14 pages
This Spreadsheet Supports STUDENT Analysis of The Case "Transportation and Consolidation at Elevalt LTD." (UVA-OM-1490)
No ratings yet
This Spreadsheet Supports STUDENT Analysis of The Case "Transportation and Consolidation at Elevalt LTD." (UVA-OM-1490)
7 pages
Joint Dislocations
No ratings yet
Joint Dislocations
35 pages
Bussiness Intelligence
No ratings yet
Bussiness Intelligence
6 pages
Business Intelligence & Business Analytics
No ratings yet
Business Intelligence & Business Analytics
8 pages
Chap 011
No ratings yet
Chap 011
70 pages
McKinsey Machine Learning
No ratings yet
McKinsey Machine Learning
6 pages
Implementing Data Science Projects PDF
No ratings yet
Implementing Data Science Projects PDF
2 pages
AI and Data Science
No ratings yet
AI and Data Science
12 pages
By Ghazwan Khalid Auda
100% (1)
By Ghazwan Khalid Auda
17 pages
Data Visualization Using Ms. Excel
No ratings yet
Data Visualization Using Ms. Excel
8 pages
Task 1 - Unit 5 - V2
No ratings yet
Task 1 - Unit 5 - V2
9 pages
Lab1 2graphingtamucccgon36pts
No ratings yet
Lab1 2graphingtamucccgon36pts
9 pages
A Survey On Data Mining
No ratings yet
A Survey On Data Mining
4 pages
SSAS
No ratings yet
SSAS
2 pages
SQL01 - Introduction To Business Intelligence
No ratings yet
SQL01 - Introduction To Business Intelligence
75 pages
Quantitative Techniques & Operations Research: Ankit Sharma Neha Rathod Suraj Bairagi Vaibhav Thamman
No ratings yet
Quantitative Techniques & Operations Research: Ankit Sharma Neha Rathod Suraj Bairagi Vaibhav Thamman
12 pages
Market Basket Analysis and Advanced Data Mining: Professor Amit Basu
No ratings yet
Market Basket Analysis and Advanced Data Mining: Professor Amit Basu
24 pages
Charts & Diagrams Primer
From Everand
Charts & Diagrams Primer
Beam Vanwaardenberg
No ratings yet

SPSS Modeler Tutorial 1

Uploaded by

SPSS Modeler Tutorial 1

Uploaded by

SPSS Modeler Tutorial 1

The Drug Project

You might also like