Auerbach C. SSD For R. An R Package For Analyzing..Data 2ed 2022
Auerbach C. SSD For R. An R Package For Analyzing..Data 2ed 2022
SSD for R
An R Package for Analyzing Single- Subject Data
Second Edition
Introduction
This introduction provides background information on single-subject
research and its use in the behavioral and health sciences. Here you will
find a brief history of the use of this type of research design to provide a
backdrop for its current use and a discussion of future directions in the use
of these designs. The purpose of this is to provide contextual information
for the introduction of SSD for R, a visual and statistical software package
that is easily accessible and useful in the analysis of single-subject data.
SSD for R is a package written in R, a free and open-source statistical
programming language (The R Project for Statistical Computing, n.d.). In
this Introduction, we also go over the contents of this book, including
updates in the second edition.
Conclusion
Single-subject research has and continues to have an important position in
the areas of both research and practice evaluation. With the current trend
toward evidence-based practice, we believe that the demand for and
reliance on this type of research is only going to increase in the future. SSD
for R is a software package available to help researchers and practitioners
analyze single-subject research data.
This text is designed to help readers learn to use both the visual and
statistical functions in SSD for R. Examples, screenshots, and instructions
for when and how to use each of these functions help to make this book a
useful tool in the interpretation of single-subject data analysis in general
and in the use of SSD for R in particular.
1
Getting Your Data Into SSD for R
Introduction
In this chapter you will learn how to measure target behaviors and use Excel
or other software to record and edit client data. You will then be able to
import these data into R and use the SSD for R functions to analyze them.
The first part of this chapter focuses on the types of data you will want to
record and some common issues related to collecting these. While an
overview of this material is covered in this chapter, additional resources that
include these topics in depth are listed in Appendix D. The second part of
this chapter shows you how to use Excel or another spreadsheet program to
quickly and effectively record the data.
What to Measure
Single-subject research designs rely on repeated measures of target
behaviors that are related to client problems. You can measure one or more
target behaviors using Excel or any software package whose files can be
saved as comma-separated values. This data can then be analyzed with SSD
for R. There are several types of measurements you can use to accomplish
this: direct behavioral observations, standardized scales, individualized
rating scales, and logs.
While you could record these data in Excel, you, as the social worker,
realize that during Week 4, the senior center was only open 5 days instead of
its customary seven due to a holiday schedule. To improve the validity of
your measurement, then, you may want to record the proportion of the week
that Mary goes to the senior center by dividing the number of days she
attends by the possible number of days that she could attend. Your data for
this time period then would be reported in Excel differently:
Week Proportion of Days Mary Went to the Senior Center
1
; record “0”
2
; record “0.29”
3
; record “0.14”
4
; record “0.20”
Standardized Scales
Standardized scales are instruments that have typically been developed for
use in either practice or research to measure a particular problem or
phenomenon. Standardized scales usually have a consistent method for
scoring, and scores can be compared across individuals (Bloom et al., 2009;
Orme & Combs-Orme, 2011).
You may choose to measure one or more target behaviors with
standardized scales; however, like other types of measures, there are
advantages and disadvantages to using these. Some advantages include the
ability to measure multiple aspects of a complex construct, the availability of
reliability and validity information in many cases, and ease of use.
Disadvantages may include the appropriateness of a given standardized scale
for a particular practice situation and the length of time the scale may take to
complete.
To more closely understand these advantages and disadvantages, let’s look
at the example of Mary a bit further. You, as the social worker, suspect that
Mary’s social isolation may be a result of depression. There are several ways
to try to assess possible depression in this client, but one way would be to
use the Beck Depression Inventory (BDI) (Beck, Ward, Mendelson, Mock,
& Erbaugh, 1961). This is a 13- or 21-question survey that is considered
both a reliable and a valid assessment of depression in adults and includes
questions about various facets of depression, including mood, pessimism,
and guilt (Thombs, Ziegelstein, Beck, & Pilote, 2008). While this, then,
would seem like an obvious measure of depression to use with Mary, it is
fairly lengthy, and it would not make sense to have her complete it daily or
even weekly. With this in mind, it might make sense clinically to use the
BDI to evaluate her depression and track Mary’s progress over time, but for
the purposes of a single-subject practice evaluation, you would probably
want to use this measure in conjunction with others as you might want more
frequent measures to aid in the analysis of your work with this client.
There are many sources for locating standardized scales, but one place to
start is with the work of Orme and Combs-Orme (2011), who included scales
and references for online and published standardized scales in two
appendices.
During Mary’s assessment period, which has lasted 4 weeks, you collect the
following information:
Week Comfort at Senior Center Mood
1 1 3
2 4 2
3 2 2
4 2 1
Note that Mary could very easily be asked these questions when she meets
with her social worker, and the responses gathered could actually be used
therapeutically during the session in which she reports the information.
Logs
Logs can be referred to as client-annotated records, critical incident
recordings, or self-monitoring, but no matter how they are referred to, logs
can be used to provide a qualitative dimension to quantitative data you
collect using other means. In general, a log is an organized journal of events
related to client problems. The log usually includes the clients’ perceptions
of these and/or the circumstances under which they occur. While we might
ask a client to use a log, they can also be completed by professionals and
significant others to get a deeper, more thorough understanding of the
overall problem we are seeking to address. Logs, then, are personal and
specific to the individual client and can be used to help formulate client-
specific interventions.
The format of a log is up to the preferences of the practitioner and client.
Some logs are very structured and can be in the form of a table that the client
is asked to fill in. Some are very unstructured and can simply be made up of
a notebook, pad of paper, or mobile device that the client uses for the
purposes of recording events and his or her reactions to these.
When using logs with clients, you will want to decide whether you want
them used at preset times or if they are to be open ended. When the client’s
problem occurs regularly, you may want to suggest that logs be used to
record events and reactions at preset times. When this is done, logs can
generate a lot of information due to their frequent use; however, if the
problem is sporadic, regular use of logs can generate large amounts of
useless information. In this case, you may suggest that your client use his or
her log only when the problem occurs.
In order to maximize the reliability of logs, you should consider
discussing the importance of accurate reporting with your clients and assure
them that the use of these falls within the bounds of confidentiality set up in
other aspects of your work. Creating a structured format for logs may help
minimize difficulties in clients using them. You will also want to be sure that
your clients understand what they are to record and how often. If you are
asking your clients to use logs when an event occurs, it is advisable to have
them record the event and their reactions as close in time to the event
occurring as possible. In all cases, you will want to encourage your clients to
be as accurate in their reporting as possible without inadvertently
encouraging inaccurate reports of improvement.
Both Bloom, Fischer, and Orme (2009) and Orme and Combs-Orme
(2011) provide more detailed information about the use of logs in single-
subject designs, and both texts provide templates for logs that can be tailored
to meet your clients’ needs.
Using Spreadsheets to Record Client Data
Once you determine what you are going to measure, how you are going to
measure it, and how often you are going to measure, you will want to collect
and record these data so that it can eventually be imported and analyzed in
SSD for R. In this section, we walk you through the steps necessary in order
to do this accurately.
One of the simplest ways to bring your client data into SSD for R for
analysis is by entering it into Excel, Google Sheets, Numbers, or any other
program that can create “.csv” files. Because Excel is the most commonly
used spreadsheet program, this chapter shows you how to enter data in
Excel. Other programs used for entering data will use a method similar to,
although not exactly the same as, Excel.
In some cases, you may not be able to use Excel or another program to
enter your data. It is possible to enter your data directly into R for analysis
into SSD for R. This is explained in detail in Appendix A, “Entering and
Editing Data Directly in R.”
If you are using Excel or another program, you will need to create a
separate file for each client/group; however, you can track multiple
behaviors for each client/group in a single file.
Creating a file that can successfully be imported into SSD for R has to be
done in a particular manner. To do this, complete the following steps:
1. Create a folder that will be used to store your data and give it a name you will remember, such
as ssddata.
2. Open Excel.
3. On the first row (labeled “1”), enter names for each behavior (i.e., variables) across the
columns, beginning with column A.
HELPFUL HINT: For each behavior you are measuring, you will need to create both a behavior
variable and a corresponding phase variable. The behavior variable will measure the target
behavior, and the phase variable will indicate in which phase, baseline or intervention, the
measurement occurred. In order to do this systematically, we recommend giving the behavior a
meaningful name, and the associated phase variable name should be similar to the behavior
variable name. For example, if you are measuring crying behavior, your behavior variable
could be named “cry” and your phase variable could be named “pcry.”
4. Starting in row 2, begin entering your data for each occurrence the behavior is measured.
IMPORTANT NOTE: When phases change, you will need to enter “NA” (do not enter “N/A” or
“na”) into the row between a change in phases for both the behavior and phase variable.
To look at a relevant example, let’s return to the example of Mary,
discussed previously in this chapter. You will note that the social worker
ultimately collected three measurements during her assessment: Mary’s
attendance at the senior center, her comfort with going to the senior center,
and her mood. In order to record these data in Excel, we will need to create
three behavior variables and three corresponding phase variables. We will
call the behavior variables “attendance,” “comfort,” and “mood.” We will
call the corresponding phase variables “pattendance,” “pcomfort,” and
“pmood.” In addition, we will also create a variable called “notes” to record
any critical incidents that might have occurred.
In Figure 1.1, note that we have entered baseline, or preintervention, data
for Mary. If we had begun our intervention, we would continue entering that
data on Excel row 7; however, the phase variable would change as “A”
generally denotes baseline data, while “B” indicates data collected during
the intervention.
5. Once your data are entered into Excel, you will need to save it as a “.csv (Comma delimited)”
or “.csv (Comma Separated Values)” file in your ssddata directory. To do this, click SAVE AS
and choose a name for your file. Do NOT click SAVE, but instead select one of the .csv options
from the drop-down menu for SAVE AS TYPE or FORMAT, as shown in Figure 1.2 (see p.
16). After you finish this, you should click SAVE and close Excel. You may receive several
warnings, but you can accept all of these by selecting CONTINUE.
Once you enter data into Excel, you can import it into SSD for R and
begin your analysis. In the next chapter, you will learn how to access R, the
SSD for R software, and RStudio user interface to complete this analysis.
Figure 1.2 Saving a Microsoft Excel file in “.csv” format
Conclusion
In this chapter, you learned about different ways to measure target behaviors
associated with your clients’ identified problems. Four different types of
measurement exist, each of which has its own strengths and weaknesses.
These include direct behavioral observations, standardized scales, individual
rating scales, and logs. When selecting one or more methods of measuring a
target behavior, you will want to consider the specific needs of your client,
identified problem, and practice or research situation.
As you collect your client data, Excel is an outstanding tool for recording
this as Excel files saved in .csv format can be imported directly into R for
analysis using the SSD for R package. This chapter showed you step-by-step
instructions to accurately and easily format and save these Excel files. A
method for entering and editing data directly in R is detailed in Appendix A.
Chapter 1
Assignment 1.1—Entering Data
Brenda is a developmentally delayed client with oppositional behavior. She
was in danger of not being accepted into a residential program because she
didn’t get along with others. Let’s examine one aspect of her behavior,
oppositional behavior in school.
This was important to measure because each episode typically lasted 20–
30 minutes and was disruptive to the school day.
For this assignment, you will create a Excel or Google Sheet with
Brenda’s baseline data. BE SURE TO STORE THIS SPREADSHEET IN A
PLACE YOU CAN ACCESS LATER, as you will use it for future
homework assignments. The data you need for this is in the following table:
1. Create the spreadsheet and name it “Brenda—FirstLast” with FirstLast being YOUR first and
last name. For example, the file would be titled “Brenda—WendyZeitlin.”
2
Overview of SSD for R Functions
Introduction
SSD for R is a software package composed of a set of statistical functions
that are designed for the analysis of single-subject research data using R, a
free, open-source statistical programming language (R Core Team, 2013). R,
in general, is becoming popular. There are numerous readily available, free
resources to help people get started with basic and advanced use and
programming; interest in R has virtually exploded in recent years. In this
chapter, you are given step-by-step instructions on how to access the
software necessary to use this package. You also are presented with a brief
overview of the capabilities of the SSD for R package. How and when to use
these packages are expanded on in subsequent chapters.
Getting Started
SSD for R is a set of functions that was designed for single-subject research
in the behavioral and health sciences and is written in R, a programming
language designed for statistical analysis and graphics. R provides an
environment where statistical techniques can be implemented (The R Project
for Statistical Computing, n.d.). Its capabilities can be extended through the
development of functions and packages. Fox and Weisberg stated, “One of
the great strengths of R is that it allows users and experts in particular areas
of statistics to add new capabilities to the software” (2010, p. xiii). SSD for R
is a software package that extends the capacity of R to analyze single-system
research data (Auerbach & Zeitlin, 2021; Auerbach & Zeitlin Schudrich,
2013).
Throughout this book, we demonstrate statistical procedures using the
SSD for R package. In order to get started using SSD for R yourself, you will
need to download three things: R; RStudio, which we use as a user interface
for R; and the SSD for R package (Auerbach & Zeitlin Schudrich, 2013; Free
Software Foundation, Inc., 2012; R Core Team, 2013). R and RStudio can be
downloaded either directly or through links from The Single-System Design
Analysis website (https://www.ssdanalysis.com). Once R and RStudio have
been downloaded, the SSD for R package can be installed using a simple R
command.
Begin by downloading R and RStudio free of charge from the Single-
System Design Analysis website (https://www.ssdanalysis.com or from the
companion site to this text at Oxford University Press). When you click the
links for each of these, you will be taken to external sites. Both R and
RStudio are completely free and are considered safe and stable downloads.
RStudio has developed a cloud version as an alternative to running
RStudio on your computer. All you need to run the cloud version is a web
browser and an RStudio Cloud account. You will also find instructions on
using the RStudio Cloud version on both the Single-System Design Analysis
website and the companion site to this text at Oxford University Press.
Throughout the rest of this book, we use sample datasets that you can
download from https://www.ssdanalysis.com or from the companion site to
this text at Oxford University Press. To download these, click on the
“Datasets” tab from the home page and follow the instructions. These files
include stability.csv, jenny.csv, jennyab.csv, ed.csv, and
social_skill_group.csv. R scripts for each chapter are also provided on the
website.
As you are getting started, you may want to create a single folder on your
computer to hold your R data files. These downloaded files should be stored
there.
Once these are installed, open RStudio. When you open it, your screen
should look like Figure 2.1.
Figure 2.1 RStudio
After the SSD for R package is installed, type the following in the Console:
>require(SSDforR) and press <ENTER>.
Another way to load the SSD for R package is to click the Packages tab in
the lower right pane of RStudio. Click “Install Packages,” and a dialogue box
will appear. Select “Repository (CRAN)” as the location to “Install from:”
Under “Packages,” type “SSDforR.” Select this package and then click
“Install.” That package will automatically be installed, and the Packages tab
will be refreshed. Now, check the box next to SSDforR to make it available
for your use. You will only need to install SSD for R once but will need to
check the box for each new R session.
Your screen should now look like what is displayed in Figure 2.2.
Important Note: The package only needs to be installed once using
install.packages(). This command can also be used to install other R
packages and upgrades. The require(SSDforR) command has to be run once
at the beginning of each R session.
Figure 2.2 RStudio with SSD for R installed and required
Getting Help
You can obtain a list of the SSD for R functions by typing the following in
the Console:
>SSDforR()
The list of functions will appear in the Console on the left. You can obtain
help for any function in the list by typing a question mark (?) followed by
the function title. For example, if you need more information on using the
ABrf2() function, you would enter the following in the Console:
>?ABrf2
Help for the command will appear in the bottom right window under the
Help tab. Your screen should look as it does in Figure 2.3 (see p. 21).
Figure 2.3 RStudio with functions listed and help file displayed
All help files have the same structure. They begin with a brief description
of what the function does. The “Usage” section shows the required syntax
for the function, and the “Arguments” section describes what is required in
the syntax. The “Authors” section lists the authors of the function, and the
“References” section provides additional information. At the bottom of each
help file is an “Examples” section that can be copied and pasted into the
Console and run. For example, type the following in the Console:
>?ABplot
When you scroll to the bottom of this help file, you will see an example.
Copy the example in the help file and then paste it into the Console. Remove
the number sign (#) and press <ENTER>. Your screen should look like what
is displayed in Figure 2.4 (see p. 22):
Figure 2.4 Using examples in ABplot() help file
You just created your first graph in SSD for R, and you now see in the
bottom of the Console the proper syntax for the ABplot() command.
Alternatively, you can access hyperlinks for all SSDforR functions by
clicking the hyperlink for SSDforR in the Packages tab in the lower right
pane of RStudio.
Note: Most, although not all, function names begin with a capital letter.
When you type the functions in the Console, you must type it EXACTLY as
it appears in the help file.
The column labeled “yell” shows the number of times that Jenny yelled
during the school day. The label “pyell” is a phase variable associated with
the variable “yell.” Since there has been no intervention done at this point,
the “A” for each measurement indicates that these are baseline data. The
“cry” variable indicates how many times Jenny cried in school each day, and
“pcry” is the phase variable associated with “cry.” The “callout” indicates
how many times Jenny called out in class without raising her hand, while
“pcallout” is the phase variable associated with “callout.” Note that each
behavior variable has an associated phase variable, which is required in
order to analyze these data using SSD for R. While we have given each
phase variable a prefix of “p,” you could actually name these phase variables
anything meaningful to you.
To understand this, we can see that on the first day of collecting data
Jenny yelled twice, called out of turn three times, and cried once. On the
second day, she yelled twice, cried twice, and called out three times.
The “notes” column is optional but allows the social worker to put in his
or her comments and note critical incidences that could impact any of the
data. Note that the social worker mentioned that Jenny’s mother was out of
town one day, and, on another day, Jenny got sick while she was in school
and needed to go home early. On the day Jenny went home sick, she cried 10
times, which seems to be a lot, even for her.
This Excel data file can be imported into RStudio so it can be analyzed
using SSD for R. In order to do that, however, remember that the spreadsheet
needs to be saved as a “CSV (Comma delimited)” file.
To begin your analysis, you need to load the SSD for R package. To do
this type the following in the Console:
>require(SSDforR)
This command is one of the functions in SSD for R. Once you hit
<ENTER>, you will be prompted to select a file. Select Jenny.csv and, after
opening the file, type the following in the Console:
>attach(ssd)
As displayed in Figure 2.6 (see p. 25), to view your file in the top left pane
in RStudio, follow these steps: Click Environment in the top right pane and
then in the top right pane click ssd in the Global Environment window.
Jenny’s data will appear in the top left pane.
Figure 2.6 Displaying ssd spreadsheet in Rstudio
Your RStudio window should look like what is displayed in Figure 2.7 (see
p. 25).
Figure 2.7 Using listnames() to view variable names in alphabetical order
Note how all the variables in the Jenny.csv Excel spreadsheet are now
listed in the Console in alphabetical order.
Alternatively, you may want to view the variables you created in the order
in which you created them. To do that, enter the following command in the
Console:
>names(ssd)
Note how the order in which the variables appear in Figure 2.8 is the same as
in your Excel spreadsheet (see Figure 2.5).
Figure 2.8 Using names(ssd) to view variable names in creation order
If you want to see all the values for a given variable, simply type the name
of the variable into the prompt in the Console. For example, to see the values
for “yell,” simply type yell at the prompt, which is displayed in Figure 2.9
(see p. 27).
Figure 2.9 Displaying all values for the variable yell
Note that the values for “yell” are the same regardless of whether you
view them in R Studio or in Excel (see Figure 2.5).
Once you import your data file, you will be able to use the SSD for R
functions to analyze your data visually and statistically. As you continue
going through this book, you will learn more about Jenny’s presenting
problems by analyzing data at different points in her treatment process.
Gloria a 62-year-old widow described as having a major acute depressive
disorder that had its present onset after her husband’s sudden death from a
heart attack. Her symptoms are feeling sadness, tearfulness, lack of interest
in most daily activities, insomnia, and suicidal ideations.
To better understand Gloria’s degree of depression, her worker
administered a depression scale daily. Also measured was the number of
hours Gloria slept daily. These data are shown in Figure 2.10 (see p. 28). The
column labeled “depress” shows Gloria’s daily depression score. The
“pdepress” is a phase variable associated with the variable “depress.” Since
there has been no intervention done at this point, the “A” for each
measurement indicates that this is baseline data. The “sleep” variable
indicates how many hours Gloria slept each day, and “psleep” is the phase
variable associated with “sleep.”
Summary of Functions
SSD for R supports the full range of analysis that can be conducted with
single-subject research data. How and under what conditions are each of
these functions used is discussed in detail in subsequent chapters, but a brief
overview of these functions is described here. You can also refer to
Appendix B, SSD for R Functions Quick Guide, to see all of these functions
grouped by usage.
• Basic graphing functions—These include a set of commands that will
allow you to make and label a variety of graphs, draw vertical lines
between phases, and note critical incidences. Graphs include, among
other things, simple line graphs (with or without mean or median
lines), standard deviation graphs, and graphs showing interquartile
ranges. These graphs can be exported in a number of different
formats.
• Functions associated with descriptive and basic statistics—These
include basic descriptive statistics, including mean, median, standard
deviations, and quartiles.
• Functions associated with statistical process control (SPC) charts—
These include X-R charts, R charts, p charts, X-mr charts, and c
charts. Again, all charts and graphs can be exported in a variety of
formats.
• Functions associated with autocorrelation and data transformations—
These are appropriate for calculating and dealing with issues of
autocorrelation, which are unique to time series research designs.
• Functions associated with effect sizes—These enable users to
quantify the degree of change observed between phases.
• Functions associated with hypothesis testing—These enable users to
test whether there are statistically significant differences between two
or more phases. Available tests include t-tests, chi-square, analysis of
variance (ANOVA), binomial, and conservative dual criteria.
• Functions appropriate for analyzing group data—These are suitable
for analyzing group data, including variation of individuals’
behaviors within a group.
• Functions appropriate for conducting meta-analysis—These enable
the user to conduct meta-analysis utilizing omnibus and random-
effect and fixed-effect models.
• Functions appropriate for RMarkdown—These are suitable for
producing MS Word and PDF documents for communication of
results produced by SSDforR.
• Functions suitable for community or system-wide data—These
functions are used to examine large amounts of data across phases
and are typically used in time series research that may go beyond the
scope of one client or client group.
Chapter Summary
In combination, R, RStudio, and SSD for R provide a robust way to analyze
single-system research data. This chapter showed you how to download the
necessary software and provided an overview of the visual and statistical
capability available with SSD for R. In the following chapters, you will learn
how to use these to analyze and interpret single-subject research data.
Chapter Exercises
Assignment 2.1—Opening and Displaying an SSD for R File
1. Load SSDforR and use the Getcsv() function to open the Jenny baseline data, attach them, and
create a line graph of the yelling behavior.
2. Display the spreadsheet in the top left console of RStudio by double clicking ssd in the global
environment in the top right panel. Take a screenshot of RStudio with the spreadsheet open.
3. You will need to submit a document (Word or Google Doc) with a copy of the graph and the
screenshot.
3
Analyzing Baseline Phase Data
Introduction
In this chapter you will learn about methodological issues in analyzing your
baseline. There may be times, however, when you may want to do additional
analysis and compare, for example, one intervention phase to another. In
these cases, you would also want to consider conducting the analyses we
introduce in this chapter to those phases as well.
This chapter is not intended to provide a comprehensive discussion of
how to design a baseline; there are a number of excellent texts listed in
Appendix D that provide in-depth discussion of single-subject research
designs. Rather, in this chapter we focus on assessing baseline data. The
baseline is the phase in single-subject research in which data are collected on
the target behavior prior to any intervention. As a result, the baseline serves
as the basis for comparison with information collected during any
intervention that might take place. This comparison allows you to determine
the target behavior is changing and how it is changing. Well done, the
baseline lets you know what would be expected to continue if there were no
intervention at all (Bloom et al., 2009).
There are two different types of baselines, concurrent and reconstructed.
In a concurrent baseline, data are collected while other assessment activities
are being conducted. A reconstructed baseline is an attempt to approximate
the naturally occurring behavior based on the memories or documentation of
the client or others; case records can also be utilized to produce a
reconstructed baseline. The reconstructed baseline can be utilized when a
concurrent one is not possible. In some cases, a combination of
reconstructed and concurrent data is used to form the baseline.
Analyzing Your Baseline
In general, analyzing baseline data is simply a matter of describing it.
Traditionally, single-subject research uses visual analysis to accomplish this,
and descriptive statistics, such as the mean, standard deviation, median, and
quantiles, are used to provide more in-depth information. SSD for R can be
used to accomplish this.
The other issue that is relevant in the analysis of baseline data is that of
autocorrelation, also known as serial dependency. Data in any phase that
have high levels of autocorrelation may need to be treated differently from
data that do not have this problem when comparing phases as most analytical
methods are based on the notion that observations are independent of one
another. Therefore, it is important to assess the degree to which phase data
are autocorrelated in order to better understand how comparisons to other
phases can be made at other times, and this issue is discussed more at the
end of the chapter.
Visual Analysis
Determining the stability of a baseline is an important first step in the
analysis of single-system data. A stable baseline is one in which there are no
wide fluctuations or obvious cycles. Stable baselines provide the ability to
estimate what might happen to the behavior in the future if no intervention
were provided. A lot of variability in the data makes it difficult to determine
how the behavior is affecting the client system and can lead to incorrect
conclusions about the impact of the intervention (Matyas & Greenwood,
1990).
Ideally, the baseline should continue until stability is achieved; however,
there are situations in which this is not possible or practical (Bloom et al.,
2009; Logan, Hickman, Harris, & Heriza, 2008; Portney & Watkins, 2008).
In some cases, it may be necessary to intervene with the client before
stability in the baseline occurs because of pressing needs of the client, and it
may be possible to get a good understanding of the behavior by combining
concurrent and retrospective baseline data. This issue is often the case when
single-subject research is part of practice evaluation. If it is not possible to
continue collecting baseline data until that phase is stable, it might be
feasible to select another target behavior to assess.
Illustrations of baselines with varying degrees of stability follow next.
Figure 3.1 is an illustration of a stable baseline.
Note that there is only a slight upward or downward direction in the data
as all baseline values range between 3 and 4. This is a desirable pattern
because there is a consistent occurrence of the behavior over time. Changes
in the behavior during the intervention phase would be readily apparent.
Figure 3.2 provides an example of an increasing and stable baseline.
Figure 3.2 Example of an increasing stable baseline
In this example, the baseline is stable because we can use the obvious
trend to predict future values; however, the outcome being measured is
consistently increasing. If the goal were to decrease or stabilize the behavior,
a reversing or flattening trend in the intervention phase would indicate that.
If, however, the goal was for the behavior to continue increasing, a
successful intervention may appear to be a change in the rate of the increase.
Figure 3.3 shows an example of a semistable baseline.
In the Plots pane, you will find the box plot for these data. In the Console,
find a host of descriptive statistics, including the sample size (n), values for
the mean, the 10% trimmed mean, median, sd, range, interquartile range
(iqr), and quantiles. The box plot in Figure 3.5 shows data with considerable
variation with a minimum value of one and a maximum of five.
We can also illustrate this with a simple line graph displaying the number
of yelling incidents using the following command, as shown in Figure 3.7
(see p. 37):
>ABplot(yell, pyell,“time”,“amount of yelling”, “Jenny’s yelling”)
Figure 3.8 Annotated line graph of Jenny’s yelling during baseline phases
Note that when you enter the ABstat() and ABtext() functions, you will
receive an additional prompt in the Console instructing you to place the
mouse where you want your lines and text to appear. For the ABstat()
command, place the cursor at the beginning left of the phase where the line
should appear to ensure that the statistic line covers all the phase data. Since
we only have baseline data at this point, hover your cursor in the graph
above the 0 on the x-axis. For the ABtext() function, center the mouse over
the area where you want the text. After the text or statistic line appears, if
you are satisfied with its appearance on the graph, simply type “y” and
<ENTER>. If you are not satisfied, type “n” and <ENTER>. In this case,
you will have to reenter the command to continue annotating your graph.
You can easily do this by using your up and down arrow keys to re-create
previous commands.
The addition of a measure of central tendency provides valuable
information. We can now see that the points in the graph vary widely from
the mean and median, and the baseline looks pretty unstable. The sixth point
is the furthest below the mean and median while points 10, 12, and 13 are
the furthest above.
Another way to enhance your visual analysis is to examine how the data
fall around standard deviation bands, which display the mean along with
lines depicting the bounds of either one or two standard deviations around
the mean.
Approximately two thirds of all data in a normal distribution fall between
one standard deviation (sd) above and one standard deviation below the
mean. Bloom et al. (2009) suggested that, depending on the nature of the
behavior, data falling outside of one standard deviation can be defined as
desirable or undesirable, depending on the behavior being measured. This
can be illustrated, as shown in Figure 3.9 (see p. 38), using Jenny’s yelling
behavior with the following command:
>sd1bandgraph(yell,pyell,“A”,“time”,“amount of yelling”, “Jenny’s yelling (1-SD)”)
The actual values for the means and standard deviations are displayed in the
Console:
“SD=” “1.26”
“+1sd=” “4.06”
“mean=” “2.8”
“-1SD=” “1.54”
and we can add this to the band graph by adding a key to your graph in an
unobtrusive place:
>ABtext(“KEY: mean=2.8, +1 SD=4.06, -1 SD=1.54”)
Figure 3.9 1-sd band graph of Jenny’s yelling during baseline phase
Notice that 11 of the 15 data points in the baseline are within one standard
deviation, which is 73% of the data. As this is above two thirds of the data
expected in a normal distribution, we can assume that variability of this
baseline is not excessively high.
Using two standard deviations would be considered even more rigorous
because approximately 95% of all scores in a normal distribution are within
plus or minus two standard deviations of the mean. This is illustrated in
Figure 3.10, which is annotated with the values of each band and can be
demonstrated using the following function:
Figure 3.10 2-sd band graph of Jenny’s yelling during baseline phase
As the graph displays, none of the points are above or below two standard
deviations. Therefore, if we were using plus or minus two standard
deviations as the criterion for typical behavior, none of the values would be
considered unusual (i.e., desirable or undesirable). The actual values for the
mean and standard deviations are displayed in the Console and were used in
the annotation.
“SD=” “1.26”
“+2sd=” “5.33”
“mean=” “2.8”
“–2SD=” “0.27”
Trending
As we continue our analysis of the baseline, it is important to assess if the
data have a significant trend and to note the direction of the trend (i.e.,
whether it is increasing or decreasing). The trend can also be completely flat
or irregular, as discussed previously. The goal of an intervention may be to
reverse the trend in a negative behavior or increase it for positive behavior. It
is important to detect a trend early on as a strong trend will impact the type
of data analysis we can do in comparing phases.
As an example, let’s examine Jenny’s yelling behavior to see if we can
detect a trend. Invoke the following command in the Console to display what
is illustrated in Figures 3.11 and 3.12 (see p. 41):
Figure 3.11 Console output of assessment for trending of Jenny’s yelling during baseline phase
Figure 3.12 Visual assessment for trending of Jenny’s yelling during baseline phase
>Aregres(yell, pyell,“A”)
The graph displays a slight trend in the data, with the dotted regression line
displaying the slope; however, note that the individual data points are not
clustered tightly around the regression line. In addition to the graph,
statistical output is produced in the Console.
These values show that the degree of change can be quantified by the
estimate for x1 of 0.13214, which is the slope of the regression line. This can
be interpreted as follows: For each unit increase in time, there is an
estimated 0.13214 unit increase in the yelling behavior. The column labeled
“t value” is the calculated value for the statistical significance of the slope
and constant. The last column labeled “Pr(>|t|)” is the probability of making
a Type I error (i.e., testing the hypothesis that the coefficient for the slope is
greater than zero). Because the probability is greater than the commonly
accepted 0.05 threshold (p = .0791), the slope is not considered statistically
significant. Despite an insignificant p value, however, we may still want to
consider the fit of the data around the regression line because significance
may be hard to achieve, particularly with small sample sizes. If this is the
case, we might want to look at the value for Adjusted R-squared, which
explains the proportion of the variance in the outcome, in this case, Jenny’s
yelling, explained by the predictor variable time. In this example, we see the
adjusted R-squared is 0.1581, indicating that approximately 16% of yelling
is predicted by time.
A visual inspection of the graph suggests that the data are not linear and
outliers at points 10, 12, and 13 have an impact on the slope by pulling it
upward.
Autocorrelation
All statistical tests rely on various assumptions. When these assumptions are
not met, incorrect conclusions about calculated values can be made, such as
deciding that observed differences between phases are meaningful when
they, in fact, are not (i.e., Type I error). On the other hand, we may
erroneously not detect a difference between phases when those differences
do exist (i.e., Type II error).
An assumption in the analysis of data in many types of analyses is that
each observation is independent of the others. This means that observations
cannot be predicted from one another. This is often the case when many
research subjects are included in a study, and each individual is observed
separately; however, this is less often the case when subjects may have close
relationships to one another or we measure individuals repeatedly, as is the
case in single-system research designs.
When data lack independence, they are considered correlated, and a
special consideration in single-system research is serial dependency (Bloom
et al., 2009). Serial dependency can be measured by testing for a correlation
known as autocorrelation (Gast & Ledford, 2010). When visual analysis is
used, high levels of autocorrelation combined with variability in the baseline
increase the likelihood of a Type I error (Matyas & Greenwood, 1990).
Measuring autocorrelation using the rf2 (Huitema & McKean, 1994) is
preferable when the number of observations are small, usually considered
less than six (Bloom et al., 2009). The following command entered into the
Console is needed to test for lag-1 autocorrelation for Jenny’s yelling
behavior:
>ABrf2(yell,pyell,“A”)
Figure 3.13 Console output of assessment for autocorrelation of Jenny’s yelling during baseline phase
Figure 3.14 Visual assessment for auotocorrelation of Jenny’s yelling during baseline phase
You will note in Figure 3.15 that the graph does display a decrease in the
degree of variation for the transformed data, which could indicate a
reduction in the degree of autocorrelation.
Figure 3.15 Moving average plot for Jenny’s yelling during baseline phase
When you use the ABma() function, you have an option to save the
transformed data for future analysis. Once you save this, you will want to
test it again for autocorrelation. If it appears as if transforming the data has
reduced variation, you will want to save use these data, and not your
original, for further analysis.
If the data are not independent and have a significant trend, transformation
using differencing is recommended. Differencing simply calculates the
difference of the value of one data point to the one immediately preceding it.
As an example, begin by examining Jenny’s calling out behavior at school,
shown in Figures 3.16 (see p. 45) and 3.17 (see p. 45), using the following
function:
>Aregres(callout, pcallout, “A”)
Figure 3.16 Console output for trending of Jenny’s calling out during baseline phase
Figure 3.17 Visual assessment for trending of Jenny’s calling out during baseline phase
As both the graph and output in the Console indicate, there is a strong
trend in the data. Next we test for autocorrelation of the calling out behavior
using the following function, the results of which are displayed in Figures
3.18 (see p. 46) and 3.19 (see p. 46):
>ABrf2(callout, pcallout,“A”)
Figure 3.18 Console output of assessment for autocorrelation of Jenny’s calling out during baseline
phase
Figure 3.19 Visual assessment for autocorrelation of Jenny’s calling out during baseline phase
The data are autocorrelated with rf2 = 1.497 and a significance of rf2 of
less than 0.05. To try to reduce the effects of this autocorrelation, these data
can be transformed using differencing with the following command (a graph
of this function is displayed in Figure 3.20) (see p. 47):
>diffchart(callout, pcallout,“A”)
Figure 3.20 Differencing plot for Jenny’s calling out during baseline phase
Since differencing seems to smooth out the line, we will save the
transformed data by typing “y” when prompted.
Regardless of how you transform your data (i.e., either through the
moving average or through differencing), after you have saved the
transformed data, use the Getcsv() and attach(ssd) commands to open and
attach the newly created dataset. Once this is accomplished, you will need to
test the transformed data for autocorrelation using the following command
(results are shown in Figures 3.21 and 3.22) (see p. 48):
>ABrf2(diff,phase,“A”)
Figure 3.21 Console output of assessment for autocorrelation of differenced calling out during
baseline phase
Figure 3.22 Visual assessment for autocorrelation of differenced calling out during baseline phase
In the case of Jenny’s calling out behavior, the transformation has reduced
the rf2, and it is no longer statistically significant. It should be noted that
transformed data should always be tested for autocorrelation, and all other
phases will need to be transformed in order to compare them.
There are situations in which transforming data does not lower levels of
serial dependency. In these cases, checking for autocorrelation will yield
high values for rf2 and/or significance values less than 0.05. When this
happens, it will likely be best to continue working with the original,
untransformed data as the comparison between the phases will be easier to
interpret than having to account for transformed, highly autocorrelated data.
You can calculate the autocorrelation for any lag using the ABautoacf()
function. Because of small sample size, we strongly recommend the use of
the rf2 method to calculate the lag-1 autocorrelation. Load and attach Jenny’s
baseline data for an example of how to produce the autocorrelation for a lag-
5 autocorrelation for her yelling behavior. Output for this is shown in
Figures 3.23 (see p. 49) and 3.24 (see p. 49):
>ABautoacf(yell, pyell, “A”, 5)
When entering this command, note that the integer “5” is the lag number
requested. You can see from the correlogram in Figure 3.24 that none of the
autocorrelations in lags 1 through 5 are above the upper blue line or below
the lower blue line, indicating nonsignificance. The Box-Ljung test statistic
is 2.9987, and the p value is above .05, further indicating that the
autocorrelations for the phase are not statistically significant.
Figure 3.23 Console output for assessment of lag-5 autocorrelation in Jenny’s yelling during baseline
Figure 3.24 Visual assessment of lag-5 autocorrelation in Jenny’s yelling during baseline
Another Example
Now, let’s consider Anthony’s baseline checking behavior. If you recall,
Anthony is a client who appears to have obsessive-compulsive disorder, and
he has been asked to count the number of times he checks that his appliances
are turned off or unplugged each day. At his next appointment, he reports his
checking behavior for the previous week. To work with these data, use the
Getcsv() and attach(ssd) functions to load Anthony.csv. Then, describe it as
follows:
>ABdescrip(checking, pchecking)
The output displayed in the Console and illustrated in Figure 3.25 indicates
that the mean for Anthony’s checking is four times per day (sd = 0.894),
with the behavior ranging from three to five times per day. The box plot,
displayed in Figure 3.26 (see p. 51), shows the narrow range of this
behavior, with no outliers.
Figure 3.25 Console output of descriptives for Anthony’s checking during baseline phase
With this in mind, you can create a line plot by invoking the following
command:
>ABplot(checking, pchecking, “day”, “frequency”, “Anthony Checking”)
Since there were no obvious outliers in the box plot, you might choose to
annotate this line graph with a labelled mean line, being sure to accept your
annotations:
>ABstat(checking,pchecking,“A”,“mean”)
>ABtext(“mean=4”)
From this initial analysis, it looks like the baseline is fairly stable, but we
can check this out with a one standard deviation band graph, which can be
annotated to display the values for the mean and the standard deviation
bands:
>sd1bandgraph(checking, pchecking, “A”, “day”, “frequency”, “Anthony Checking (1-
SD)”)
>ABtext(“KEY: mean=4, +1 SD=4.89, -1 SD=3.11”)
Now we see that the majority of the data do not fall within one standard
deviation since only two of six data points are within the bands. By
expanding the graph to two standard deviations, it is hoped we will find
stability:
>sd2bandgraph(checking, pchecking, “A”, “day”, “frequency”, “Anthony Checking (2-
SD)”)
>ABtext(“KEY: mean=4, +2 SD=5.79, -1 SD=2.21”)
As illustrated in Figure 3.29 (see p. 53), all baseline data fall within two
standard deviations, so we have less concern about instability in this phase.
Figure 3.29 2-sd band graph of Anthony’s checking during baseline phase
The output in the Console, shown in Figure 3.30 (see p. 53), indicates that
whatever trend is observed, and there does not appear to be one from looking
at any of the line graphs we produced so far, is not statistically significant
since the significance level for the regression line is 0.2103.
Figure 3.30 Console output for assessment of trending in Anthony’s checking during baseline
Chapter Summary
Early analysis of baseline data in single-subject research entails decision-
making that informs the duration of this phase as well as how other phases
may be compared to the baseline in the future. For example, visual analysis
of the baseline early on may show an unstable or semistable trend that may
indicate the need to collect additional baseline data in the hopes of creating
stability. This analysis is critical for ensuring the integrity of the baseline for
future analysis.
SSD for R enables users to gain a thorough understanding of baseline data
through both visual and statistical inspection. The ability to visualize and
statistically understand how data are clustered around the mean or median of
the baseline (or any other phase for that matter) will give you a greater sense
of the variability of the data and, in a practice evaluation setting, may guide
you in setting concrete goals with clients.
In all cases, however, trending and autocorrelation of phase data need to
be considered since disregard for these issues could lead to erroneous
comparison of phases. This chapter provided guidance on assessment of
trending and autocorrelation in small samples, which are typical of single-
subject research. Additionally, SSD for R has the capability to transform and
save phase data for which serial dependency is an issue.
Chapter Exercises
Assignment 3.1—Annotating a Line Graph
For this assignment you will continue assessing Brenda’s baseline data using
the file you created in Chapter 1.
1. Obtain descriptive statistics for Brenda’s oppositional behaviors. In SENTENCE form, provide
information about the number of observations there are in the baseline, the average number of
oppositional episodes per day, and the standard deviation.
2. Create a line graph (like you did in the previous module) AND annotate it with a line denoting
the mean and text, appropriately placed, to display information about the mean. Be sure to label
your graph with your name in the main label.
Introduction
In this chapter you will learn about methodological issues to consider in
analyzing the success of your intervention and how to conduct your visual
analysis. We begin with a discussion of descriptive statistics that can aid in
the visual analysis of your findings by summarizing patterns of data across
phases. Four common forms of descriptive statistics are explained: central
tendency, variation, trends, and effect size. We then continue to a discussion
of testing for autocorrelation in the intervention phase. We conclude with a
discussion of the goal line, which provides a visual method to quantify
effectiveness of an intervention when a specific decision is required to attain
a specific goal.
Descriptive Statistics
A good way to begin comparing phases is by producing descriptive statistics
for central tendency and variation for all phases. As an example, let’s use the
Jennyab dataset by invoking the Getcsv() function. These data are a
continuation of our study of Jenny, but now includes both baseline and
intervention data.
Once the dataset is open, type attach(ssd) in the Console to make the
dataset available for analysis. Once the dataset is attached, you can produce
descriptive statistics and view a box plot for both phases of Jenny’s crying
behavior by entering the following command in the Console:
>ABdescrip(cry,pcry)
The output, shown in Figures 4.1 (see p. 58) and 4.2 (see p. 58), includes
descriptive statistics, and a box plot is displayed:
The statistics displayed in Figure 4.2 are explained as you review this
chapter, but we begin with a discussion of the median. The median (Md) is
the value for which 50% of the observations fall both above and below; it is
the middle value if all the data were listed in increasing value. The median
can be used to express the typical value in a given phase. In the Jenny
example, the median number of daily episodes of crying was five during the
baseline (the “A” phase) compared to one during the intervention (the “B”
phase). The box plot in Figure 4.1 graphically displays the median, the
range, and the interquartile range for both phases. The thick black line in the
boxes represents the median score for the phase. The median is visibly lower
in the intervention phase, which is desirable in this case. The box plot also
shows that there is considerably more variation in the baseline than in the
intervention as noted by the upper and lower bounds of the box plots and the
sizes of the actual boxes.
Now that we have a general sense of how the data are distributed in each
phase, we can get additional information by examining a simple line graph.
Consider Figure 4.3, which was produced in part by entering the following
in the Console:
We can add a vertical line to separate the phases by entering the following in
the Console:
>ABlines(cry)
After you type this command in the Console, click the mouse in the gap
between the phases. You will be queried with a “Y/N” request to accept the
line. If you are satisfied with the placement of the line, enter “Y.” If you
enter “N,” you will then need to reenter ABlines() to place the line in the
correct position.
Looking at this line graph, we note that there is an extreme value in the
baseline, and that there is an associated note with that data point. In order to
make this clear, we can make a notation on the graph. To do this we labeled
the extreme score by entering the following command in the Console:
>ABtext(“Goes Home Sick”)
This is helpful in explaining why the observation is not typical. Entering the
following in the Console produces an arrow extending from the outlying
data point to your text:
>ABarrow()
Similar to drawing a line on the graph, you will be prompted to place the
ends of the arrow in the desired locations.
Figure 4.4 illustrates that adding lines that represent the median in each of
the phases can enhance the graph, especially if there are extreme, or atypical,
values. To add a median line to the baseline phase, enter the following
command in the Console:
Now click the mouse at zero on the x-axis. To add a median line to the
intervention phase, enter the following in the Console:
>ABstat(cry,pcry,“B”,“median”)
The lines were labeled using the ABtext() function and clicking the mouse
in the appropriate location. The addition of the median lines visually
demonstrates that the typical amount of crying decreased after the
intervention. The median is a good choice to use instead of the mean when
there are observations that are very different from the typical value, or
“outliers.” The last observation in the baseline of 10 episodes could be
considered an outlier.
The mean is another measure of central tendency that is helpful in
describing the typical observation. Figure 4.5 displays a simple line graph,
but this time, the means for each phase of Jenny’s crying behavior are
displayed. Unlike the median, the mean can be strongly impacted by outliers.
This is especially true when there are a small number of observations, which
is common in single-subject research.
Figure 4.6 Jenny’s crying behavior with means and trimmed means displayed
>Trimline(cry,pcry,“A”)
>Trimline(cry,pcry,“B”)
The text was added to baseline and intervention with the following
commands:
>ABtext(“tM=4.45”) and ABtext(“tM=1.44”)
As Figure 4.6 illustrates, there is very little difference between the mean and
the trimmed mean in this example. The trimmed mean is generally preferred
over the median as a measure of central tendency because it uses more
information about the data than the median.
Measures of Variation
Whenever a measure of central tendency (e.g., mean or median) is reported
on a set of data, it is necessary to describe the degree to which scores deviate
from it. Variation can be defined as the degree to which scores fluctuate
around a measure of central tendency. As a result, measures of variation are
necessary to fully describe a set of data. Two measures of variation that are
commonly used in single-system data analysis are the range and standard
deviation (SD) (Bloom et al., 2009).
Range
The range is the simplest measure of variation. It is simply the difference
between the highest and lowest values in a set of observations. The range in
the baseline of Jenny’s crying behavior is nine, and it is five for the
intervention. A more useful form of this measure is the interquartile range
(IQR); the IQR is the difference between the third (75th percentile) and first
(25th percentile) quartiles. These percentiles are displayed in the box plot in
Figure 4.1, with the 25th percentile represented by the bottom of each box
and the 75th percentile represented by the top of each box. For the Jenny
data, the IQR is 3 for both the baseline (6 minus 3) and intervention (3 minus
0) phases. The IQR can be graphed using the following set of functions,
which will produce Figure 4.7:
Figure 4.7 Jenny’s crying with IQR ranges and medians in baseline and intervention phases
Standard Deviation
The SD is the most widely used measure of variation and is frequently used
in conjunction with the mean. Like the mean, it takes into account the impact
of all observations, thus using all available information. The SD is the
average amount of differences of the scores from the mean. In a normal
distribution, the SD provides the percentage of scores above and below the
mean. In fact, if the scores are normally distributed, 68.2% of them will fall
between 1 SD above and 1 SD below the mean, and 95% will fall between 2
SDs above and below the mean. “Typical” behavior can be defined as values
between ±1 SD in the baseline.
Lines representing 1 or 2 SDs above and below the mean can be displayed
on a graph, which provides a view of the variability between phases. Figure
4.8 displays a band graph displaying ±1 SD for the baseline going through
both the baseline and intervention. The following commands will produce
the graph, and output in the Console will provide values for the SD bands
and the mean:
Looking at the graph in Figure 4.8, we see that a larger percentage of the
scores are below 1 SD during the intervention as compared to the baseline.
Since decreasing values are desirable in this example, we can assume that
the intervention is making a positive difference for Jenny’s crying.
>ABregres(cry, pcry,“A”,“B”)
Note that a separate graph is produced for the two phases. The ABregres()
function has the capability to compare any two phases of a design. If, for
example, there was a B1 phase, it could be compared to the baseline with the
following command:
>ABregres(cry,pcry,“A”,“B1”)
Figure 4.9 displays an increasing trend in the baseline (i.e., the number of
crying episodes is generally increasing from one time point to the next)
while decreasing during the intervention phase (i.e., the amount of crying is
generally decreasing from one time point to the next).
In the Console to the left of your graphs, you will find some important
measures that quantify what is presented graphically. Here we see that the
slope is positive (0.4121) compared to the intervention, which is negative (–
0.1790). A decreasing trend in the number of episodes is desired during the
intervention in this case as we would hope to see a reduction in the number
of crying episodes. In this example, the trend is a modest one. The multiple
R-squared values are 0.4229 for the baseline and 0.4783 for the intervention.
The closer this value is to 1, the greater the trend and the closer to 0 is the
weaker the trend. Additionally, the p values for the slopes of these are both
statistically significant.
Now consider Figures 4.11 and 4.12 (see p. 68), which examine the trend
of Jenny’s calling out behavior in both the baseline and intervention phases.
To generate these figures, the following command was entered in the
Console:
>ABregres(callout,pcallout,“A”,“B”)
In this example, the R squared in the baseline is 0.9685 for the baseline and
0.9388 in the intervention. In this instance, the trend line almost perfectly
fits the observations. This would be considered a strong trend with slopes of
0.90476 and –0.62238 for the baseline and intervention phases respectively.
Figure 4.11 Assessing trend in baseline and intervention for calling out behavior
Figure 4.12 ABregres() statistical output for calling out behavior
Now let’s consider Figures 4.13 (see p. 68) and 4.14 (see p. 69), where we
examine the trend for Jenny’s yelling behavior by entering the following
command in the Console:
>ABregres(yell,pyell,“A”,“B”)
In this example, we see that the baseline has a weak trend, and the
intervention has almost no trend, with R-squared values of 0.2183 in the
baseline and 0.03175 in the intervention. The slopes are 0.13214 and –
0.01961, respectively, which indicate fairly flat regression lines. When a
trend line is flat, it indicates that the behavior is not changing over time.
Figure 4.13 Assessing trend in baseline and intervention for yelling behavior
Figure 4.14 ABregres() statistical output for yelling behavior
To the left in the Console, you will see the presentation in Figure 4.18 (see p.
74).
The output in Figure 4.18 displays calculations below and above for the
mean, median, and regression line. In this case, the desired zone is below the
regression line (G Regression line). Because the g-index is a positive
number, 0.375, improvement is noted during the intervention. A positive g-
index indicates improvement, while a negative value denotes deterioration.
Figure 4.18 Statistical output from the Gindex() function for calling out behavior
The absolute value for the g-index can be interpreted as follows: Scores
between 0.1 and 0.3 are considered a small effect, scores between 0.31 and
0.5 are considered a medium effect, and scores 0.51 and higher are
considered a large effect (Cohen, 1988). Therefore, we can interpret the
results to indicate that in this case there was moderate improvement between
phases.
Now let’s look at the use of the g-index with Jenny’s yelling, which you
may recall only contained a weak trend in both the baseline and intervention.
Figures 4.19 (see p. 75) and 4.20 (see p. 75) show results for this, which
were created by entering the following command in the Console:
>Gindex(yell,pyell,“A”,“B”)
Figure 4.19 Gindex() function for yelling behavior
Figure 4.20 Statistical output from the Gindex() function for yelling behavior
Because there is some variation in the baseline, the median can be used to
define the desired zone. Since lower scores are desirable, the desired zone is
below the median. The g-index based on scores below the median is 0.816,
which indicates a large degree of improvement in the behavior.
A word of caution about the use of effect sizes in general. While these
calculations indicate the amount of change between phases, observed
changes cannot be attributed to causality. That is, effect sizes alone, even
large effects, do not prove that the intervention was the cause of the
observed change.
Special Cases of Effect Size: Non-overlap Methods
In this section, we discuss a type of effect size known as non-overlapping
data effect size measures. The methods discussed in this section can be used
in lieu of traditional effects sizes (e.g., ES) in a number of situations. For
instance, when data are not distributed normally, there is a great deal of
variation between or within phases or there are multiple phases to be
compared, non-overlap methods may be more appropriate to use (Bloom et
al., 2009; Parker, Vannest, & Davis, 2011; Vannest et al., 2013).
These types of effect size are calculated by using the percentage of data
points in the intervention/treatment phase that is above or below a notable
point in the baseline phase (Lenz, 2012; Scruggs & Mastropieri, 1998,
2013). Five different forms of non-overlap methods are presented:
Percentage of Non-overlapping Data (PND), Percentage of Data Exceeding
the Median (PEM), Percentage of all Non-overlapping Data (PAND),
Improvement Rate Difference (IRD), and the Non-overlap of All Pairs
(NAP). The strengths and weaknesses of these methods will also be
discussed.
Figure 4.22 Statistical output from PNDbelow() for Jenny’s yelling behavior
The PND graphs can be annotated using the ABlines() and ABtext()
commands. The results of the PND indicate that 18% of the data points are
below the lowest baseline point. According to the criteria presented in Figure
4.22, a PND of 0.18 would indicate an ineffective intervention (Scruggs &
Mastropieri, 1998). This finding, however, does not confirm what we see
visually. One of the weaknesses of the PND is that it is based on a single
value in the baseline and is, therefore, susceptible to the influence of outliers
(Lenz, 2012). Therefore, the presence of outliers increases the likelihood of a
small effect, which may not be an accurate representation of actual change.
Figure 4.23 PEM values below the median for Jenny’s yelling behavior
Figure 4.24 Statistical output from PEMbelow() for Jenny’s yelling behavior
The PEM graphs can be annotated using the ABlines() and ABtext()
commands. The results of the PEM indicate that 88% of the data points are
below the baseline median of two. According to the criteria presented in
Figure 4.24, a PEM of 0.88 would indicate a moderate effect size and is
more consistent with what we see visually.
Figure 4.26 Statistical output from PANDbelow() for Jenny’s yelling behavior
Figure 4.27 IRDbelow() graph for Jenny’s yelling behavior with reference line
Figure 4.28 displays the statistical output from the Console. The results
indicate an 81.18% improvement from baseline to intervention. A percentile
chart is presented that indicates the percentile the IRD is in (Parker et al.,
2009). This chart is based on a nonrandom study of data from 166 A-B
single-subject design studies (Parker et al., 2009). An IRD of 81.57% would
be considered moderately high since only about 35% of IRD values obtained
from the sample of studies reported by Parker et al. (2009) would be larger.
Figure 4.28 Statistical output from IRDbelow() for Jenny’s yelling behavior
Notice the prompt in the Console to choose one of the following: (s)ave,
(a)ppend, or (n)either results? (s/a or n), select n. The other two choices are
used to store the NAP results to a database for use in meta-analysis. This
option is discussed more fully in the meta-analysis chapter.
A chart of the degree of effectiveness is presented alongside the computed
effect size. In this case, the calculated value of 0.9411765 displayed below
“Est” would be considered very effective (Parker & Vannest, 2009). Unlike
the previous non-overlap methods discussed, dependable 95% confidence
intervals (CIs) can be calculated. In this example, the CI ranges from
0.7773566 to 0.9850379, which indicates that there is 95% confidence that
the impact of the intervention is at least a moderate one.
The phases were labeled using the ABtext() function and clicking the mouse
in the appropriate location. Figure 4.30 displays the ABplot in the Plots
pane.
Figure 4.31 displays a box plot in the Plots pane. The thick black line in each
box represents the median for each phase (Mda = 37 and Mdb = 26).
Although an 11-point decrease in the median from the baseline to the
intervention is substantial, it can lead to a different conclusion than observed
in Figure 4.30, that the DBT is not efficacious.
Figure 4.31 Box plot of Gloria’s depression score
Autocorrelation
As stated in the previous chapter, autocorrelation should be tested in each
phase as it can impact the results of tests of significance.
We can, for example, look at autocorrelation in the intervention phase for
Jenny’s crying behavior by typing the following command in the Console:
>ABrf2(cry, pcry,“B”)
Statistical results are displayed in Figures 4.32 and 4.33 (see p. 85).
Figure 4.32 ABrf2() for crying behavior in the intervention phase
After you have saved the transformed data, use the Getcsv() function to
open it. Once the dataset is open, type attach(ssd). Once the dataset is
attached, you can test it for autocorrelation by entering the following
command:
>ABrf2(diff, phase,“B”)
In the chapter on analyzing the baseline, you used the first difference to
transform your baseline data. Now, both the phases have been transformed,
and the autocorrelation has successfully been removed.
Another option for transforming data is with the ABma() function. As
stated in Chapter 3, it is an appropriate function to use in trying to transform
autocorrelated data when the data in a phase is highly variable.
In the next chapter on statistical significance, you will learn how to use
the Append() command in SSD for R to combine transformed datasets in
multiple phases (e.g., baseline and intervention) when testing for statistical
significance.
Conclusion
In this chapter, we discussed how you might use descriptive statistics to
compare two phases in a single-subject study. These include measures of
central tendency, variation, trend, and effect size. Using Jenny as an
example, you learned the commands necessary to calculate and interpret
these statistics in SSD for R. Additionally, these commands generate
supporting graphs, including box plots, SD band graphs, and line charts
showing the mean, median, and the trimmed mean for any phase that you
can use to compare any two phases.
SSD for R also provides three standard methods for computing effect size,
which were discussed in detail. Additionally, four methods of evaluating
effect size using non-overlap methods were examined. The chapter
concluded with a discussion of autocorrelation in the intervention phase and
how to consider dealing with this issue in light of the possibility of
autocorrelation in a comparative (e.g., baseline) phase.
Chapter Exercises
Assignment 4.1—Working With Two Phases of Data
For this assignment, you will modify the spreadsheet you created in Chapter
1 with Brenda’s baseline data. You will add data to continue tracking her
behavior into the intervention phase. Add the following data to your
EXISTING file:
Day # of Oppositional Episodes
8 2
9 1
10 1
11 2
12 0
13 1
14 0
15 1
16 1
17 0
18 0
19 0
20 0
21 0
22 0
23 0
Once you add these data to your Google Sheet, do the following:
1. Import it into RStudio with a different file name than you used before.
2. Create a line graph labeling it appropriately. In the main title, include YOUR initials. For
example, my main title could read, “Brenda’s Oppositional Behavior—WZ”
3. Annotate the line graph to put a vertical line between phases.
4. Label the baseline phase “A” and the intervention phase “B”
5. Export the graph to a Word document and add your answer to the following questions: Based
on the line graph alone, do you think that the social worker’s intervention with Brenda is
having the desired effect? Why or why not?
Introduction
In this chapter, we discuss tests that can be used to compare your data across
phases. These are all tests of statistical significance that are used to
determine whether observed outcomes are likely the result of an intervention
or, more likely, the result of chance. Used correctly, statistical tests can help
you make sounder decisions about your data (Bloom et al., 2009).
Usually, the first step in significance testing is to form a hypothesis of no
difference, referred to as the null hypothesis. In single-system design, the
typical null hypothesis, described as H0, is that the target behavior did not
change after the introduction of the intervention. For example, we could
state the null hypothesis that Jenny’s self-esteem did not change after
participation in a social skills group. An alternative hypothesis, denoted as
H1 or HA, is that the target behavior changed after the introduction of the
intervention. This is a nondirectional hypothesis because, stated as such,
change can be positive (i.e., desired) or negative (i.e., undesired). Continuing
the Jenny example, we could test the hypothesis that Jenny’s self-estemm did
change after participation in the school’s social skills group. A directional
null hypothesis would be that the target behavior was either the same or
worse after the intervention, such as Jenny’s self-esteem not improving or
deteriorating after participating in social skills. The alternative directional
hypothesis would be that the behavior improved after the intervention, and
we would see an improvement in Jenny’s self-esteem after participating in
the social skills group. We recommend the use of a nondirectional
hypothesis because it allows you to establish if the change is positive or
negative.
The purpose of a statistical test is to determine how likely it is that we are
making an incorrect decision by rejecting the null hypothesis and accepting
the alternative one. In statistics, this is referred to as a Type I error. In more
common parlance, this is a “false positive.” By using probability, tests of
statistical significance quantify the likelihood of making a Type I error.
Typically, in the social sciences, we are willing to accept the alternative
hypothesis if the chance of making a Type I error is 5%, or a probability of
.05, or less. This is typically shown as output in statistical programs as a p
value (e.g., p ≤ .05) or a sig value (e.g., sig ≤ .05).
>Getcsv()
Once the dataset is open, attach it by entering the following command in the
Console:
>attach(ssd)
Note that once you create a legend, you will no longer be able to alter the
graph. Therefore, we recommend that you add this as a last step in creating
SPC charts. Be sure that your graph looks exactly as you would like it before
adding the legend.
In the Console, you will see the values for the upper band (Uband), mean,
and lower band (Lband). These values were used to label the X̄-R chart.
In this case, the desired zone would be below the lower band (68.56)
because the hospital’s goal is to reduce the number of nonmedical
admissions. The last 3 weeks of the intervention are in the desired zone,
showing that the intervention has some promise. The hospital administrator
would probably want to continue measuring this process over some time
because it seems that the intervention is going in the desired direction. It
would seem that the process of a high level of nonmedical admissions might
have been interrupted by the intervention. To increase confidence, continued
tracking would be recommended.
R Chart
The R chart is designed to detect changes in variation over time. There are
two forms of this chart available with SSD for R. One version of the R chart,
which is recommended when the sample size is small (i.e., less than 10),
uses the mean range of the samples to track variation. Similar to the X̄-R
chart, using the mean range improves confidence in the measure because it
takes into account all available data. If the sample size is larger than 10, use
of the standard deviation is recommended over the range (Orme & Cox,
2001). Although our hypothetical data have a sample size of 12, for
illustration purposes we have presented an R chart using the range.
Figure 5.2 was created with data from the ed.csv file by entering the
following commands in the Console:
Command Purpose
Rchart(admits, admitweek, 3,“weeks”,“mean Creates SPC chart using three standard deviations
range of admits”,“Social Admits”)
SPCline() Adds line between phases; click twice to indicate
the top and bottom of the line
ABtext(“24.69”) Enters mean text
ABtext(“12.83”) Enters Uband text
ABtext(“0.97”) Enters Lband text
SPClegend() Creates a legend
Figure 5.2 shows that all sample mean ranges for the 12 weeks are within
the two bands, indicating that the variation for this process remained stable.
The commands in the following table can be used to create an R chart
using the standard deviation as the measure of process:
Command Purpose
Rchartsd(admits, admitweek, 3, “weeks”, “mean Creates SPC chart using three standard
SD of admits”,“Social Admits”) deviations
SPCline() Adds line between phases; click twice to indicate
the top and bottom of the line
ABtext(“8.61”) Enters Uband text
ABtext(“4.48”) Enters mean text
ABtext(“0.34”) Enters Lband text
SPClegend() Creates a legend
Figure 5.3 shows virtually the same trend displayed in Figure 5.2. More
importantly, all the weekly standard deviations are within the two bands,
demonstrating that the variation for this process remained stable. As we
described previously, extreme variation influences the stability of a measure.
Proportion Chart
Often clinicians need to track a client system to determine the extent to
which a set of tasks has been completed. When the target behavior has a
binary outcome (e.g., completing or not completing a task), comparing the
proportion of task completion over time or between phases can be measured.
Attendance (vs. nonattendance) in a group, child welfare workers
completing records on time (vs. not completing on time), workers
completing home visits (vs. not completing home visits), whether a patient is
compliant with taking medication (vs. noncompliant), and if a patient with
hearing loss wears their hearing aid (vs. not wearing the hearing aid) are all
examples of task completion that can be assessed using proportion charts (P
charts). For a more in-depth discussion of this procedure, we suggest you
refer to J. G. Orme and Cox (2001) and Mitra (2008).
As an example, open the Jennyab.csv dataset, attach it, and view the
variables by clicking on the spreadsheet icon to the right of the ssd file in the
Environment pane. In this example, we use the following three variables:
group, wgroup, and pgroup. The group variable is a binary measure of group
attendance, with “0” indicating that Jenny did not attend the daily group and
“1” indicating that she did attend. The variable wgroup is the week of
attendance, and pgroup is the phase variable indicating whether the measure
was obtained during the baseline or intervention. To improve attendance,
after the fifth week an intervention was introduced; children who had perfect
weekly attendance were given an extra 20 minutes of free play on Friday
afternoons.
These data have 15 samples (i.e., weeks) and five observations (i.e., days)
per sample. Entering the following commands will create the p chart in
Figure 5.4.
Figure 5.4 P chart based on 3-sd bands
Command Purpose
Pchart(group,wgroup, 3, “weeks”, “proportion of attendance”, Creates SPC chart with three
“Jenny’s Group Attendance”) standard deviations
ABlines(group) Adds line between phases
ABtext(“0.32”) Enters mean text
ABtext(“1.00”) Enters Uband text
ABtext(“–0.36”) Enters Lband text
SPClegend() Creates a legend
Figure 5.4 shows that all the samples during the baseline are within the
upper and lower bands. After the intervention, the proportion of attendance
increases, and, during the last 5 weeks, Jenny attended all group sessions.
The findings from the chart would provide evidence to continue the
intervention.
X-Moving-Range Chart
Like the previous SPC charts, the X-moving-range chart (X-mR chart) can
be used to detect changes within and between phases. This chart should not,
however, be used when there is a trend in the data. It can be used, though,
when you are not using samples of data, but individual data points, such as
what we have seen with the majority of Jenny’s data. As with the previous
charts, large unexpected changes in the undesired zone may indicate the
need to modify the intervention. Additionally, slow or no movement into the
desired zone may also indicate the need to modify the intervention. During
the baseline, abrupt change, specifically undesirable change, may signal a
need to begin the intervention.
Figure 5.5 provides an example of hypothetical data on self-esteem
measured on a scale of 1 to 5, with 1 being the lowest level of self-esteem
and 5 being the highest. Because we want Jenny’s self-esteem to increase,
the desired zone, then, would be above the upper band.
Figure 5.5 X-mR chart based on Jenny’s self-esteem during baseline phase
To create this chart yourself, enter the following using data from the
Jennyab dataset:
Command Purpose
>Xmrchart(esteem, pesteem, “A”, 3, “weeks”,“self-esteem Creates SPC chart with three
score”,“Jenny’s Self-Esteem”) standard deviations
>ABlines(esteem) Adds line
>ABlines(esteem) Adds line
>ABtext(“3.12”) Adds mean text
>ABtext(“5”) Adds upper band text
>ABtext(“2”) Adds lower band text
>ABtext(“A”) Adds text
>ABtext(“B”) Adds text
>ABtext(expression(A[1])) Adds text with subscript
>SPClegend() Adds legend
C Chart
It is suggested that when the outcome is a count of behavior the C chart
should be utilized. Examples of behaviors that are counts would be the
number of times a child displays aggressive behavior, the number of times a
child leaves his or her seat, or the number of times a child yells out an
answer.
As an example, you can re-create Figure 5.6 (see p. 100) by entering the
following commands in the Console using the Jennyab dataset:
Figure 5.6 C chart based on Jenny’s yelling during baseline phase
Command Purpose
>Cchart(yell,pyell,“A”, 3,“days”, “Count”, “Count of Creates SPC chart with three standard
Yelling”) deviations
>ABlines(yell) Adds line between phases
>ABtext(“2.8”) Enters mean text
>ABtext(“8”) Enters Uband text
>ABtext(“–2”) Enters Lband text
>ABtext(“A”) Enters text for the A phase
>ABtext(“B”) Enters text for the B phase
>SPClegend() Creates a legend for the graph
As Figure 5.6 illustrates, all the points are within the upper and lower
bands. Although there are no unusual observations during either the baseline
or the intervention, the yelling was consistently lower after the intervention
was initiated.
Figure 5.7 Line graph of Jenny’s yelling with 1-sd bands based on baseline phase
Command Purpose
SD1(yell,pyell,“A”,“days”, “Count”, “Count of Yelling”) Creates graph
ABlines(yell) Adds line between phases
ABtext(“+1sd = 4.06”) Enters text between quotes
ABtext(“–1sd=1.54”) Enters text between quotes
ABtext(“mean=2.8”) Enters text between quotes
Figure 5.8 Console output for binomial test of Jenny’s yelling across phases
Another Example
Proportion/frequency can also be used to help track task completion and can
be used as an alternative to SPC p charts. When the target behavior has a
binary outcome (i.e., completing or not completing a task), comparing the
proportion of tasks completed over time or between phases can be measured.
One example is a client’s attendance in a group (attendance vs.
nonattendance).
Let’s use Jenny’s group attendance as an example. The behavior variable
group is coded as “1” for attended and “0” for did not attend. We can use the
R table() function to compare the frequency of attendance during the
baseline to the intervention. To do this, enter the following in the Console:
>table(group, pgroup)
From the output in the table in Figure 5.9, we can see that Jenny attended
the group 8 out of 25 (17 + 8) sessions during the baseline compared to 40 of
50 (10 + 40) sessions during the intervention.
The ABbinomial() function can now be utilized to test if the rate of
attendance during the intervention was statistically different from the
baseline. In this example, we would enter the following into the Console
with the output displayed in Figure 5.10:
>ABbinomial(pgroup,“A”,“B”, 8, 40)
>SN(3.934e-12)
The significance value is less than .05, and the null hypothesis can therefore
be rejected. The output shows that there was an increase from a 32% (0.32)
attendance success rate during the baseline to 80% (.8) during the
intervention.
Figure 5.10 Console output for binomial test of Jenny’s group attendance across phases
Then create your desired zones with the addition of Gline(), specifying lines
at both 6 and 8 hours of sleep. Follow the prompts in the Console until your
plot looks like the one illustrated in Figure 5.11.
Figure 5.11 Line graph of Gloria’s sleeping with defined goal lines
Notice that by using the Gline() function you are able to customize what
is considered desirable. This function, then, can be used in collaboration
with work with clients to help assess the degree to which they are reaching
goals that are meaningful to them.
We can now count two desirable values in the baseline and six in the
intervention. With this in mind, we can run the ABbinomial() function:
> ABbinomial(psleep, “A”, “B”, 2, 6)
The output found in the Console is illustrated in Figure 5.12 (see p. 106).
Figure 5.12 Console output for binomial test of Gloria’s sleep across phases
The output illustrates that Gloria slept a healthy amount 40% of the time
during the baseline and 6 out of 10 times, or 60% of the time, during the
intervention. While we see an improvement, the p value of .2126 is greater
than .05, so we are unable to reject the null hypothesis.
This is disappointing, but remember that the number of observations in
each phase impacts p values, so do not give up just yet. Perhaps with more
time, statistical significance could be achieved.
Chi-Square
The chi-square (χ2) is a widely used statistic in the helping professions. The
chi-square is a nonparametric statistic used to test for Type I error when data
are categorical. Because it is a nonparametric test, we assume independence;
therefore, the chi-square should not be used when autocorrelation is
problematic. Before deciding to use chi-square or any other test for Type I
error, you should test both phases to see if autocorrelation is a problem.
In this example, let’s examine Jenny’s yelling behavior more closely. You
can test two phases using the ABrf2() function to test for autocorrelation of
Jenny’s yelling behavior, by entering the following commands in the
Console, the results of which are displayed in Figure 5.13 (see p. 107):
>ABrf2(yell, pyell, “A”)
>ABrf2(yell, pyell, “B”)
Figure 5.13 Console output assessing autocorrelation of yelling in two phases
The rf2 values in Figure 5.13 display small nonsignificant rf2 values of
0.267 and 0.383, respectively, for the baseline and intervention phases.
Because we met the assumption that the data in both phases are sufficiently
independent, we can proceed with a chi-square test.
Because the chi-square is used for categorical data, a method to create
frequencies is necessary. Creating a desired zone similar to what was done
for the proportion/frequency can be used. Five methods for doing this are
available in SSD for R: the median, the mean, the trimmed mean, the
ordinary least squares (OLS) regression line, and the robust regression line.
You should use whichever method is most appropriate for your situation.
The median or trimmed mean is appropriate when there are outliers and
no trend. If there were a trend, but not severe outliers in the data, it would be
more appropriate to use the OLS regression method. If there were outliers
and a trend, it would be preferable to use the robust regression method.
Finally, if there is no trend and there are no outliers, it would be appropriate
to use the mean. To help you choose which method is most appropriate for
your situation, refer to Appendix C.
To better illustrate the general use of the chi-square methods, we can look
more closely at Jenny’s yelling behavior, the results of which are displayed
in Figures 5.14 (see p. 108) and 5.15 (see p. 109). This example uses the
baseline mean to form the desired zone. We chose this method because there
are no outliers in the data, and there is not a significant trend in either phase.
Because we want the number of yelling episodes to decrease, the desired
zone is below the regression line. Enter the following command in the
Console:
>meanbelow(yell, pyell,“A”,“B”)
Figure 5.14 Console output of chi-square test comparing Jenny’s yelling across phases
Figure 5.15 Visual output of chi-square test comparing Jenny’s yelling across phases
The first part of the output in Figure 5.14 displays the frequencies and
percentages. The “[,1]” is the baseline phase, and “[,2]” is the intervention
phase. “FALSE” is above the reference line and represents what is not
desirable, while “TRUE” is below the mean line, or in the desired zone. For
the baseline, eight observations (53.3%) are in the desired zone, and seven
(46.7%) are in the undesired zone. For the intervention, all 17 observations
(100%) are in the desired zone.
The next section shows the frequencies are the percentages for all
observations across all phases. All of the undesired scores (100%) occurred
in the baseline. Over two thirds (68%) of the observations in the desired
zone occurred in the intervention.
The p value for the chi-square is .001439. Because the p value is less than
.05, the null hypothesis can be rejected. Because the entire table is 2 × 2
(two columns by two rows), Fisher’s exact test is calculated. This test
provides a more precise calculation of significance in small tables with small
cell sizes. In this case, the p value = .001912. Based on this result, we would
conclude that there was a statistically significant higher frequency of scores
in the desired zone during the intervention compared to the baseline.
The plot shown in Figure 5.15 (see p. 109) illustrates both baseline and
intervention phases with the horizontal line placed at the mean for the
baseline, denoting the boundary for the desired zone.
If the desired zone were above the mean line, the command would be as
follows: meanabove (yell,pyell,“A”,“B”). If the median was to be used to
determine the desired zone, the commands would be medbelow(yell,pyell,
“A”,“B”) or medabove(yell,pyell,“A”,“B”). If the OLS regression line was
used, the commands would be regbelow(yell,pyell,“A”,“B”) or
regabove(yell,pyell, “A”,“B”). If the trimmed mean was to be used, the
commands would be trimdbelow(yell,pyell,“A”,“B”) or
trimabove(yell,pyell,“A”,“B”). If you were using robust regression and the
desired zone was above the line, the command would be robregabove(yell,
pyell, “A”, “B”). If you were using robust regression and the desired zone
was below the line, the command would be robregbelow(yell, pyell, “A”,
“B”).
We can use the CDC to determine whether there were significant changes
between baseline and intervention. Since Gloria’s goal was to increase her
sleep, we use the CDCabove() function:
>CDCabove(sleep, psleep, “A”, “B”)
As Figure 5.17 (see p. 111) shows, during the intervention, there were six
observations above both the adjusted regression line and the adjusted mean
line. In the Console, you see that noted as the TRUE TRUE condition. The
output in the Console tells you that eight observations are needed in this
desired zone in order to achieve statistical significance, and we are short by
two. While things seem to be going in the right direction, this test reveals
that there are no significant differences between the phases at this point.
Figure 5.17 Output results of CDC test comparing sleep across phases
t-test
The proportion/frequency test, the chi-square, and the CDC are based on
comparing the number of observations in some desired zone in one phase to
another. The t test compares the differences in the means between phases.
The null hypothesis is that the difference in means between the phases is
zero (i.e., there is no change between phases). If the t-test is statistically
significant (i.e., p ≤ .05), then we accept the alternative that the mean
differences are greater than zero. The difference, however, could be
improvement or deterioration in the behavior. If higher values are desired,
the difference should be positive, and if lower values are desired, the
difference should be negative.
There are some caveats for the use of the t test. It should not be used if
there is a trend in the data or if either phase has problematic autocorrelation
(Auerbach & Schudrich, 2013).
The t test also assumes equal variation between the phases. If the
variances are unequal, then a t test for unequal variances should be used.
SSD for R produces results for both equal and unequal variances. The
ABttest() function allows you to test the differences in variance between
phases and provides output for both versions of the t test.
As an example, let’s use a simple t test to look at Jenny’s yelling behavior
across phases. Previously, we determined that there was no significant trend
in either phase, and independence of observations can be assumed. Enter the
following command in the Console to get the output that is displayed in
Figure 5.18 (see p. 113) and Figure 5.19 (see p. 114):
>ABttest(yell, pyell,“A”,“B”)
Figure 5.18 Console output results of t-test comparing yelling across phases
There are three tests displayed in the Console panel to the left of the
graphical output: a t test for equal variances (Two Sample t-test), a test for
equality of variances (F test to compare two variances), and a t test for
unequal variances (Welch Two Sample t-test). Because the p value for
equality variances is less than .05 (.002407), we can reject the null
hypothesis that the variances are not different and accept the alternative that
they are not equal. Therefore, we use the t-test results for unequal variances
to determine whether there are statistically significant differences between
the baseline and the intervention. Looking at the results in this case, the null
hypothesis is rejected because the p value is less than .05 (4.685e-05 or
.00004685). Therefore, we accept the alternative that the mean difference
between the phases is greater than zero. We also observe that the mean went
down from 2.8 in the baseline to 0.9411765 in the intervention. We can
therefore conclude that the introduction of the intervention was associated
with an improvement in behavior.
The bar plot depicting this is shown in the Plots pane to the right of the
Console.
Note that any two phases can be compared with t tests. If, for example,
there were a second intervention (C), you could compare the mean of that
phase against the baseline or against the first intervention (B), but not both.
One-way ANOVA is a better choice if you want to compare more than two
phases simultaneously.
Figures 5.20 (see p. 115) and 5.21 (see p. 115) display the results of the one-
way ANOVA.
Figure 5.20 Console output results of one-way ANOVA comparing depression across phases
Figure 5.21 Visual comparison of mean depression across phases
The results of the one-way ANOVA indicate that the means between the
three phases are statistically different, and the null hypothesis can be
rejected. Note that the significance value is in scientific notation (2.65e-06).
Next to this value are three asterisks, indicating the p value is less than .001.
Note that the difference in the phase means are reported in alphabetical
order. The mean drops 13 points from the baseline to the intervention, is
reduced another 6.8 points on intensifying the intervention, and drops a total
of 19.8 points overall.
Because the one-way ANOVA is a single test, it is not known which mean
differences are accounting for the significant finding (B1-A, B2-A, and/or
B2-B1). The Tukey multiple-comparison test accomplishes this. The Tukey
test compensates for the increase in the chance of a Type I error by adjusting
the p values. The results indicate all pairs are significantly different from one
another. By examining the means, we can conclude that Gloria’s depression
improved during the intervention phase and improved further once the
intensity of the intervention increased.
Tests of Type I Error When Autocorrelation Is High
As previously stated, prior to conducting tests of Type I error, we
recommend testing for autocorrelation in both phases. As an example of a
method for dealing with the issue of autocorrelation when comparing phases,
we look more closely at Jenny’s crying behavior.
Begin by using the ABrf2() function to assess for autocorrelation in both
the baseline and intervention phases by entering the following into the
Console after opening and attaching Jenny’s file:
>ABrf2(cry,pcry, “A”)
>ABrf2(cry,pcry, “B”)
The resulting rf2 values are 0.302 (sig = .425) and 1.019 (sig = .00),
respectively. The results indicate that the data for the intervention are highly
autocorrelated, so it would be advisable to see if transforming the data for
this phase reduces or eliminates the issue. Before deciding how to deal with
this, check whether the data trend by entering the following in the Console:
>ABregress(cry, pcry, “A”, “B”)
Since there is also a significant trend in both the baseline and intervention, it
is advisable to use differencing to transform the data.
The steps for accomplishing this are in the following table:
Command Purpose
>diffchart(cry,pcry, “B”) Runs first difference transformation on
intervention data.
Select Y when prompted in the Console. This saves the transformed data in the place
of your choosing.
Enter “tjennyb” as a file name and save it in a location Saves transformed intervention data.
you will remember.
>Getcsv() Open transformed data tjennyb.
>attach(ssd) Attaches data.
>ABrf2(diff, phase, “B”) Runs autocorrelation on transformed
intervention data.
NOTE: When you transform data, the behavior variable is renamed diff, and
the phase variable is renamed phase.
The output from the ABrf2() on the transformed data shows that the rf2 =
0.209 and sig = .434. It appears as if transforming the data was helpful in
decreasing the autocorrelation. In order to proceed we now need to transform
the baseline data using the same method we used for transforming the
intervention data and test the transformed data for autocorrelation. The steps
for accomplishing this are shown in the next table:
Command Purpose
>Getcsv() Open jennyab.csv data to transform the
baseline.
>attach(ssd) Attaches data.
>diffchart(cry,pcry, “A”) Runs first difference transformation on
baseline data.
Select Y when prompted in the Console. This saves the transformed data in the place
of your choosing.
Enter “tjennya” as a file name and save it in a location Saves transformed baseline data.
you will remember.
>Getcsv() Open transformed data tjennya.
>attach(ssd) Attaches data.
>ABrf2(diff, phase, “A”) Runs autocorrelation on transformed data.
The transformation of the baseline data shows that the rf2 = –0.617; however,
that autocorrelation that exists is not significant (sig of rf2 = .109).
We now merge the two transformed datasets and use this to examine
whether there are significant differences between baseline and intervention.
To combine the two datasets, enter the commands in the following table:
Command Purpose
>Append() Open transformed baseline data first: tjennya. Then open the transformed intervention
file tjennyb. Now save the combined file as tjennyab.
>Getcsv() Open the tjennyab.csv file.
>attach(ssd) Attaches data.
NOTE: When you use the Append() function, the behavior variables remain
named diff, and the phase variables remain named phase.
Because there seems to be no significant problem of autocorrelation in
either phase with the transformed data, we can proceed with the t test after
the baseline and intervention phase data are merged since we have already
determined that there is not a trend in either the baseline or intervention
phases.
In order to conduct the t test on the transformed dataset, enter the
following command in the Console:
>ABttest(diff, phase, “A”, “B”)
The results of the t test on the transformed data are presented in Figures 5.22
and 5.23 (see p. 119).
Figure 5.22 Console output results of t-test comparing differenced crying across phases
Figure 5.23 Visual comparison of mean differenced crying across phases
When looking at the test to examine the equality of variances, we see that
they are significantly different (p < .05), so we look at the t-test results for
unequal variances. When we look at these results, we note p = .4. As a
result, we cannot reject the null hypothesis that the difference in the means
between the baseline and intervention is greater than zero. Therefore, we
would have to conclude that, statistically, the intervention did not have an
impact on Jenny’s crying behavior, although it appears as if change is going
in the right direction.
Conclusion
In this chapter, we covered the use of a number of hyothesis tests to test for
Type I error between phases. SPC charts have the advantage of testing for
Type I error while producing graphs that are easily understood. We also
covered other statistical tests, including proportion/frequency, chi-square, the
CDC, the t test, and ANOVA. While these tests exist, it is important to
consider the conditions under which it is appropriate to use each of these,
and we recommend using the decision trees in Appendix C to help you select
the test of Type I error that is most suitable for your data.
In single-subject research, we recommend using hypothesis tests in
conjunction with effect sizes to holistically assess your work with clients.
While statistical tests alone are sometimes used to examine change in all
types of evaluations, using a combination of both will likely lead to better
practical decision-making.
Finally, a method for transforming data and merging datasets was
discussed. Once new datasets are created using the Append() function, they
can be tested for Type I error using the techniques discussed in the chapter.
Chapter Exercise
Answer each of the following questions to determine whether there are
statistically significant differences between Brenda’s baseline and
intervention phase oppositional behavior:
1. Are there issues of autocorrelation in either the baseline or intervention phases? To make your
justification, provide both the rf2 and the significance of the rf2 for each phase.
2. Is there a significant trend that needs to be addressed in either phase? Justify your answer.
3. Conduct an appropriate hypothesis test with justification for why you are using a particular test.
Is there a significant difference in Brenda’s behavior after the introduction of the intervention?
4. What is the effect of the intervention? Choose an appropriate effect size with justification on
why you are utilizing that type of descriptive statistic. Interpret your findings.
5. Assume you are working with Brenda. Based on your findings, what practice decisions would
you make? Justify your rationale with the results of your evaluation.
6
Analyzing Group Data
Introduction
Macgowan (2008) pointed out the importance of measuring performance in
evidence-based group work (Macgowan, 2008, 2012). A set of functions
exists in SSD for R for analyzing measures of group behaviors.
This chapter discusses how to enter data on a group target behavior into
Excel or any other program that can export data to the .csv format. It also
explains how to use the SSD for R functions to analyze group outcomes.
You will need to create several variables in Excel: (a) a group behavior
variable that corresponds to the group behavior that you are measuring; (b) a
phase variable that corresponds to the group behavior variable (the phase
variable will indicate in which phase, baseline or intervention, the
measurement occurred); (c) a time unit variable in which the behavior
occurred (this could represent the group session in which the measurement
was taken, e.g., “pweek”); and (d) a group member variable. This variable
will indicate which member of the group is being measured for each
occurrence.
In order to do this systematically, we recommend giving the group
behavior variable a meaningful name; the phase variable name should be
similar to the behavior variable name. For example, if you are measuring the
degree of group mutual aid, your behavior variable could be named “muaid”
and your phase variable could be named “pmuaid.” In the example presented
in this chapter, we used pweek as our time unit variable to denote that the
groups were conducted weekly. Finally, we used “member” to represent a
group participant. Starting in row 2, begin entering your data for each
occurrence the behavior is measured.
IMPORTANT NOTE: When phases change, you will need to enter “NA”
into the row between a change in phases for each variable.
In Figure 6.1, note that we have entered baseline and intervention data for
a mutual aid group. Note that there are eight group members (labeled as
member 1 through 8 in column D) within each week (labeled pweek in
column B) displayed in Figure 6.1. The data are entered in order of the time
unit as this is the order in which data for this group were collected.
Figure 6.1 Example for entering group data
You can also track group members’ individual behaviors in the same
spreadsheet where you are tracking your group behavior variable. These
behaviors are typically not the same as the behavior that is measured for the
entire group, and specific behaviors measured can differ for each group
member. Note in Figure 6.1 the eight behavior variables m1, m2, . . . m8 and
the eight phase variables pm1, pm2, . . . pm8. The column labeled m1 is a
behavior variable for the first group member, and pm1 is a phase variable for
the first group member. All eight members have individual behaviors
recorded. The data were entered using the same steps discussed in Chapter 1.
3. Once your data are entered into Excel, you will need to save it as a “CSV (Comma delimited)”
file or “CSV (Comma Separated Values)” in your ssddata directory. To do this, choose SAVE
AS and choose a name for your file. Do NOT click SAVE, but instead select one of the CSV
options from the drop-down menu for SAVE AS TYPE or FORMAT. After you finish this, you
should click SAVE and close Excel. You may receive several warnings, but you can accept all
of these by selecting CONTINUE.
4. Once you enter your data into Excel, you can import it into SSD for R. First, remember to load
the SSD for R package by typing require(SSDforR) in the Console and pressing the
<RETURN> key. You can now begin your analysis using the Getcsv() function. Once the file is
open, type attach(ssd).
The variable muaid is the group behavior variable, pweek is the time unit
variable, pmuaid is the phase variable denoting whether the measurement
was made during the baseline or intervention, and member is a number
representing the group member being measured. The columns labeled m1
through m8 are the self-esteem measures for each group member. The phase
variable associated with m1 is “pm1,” m2 is associated with “pm2,” and so
on. As previously stated, these represent a target behavior (i.e., self-esteem)
and phase for each of the individual group members. There are 10 weeks of
baseline data and 10 weeks of intervention data.
The first step in analyzing your data is to describe each of the phases. The
typical conventions of single-subject research that use visual analysis and
descriptive statistics (e.g., mean, median, quantiles) are used to accomplish
this; however, because of the nature of group data, some of the functions
designed for individual measures do not adequately describe the data, and
functions especially designed for groups are used in their place.
Descriptive Statistics
It is appropriate to describe the behavior by a time unit variable that, in our
example, is weeks. Let’s begin by looking at the group behavior by week
more closely. A box plot can be created as follows on muaid by pweek by
using the following function:
>ABdescrip(muaid, pweek)
To make this a bit more descriptive, a line based on the median for the
baseline can be added using the following function:
>Gmedian(muaid,pmuaid,“A”)
Figure 6.3 Boxplot of mutual aid group with baseline median displayed
The values for the mean, median, standard deviation, quantiles, and other
descriptive statistics by week for mutual aid appear in the Console and in
Figure 6.4 (see p. 125).
By examining Figures 6.3 and 6.4, we see data with considerable variation
in both phases, but with more in the baseline. The minimum value is 32, and
the maximum is 79, both of which occurred during the baseline. We also
note that the median value for each week during the intervention is above the
median for the baseline.
Comparing Phases
The ABdescrip() function can also be utilized to compare descriptive
statistics between phases. In our example, we accomplished this by entering
the following command in the Console:
>ABdescrip(muaid, pmuaid)
Figure 6.5 displays a box plot comparing the A (Weeks 1 through 10) and B
(Weeks 11 through 20) phases. The box plot displays a higher median during
the intervention phase. However, consider the results of Figure 6.3, which
shows that the degree of mutual aid had begun to increase during Week 7,
prior to the intervention being introduced during Week 11. This example
shows the usefulness of displaying group data by week, which provides
greater detail than overall phase data, and its use should always be
considered.
Figure 6.5 Boxplot comparing mutual aid between phases
Figure 6.6 shows the output for the phase data displayed on the Console.
From this, we observe that there was a mean increase in mutual aid from
baseline to intervention, and, as displayed by the standard deviations,
variation decreased during the intervention.
Figure 6.6 Descriptive statistics for mutual aid by phase
Note that the time interval variable pweek is included in this group
command. This is because there are multiple measures per week (i.e., the
eight members); therefore, the unit of analysis is the mean of members per
week (in consultation with Huitema, 2013).
Figure 6.7 presented in the graph window to the bottom right displays a
red dot representing the mean for each of the 10 baseline weeks. The results
of the baseline autocorrelation test are shown in Figure 6.8 (see p. 129).
Figure 6.7 Graphical output from GABrf2() function for mutual aid baseline example
Figure 6.8 Test for autocorrelation and trend for mutual aid in baseline
Figure 6.10 (see p. 131) presents the statistical results of the analysis. The rf2
for the intervention phase is 0.291. Because the sig of rf2 is above 0.05 in our
example (sig = .537), it is not statistically significant. As a result, we can
conclude the intervention data are not autocorrelated. Figure 6.9 displays a
lack of a trend in the data, and the coefficient of –0.1477 [Pr(>|t|) > 0.05]
indicates little change over time. Despite an insignificant p value, however,
we may still want to consider the fit of the data around the regression line
because, again, significance may be hard to achieve, particularly with small
sample sizes.
Figure 6.9 Graphical output from GABrf2() function for mutual aid intervention example
Figure 6.10 Test for autocorrelation and trend for mutual aid in intervention
>Effectsize(muaid,pmuaid,“A”,“B”)
t-test
The t test compares the differences in the means between phases. The null
hypothesis is that the difference in means between the phases is zero (i.e.,
there is no change). If the t test indicates statistically significant differences
between the phases, that is, p ≤ .05, then we accept the alternative that the
mean differences are greater than zero. The t test, however, is nondirectional,
meaning that significant p values could indicate either improvement or
deterioration in the observed behavior. If higher values are desired, the
difference should be positive, and if lower values are desired, the difference
should be negative.
As mentioned in Chapter 5, there are some caveats for the use of the t test.
It should not be used if there is a trend in the data or if either phase has
problematic autocorrelation. If the data are autocorrelated, a transformation
of the data can be used to try to remove it. If the problem of autocorrelation
is resolved, the transformed data can then be analyzed using the t test.
The t test also assumes equal variation between the phases. If the
variances are unequal, then a t test for unequal variances should be used. The
group t-test function in SSD for R provides a test for the differences in
variances between phases, which can be interpreted to determine which form
of the t test should be used. Additionally, output for the group t-test function
includes results for both equal and unequal variances, and the appropriate
test results can then be analyzed.
Looking at our example of the mutual aid group, we recall that the
baseline data are not autocorrelated but do have a moderate trend. Although
there is a trend in the baseline, for illustrative purposes, enter the following
command in the Console to get the output that is displayed in Figure 6.12:
Figure 6.12 t-test comparing baseline and intervention for mutual aid group
>GABttest(muaid,pmuaid,pweek,“A”,“B”)
Note that the time interval variable pweek is included in the command. This
is because there are multiple measures per week (i.e., the eight group
members); therefore, the unit of analysis is the mean of the members for
each week.
There are three tests displayed in the Console panel to the left of the
graphical output: a t test for equal variances at the top of the Console, an F
test to compare the variances in the middle, and a t test for unequal variances
at the bottom. To begin, look at the p value for the F test to compare the
variances. Because the p value for equality variances in our example is less
than .05 (.006367), we reject the null hypothesis that the variances are equal.
Therefore, we use the t-test results for unequal variances to determine
whether there are statistically significant differences between the baseline
and intervention phases.
Looking at the results for unequal variances, located at the bottom of the
Console, we reject the null hypothesis because the p value is less than .05
(.0006419). Therefore, we accept the alternative that the mean difference
between the phases is greater than zero. We also observe that the mean went
up from 54.8125 in the baseline to 60.7125 in the intervention.
Proportion/Frequency
If you are using task completion data or any binary outcome, the
ABbinomial() function can be utilized to test for statistical significance
between phases. This command is discussed at length in Chapter 5. No
adjustments are needed to this function for group data.
Command Purpose
>XRchart(muaid,pweek,2,“week”,“Level of >Mutual Creates graph using two standard
Aid”,“Group Mutual Aid”) deviations
>ABlines(muaid) Adds line between phases
>ABtext(“57.763”) Enters mean text
>ABtext(“64.605”) Enters Uband (upper band) text
>ABtext(“50.920”) Enters Lband (lower band) text
>SPClegend() Creates a legend
NOTE: Once you create a legend, you will no longer be able to alter the
graph. Therefore, we add this as a last step in creating SPC charts. Be sure
that your graph looks exactly as you would like before adding the legend.
In this example, the desired zone would be above the upper band (64.605)
since the group’s goal is to increase the level of mutual aid. Although 9 of 10
weeks during the intervention are above the mean, none of the intervention
observations is in the desired zone.
In terms of practice decisions, the group facilitator would probably want
to continue measuring this process because it seems that the intervention is
going in the desired direction. To increase confidence, continued tracking
would be recommended.
Using R Charts to Assess Variability of Group Data
As discussed in Chapter 4, R charts can be used to detect changes in
variation over time. As these SPC charts use samples, they are appropriate
for use with group data. In the example of our mutual aid group, it is most
appropriate to use the standard deviation form of the R chart, which is
illustrated in Figure 6.14 (see p. 137). To create this chart with two standard
deviations, the following command was entered into the Console:
>Rchartsd(muaid, pweek, 2, “week”, “standard deviation”, “Group Mutual Aid”)
The graph was further annotated by noting values for the mean, upper, and
lower bands using the ABtext() function and adding a legend in the same
way that the X̄-R chart was annotated, above, with the SPClegend()
function.
Figure 6.14 R-chart using standard deviations for mutual aid group
In looking at the output, we see that the data became more stable in the
second half of the baseline and throughout the intervention. Combined with
our previous analysis, this could help us conclude that, over time, the
group’s level of mutual aid was moving in the desired direction, and that, as
a whole, the overall level of mutual aid was beginning to become less
variable between group members.
Individual Change
As mentioned, the individual change of group members can also be
measured with SSD for R. In our hypothetical example, the group members
used a self-anchoring scale weekly to rate their self-esteem. The scale ranged
from a low of 1 to a high of 10. The measured score for each group member
is contained in the columns labeled m1 to m8. The corresponding phase
variables for these are contained in columns labeled pm1 to pm8.
Measured scores for these individuals can be presented in multiple line
graphs that are displayed simultaneously. This is helpful since all the
members’ progress can be visualized at once. An example of this type of
graph is presented in Figure 6.15.
Figure 6.15 Individual self-esteem scores for group members
The first step in creating the graph is to set up the graphic environment
using the plotnum() function, which provides parameters for the desired
number of rows and columns. We want to arrange our eight graphs as four
rows of two columns, so we enter the following command into the Console:
>plotnum(4,2)
Now, each graph can be added to the graphic environment using the
ABplotm() commands seen in the next table. Note that this function is
different from the ABplot() function:
NOTE: Once you create an individual graph, you will able to alter and
annotate the graph until the next graph is added. Be sure that your graph
looks exactly as you would like it before adding another. If the following
error occurs, “figure margins too large,” you will have to increase the size of
the Plots pane. If this does not solve the issue, reset the Plots pane by issuing
the following R command in the Console: dev.off(). If you do this, be aware
that any graphs currently in the pane will be removed.
Command Purpose
>ABplotm(m1,pm1,“weeks”,“self esteem”,“member 1”) Creates graph
>ABlines(m1) Adds line between phases
>ABplotm(m2,pm2,“weeks”,“self esteem”,“member 2”) Creates graph
>ABlines(m2) Adds line between phases
>ABplotm(m3,pm3,“weeks”,“self esteem”,“member 3”) Creates graph
>ABlines(m3) Adds line between phases
>ABplotm(m4,pm4,“weeks”,“self esteem”,“member 4”) Creates graph
>ABlines(m4) Adds line between phases
>ABplotm(m5,pm5,“weeks”,“self esteem”,“member 5”) Creates graph
>ABlines(m5) Adds line between phases
>ABplotm(m6,pm6,“weeks”,“self esteem”,“member 6”) Creates graph
>ABlines(m6) Adds line between phases
>ABplotm(m7,pm7,“weeks”,“self esteem”,“member 7”) Creates graph
>ABlines(m7) Adds line between phases
>ABplotm(m8,pm8,“weeks”,“self esteem”,“member”) Creates graph
>ABlines(m8) Adds line between phases
Conclusion
In this chapter, we discussed how you could use SSD for R to assess group
functioning. There are functions to compare change over time or between
phases. Using the group social interaction example data, you learned the
functions necessary to calculate these statistics. These functions generate
supporting graphs, box plots, SPC charts, and individual line graphs. SSD for
R also provides methods for computing group autocorrelation, group effect
size, and group t tests.
Chapter Exercises
Assignment 6.1—Create a Group Spreadsheet and Import It to
RStudio
A social worker serving mental health clients is assigned a group of four
clients having difficulty adapting to transitional housing because of the
quality of their activities of living (ADL) skills, such as cleaning, laundry,
maintaining their residences, shopping for food, preparing meals, paying
bills, and taking public transportation. The worker decides to start a group
and provide an intervention to teach the clients ADL skills. She measures
their ADL skills at each session with a scale ranging from 1 (poor) to 10
(excellent).
For this assignment, you will create a spreadsheet with the group’s
baseline and intervention data. BE SURE TO STORE THIS
SPREADSHEET IN A PLACE YOU CAN ACCESS LATER, as you will
use it for future homework assignments. The data you need for this is in the
following table:
ADL Score Week of Service Phase (A or B) Member ID
4 1 A 1
4 1 A 2
6 1 A 3
3 1 A 4
2 2 A 1
3 2 A 2
4 2 A 3
2 2 A 4
7 3 A 1
3 3 A 2
2 3 A 3
3 3 A 4
3 4 A 1
4 4 A 2
5 4 A 3
3 4 A 4
2 5 A 1
2 5 A 2
4 5 A 3
6 5 A 4
NA NA NA NA
5 6 B 1
7 6 B 2
8 6 B 3
4 6 B 4
6 7 B 1
5 7 B 2
7 7 B 3
6 7 B 4
6 8 B 1
7 8 B 2
6 8 B 3
5 8 B 4
7 9 B 1
6 9 B 2
7 9 B 3
6 9 B 4
6 10 B 1
7 10 B 2
8 10 B 3
6 10 B 4
7 11 B 1
6 11 B 2
5 11 B 3
ADL Score Week of Service Phase (A or B) Member ID
7 11 B 4
8 12 B 1
6 12 B 2
7 12 B 3
5 12 B 4
6 13 B 1
7 13 B 2
7 13 B 3
7 13 B 4
8 14 B 1
6 14 B 2
7 14 B 3
6 14 B 4
7 15 B 1
7 15 B 2
8 15 B 3
8 15 B 4
Introduction to Meta-Analysis
Throughout this book, we have discussed methods for evaluating the degree
to which change can be observed and interpreted with the introduction of
intervention(s) to help remediate some identified problem(s). The findings
from these evaluations should inform your work with individual client
systems, but sometimes, we might want to know more.
Oftentimes our work involves many client systems with similar presenting
problems, and we might use similar intervention techniques to help address
those problems. We also know that some client systems respond better to
intervention than others. So, we might wonder, given clients with a
particular type of presenting problem (e.g., depression), how much change
could we expect to see if we use a particular intervention (e.g., cognitive
behavioral therapy)? And, does that change more generally make meaningful
differences in the lives of our clients?
Meta-analytic techniques can be used to aggregate evaluation results
across studies. In the case of single-subject research designs, we could
combine findings from evaluations with 5, 10, or 20 clients to determine, on
average, how effective an intervention is. We can combine effect sizes
discussed in Chapter 4 to report the direction and magnitude of change
across all those clients (i.e., individual studies). This is a more complex and
sophisticated way of understanding differences across studies than reporting
those changes qualitatively or simply reporting the individual effect sizes for
each study.
Meta-analysis is, in essence, a study of studies. It is most frequently used
to understand average effects of interventions across research in which group
designs are used, and it is predicated on the assumption that individual
studies are independent of one another. Therefore, it is only appropriate to
use meta-analytic techniques with different clients. That is, do not conduct a
meta-analysis with the same client across different outcomes.
In this chapter, we talk more about why meta-analysis is important to
consider in single-subject research, and then we demonstrate how to do this
using SSD for R functions.
When entering this command, you will always use napES and Label as
the first two parameters you enter unless you manually change these in your
.csv file prior to loading it. Figure 7.1 demonstrates that the mean NAP for
all seven students is 0.79 (standard deviation [sd] = 0.19). Looking at the
key below the computed output provides information on the interpretation of
the calculated values. An average effect of 0.79 is understood to be a
moderately effective degree of change.
Figure 7.2 provides additional information. The output in this graph orders
studies (in this case students) from lowest effect size at the base of the y-axis
to highest at the top of y-axis. We can see here that Students 3 and 5 did not
really improve since their NAP scores fell below the 0.67 mark, which is the
threshold for minimal effectiveness. Their NAP scores were much less than
the remainder of students included in the study, all of whose effect sizes
were greater than 0.8. This is valuable information, and if we were the
teachers in this classroom, we might wonder in what ways Students 3 and 5
might differ from those who saw greater effects. This could provide us the
beginnings of insights on how we might want to change interventions based
on characteristics of more than one study participant.
Using Traditional Effect Size Statistics to Conduct Meta-
Analysis
In Chapter 4, we discussed traditional effect sizes such as ES, d-index, and
Hedge’s g. If these effect sizes can be used across studies to be included in a
meta-analysis, it may be desirable to do so. Reasons for this, according to
Shadish and colleagues (2014), include being able to integrate these studies
more easily into comparisons with between-subject designs and making use
of well-known meta-analytic tools, such as forest plots and publication bias
analyses. Additionally, tools provided in SSD for R allow for the
introduction of a moderating variable in this analysis.
To illustrate, consider a child welfare agency that provides services to
parents whose children are at risk for foster care placement due to
allegations of neglect. The agency has recently developed an in-home
intervention for which they want to receive a grant from the state. The state,
however, will only consider funding the program if it can be shown that
participation is effective at keeping families intact. The agency tracked 14
clients who participated in the program using observational measures of
parenting before and during the intervention. Effect sizes have been
computed for all 14 using the techniques described in Chapter 4, and they
have been placed in the file Parenting intervention meta.csv that comes with
this text.
Use the Getcsv() and attach(ssd) functions to load these data. If you click
on the spreadsheet icon, you will see that there have been two additional
columns added to what was developed by saving and appending the effect
size file you learned how to create previously. The first is a variable called
Support, which assess the level of social support available to the parent on a
scale ranging from 1 (with no social supports available to the parent) to a
high of 10 (the parent has many supports available to them). Additionally,
the variable Single denotes whether or not the parent has a live-in partner.
Zero indicates having a partner, and one indicates being single.
Enter the following in the Console to view the mean effect size across all
studies, the results of which are displayed in Figures 7.3 and 7.4 (see p. 148):
>meanES(ES, Label, “Mean Parenting Skills”)
Figure 7.3 Console output results of meanES() function
When entering this command, you will always use ES and Label as the
first two parameters you enter unless you manually change these in your .csv
file prior to loading it.
The output shown in these figures looks similar, although not identical, to
what you saw previously. The mean effect calculated is 1.33 (sd = 0.99) and,
according to the key provided underneath this output, demonstrates a
medium effect since this value is between 0.87 and 2.67. Additionally, we
see that, on average, parents saw a 40.77% improvement in parenting skill
scores after the introduction of the intervention.
Figure 7.4 shows the effect sizes from smallest to largest. Here, we see
that Clients 1, 4, and 13 showed (in decreasing order) the most
improvement, while Clients 12, 8, 11, 3, and 14 showed (in decreasing
order) the least. Evidence of overall improvement in parenting skills makes a
compelling case for potential funders who want some assurances that
services they are paying for will actually help clients keep their children at
home.
To continue our analysis, you will want to conduct a more robust
investigation by using the metareg() function as it weights each study based
on the variance of the ES within each study. This function assumes that the
variance between studies is greater than zero, and that studies included in the
meta-analysis are a sample drawn from a larger universe. Therefore, we
assume that differences observed between studies is not due to sampling
error, but from real differences in the population. The information gleaned
from this function will be more substantial and useful. Begin by entering the
following in the Console:
> metareg(ES, V)
The output for this function is displayed in Figures 7.5 (see p. 149) and
7.6 (see p. 150). Figure 7.5 shows that the summary ES for these 14 studies
is 0.94 and is displayed as the estimate. The interpretation of this estimate is
based on Cohen’s guidelines, with 0.2 indicating a small effect, 0.5
indicating a medium effect, and 0.8 indicating a large effect. A summary ES
of 0.94 indicates nearly a 1-SD improvement from baseline to intervention,
which is considered a large effect. The standard error for the ES, the z
statistic, as well as the upper and lower bounds of the 95% confidence
interval are the additional information provided on this line. The p that is
displayed is the p value for the summary statistic. In this case, p < .05, so it
is statistically significant.
Figure 7.5 Console output results of metareg() function
The output from the metareg() function also gives us information about
the heterogeneity, or the variability, of the studies included in this analysis.
The QEp value is the Q statistic that tells us whether the heterogeneity
between the studies is statistically significant.
The forest plot displayed in Figure 7.6 (see p. 150) illustrates this
heterogeneity, or variability, across these studies. Here, we can see that some
studies show a very small effect that is not statistically significant (Studies
14 and 3, for instance), while others show much larger effects that are
statistically significant but may have wide confidence intervals (Studies 1
and 9, for example). Notice that some studies’ ESs (such as Studies 3 and
14) are represented by larger squares, while others are represented by
smaller squares. The size of the squares denotes the precision of the ES, with
larger samples having more precision. The diamond at the bottom of the
forest plot denotes the summary effect (0.940), and the length of it represents
the 95% confidence interval, ranging from 0.439 to 1.442.
>metaregi(ES, Support, V)
In the output shown in Figure 7.7, notice that there is a new line, mods,
that includes any moderator variables added to the model. The intercept in
this case, however, is not significant since p = .353 and is greater than the
.05 threshold. We therefore ignore this coefficient and discard the model and
conclude, in this case, that social support does not improve our
understanding of the ESs of the included studies.
Since heterogeneity has not changed or improved, we can think about the
variable Single since single parents often have a more difficult time
parenting. The output displayed in Figures 7.9 and 7.10 (see p. 153) is result
of the following being entered in the Console:
Figure 7.9 Console output results of metaregi() function for single as a moderator
Figure 7.10 Visual output results of metaregi() function for single as a moderator
>metaregi(ES, Single, V)
Conclusion
Being able to combine single-subject research designs is particularly useful
in both practice and research settings. From a practice perspective, it
provides more information about what practitioners might be able to expect
from clients as a result of implementing specific interventions. In the
examples provided, we were able to identify clients who responded to
interventions better than others. We also found that being single negatively
impacted parenting skills across studies. In this way, we might be able to
begin identifying the types of clients who might respond more successfully
to specific ways of working than others. From a research perspective, being
able to conduct meta-analyses of single-subject research designs adds to the
knowledge base of what is known about particular interventions by
providing rigor that is not attainable by analyzing one study at a time.
In this chapter, we covered two methods for analyzing multiple single-
subject research studies. The meanNAP() function is an extension of the
nonparametric NAP method and can be used when data in studies trend or
may have problems of serial dependency. The metareg() and metaregi()
functions make use of traditional effect sizes that are used in group research
designs; however, they cannot be used in cases where serial dependency is
problematic or data in individual studies trend. The advantages to these
functions, though, are that they are more easily understood in the larger
research community, they can be used in larger meta-analyses to build
research evidence around the effectiveness of specific interventions, and
moderator variables can be included.
8
Using RMarkdown to Present Your Findings
Introduction
In this chapter you will learn how to utilize RMarkdown to present SSD for R
findings in a well-ordered and reproducible manner. RMarkdown is a plain
text formatting syntax that makes writing research reports simple. The
language provides a simple syntax that formats text such as headers, lists,
boldface, and so on. This language is popular, and you will find many apps
that are compatible with it. For example, combined with other packages, like
SSD for R, users can easily create tables and graphics to present their
research findings. Another important feature of this markdown language is
that it will make your findings reproducible in that all of your files are
connected. Thus, if there are changes to your data, rerunning the analysis is
simple. As Baumer and Udwin (2015) suggested, an RMarkdown document
links computation, output, and written analysis to enhance transparency,
clarity, and ease to reproduce the research. Furthermore, sharing data is only
a click away (Baumer & Udwin, 2015).
As mentioned, RMarkdown can save and execute SSD for R code and
create high-quality reports from a single file. Doing so provides greater
insight into your results and clearly displays what you did to analyze your
data. Furthermore, RMarkdown makes it possible for other
clinicians/researchers to replicate findings. RMarkdown is a straightforward
language that allows you to create documents with headings, text, images,
margins, and more to enhance the communication of the findings produced
by SSD for R. Documents created in RMarkdown can be saved in different
file formats, such as doc, PDF, or HTML.
Getting Started
If you are using a stand-alone version of RStudio, the installation of a LaTex
package is required to create professional-looking files in PDF format.
LaTeX is already installed on RStudio Cloud.
To create a Word file, the installation of MS Word is required. For a
Windows computer, MiKTeX is recommended and can be installed from the
following URL: http://miktex.org/download. For a Mac, MacTex 2013 + is
recommended and can be installed from the following URL:
https://tug.org/mactex/mactex-download.html. When you create a report
utilizing RMarkdown, it will automatically convert the report to a PDF or
Word file without your direct interaction with the LaTeX software you
installed. Once again, LaTeX is already installed on RStudio Cloud.
The next step is to open RStudio or RStudio Cloud and type the following
commands in the Console one at a time to install the necessary packages to
create a report in PDF or Word format:
>install.packages(“rmarkdown”)
>install.packages(“knitr”)
>install.packages(“kableExtra”)
Once this is done, you are ready to work through an example. The
example will focus mostly on creating an MS Word document. The first step
is to create an RMarkdown file (.Rmd). In RStudio, select File/New File/R
Markdown, and the menu in Figure 8.1 will be displayed. Select Create
Empty Document and a blank untitled RMarkdown document will appear in
the top left pane.
Figure 8.1 Markdown menu
The first step is to create a YAML where you can configure your
RMarkdown file. This portion contains the header, author, title, and type of
format you desire. This portion of the file starts and ends with ---. Here is an
example:
---
title: “SSD for R”
author:
- Charles Auerbach
- Wendy Zeitlin
subtitle: An R Package for Analyzing Single-Subject Data
output:
word_document: default
pdf_document: default
---
In this case, there are two authors, and each author’s name begins with a
hyphen (-) separated by a space. Also, note that under the output section,
both Word and PDF documents are entered, allowing for substituting
between document types.
Below the YAML is where different code chunks begin, which contain
various R functions from SSD for R. For example, the code chunk below will
open the jennyab.csv file, which can be downloaded from the authors’
website (https://www.ssdanalysis.com) or the publisher’s website. The data
file must be placed in the same directory as the .rmd file.
```{r openfile,include=FALSE}
knitr::opts_chunk$set(comment = NA)
require(SSDforR,warn.conflicts=F, quietly=T)
ssd <- read.csv(“Jennyab.csv”)
attach(ssd)
```
Each code chunk begins and ends with three ``` (grave accents). Inside the
{} curly brackets, the r indicates the chunk will include R code. Each chunk
requires a unique name, and in this example, the chunk was given the name
openfile. After the comma (,) options can be added for the chunk. Using the
option include = FALSE will suppress code from being displayed for the
chunk in the output document. Using Echo = FALSE will suppress only
code, and results = “hide” will suppress only results; however, we didn’t
include those options in this example as we want those displayed. The code
require(SSDforR,warn.conflicts = F, quietly = T) loads the SSD for R
package. You can use this syntax in all RMarkdown code that requires the
use of a package by replacing the package name, in this case SSD for R, with
another package name. The knitr::opts_chunk$set(comment = NA) will
improve the look of output by removing comment marks (#). The ssd <-
read.csv(“Jennyab.csv”) syntax places the jenny.csv file into the vector ssd
so it can be accessed using the syntax attach(ssd). This code can be altered
to analyze other datasets by simply changing the name of the file within the
quotation marks.
Let’s add a chunk of code to produce an ABplot() of Jenny’s yelling
behavior to the script. Be sure to add four line returns after the ending ```. To
create a page break in your document between the title page and the graph
output, a \pagebreak command is placed in the middle of the white space
between the chunks. Below is the code to create the graph using
RMarkdown. Once again, the best way to obtain the coordinates for the
commands that follow is to create the graph in RStudio and then apply them
to the commands in the chunk.
```{r ABplot,echo=F,results=“hide”}
ABplot(yell,pyell,“Days”,“Amount”,“Jenny”)
RMlines(yell,15.5)
RMtext(“A”,5,6)
RMtext(“B”,25,6)
RMtext(“argument”,2.5,5)
RMarrow(5.9,5,9.5,5.1)
RMstat(yell,pyell,“A”,“mean”,.1)
```
Once again, the chunk begins with a title and the following options: echo =
F and results=“hide”. The echo is set to “F”, and the result options are set
to “hide”, resulting in the prevention of messages and results from being
printed in the document. The SSD for RABplot() produced will appear in the
output document. The function that follows will annotate the graph. The best
way to obtain the coordinates for the commands that follow is to create the
graph in RStudio and then apply them to the commands in the chunk. The
RMlines() function will place a vertical line on the x ordinate 15.5; the
RMtext() will place an “A” at the x ordinate 5 and the y ordinate 6; the
second RMtext() function will place a “B” at the x ordinate 25 and the y
ordinate 6. The final RMtext() function will place the text “argument” at the
x ordinate 2.5 and y ordinate 5; the RMarrow() will draw an arrow from the
text “argument” to the 10th observation. Finally, the RMstat() function
places a mean line in the baseline.
To create the document, we can knit the .rmd file to create the MS Word
document to view the graph. When you knit a file, your RMarkdown
document is automatically saved. In order to do so, Word must be installed
on your computer. As displayed in Figure 8.2, in the top left pane click on
the down arrow next to knit and select MS Word.
Figure 8.2 Knit menu for RMarkdown
The output will be displayed in a Word file. If the file is read only, use
File/Save As to save the file under a new name. If you are running RStudio
Cloud, when the output is created, a message will appear to downloaded the
file to your local machine. You will then be able to modify the file. Figure
8.3 displays the result that will be produced in a Word document.
Figure 8.3 ABplot() of Jenny’s crying behavior
You can also add text to annotate your findings. For example, the
following can be included by pressing return four times after the ending ```.
**Introduction**
The School Base Support team recommended that Jenny receive Cognitive Behavioral
Therapy (CBT) at the Child Help Center to reduce or eliminate maladaptive and
inappropriate behaviors. The primary goals were the development of self-control and
problem-solving strategies. The acquisition and internalization of these skills provide the
means for the child to regulate her behavior.
**Visual Analysis**
For data analysis, visual analysis was used. Results indicated that the trend of changes in
the behavior scores for the client was descending in the interventions phase (B) and
indicated improvement. Figure 1 illustrates the results from the AB design.
\pagebreak
Note the double asterisks (*) around the subtitle to bold the subtitles
“Introduction” and “Visual Analysis.” You add a page break by placing the
\pagebreak command as a line after the final sentence of the section’s
explanation.
Each phase was tested for autocorrelation by inserting the chunks that
follow:
```{r autoA,echo=T}
ABrf2(yell,pyell,“A”)
```
```{r autoB,echo=T}
ABrf2(yell,pyell,“B”)
```
Notice that once again, the option echo was set to T to display the SSD for R
code in the document. To view the results of the autoA and autoB chunks,
you will need to knit the .rmd file. The results are shown in Figures 8.4 and
8.5 in the resulting output from the autocorrelation chunk for the baseline
autocorrelation. Figures 8.6 (see p. 162) and 8.7 (see p. 162) display the
results for the intervention autocorrelation.
Figure 8.4 Baseline ABrf2() results
Figure 8.5 Baseline ABrf2() Graph
Figure 8.6 Intervention ABrf2() results
Figure 8.7 Intervention ABrf2() graph
The next chunk will produce the results of a t test comparing the A and B
phases. Note the title of the chunk is ttest and the option echo = T is added
to include the syntax in the output. The syntax options(scipen = 999) is
added to remove scientific notation from the output. The SSD for R syntax
ABttest(yell,pyell,“A”,“B”) performs the t test.
```{r ttest,echo=T}
options(scipen = 999)
ABttest(yell,pyell,“A”,“B”)
title(main= “Figure 2: Mean Differences”)
```
Once again, knit the .rmd file as displayed in Figure 8.2, and changes
made to the file will be automatically saved. Figures 8.8 (see p. 163) and 8.9
(see p. 164) display the result of the t test. Note because echo = T was
entered as an option, the syntax is also included in the document. The title()
function is included to provide a title for the mean bar plot produced by the
ABttest() function. The title() function adds the title inside the quotations to
the bar graph produced with the ABttest() function. Notice that “t” in t-test
is between asterisks, which will italicize the characters between them.
Figure 8.8 Results from ABttest() function
Figure 8.9 Graph from ABttest() function
The final chunk will add the results of an effect size. Once again knit the
.rmd file as displayed in Figure 8.2, and changes made to the file will be
automatically saved. Figure 8.7 displays the result of the effect size.
```{r ES,echo=T}
Effectsize(yell,pyell,“A”,“B”)
```
Again, you can also add text to annotate your findings and conclusions. For
example, the following can be included by adding four white spaces after the
ending ```.
**Statistical Analysis**
Both phases were tested for autocorrelation. The findings indicate that the degree of
autocorrelation was small enough in each phase (rf2 = 0.267 and rf2 = 0.383, respectively)
to conduct a *t*-test for statistically significant differences between phases. A Welch
correction was used to compensate for unequal variances. The mean number of yelling
episodes per day decreased from 2.8 to .94 from baseline to intervention. Figure 2 shows
these differences, which are statistically significant (*t* = 5.26, *p* = <001). The Cohen’s
*d* of 1.95 indicates a moderate degree of change between phases.
**Conclusion**
The School Based support team is proposing removing the intervention to form an A-B-A
design, which is sometimes referred to as an experimental removal of intervention design.
The design is experimental because it consists of testing whether the introduction of an
intervention was likely to have caused changes in the target behavior as a result of the
child’s integration of problem-solving skills to regulate her behavior.
To view the results of ES chunk along with the additional narrative, knit the
.rmd file once again. The results are shown in Figure 8.10 (see p. 165). If the
MS Word document is read only, to edit the document, it will need to be
saved under a new file name.
Conclusion
In conclusion, this chapter provided an introduction to developing
RMarkdown documents to effectively present your SSD for R findings.
There are a number of helpful texts and websites on using RMarkdown in
Appendix D. These examples provide a more in-depth explanation on
formatting text, tables, and figures.
Chapter Exercise
Assignment 8.1—Create an RMarkdown Report on Evaluating
Your Practice With Brenda
For this assignment, you will use Brenda’s baseline and intervention data to
create a final report describing your practice evaluation. You will utilize
RMarkdown to present SSD for R findings of the Brenda case example in a
well-ordered and reproducible manner. You can modify the RMarkdown
example included with the text to create your document.
1. Create a YAML header.
2. Create a chunk load SSDforR, open the Brenda baseline/intervention file and attach it.
3. Create a chunk to recreate an ABplot() you created in Assignment 4.1.
4. Add an Introduction
5. Create a chunk to display your findings from Assignment 4.3.
6. Create a chunk to display the results on the effect size you utilized in Assignment 4.3.
7. Create a chunk to display the appropriate hypothesis test used in Assignment 5.3. Also add a
description of your statistical analysis.
8. Finally add a conclusion.
9
Building Support for Practice Research
Introduction
We recognize that practice research does not occur in a vacuum, and those
desiring to do this type of evaluation work in many different types of
settings. Some practitioners work in solo practices, while others may be
employed in large multisite, multiprogram organizations with thousands of
employees and clients at any given time. Therefore, the notion of engaging
in practice research and the ability to do so should be considered within the
context in which one works, as flexibility, resources, and work demands can
vary from organization to organization.
In this chapter, we discuss common issues that arise when organizations,
whether small or large, wish to engage in practice research, and we provide
some suggestions for addressing these.
It has been widely recognized that engaging in practice research is an
essential component of good and responsible practice in many disciplines
(Shaw, 2011). While we discussed the benefits of this type of research
previously, practice research findings can also be used at an administrative
level to improve existing programs and plan for future services. Engaging
in practice-based research can also be used to develop organizational
policies and procedures and aid in decision-making about the allocation of
resources (Preskill & Boyle, 2008).
We believe, then, that the importance of practice research cannot be
overstated. Despite this, many wishing to engage in practice research often
face challenges, including issues related to the practical ability to conduct
research, often referred to as research capacity.
Conclusion
The focus of this book has been on single-subject research design,
methodology, and analysis; however, in order to make this, or any type of
research, accessible to organizations and practitioners, a clear plan for
conducting practice research should be developed.
This chapter was written to give you the opportunity to consider the
context of your organization along with what may be required to include
practice research into your organization’s core activities. This chapter was
not meant, however, to be an exhaustive discussion of building research
capacity; rather, it was designed to provide readers a broad understanding of
factors that should be considered when attempting to implement research in
a practice environment.
Common obstacles to conducting research in practice settings discussed
in this chapter include administrative factors, work demands placed on
practitioners, the availability of research knowledge and skills, and the
research tradition of some professions. To address these, recommendations
have been developed to remediate these barriers. These involve building
support and demand for practice research by increasing its value to
stakeholders, the development and/or accessibility to research skills, and
providing the infrastructure necessary to conduct practice research.
In this chapter we discussed the importance of including practitioners in
the process of building research capacity. It should be emphasized that, in
order to increase the chances of success, capacity building must be
collaborative. All activities should include representation from all employee
groups that will either participate in the research process or will be
consumers of research. Organizational change of any type needs to consider
the needs of all involved (Choi & Ruona, 2011).
With the publication of this book, we hope that we have begun to address
some of the barriers identified in this.
APPENDIX A
Note that the behavior variable cry is followed by a “<” (less than sign) and
a “–” (dash) which is used synonymously in R to an equal sign (=). Also
note the lower case “c” before the open parenthesis and that a comma
follows each observation entry. Note that for the phase variables, pcry and
pesteem the letters A, B, and B1 are enclosed in quotes. Finally, notice that
the same number of elements are entered for the behavior variables and their
corresponding phase variables. Additionally, the NA, inserted as described in
Chapter 1 to denote a change of phases, is located in the same position in the
behavior variable and its corresponding phase variable. For example, cry is
made up of a total of 22 elements, as is the corresponding phase variable,
pcry. We also note that phases change for both the behavior and phase
variables after six occurrences of the baseline, and six occurrences of the
first intervention phase. After you enter the close parenthesis, press the
<ENTER> key.
After the two behavior and two phase variables are entered, the following
command can be utilized to create a “data frame” that can be saved as a
“.cvs” file for later use:
>ssd<-data.frame(cry,pcry,esteem,pesteem)
To view the data frame, shown in Figure A.1 (see p. 176), type the following
command:
>ssd
Note how the “NA”s line up for each behavior variable and its
corresponding phase variable. Also notice that each element is numbered on
the far left with values ranging from the first (“1”) to the last (“22”).
Now the data frame can be saved as a.csv file by using the following
function:
>Savecsv()
Figure A.1 The ssd data frame with four variables.
Remove the data frame from memory using the following command:
>rm(list = ls())
The data you just saved can be opened using the Getcsv() function and then
attach(ssd). It can also be opened and modified using Excel.
When you do this, the “R Data Editor” will open, which allows you to make
changes to your data in a spreadsheet-type format, as shown in Figure A.2
(see p. 177). After changing any data in the editor, remember to save it using
the Savecsv() function.
Figure A.2 The ssd data displayed in R Data Editor
List variables
Description: Lists all variables in active data frame.
Command: listnames()
Ex: listnames()
Graphing Functions
Create a line graph
Description: This function builds a simple line chart for a given
behavior across all phases. A space separates each phase.
Command: ABplot(behavior, phaseX, ABxlab, ABylab,
ABmain)
behavior: behavior variable
phaseX: phase variable
ABxlab: label for x-axis; written between quotation marks
ABylab: label for y-axis; written between quotation marks
ABmain: main label for graph; written between quotation marks
Ex: ABplot(yell, pyell, “school days”, “yelling incidents”,
“Jenny’s Yelling”)
Label a graph
Description: This function enables the user to write text on a
graph. Users have three options for doing this: text with
neither subscripts nor superscripts, text with superscripts, and
text with subscripts. Commands for each and examples to
label a graph are displayed below. After the command is
invoked, users will be prompted to place the cursor where the
text is to begin. After text is placed, users will be prompted to
choose whether or not to accept the graph with the text.
Command: (for text with neither subscripts nor superscripts)
ABtext()
Text to be entered on graph must appear between quotation
marks
Ex: ABtext(“baseline”)
Command: (for text with superscripts)
ABtext(expression(text^superscript))
text: this is the text that is to be displayed on the graph
superscript: this is the actual superscript that is to be displayed
Ex: ABtext(expression(X^2))
Command: (for text with subscripts)
ABtext(expression(text[subscript]))
text: this is the text that is to be displayed on the graph
subscript: this is the actual subscript that is to be displayed
Ex: ABtext(expression(B[1]))
Calculates g-index
Description: Calculates effect size based on scores in the desired
zone.
Command: Gindex(behavior, phaseX, v1, v2)
behavior: behavior variable
phaseX: phase variable on which test is based
v1: phase letter entered between quotes (e.g., “A”, “B”)
v2: phase letter entered between quotes (e.g., “A”, “B”)
Ex: Gindex(cry, pcry, “A”, “B”)
Create a C chart
Description: This function builds a C chart and is used with
individual (i.e., ungrouped) data. A space separates each
phase. For use when the outcome variable is a count (i.e., ratio
level) variable.
Command: Cchart(behavior, phaseX, v1,bandX,ABxlab,
ABylab, ABmain)
behavior: behavior variable
phaseX: phase variable on which bands are based
v1: phase letter entered between quotes (e.g., “A”, “B”)
bandX: number of SDs desired (i.e., 1, 2, 3)
ABxlab: label for x-axis; written between quotation marks
ABylab: label for y-axis; written between quotation marks
ABmain: main label for graph; written between quotation
marks
Ex: Cchart(yell, pyell, “A”, 3, “days”, “Count”, “Count of
Yelling”)
t test
Description: Computes Student t test between any two phases.
This test should only be used if there is not a trend in either
phase AND there is no problem with autocorrelation in either
phase. Graphical output is a bar chart displaying the mean for
each phase.
Command: ABttest(behavior, phaseX, v1,v2)
behavior: behavior variable
phaseX: phase variable on which test is based
v1: phase letter entered between quotes (e.g., “A”, “B”)
v2: phase letter entered between quotes (e.g., “A”, “B”)
Ex: ABttest(yell, pyell,“A”,“B”)
Scientific notation
Description: Coverts scientific notation to five decimal places
Command: SN(value)
value: value to be translated
Ex: SN(2.73e-16)
Meta-analysis Functions
Calculation of mean effect size
Description: This function calculates a mean and SD for
Cohen’s D effect sizes. A file containing saved effect sizes
must be opened by Getcsv() and then attached.
Command: meanES(ES, lab, esmain)
ES: effect size variable
lab: label for effect size variable
esmain: main title for graph
Ex: meanES(ES,“cry”,“ES For Crying”)
Meta Regression
Description: Meta regression for saved effect sizes. A file
containing saved effect sizes must be opened by Getcsv() and
then attached.
Command: metareg(ES, V)
ES: effect size variable
V: Variance of the effect size
Ex: metareg(ES,V)
Decision Trees
Helpful Websites
Campbell Collaboration— http://www.campbellcollaboration.org. Produces freely available
systematic reviews in a number of broad disciplines, including crime and justice, education,
international development, and social welfare. This site contains many free resources, including
training on methodology related to the production of systematic reviews.
Gitlab. https://evoldyn.gitlab.io/evomics-2018/ref-sheets/rmarkdown-cheatsheet-2.0.pdf. Cheat sheet
for RMarkdown.
Outcome Informed Evidence-Based Practice. http://ormebook.com/. This site accompanies Orme and
Combs-Orme’s (2011) text listed in the Texts Covering Single-Subject Research Methodology. In
addition to information relevant to each chapter in the text, this site contains a lot of pertinent
information related to single-subject research in general, including bibliographies, templates for
collecting data and creating graphs, and links to standardized scales.
R for Data Science. https://r4ds.had.co.nz. Contains an online free-to-use copy of the R for Data
Science book.
RMarkdown. The definitive guide provides an online free-to-use version of R Markdown.
https://bookdown.org/yihui/rmarkdown/
RStudio RMarkdown. https://rmarkdown.rstudio.com. Contains a host of recourses on using
RMarkdown, including cheat sheets and a reference guide.
Single-Case Research. http://www.singlecaseresearch.org/. Information on single case research
design and analysis through the posting of published papers, manuscripts in press, and white
papers. Free calculators are available for the purpose of analysis along with instructional videos on
a variety of analysis topics.
U.S. Department of Education’s Institute of Education Sciences. http://ies.ed.gov/. Contains a host of
resources related to education, including methodological papers and presentations, datasets, and
access to the What Works Clearinghouse.
Additional Resource
Volkov, B. B., & King, J. A. (2007). A checklist for building organizational evaluation capacity.
Retrieved from
https://wmich.edu/sites/default/files/attachments/u350/2014/organiziationevalcapacity.pdf
References
Archer, B., Azios, J. H., Müller, N., & Macatangay, L. (2019). Effect sizes in single-case aphasia
studies: A comparative, autocorrelation-oriented analysis. Journal of Speech, Language, and
Hearing Research, 62(7), 2473–2482.
Arnold, M. E. (2006). Developing evaluation capacity in extension 4-H field faculty. A framework
for success. American Journal of Evaluation, 27(2), 257–269.
Auerbach, C., & Schudrich, W. Z. (2013). SSD for R: A comprehensive statistical package to analyze
single-system data. Research on Social Work Practice, 23(3), 346–353.
doi:10.1177/1049731513477213
Auerbach, C., & Zeitlin , W. (2021). SSD for R (version 1.5.2). Vienna, Austria: R Foundation for
Statistical Computing. Retrieved from http://www.R-project.org/
Baizerman, M., Compton, D. W., & Hueftle Stockdill, S. (2002). New directions for ECB. In D. W.
Compton, M. Baizerman, & S. Stockdill (Eds.), The art, craft, and science of evaluation capacity
building (Vol. 2002, pp. 109–120). San Francisco, CA: Jossey-Bass.
Baumer, B., & Udwin, D. (2015). R markdown. Wiley Interdisciplinary Reviews: Computational
Statistics, 7(3), 167–177.
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for
measuring depression. Archives of General Psychiatry, 4, 561–571.
Beddoe, L. (2011). Investing in the future: Social workers talk about research. British Journal of
Social Work, 41(3), 557–575.
Benneyan, J. C., Lloyd, R. C., & Plsek, P. E. (2003). Statistical process control as a tool for research
and healthcare improvement. Quality and Safety in Health Care, 12(6), 458–464.
doi:10.1136/qhc.12.6.458
Berger, R. (2010). EBP practitioners in search of evidence. Journal of Social Work, 10(2), 175–191.
Bloom, M., Fischer, J., & Orme, J. G. (2009). Evaluating practice: Guidelines for the accountable
professional (6th ed.). New York, NY: Pearson.
Borckardt, J. J. (2008). User’s guide: Simulation modeling analysis: Time series analysis program for
short time series data streams: Version 8.3.3. Retrieved from
http://www.clinicalresearcher.org/SMA_Guide.pdf
Brossart, D. F., Parker, R. I., & Castillo, L. G. (2011). Robust regression for single-case data analysis:
How can it help? Behavior Research Methods, 43(3), 710–719. doi:10.3758/s13428-011-0079-7
Buck, J. A. (2011). The looming expansion and transformation of public substance abuse treatment
under the Affordable Care Act. Health Affairs, 30(8), 1402–1410.
Carman, J. G., & Fredericks, K. A. (2010). Evaluation Capacity and Nonprofit Organizations: Is the
glass half-empty or half-full? American Journal of Evaluation, 31(1), 84–104.
Chambless, D. L., Baker, M. J., Baucom, D. H., Beutler, L. E., Calhoun, K. S., Crits-Christoph, P., . .
. Haaga, D. A. (1998). Update on empirically validated therapies, II. Clinical Psychologist, 51(1),
3–16.
Choi, M., & Ruona, W. E. (2011). Individual readiness for organizational change and its implications
for human resource and organization development. Human Resource Development Review, 10(1),
46–73.
Christie, D., & Menter, I. (2009). Research capacity building in teacher education: Scottish
collaborative approaches. Journal of Education for Teaching, 35(4), 337–354.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
Erlbaum.
Council on Social Work Education (CSWE). (2015). Educational policy and accreditation standards.
Alexandria, VA: Author.
Edwards, J. R., & Cable, D. M. (2009). The value of value congruence. Journal of Applied
Psychology, 94(3), 654–677.
Epstein, I., & Blumenfield, S. (2012). Clinical data-mining in practice-based research: Social work
in hospital settings. Routledge.
Ferguson, C. J. (2009). An effect size primer: A guide for clinicians and researchers. Professional
Psychology: Research and Practice, 40(5), 532–538.
Fisher, W. W., Kelley, M. E., & Lomas, J. E. (2003). Visual aids and structured criteria for improving
visual inspection and interpretation of single-case designs. Journal of Applied Behavior Analysis,
36(3), 387–406.
Fouché, C., & Lunt, N. (2010). Nested mentoring relationships reflections on a practice project for
mentoring research capacity amongst social work practitioners. Journal of Social Work, 10(4),
391–406.
Free Software Foundation, Inc. (2012). RStudio. Boston, MA: Author.
Gast, D. L., & Ledford, J. (2010). Single-subject research in behavioral sciences. New York, NY:
Routledge.
Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Thousand Oaks,
CA: Sage.
Higgins, J., & Thomas, J. (Eds.). (2020). Cochrane handbook for systematic reviews of interventions
(Version 6.1.0). Chichester, UK: Wiley. Retrieved from
https://training.cochrane.org/handbook/current
Huitema, B. E., & McKean, J. W. (1994). Two reduced-bias autocorrelation estimators: rF1 and rF2.
Perceptual and Motor Skills, 78(1), 323–330.
Janosky, J. E., Leininger, S. L., Hoerger, M. P., & Libkuman, T. M. (2009). Single subject designs in
biomedicine (2009 ed.). New York, NY: Springer.
Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings (2nd
ed.). New York, NY: Oxford University Press.
Kratochwill, T. R., & Levin, J. R. (2014). Single-case intervention research: Methodological and
statistical advances. American Psychological Association.
Kratochwill, T. R., Hitchcock, J. H., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., &
Shadish, W. R. (2013). Single-case intervention research design standards. Remedial and Special
Education, 34(1), 26–38. doi:10.1177/0741932512452794
Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., &
Shadish, W. R. (2010). Single-case designs technical documentation. Washington, DC: US
Department of Education, Institute of Education Sciences. Retrieved from
https://files.eric.ed.gov/fulltext/ED510743.pdf
Krishef, C. H. (1991). Fundamental approaches to single subject design and analysis. Malabar, FL:
Krieger.
Kromrey, J. D., & Foster-Johnson, L. (1996). Determining the efficacy of intervention: The use of
effect sizes for data analysis in single-subject research. Journal of Experimental Education, 65(1),
73–93.
Lenz, A. S. (2012). Calculating effect size in single-case research: A comparison of nonoverlap
methods. Measurement and Evaluation in Counseling and Development, 46(1), 64–73.
doi:10.1177/0748175612456401
Logan, L. R., Hickman, R. R., Harris, S. R., & Heriza, C. B. (2008). Single-subject research design:
Recommendations for levels of evidence and quality rating. Developmental Medicine & Child
Neurology, 50(2), 99–103.
Lunt, N., Fouché, C., & Yates, D. (2008). Growing research in practice (GRIP): An innovative
partnership model. Wellington, New Zealand: Families Commission.
Ma, H.-H. (2009). The effectiveness of intervention on the behavior of individuals with autism. A
meta-analysis using percentage of data points exceeding the median of baseline phase (PEM).
Behavior Modification, 33(3), 339–359.
Macgowan, M. J. (2008). A guide to evidence-based group work. Oxford University Press.
Macgowan, M. J. (2012). A standards-based inventory of foundation competencies in social work
with groups. Research on Social Work Practice, 22(5), 578–589. doi:10.1177/1049731512443288
Manolov, R., Solanas, A., Sierra, V., & Evans, J. J. (2011). Choosing among techniques for
quantifying single-case intervention effectiveness. Behavior Therapy, 42(3), 533–545.
Matyas, T., & Greenwood, K. (1990). Visual analysis of single-case time series: Effects of variability,
serial dependence, and magnitude of intervention effects. Journal of Applied Behavioral Analysis,
23(3), 341–351.
McCrystal, P., & Wilson, G. (2009). Research training and professional social work education:
Developing research-minded practice. Social Work Education, 28(8), 856–872.
Mechanic, D. (2012). Seizing opportunities under the Affordable Care Act for transforming the
mental and behavioral health system. Health Affairs, 31(2), 376–382.
Miller, B. (n.d.). Single-subject research design (SSRD). Vancouver, BC, Canada: University of
British Columbia School of Rehab Sciences.
Mitra, A. (2008). Control charts for attributes. In Fundamentals of quality control and improvement
(3rd ed., pp. 369–414). Hoboken, NJ: Wiley.
Mohammed, M. A., & Worthington, P. (2012). Why traditional statistical process control charts for
attribute data should be viewed alongside an XMR-chart. BMJ Quality & Safety, 22(3), 263–269.
doi:10.1136/bmjqs-2012-001324
Morgan, D. L. (2008). Single-case research methods for the behavioral and health sciences. Sage.
Nagler, E., Rindskopf, D. M., & Shadish, W. R. (2008). Analyzing data from small N designs using
multi-level models: A procedural handbook. New York: Graduate Center, CUNY.
Nathan, P. E., & Gorman, J. M. (Eds.). (2002). A guide to treatments that work (2nd ed.). New York,
NY: Oxford University Press.
National Association of Social Workers. (2017). Code of ethics. Washington, DC.
Nourbakhsh, M. R., & Ottenbacher, K. J. (1994). The statistical analysis of single-subject data: A
comparative examination. Physical Therapy, 74(8), 768–776.
Orme, J., & Powell, J. (2008). Building research capacity in social work: process and issues. British
Journal of Social Work, 38(5), 988–1008.
Orme, J. G. (1991). Statistical conclusion validity for single-system designs. Social Service Review,
65(3), 468–491. doi:10.2307/30012410
Orme, J. G., & Combs-Orme, T. (2011). Outcome-informed evidence-based practice. New York:
Pearson.
Orme, J. G., & Cox, M. E. (2001). Analyzing single-subject design data using statistical process
control charts. Social Work Research, 25(2), 115–127.
Parker, R. I. (2006). Increased reliability for single-case research results: Is the bootstrap the answer?
Behavior Therapy, 37(4), 326–338. doi:10.1016/j.beth.2006.01.007
Parker, R. I., Hagan-Burke, S., & Vannest, K. (2007). Percentage of all non-overlapping data
(PAND). An alternative to PND. Journal of Special Education, 40(4), 194–204.
doi:10.1177/00224669070400040101
Parker, R. I., Vannest, K. J., & Brown, L. (2009). The improvement rate difference for single-case
research. Exceptional Children, 75(2), 135–150.
Parker, R. I., Vannest, K. J., & Davis, J. L. (2011). Effect size in single-case research: A review of
nine nonoverlap techniques. Behavior Modification, 35(4), 303–322.
Polit, D. F., & Chaboyer, W. (2012). Statistical process control in nursing research. Research in
Nursing & Health, 35(1), 82–93. doi:10.1002/nur.20467
Portney, L. G., & Watkins, M. P. (2008). Foundations of clinical research: Applications to practice
(3rd ed.). Hoboken, NJ: Prentice Hall.
Preskill, H., & Boyle, S. (2008). A multidisciplinary model of evaluation capacity building.
American Journal of Evaluation, 29(4), 443–459.
Pustejovsky, J. E. (2019). Procedural sensitivities of effect sizes for single-case designs with directly
observed behavioral outcome measures. Psychological Methods, 24(2), 217.
R Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria: R
Foundation for Statistical Computing. Retrieved from
https://www.scirp.org/(S(351jmbntvnsjt1aadkposzje))/reference/ReferencesPapers.aspx?
ReferenceID=1787696
Rock, B. D., Auerbach, C., Kaminsky, P., & Goldstein, M. (1993). Integration of computer and social
work culture: A developmental model. In B. Glastonbury (Ed.), Human welfare and technology:
Papers from the Husita 3 Conference on IT and the quality of life and services. Maastricht,
Netherlands: Van Gorcum, Assen.
The R Project for Statistical Computing. (n.d.). What is R? Retrieved from http://www.r-
project.org/about.html
Schlosser, R., & Wendt, O. (2008). Systematic reviews and meta-analyses of single-subject
experimental designs (SSEDs). National Center for the Dissemination of Disability Research.
https://ktdrr.org/systematicregistry/lib_systematic_search.cgi?location=sr&sel_1=104
Schudrich, W. (2012). Implementing a modified version of Parent Management Training (PMT) with
an intellectually disabled client in a special education setting. Journal of Evidence-Based Social
Work, 9(5), 421–423.
Scruggs, T. E., & Mastropieri, M. A. (1998). Summarizing single-subject research issues and
applications. Behavior Modification, 22(3), 221–242. doi:10.1177/01454455980223001
Scruggs, T. E., & Mastropieri, M. A. (2013). PND at 25: Past, present, and future trends in
summarizing single-subject research. Remedial and Special Education, 34(1), 9–19.
doi:10.1177/0741932512440730
Shaw, I. (2005). Practitioner research: evidence or critique? British Journal of Social Work, 35(8),
1231–1248.
Shaw, I. (2011). Evaluating in practice (2nd ed.). Burlington, VT: Ashgate.
Smith, I. R., Garlick, B., Gardner, M. A., Brighouse, R. D., Foster, K. A., & Rivers, J. T. (2013). Use
of graphical statistical process control tools to monitor and improve outcomes in cardiac surgery.
Heart, Lung and Circulation, 22(2), 92–99. doi:10.1016/j.hlc.2012.08.060
Smith, J. D. (2012). Single-case experimental designs: A systematic review of published research and
current standards. Psychological Methods, 17, 510–550.
Stewart, K. K., Carr, J. E., Brandt, C. W., & McHenry, M. M. (2007). An evaluation of the
conservative dual-criterion method for teaching university students to visually inspect Ab-design
graphs. Journal of Applied Behavior Analysis, 40(4), 713–718. doi:10.1901/jaba.2007.713-718
Swoboda, C. M., Kratochwill, T. R., & Levin, J. R. (2010). Conservative dual-criterion method for
single-case research: A guide for visual analysis of AB, ABAB, and multiple-baseline designs.
Wisconsin Center for Education Research Working Paper No. 2010-13. Retrieved from
https://wcer.wisc.edu/docs/working-papers/Working_Paper_No_2010_13.pdf
Tasdemir, A. (2012). Effect of autocorrelation on the process control charts in monitoring of a coal
washing plant. Physicochemical Problems of Mineral Processing, 48(2), 495–512.
Tate, R. L., & Perdices, M. (2019). Single-case experimental designs for clinical research and
neurorehabilitation settings: Planning, conduct, analysis and reporting. Routledge.
Thombs, B. D., Ziegelstein, R. C., Beck, C. A., & Pilote, L. (2008). A general factor model for the
Beck Depression Inventory-II: Validation in a sample of patients hospitalized with acute
myocardial infarction. Journal of Psychosomatic Research, 65(2), 115–121.
Thyer, B. A., & Myers, L. L. (2011). The quest for evidence-based practice: A view from the United
States. Journal of Social Work, 11(1), 8–25.
Vannest, K. J., Davis, J. L., & Parker, R. I. (2013). A new approach to single case research. New
York: Routledge.
Verzani, J. (2004). Using R for introductory statistics. Chapman and Hall/CRC.
Volkov, B. B., & King, J. A. (2007). A checklist for building organizational evaluation capacity.
Retrieved from
https://wmich.edu/sites/default/files/attachments/u350/2014/organiziationevalcapacity.pdf
Weisberg, S., & Fox, J. (2010). An R companion to applied regression (2nd ed.). Thousand Oaks,
CA: Sage.
Wendt, O. (2009, May). Calculating effect sizes for single-subject experimental designs: An overview
and comparison. Presented at the Ninth Annual Campbell Collaboration Colloquium, Oslo,
Norway.
Wendt, O., & Rindskopf, D. (2020). Exploring new directions in statistical analysis of single-case
experimental designs.
Wheeler, D. J. (2004). Advanced topics in statistical process control: The power of Shewhart’s charts
(2nd ed.). Knoxville, TN: SPC Press.
Woodall, W. H. (2006). The use of control charts in health-care and public-health surveillance.
Journal of Quality Technology, 38(2), 89–104.
Index
Symbols
``` (grave accents), 158
{} (curly brackets), 158
data
baseline phase data, 31–56
combining files, 190
comparing baseline and intervention phases, 57–89
entering and editing data directly in R, 175–178
example data, 8
with high levels of autocorrelation, 31–32
saving data files, 179
using Excel to record client data, 14–16
what to measure, 10–14 See also group data
data frames, 175
data transformations
functions associated with, 29, 189–190
using differencing, 44–49, 116
decision trees, 197–199
descriptive statistics
comparing baseline and intervention phases, 57–62, 184
functions associated with, 29, 183
diffchart() function, 189
differencing, transformations using, 44–49, 116
d-index value, 69, 71, 72, 81
direct behavioral observations, 10–11
directional null hypothesis, 90
downloading, 18–20
false negatives, 92
false positives, 90
figure margins too large error, 138
F tests, 112, 135
functions
accessing hyperlinks in RStudio, 21
associated with data transformations, 29, 189–190
associated with hypothesis testing, 30
basic statistical analysis, 29, 183–184
effect size functions, 29, 184–187
graphing functions, 29, 179–183
group data, 30, 193–194
help files, 20–21
listing, 20, 179
listing variables, 179
meta-analysis functions, 30, 195
opening and importing .csv files, 179
related to autocorrelation, 29, 189–190
related to RMarkdown, 30, 194–195
saving data files, 179
SPC charts, 29, 187–189
SSD for R functions, 21–23
suitable for system-wide data, 30
summary of, 28–30
tests of statistical significance, 190–193 See also individual functions
R
benefits of, 18
download and installation, 18
entering and editing data directly in, 175–178
range, 62–63
Rchart() function, 187
R charts, 94–96, 136–137, 187–188
Rchartsd() function, 187
R Data Editor, 176
reconstructed baselines, 31
regabove() function, 108–109, 192
regbelow() function, 108, 192
research capacity, 167. See also practice research
research standards, 4–5
resources, 6–7, 201–202
RMarkdown
additional resources, 202
creating documents, 155–164
functions appropriate for, 30, 194–196
getting started, 155–156
overview of, 155
RMarrow() function, 159, 194, 196
RMGline() function, 194, 196
RMlines() function, 156, 159, 194, 196
RMstat() function, 159, 196
RMtext() function, 159, 195–196
robregabove() function, 109, 192
robregbelow() function, 109, 192
RobustCDCabove() function, 111, 190
RobustCDCbelow() function, 111, 191
robust conservative dual criteria tests, 91
robust regression, 106, 111, 184
RStudio
accessing function hyperlinks, 21
download and installation, 18–20
entering and editing data directly in R, 175–178
importing group data, 140–141
LaTex installation, 155
v argument, 22
variables, listing, 179
variation, measures of, 62–65
visual analysis
autocorrelation, 42–49, 84–87
baseline example data: Anthony, 49–54
baseline example data: Jenny, 34–40
baseline phase data, 32–34
benefits of SSD for R, 5–6
descriptive statistics, 57–62
goal line graphs, 82–83
improvement difference, 78–80
measures of variation, 62–63
measuring trends, 65–68
non-overlap methods, 76–78
non-overlap of all pairs, 80–82
standard deviation, 64–65
versus statistical analysis, 5–6
trending, 40–42
using effect size to describe change, 69–75