Ebp PPT 1 7 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 343

Evidence-Based Physical Therapist

Practice

Marlouise Pachorro, PTRP, DPT


What will you gain through this course?
● Allows you to make independently sound judgements about validity of clinical
research
● Improve your knowledge in collecting, processing and implementing research
findings to improve clinical practice, the work environment or patient
outcomes.
● This will be a starting point for focusing on the concepts of EBP with
emphasis on forming answerable clinical questions and effective literature
search strategies
● Identify barriers and bridges to evidence based clinical practice
Required readings
Jewell, D.V. Guide to Evidence Based Physical Therapist Practice. 4th ed.
Burlington, MA: Jones & Bartlett Learning
Related Web Sites:
libraryguides.nau.edu
tripdatabase.com
www.ncbi.nlm.nih.gov
pedro.org.au
casp-uk.net
sralab.org
Principles of Evidence-Based Physical Therapist Practice

What is Evidence-based Medicine?


“the conscientious, explicit, and judicious use of current best evidence in making
decisions about the care of individual patients. The practice of EBM requires
integrating individual clinical expertise with the best available clinical
evidence from systematic research and our patient’s unique values and
circumstances.”
Strauss et al: Evidence-Based medicine, ed. 3, 2006.
Classroom Policies:

1. Understand that daily attendance and punctuality is mandatory

2. Consistently demonstrate professional conduct in actions and attire (Always be in your complete uniform)

3. Adhere to the program and university policies and guidelines

4. All are required to enroll in the google classroom (SOUL)

5. Using hand held electronic devices (phone, iPads, tablets, etc.) should be done in appropriate context

6. Strictly no eating and drinking inside the classroom and laboratory at all times. Breaks will be provided by the instructor.

7. Be aware of and meet all deadlines for assignments that are required
Evidence-Based Practice
“care that ‘takes place when decisions that affect the care of patients
are taken with due weight accorded to all valid, relevant information.”

“”A way of providing health care that is guided by a thoughtful integration of the
best available scientific knowledge with clinical expertise. This approach allows the
practitioner to critically assess research data, clinical guidelines, and other
information resources in order to correctly identify the clinical problem, apply the
most high-quality intervention, and re-evaluate the outcome for future
improvement”
-2015: introduced in 2009 Medical Subject Headings (MeSH)
Integration of:

Best research evidence Best research evidence

Clinical expertise Clinical expertise

Patient values Patient values


Evidence-Based Physical Therapist Practice
Evidence-based physical therapist practice is “open and thoughtful clinical
decision making” about physical therapist management of a patient or client that
integrates the “best available evidence with clinical judgment” and the patient
or client’s preferences and values, and that further considers the larger social
context in which physical therapy services are provided, to optimize patient or
client outcomes and quality of life.
The process of Evidence-Based PT Practice
Step 1: A question in response to a patient or client’s problem or concern.

Step 2: Tracking down the best evidence (research to answer the question)

Step 3: Clinically appraising the evidence for its validity, impact, applicability

Step 4: Integrating the appraisal with the therapist’s clinical expertise, and with the
patient’s unique biology, values and circumstances

Step 5: the therapist and that individual will collaborate to identify and implement the next
steps in the management process, and seeking ways to improve them both for next month
Factors Dependent on EBP practice
● PT’s must have sufficient knowledge about patient’s condition and recognize
what is not known.
● Access to evidence (research studies)
● Knowledge of the evidence appraisal process
● Time to search, appraise, and integrate the evidence into practice
Barriers to EBPT practice
● Lack of time
● Lack of access to evidence
● Lack of skills
● Lack of relevant evidence
Characteristics of Desirable Evidence
Hierarchies of Evidence
Hierarchies: Levels and Grades
LEVELS GRADES:

Higher, stronger 1a Higher, stronger A

Evidence 1b Evidence B

1c C

2a D

2b Lower; weaker .

Lower; weaker …. evidence

evidence 5
Searching the Evidence
Asking the question and Searching the Evidence

● Segment 1: Primary and Secondary sources of Evidence


● Segment 2: Examples of Common CLinical Questions; Background vs
Foreground Questions (PICO)
● Segment 3: locating Citations and Abstracts
● Segment 4: Locating the full text of Journal Citations
Primary vs Secondary Source
Primary Source: Secondary Source:

Reference source that represents the original Reference source that represents a review or
document by the original author report of another’s work

Scientific research=primary sources present –Often lack the freshness and immediacy of the
original thinking, report on discoveries or share original material
new information
Formulating Clinical Question
(1) the anatomic, physiologic, or pathophysiologic nature of the problem or issue;
(2) the medical and surgical management options;
(3) the usefulness of diagnostic tests and clinical measures to identify, classify, and/or quantify the
problem;
(4) which factors will predict the patient or client’s future health status;
(5) the benefits and risks of potential interventions;
(6) the utility of clinical prediction rules;
(7) the nature of the outcomes themselves and how to measure them; and/or
(8) the perspectives and experiences of others with similar problems or issues.

Questions designed to increase understanding about a situation (numbers 1 and 2) Background


questions used to facilitate clinical decision making (numbers 3 to 8). Foreground
Background Questions Foreground Questions
● Addresses normal human physiology and ● Usually concern choices
the pathophysiology associated with a ● Specific to decision making
medical condition ● Combining the best available evidence
● are the most common types of questions that
with your expertise and knowledge is
essential for effective application of
patients or clients and their families will ask.
evidence
● Research articles often contain information ● The heart of EBPT practice
relevant to background questions in their ●
introductory paragraphs
Elements of an Answerable Foreground Question
P PATIENT POPULATION

I INTERVENTIONS OR EXPOSURES

C COMPARISON INTERVENTION (S)

O OUTCOME(S) WE ARE SEEKING


Questions About…
● Diagnostic Tests
● Clinical Measure
● Prognostic Factor
● Intervention
● Clinical Prediction Rule
● Outcomes
● Self-reported Outcomes Measure
● Patient or Client perspective
EXAMPLES OF CLINICAL QUESTIONS
Pathophysiology How does a given disease or condition manifest itself?

Intervention: Is the given intervention effective? Compared to no intervention or


another intervention?

Diagnosis: What tests rule in or rule out? Single test or Multiple tests?

Prognosis: Likelihood that a patient recovers? How long will it take? Will the
return to the prior level?
Questions, Theories, and Hypotheses
Research question
Investigator must start with a specific objective toward which his or her research efforts will
be directed:
objective may be expressed in the form of a question, a purpose statement, or a problem statement
Background
Researchers must provide sufficient background information to justify the need for
this study.

● Why the question or purpose statement is important to answer


● How the results obtained will increase understanding of a particular phenomenon or situation

(1) a literature review


(2) citations of publicly available epidemiologic data.
Background
Literature Review

● Focus on previous works that have the closest relationship to the research
question or purpose statement.
● Limitations of previous research
● If no prior evidence is available for the research question, researchers may
review studies from other disciplines or practice settings.
○ May also consider studies that look at dx or clinical problems that have similar characteristics
to the disease or disorder the researchers are interested.
Background
Citations of Epidemiology Data

● Health statistics from federal agencies, state agencies, and private


organizations.

***In rare instances, the need for a study is justified by the identification of a
previously undiscovered clinical phenomenon.
Theories, Concepts, and Constructs
Theory- organized set of relationships among concepts or constructs that is proposed to
systematically describe and/or explain a phenomenon of interest

A successful theory is one that is consistent with empirical observations and that, through
repeated testing under various conditions, is able to predict future behavior or outcomes

“grand theories”
Conceptual Framework
● describe relationships among concepts and constructs from which predictions may be
made, but they are not elaborate enough to explain all of the intricacies of the
phenomenon of interest
Smaller-scale theories often are referred to as conceptual
frameworks
Hypothesis
Null Hypothesis
Researchers anticipate that “no difference” or “no relationship” between groups or variables will be demonstrated by
their study’s results.

“statistical hypothesis” because statistical tests are designed specifically to challenge the “no difference (no
relationship)” statement

The premise behind this approach is that a study’s results may be due to chance rather than due to the experiment
or phenomenon of interest
Alternate Hypothesis
“Research hypothesis”

Predict that a difference or relationship between the groups or variables will be demonstrated
by the study’s results.

may provide directional language, such as “more than,” “less than,” “positive,” or
“negative,” to make their predictions more specific.
Research Design
Research Paradigms
Quantitative Research
assumes that there is an objective truth that can be revealed by investigators who attempt to conduct their inquiries in a
value-free manner

Uses standardized, quantitative measures

Qualitative Research
assumes that knowledge and understanding are contextual and relative to each individual studied.

Data is captured in words rather than numbers

Does not assume one objective truth

An emphasis is placed on descriptive information


Quantitative Research
Experimental Design

Subjects purposely manipulated with the resulting behaviour measured

Use at least 2 groups with random assignments

comparison allows investigators to determine whether there are differences in outcomes between the group(s) that
is (are) manipulated and the group(s) that is (are) not.

Quasi-Experimental Design
purposeful manipulation of the subjects by the investigators, but they lack either a second group for comparison purposes or a
random assignment process, or both.

investigators have difficulty obtaining sufficient numbers of subjects to form groups or when group membership is
predetermined by a subject characteristic
Quantitative Research
Nonexperimental Research

Investigators are observers who collect information about the phenomenon of


interest; there is no purposeful manipulation of subjects by the researchers to produce
a change in behavior or status.

Observational Studies
Qualitative Research
investigate subjects’ thoughts, perceptions, opinions, beliefs, attitudes and/or
experiences
Researchers do not introduce experimental interventions or controls that limit
knowledge or behavior in an artificial context
they attempt to capture how subjects construct meaning about naturally
occurring phenomena and interactions within their environment.
The analysis is focused on identifying patterns or recurring themes in the data
Data are collected through observational methods in which the researcher may be
(1) removed from the daily interactions of the subjects (“nonparticipant observation”) or
(2) immersed in the subjects’ natural environment (“participant observation”).

The attempt is to capture the subjects’ perspective in a naturally occurring


environment
Number of Groups
Within-subject designs= repeated measures of an outcome are compared for a
single group of subjects

A subject’s initial baseline is used for comparison throughout

Between-subjects design= outcomes are compared between two or more


independent groups of subjects
Type of Data
Quantitative design usually use numeric data

“How much or how many”

Qualitative design focus on collection of subjects’ words or the researchers’


descriptions of subject behaviors in order to identify themes that provide insight
into the meaning of a phenomenon.
Time Elements
Duration:

Cross-sectional study -data collected once during a single point in time or a limited time interval
Longitudinal study- repeated measures over an extended period of time

Direction:

Retrospective Design- use past information for data

Prospective Design-collect their data in real time


Degree of Control
● For quantitative design
● Need to minimize bias in the study
● Experimental designs- most restrictive in terms of the amount of control
imposed on study participants and conditions
● Quasi-experimental designs lack either the control achieved by random
placement of subjects into groups or the use of a comparison group or both
Research Designs for Questions About Diagnostic Tests
and Clinical Measures
● Usually Nonexperimental and cross-sectional in design
● No assignment to groups or manipulation of a variable
● Goal is to determine the usefulness of the test of interest (index test) for
correctly detecting a pathology or an impairment in body functions and
structure.
● Individuals who appear to have a particular disorder or condition are
evaluated with the index test as well as with a second test about the
usefulness already has been established
● Second test is usually the “gold standard” or “reference standard” (this is
compared with the index test)
Research Designs for Questions About Prognostic Factors
● Assess the relationship between a prognostic factor and an outcome
● Non-experimental in nature
● Cohort design= refers to a group of subjects who share a common
characteristic such as age, gender, occupation or a presence of a diagnosis
○ Prospective Cohort
○ Retrospective Cohort
○ Case Control- retrospective approach in which subjects who are known to have the outcome of
interest are compared to a control group known to be free of the outcome
Research Designs for Questions about Interventions
Experimental Studies

RCT= randomized controlled trial

An ideal RCT

Enrolls a large representative group of subjects

Higher numbers of subjects enhances the function of statistical test and


allows researchers to apply their findings to a large group of recruited subjects
Research Designs for Questions about Interventions
PRETEST-POSTTEST FORMAT

● Pretest-posttest control group design- most common type of RCT of interest


to physical therapists
○ 2 groups: Experimental and Control
○ Performance on the outcome of interest is measured in both groups at the pretest
○ Intervention is applied to the Experimental group
○ Subjects are measured again at posttest to see if a significant change has occurred
Research Designs for Questions about Interventions
● Quasi- Experimental studies
○ Relies on several design options, including a time series design, and a nonequivalent control
group design
○ nonequivalent control group format is similar to the experimental pretest–posttest design
except that random assignment to groups is lacking
○ A simple time series format is one in which repeated measures are collected over time before
and after the experimental intervention is introduced to a single group
Research Designs for Clinical Prediction Rules
● Non experimental
Research Designs for Outcomes Research
● these studies focus on the “end results” experienced by patients or clients following an episode of
care, rather than on the relative efficacy or effectiveness of individual interventions

● commonly is retrospective

● Which patients achieved normal gait patterns after physical therapy following anterior cruciate
ligament reconstruction—those who received resistance training earlier or those who
received it later in the rehabilitation program?
Methodologic Studies About Self-Report Outcomes
Measures
● outcomes measures refer to standardized self-report (i.e., surveys) or performance-based (i.e.,
balance tests) instruments or procedures that capture person-level end points such as activity
limitations, participation restrictions, and quality of life.
Secondary Analyses
● Collection of previously completed individual studies
● Systematic reviews and Meta Analyses
Qualitative Studies
● Focus on perspectives of patients, family members, care givers, and
providers
● 3 common research methodologies used in PT:
○ Phenomenologic (Interviews)
○ Ethnographic (observations)
○ Grounded Theory (empirically driven theory)
QUIZ!!!
1. Give 1 example each of a primary source and secondary source
2. Difference between background and foreground question
3. Give one research design and explain
1. Which has a higher level of research design? RCT or Cohort?
2. Is the Synopses of Synthesis a primary or secondary source?
3. Process of EBPT practice (5 A’s)
REVIEW: Types of research
● Experimental Research
○ Cause and Effect
○ Control and Experimental Group **
○ Researcher manipulated Variable **
○ Randomization **
● Quasi-Experimental
○ One or more (**) is not met
● Non-Experimental
○ No purposeful manipulation
REVIEW: Research Design
● Synthesized literature
○ Systematic review: Comprehensive analysis of literature
○ Meta-Analysis: Statistically combining several researches
● Qualitative
○ Phenomenology: specific events
○ Ethnography: Ethnics
● Quantitative
○ RCT
○ Cohort
○ Cross sectional
○ Longitudinal
RESEARCH SUBJECTS
● Clinical research that are related to patient/client management requires data
obtained from people
● Target population
● Accessible population: smaller subset of target population
● Subjects
● Sample: collection of subjects for the study
Sample
● Primary data: data from these people will be collected in real time
● Secondary data: people in which data was collected as part of a medical
routine or during a participation from previous study
Evidence-based physical therapists must evaluate a study’s design to determine if the
results answer the research question in a useful and believable fashion. Three design steps
pertaining to a study’s subjects are essential ingredients of a successful project:

1. Identification of potential candidates for study;


2. Selection of an appropriate number of individuals from the candidate pool; and
3. Management of the subjects’ roles and activities during the study.
Subject Identification
● Inclusion criteria
○ Primary characteristic of the target population that make them eligible to be selected for the
study

● Exclusion criteria
○ Additional traits or characteristics that would exclude them from participation in the study
Subject selection
Probabilistic sampling:

Simple random sample:

Each potential subject has an equal chance to be selected

Unbiased sampling

Drawn from the accessible population

Paper in a hat/ random number generator


Subject selection
Systematic sampling:
Potential subjects are organized according to identifier (birthdate, patient
account number
Stratified Random Sampling
Capturing subgroups within the population
Subgroups often are based on naturally occurring differences in the proportion
of a particular subject characteristic such as gender, race, age, disease severity, or
functional level within a population.
Can improve a sample’s representativeness and decrease sampling error
Identifies relevant population characteristics, separates members of a
population into homogeneous, non overlapping subsets
Subject selection
Cluster sampling

“Multi stage sampling”

Advantage: Convenience and efficiency


Nonprobalistic Sampling
Does not involve a random sampling
Commonly used in clinical research
Convenience Sampling
“Accidental sampling”
Subjects are chosen on the basis of availability
Limitation: bias of self-selection
effect on study outcomes of potential dissimilarities
between the individuals who volunteer and those who do not.
Snowball sampling

investigators start with a few subjects and then recruit more individuals via word of
mouth from the original participants

Purposive sampling

investigators make specific choices about who will serve as subjects in their study

Commonly used in qualitative studies


Subject Management
Random assignment methods
● Random assignment by individual
○ Each subject is assigned, or allocated, to a group according to which group number is
randomly pulled from a hat or which side of a coin lands upright.
○ The problem with this approach is that groups may be unequal in both number and distribution
of characteristics, especially if the overall sample size is small
● Block assignment
○ size of the groups is predetermined to ensure equal numbers.
○ randomly assign the identified number of subjects to the first group, then the second
group, and so forth
Random assignment methods
● Systematic assignment
○ use a list of the subjects and repeatedly count off the group numbers until everyone is
assigned.

● Matched assignment
○ investigators want all groups to have equal numbers of subjects with a specific characteristic.

○ Subjects first are arranged into subgroups according to attributes such as gender, age, or
initial functional status. Then each member of the subgroup is randomly assigned to the study
groups.
Nonrandom assignment methods
● Commonly used in retrospective studies about prognostic factors
Sample size
● n= “number”
Variables and their
Measurements
● Qualitative Studies:
○ The data obtained are the participants’ and/or the researchers’ words from which
cohesive themes are discerned.
● Quantitative Studies:
○ focus on information that is collected under controlled conditions after operational
definitions for the measurement of each data element have been established
○ tests, diagnoses, treatments, and effects are referred to generically as “variables.”

■ specify what variables will be included and how they will be measured
Variables
● characteristics of individuals, objects, or environmental conditions that may
have more than one value.

INDEPENDENT VARIABLE
● variable that is purposefully manipulated by investigators in an effort to produce a
change in an outcome.

● Intervention studies may have one or more independent variables, a situation that
increases the complexity of the research design because of the potential
interaction between the independent variables at their different levels.
● Factorial design
● Cause; intervention/mxn; predictor variable
Variables
DEPENDENT VARIABLES

● Effect; Outcome of interest in the study


● Target Variable

EXTRANEOUS VARIABLES

● Variable you are not investigating that could potentially affect the outcome of your study
Other terminologies related to variables

● Discrete Variables: Possible values are distinct


○ Dichotomus: only 2 values are possible
● Continuous Variable: variables are on a scale;
theoretically infinite number of measurable
increments between each major unit
Measurement
Nominal level of measurement
● Without mathematical properties of rank
● Classificatory scale
● Each variable are equal
● Gender, blood type, hair color
● any statistical analysis must be performed using the frequencies (i.e., numbers or
percentages) with which these characteristics occur in the subjects in the study.
Ordinal level of measurement

● Rank ordered category


● Scales: pain scale, MMT
Interval level of measurement
● Assigns numeric values to variable
● Values have rank and a known equal distance between them
● No absolute zero (0 has a meaning)
● Temperature: has 0 deg (actual temperature and not the absence of that)
Ratio level of measurement
● Cannot give negative values
● Rank order
● Height, weight, blood pressure, speed, length, age
Reference Standards in Measurements

● Norm Referenced: performance standard is derived from the


scores of previously tested individuals
● Criterion referenced: compare the value obtained to a previously
established absolute standard
MEASUREMENT RELIABILITY
TEST-RETEST RELIABILITY
“REPRODUCIBILITY”
The instrument is used on two separate occasions on the same subjects
Challenge: how much time should pass between two measurements
INTERNAL CONSISTENCY
Relevant to self-report instruments

These surveys usually have several items or questions, groups of which are
designed to measure different concepts or constructs within the instrument
The scores that are obtained on its individual parts or items correlate with one
another
MEASUREMENT RELIABILITY
PARALLEL FORMS:
Also relevant to self-report instruments

parallel forms reliability can be established only in cases where two


versions of the instrument exist,both of which measure the same
constructs or concepts.
SPLIT-HALF:
eliminates the need for two test administrations by combining the two
forms of a survey instrument into one longer version
MEASUREMENT RELIABILITY
RATER RELIABILITY:

Stability of measure will also depend on the person/s collecting it

INTERRATER RELIABILITY: 2 or more raters, single test

INTRARATER RELIABILITY: 1 rater, multiple test


MEASUREMENT VALIDITY
Appropriateness or correctness

Ability of a measure to capture what is intended to capture


MEASUREMENT VALIDITY
FACE VALIDITY

Simplest and most subjective form of measurement

A subjective assessment of the degree to which an instrument appears to


measure what it is designed to measure

Does this instrument appear to be the appropriate choice to measure


this variable?”
MEASUREMENT VALIDITY
CONTENT VALIDITY

Items represents all of the relevant facets of a variable it is intended to


measure

CONSTRUCT VALIDITY

Degree to which the measure reflects the operational definition of the concept
or construct it is said to represent
MEASUREMENT VALIDITY
Convergent Validity
Scores of tests measuring a similar construct will be similar
Discriminant Validity
Scores of tests measuring dissimilar constructs will be dissimilar

CRITERION VALIDITY
Compare scores obtained with the target tool to scores of reference instrument
(gold standard)
MEASUREMENT VALIDITY
Concurrent Validity

evaluating criterion validity that involves administering the test of interest and reference
standard test at nearly the same time

Predictive Validity

evaluating criterion validity that reflects the degree to which the results from the
test of interest can predict a future outcome
RESPONSIVENESS TO CHANGE
The MDC (minimal detectable change) is the amount of change that just exceeds the
standard error of measurement of an instrument

FLOOR EFFECTS:
When the lowest possible score that can be achieved on the test does not capture
gradations of the lowest level of performance

CEILING EFFECTS:
When the highest possible score that can be achieved on a test does not capture the full
range of improvement an individual may make
FOR SEARCH DATABASE

**include reliability or validity in the search terms

https://www.sralab.org/rehabilitation-measures
QUIZZZZZZ
1. Give 2 examples for interval level of measurement
2. Difference between dependent and independent variable
3. Identification: “Multi stage sampling”
1. Give 2 examples for ratio level of measurement
2. Difference between interrater and intrarater reliability
3. Describe Content Validity
Validity and Reliability

● A good measurement tool is:


○ Reliable
○ Valid
○ Responsive
● Sources of Measurement Error
○ Instrument itself
○ Human error (rater)
○ Innate variability in the characteristic being measured
CONTENT VALIDITY

Items represents all of the relevant facets of a variable it is intended to


measure

CONSTRUCT VALIDITY

Degree to which the measure reflects the operational definition of the concept
or construct it is said to represent

Convergent

Discriminant
CRITERION VALIDITY
Compare scores obtained with the target tool to scores of
reference instrument (gold standard)
CONCURRENT
Scores taken at relatively the same time and examine
correlation between them
PREDICTIVE TESTING
FLOOR EFFECTS:
When the lowest possible score that can be achieved on the test does not capture
gradations of the lowest level of performance

The ideal tool matches expected range of performance of the subject population being
investigated

CEILING EFFECTS:
When the highest possible score that can be achieved on a test does not capture the full
range of improvement an individual may make

Researchers try to choose tools that assure no ceiling effects


VALIDITY IN RESEARCH
DESIGNS
“Believability”
=Internal Validity
Threats to research Validity

For evidence-based physical therapists the hope is that these threats


have been anticipated by the investigator and addressed in such a way as
to minimize their impact (keeping in mind of course that there is no such
thing as a perfect research design!).
Threats to Research Validity of Intervention
Studies
School-based physical therapy for children with spastic diplegia that is
investigating the comparative effectiveness of task-specific functional training
versus a neurodevelopmental treatment (NDT)
Threats related to Subjects
SELECTION

(1) access to the population of interest is limited and/or

(2) the method used to choose participants results in a sample that is not
representative of the population of interest
SELECTION
■ The Problem:
● The need to wait until individuals first seek medical attention for a condition being
investigated
● Implementation of nonprobabilistic sampling approaches
■ Solution:
● Research designs in which more than one location serves as the
study site
● use a randomized allocation technique to put the subjects into groups.
Allocation (Assignment)

process of placing subjects into groups results in differences in baseline characteristics between (or

among) the groups at the outset of a study


Allocation (Assignment)
Problem: it is possible to envision that children might be different in their age, gender, number and effect of other
disease processes, current motor control abilities, and severity of spasticity. If the group allocation process is successful,
then these characteristics will be distributed equally between the functional training group and the NDT group before either
intervention is applied.

Solution: use a randomized allocation technique to put the subjects into groups. When the
random assignment process is unsuccessful, or when randomization to groups is not possible
logistically (as in nonexperimental designs), then statistical adjustment for baseline differences
should be implemented.
Threats related to Subjects
ATTRITION
DROP OUT/ MORTALITY
Loss of subjects during the course of the study

When subjects withdraw from a study they reduce the sample size and, in the case
of multiple groups, introduce the possibility of group inequality in terms of relevant
characteristics.
Reduction in sample size has implication for statistical analysis

power of statistical tests to detect a difference among the groups may be lost
ATTRITION
Problem: children may withdraw because of competing school demands, boredom or
frustration, a change in a parent’s availability to bring them to the treatment session, and
so on.

Solution:
If well funded studies, researchers may replace the lost subjects
If not, researchers must document the characteristics and reasons for withdrawal
Investigators should not arbitrarily remove subjects to equalize the numbers between groups
because this step introduces additional bias into the study

An “intention-to-treat” analysis may be performed in cases when it is possible to collect outcomes


data on study dropouts
Threats related to Subjects

MATURATION
● refers to changes over time that are internal to the subjects
● A person’s physical, psychological, emotional, and spiritual status progresses and
declines as a result of, or in spite of, the events in the “outside world.”
● age, growth, increased experience and familiarity with a particular topic or skill, healing,
development of new interests and different motivations, and so on.
MATURATION

Problem: The children with spastic diplegia certainly will be changing as a result of their
natural growth process. How that growth affects their muscle length and tone, their
coordination, their understanding of the interventions applied to them, and so on may
influence their ability and desire to improve their gait patterns. If the investigators cannot
account for the reality of the children’s growth, how can they conclude which, if either,
intervention facilitated the subjects’ normalization of gait?

Solution: Take baseline measurements. Baseline measures that are similar to one another
would suggest that maturation is not occurring.
Compensatory Rivalry or Resentful Demoralization

If communication among study participants is not tightly controlled, then subjects


may acquire knowledge about the different groups’ activities
Resentful of not receiving the same treatment as the experimental group
1. Compensatory rivalry, or the “we’ll show them” attitude; or
2. Resentful demoralization, or the “we’re getting the short end of the stick so why bother” attitude.2
Compensatory Rivalry or Resentful Demoralization
Problem: The children may go to the same school and may find out the
inequality of the intervention given

Solution: Keep the subjects separate. (Enroll subjects to study who are from
different schools

Mask/Blind the subjects so that they do not know which group they
belong
Threats related to Subjects
Diffusion or Imitation of Treatment
subjects in different groups have contact with one another during the study

Either purposefully or unintentionally, these individuals may share aspects of the treatment in their
group that prompts changes in behaviors by members in a different group
Diffusion or Imitation of Treatment

Problem: If the children in the functional training group describe the tasks they perform as
part of their treatment, then children in the NDT group might start performing these same
tasks at home in an effort to copy or try out these alternative activities. If this situation
occurs, it will be difficult to attribute any changes to the NDT approach

Solution: the same as those applied to compensatory rivalry and resentful


demoralization.
Threats related to Subjects
Statistical Regression to the Mean
subjects enter a study with an extreme value for their baseline measure of the outcome of interest.
Statistical Regression to the Mean

Problem:
a few children in the gait study may have a crouched gait pattern composed of hip and knee flexion in excess of 100°
when compared to the majority of children in the sample whose flexion at the same joints exceeds normal by only 10° to
25°. As a result, the children with the crouched gait pattern may show improvement in their kinematics simply because a
second measure of their hip and knee position is likely to show decreased flexion. This situation will cloud the true effects, if
any, of the interventions

Solution:
● eliminate outliers from the baseline scores so the sample is limited to a distribution that is closer to
the mean
● take repeated baseline measures and average them to reduce the extremes through the aggregation
process
Threats Related to Investigators

Compensatory Equalization of Treatments


occurs when the individuals providing the interventions in the study purposefully or inadvertently supplement
the activities of subjects in the control (comparison) group to “make up for” what subjects in the experimental
group are receiving.
Compensatory Equalization of Treatments

Problem: children in the functional training group may walk just as well as children in the NDT group because the
physical therapists provided additional training opportunities to the children who were performing task-specific skills.

Solution: mask the investigator(s) or the therapist(s) such that they do not know which group
is receiving which intervention.

● A second step is to provide a clear and explicit protocol for intervention administration, including a script for
instructions if indicated.
● A third step is to ensure that communication about the interventions between investigators or therapists is minimized
or eliminated
Threats Related to Study Logistics

History

● events that occur outside of an intervention study that are out of the investigators’ control.
● the problem is due to concurrent events, not past incidents.
● The opportunity for a history effect to occur grows as the length of time between measures of the outcome
increases.
History
Problem may have changes in their physical education activities over the course of a school year that may enhance or
:

diminish their ambulatory abilities during the study

Solution:
● they might use a control or comparison group in addition to their treatment group and then randomly
assign subjects to each
● researchers might contact the schools attended by their subjects and inquire as to the nature and timing of their
physical education activities. The information provided by the schools may allow the investigators to organize the
intervention and data collection timetable in such a way as to miss the changes in physical education activities that
are planned over the course of the school year.
Threats Related to Study Logistics

Instrumentation
Examples: selection of the wrong measurement approach or device, inherent limitations in the measurement,
malfunction of the device, and inaccurate application of the device.
Instrumentation:
Problem: gait analysis of children with spastic diplegia might be performed through visual inspection by the
physical therapist (less accurate), or it might be performed by progressively sophisticated technologic means, including a
video and computer (more accurate). In the former case, the potential for inaccuracy rests in the normal variability that is
part of all human activity. What one physical therapist sees in terms of a child’s ability may not be what another therapist
sees.

Solution:
○ investigators should consider carefully what it is they want to measure and the techniques available to do so
○ Measurement reliability, measurement validity, and responsiveness should be evaluated with an effort toward
obtaining the most useful device or technique known.
○ the conditions under which the measurements are taken (e.g., temperature, humidity) should be maintained at
a constant level throughout the study, if possible.
○ the authors should describe an orientation and training process during which individuals collecting data for the
study learned and demonstrated the proper use of
the measurement device and/or technique.
Testing:

subjects may appear to demonstrate improvement based on their growing familiarity with the testing
procedure or based on different instructions and cues provided by the person administering the test
Testing
Problem:

may demonstrate improvement because of practice with the gait assessment process rather than because of the
functional training or NDT. Similarly, investigators who encourage some children during their test but not others (e.g., “Come on, you
can walk a little farther”) may introduce the potential for performance differences due to extraneous influences rather than the
interventions.

Solution:
● investigators may give the subjects several practice sessions with a particular test or measure before collecting actual data on
the assumption that the subjects’ skill level will plateau
● average the scores of multiple measures from one testing session to reduce the effect of changing skill level through a
mathematical aggregation technique.
● investigators should describe a clearly articulated protocol for administering the test, including a script for instructions or
coaching if indicated.
● competence performing the test to specification also should be verified in all test administrators prior to the start of the actual
data collection.
Threats to Research Validity in Quantitative Studies About Other
Components
of the Patient/Client Management Model
Additional Solutions to Research Validity Threats

1. they can compensate statistically for the threats through the use of control variables in their analyses.

“Intention to treat analysis”


alternative statistical method to address unbalanced groups

2. the investigators can simply acknowledge that threats to research validity were present and need to be recognized as a
limitation to the study
The Role of Investigator Bias

Selection

Selection criteria that are defined too narrowly relative to the question of interest are an example of possible
investigator bias that is concerning for all studies

Another issue is the purposeful selection or exclusion of some subjects rather than others based on characteristics
not included in the criteria. In both situations, the resulting lack of representativeness in the sample means the question is
answered incompletely and may provide a skewed representation of the phenomenon of interest.
Testing

Individuals responsible for the application of tests and measures may produce inaccurate results due to
their knowledge of subjects’ group classification (if relevant to the research design) or previous test results, or both

the strategy to minimize these threats is to conceal the information from the study personnel so they are not
influenced by, or tempted to respond to, knowledge about the subjects.
Assignment

individuals responsible for enrolling subjects respond to additional information by changing the group to
which a subject has been randomly assigned
Threats to Construct Validity
An evidence-based physical therapist assesses the integrity of construct validity by comparing the
variable(s) with their measures to determine if the latter truly represent the former

construct underrepresentation
lack of sufficient definition of the variable

Hawthorne effect

Tendency to perform better because they are being observed


Placebo Effect

Tendency to report better due to a sham treatment

Carryover effect

Due to multiple treatment interaction


Additional threats to construct validity occur when there are interactions
between multiple treatments or when testing itself becomes a treatment
“Believability”: Qualitative Studies

Triangulation
a method to confirm a concept or perspective generated through the qualitative research
process
1. may use multiple sources of data, such as patients and caregivers, to describe a
phenomenon.
2. may use data collection methods such as interviews and direct observations that focus on the
same issues or phenomena.
3. they may involve multiple researchers who, through discussion of the data they have
collected, provide confirmatory or contradictory information.
Reflexivity
● suggests researchers should reflect on and continually test their assumptions in each phase of the
study
● Intention is not to eliminate bias
● helps investigators place their biases in purposeful relationship to the context of their study and its
participants.
Study Relevance
term used to describe the usefulness of a quantitative study with respect to the “real
world.
-External Validity (quantitative)

-Transferability (qualitative)
Threats to External Validity or Transferability
● Inadequate sample selection—subjects are different than, or they comprise
only a narrowly defined subset of, the population they are said to represent;
● Setting differences—the environment requires elements of the study to be
conducted in a manner different than what would happen in another clinical
setting
● Time—the study is conducted during a period with circumstances
considerably different from the present.
Statistics
Raw form data= compilation of numbers based on observations from a group of
individuals

For them to be useful, they must be:

Organized

Summarized

Analyzed
Statistics
Descriptive
● An evidence-based physical therapist’s job is to consider whether the investigators selected
the right statistical tools for their research question and applied the tools appropriately.
● what information the statistics have provided (i.e., the results of the study)
● what he or she thinks about that information (i.e., whether the results provided are important and
useful)


Descriptive statistics
● Describe the data collected by researchers
● Researchers use descriptive statistics:
○ When the sole purpose of their study is to summarize
numerically details about a phenomenon of interest
○ In studies about relationships and differences to determine
whether their data are ready for statistical testing.
○ relationships or differences to provide information about
relevant subject and/or environmental characteristics



Descriptive statistics

The type of statistics used depends on the scale of measurement:
Scale of Measurement
Ratio: Math functions are meaningful
Distance, age, Time,height, weight
Interval: Math functions are meaningful
Calendar years, temperature, pH
Ordinal: categories are rank ordered. Cannot perform higher math
MMT, education level, income
Nominal: lowest level of measurement. Can only count the number in each
category
Gender, Blood type, Diagnosis
Most likely category of descriptive statistic for each scale
measurement

Scale of measurement Central Tendency Variability

Ratio/Interval Mean Standard Deviation

Ordinal Median Frequency, Percentiles

Nominal Mode Percentage in each


category
Distribution of data
Measures of Central tendency: tell you where most of your points lie
1. Mean
2. Median
3. Mode
Mean
traditionally is calculated with ratio or interval level data

statisticians discourage calculating means with ordinal level data unless they
were mathematically transformed first;


Median
Ratio

Interval

Ordinal
Mode
Ratio

Interval

Ordinal

Nominal
Measures of Variations
=how far away the values in the data points are

Range: differences of highest and lowest score

Standard deviation (SD): difference around the mean

Variance: square of the standard deviation


Interpercentile range: A measure of the spread from one percentile division point to the next;
may be used to indicate the variability around the median

coefficient of variation (CV)- comparing the variability among different measures of
the same phenomenon or among the same measure from different samples

-coefficient of
variation divides the standard deviation by its mean to create a measure of relative
variability expressed as a percentage


Standard Error of Measurement (SEM)- indicates by how much a measurement will vary from the
original value each time it is repeated.
Low levels of SEM= high accuracy
High levels of SEM=low accuracy

Standard error of the mean (SEM)- provides an assessment of the variation in errors that occurs
when repeated samples of a population are drawn.
How far your sample mean is likely to be from the true mean of the population
Measures the precision of the data
Lower SEM= more likely that the calculated mean is close to the accurate
Standard error of the estimate (SEE)- The standard deviation of the distance between each data point and the
line
The measure of the average deviation of the errors

Smaller values for the SEE suggest more accurate predictions.


Goniometry
Standard error of measurement : 2.4 to 4.9 degrees

Standard error of the mean (SEM):


Interval or Ratio data

Positively Skewed = Scores that pull the end (or “tail”) of the curve farther out
to the right

Negatively Skewed= Scores that pull the end of the curve farther out to the left


Subject Characteristics
essential to the evidence-based physical therapist to determine how closely the subjects
resemble the individual patient or client about whom the therapist has a question

Frequencies may be stated as the actual number of subjects or as a proportion of the total number
of subjects who have the characteristic.

Effect Size
1. 0.20 minimal effect
2. 0.50 moderate effect
3. 0.80 large effect
Effect sizes that exceed 1 are even more impressive using these standards.

Effect size tells you how meaningful the relationship between variables or the difference
between groups is

A large effect size means that a research finding has practical significance, while a small
effect size indicates limited practical applications

LAB ACTIVITY
Research on the formula for each of the following:
Variance:
Standard Error of Measurement (SEM)
Standard error of the mean (SEM)
Standard error of the estimate (SEE)
Standard deviation (SD)
Minimal Detectable Change (MDC)
Effect size


QUIZZZZ
1. Describe the measures of central tendency and measures of variation

2. Illustrate:

Negative Skew


TYPES OF ERRORS
● TYPE 1: False positive
● TYPE 2: False negative

H0 is true H0 is false



Reject H0 Type 1 error



Retain H0 ☺ Type 2 error

● Decision Truth

True negative: Pt has no condition

Test is Negative


True positive: pt has a condition

Test is positive



Has Disease Does not have
disease
Test (+) TP FP

Test (-) FN TN
Sensitivity vs Specificity

● Sensitivity
○ Ability of a test to truly identify the disease

● Specificity
○ Test is + and when it is -, you don’t have the disease
● SnNout:
○ Sn= sensitivity of test is high=== test is negative====Rule out the
condition

● SpPin:
○ Sp=Specificity of test is high===test is positive===rule in the
condition

○ >0.75= high

○ **mixed won’t work


Relative risk Reduction
● Relative value that reflects the decrease in risk associated with the
intervention
● Disadvantage to RRR is that a relative value does not tell us anything about the
size of the effect
Absolute risk reduction
● Indicates the actual difference in risk between groups
● Does not provide us with a clinical value that can be used to estimate the
number of patients that would need to be treated before a benefit is observed
Number needed to treat (NNT)
● A statistic that provides the information about the effectiveness in terms of
subject’s numbers
● The number of subjects that would need to be treated to prevent one adverse
outcome or to achieve one beneficial outcome in a given time period
● If results are not statistically significant---”no treatment effect” stop there,
there is no need to go further
● If statistically significant—you need to determine if the effects are clinically
important
MCID Minimal Clinical Difference
● Smallest change score associated with a patient’s perception of a change in
health status
● Depends on attributes of outcome scale, clinical context, clinical judgement
Statistics: Inference
Inferential statistics permit estimates of population characteristics based on data
from the subjects available for study.
Parametric Statistics: ratio or interval level measures
Data are normally distributed
tests of relationships and tests of differences.



Two important concepts of statistical reasoning:

1. Probability= given all the possible outcomes, the likelihood that any one event
will occur
2. Sampling error= the tendency for sample values to differ from population
values
Parametric test of relationships
Used when an investigator wants to determine whether two or more
variables are associated with one another.
○ may be used in methodologic studies to establish the reliability and
validity of measurement instruments.
○ investigators perform these tests to examine the intrarater or interrater
reliability of data collectors prior to the start of a study.
○ used to evaluate whether all variables under consideration need to remain in
the study
○ used to model and predict a future outcome as is required in studies of prognostic
factors.

Multicollinearity: The degree to which independent variables in a study are related to one another


Parametric
● Equal distribution
● Randomization
● Bell-shaped curve: N distribution
● Quantitative data
● More powerful
Risk (uncertainty)

● Statistical significance=significance level=p value= alpha
(a) level
● p value= probability of being wrong when rejecting the null
hypothesis
○ the probability that a study’s findings occurred due to chance
● p ≤ .05= most common maximum amount of risk of being
wrong most researchers are willing to accept (5 chances
out of 100)
○ ”statistically significant finding”
alpha (α) level, or significance level,
term used to indicate the threshold the investigators selected to detect
statistical significance when they designed their study, the traditional value of
which is 0.05
p values lower than the a 0.05 threshold indicate even lower probabilities of the
role of chance.

lower the alpha level (and resulting obtained p value), the lower the opportunity for
such a type I error.





Confidence Interval
An alternative way to “p value” of reporting differences among groups

range of scores within which the true score for a variable is estimated to lie within a specified probability

CI range within which the population parameter is likely to be
68% likely; 90% likely; 95% likely
95% Cis most common (analogous to α=0.05)
Power
● probability that a statistical test will detect, if present, a relationship between
two or more variables or a difference between two or more groups
● Failure to achieve adequate power will result in a type II error, a
situation in which the null hypothesis is accepted incorrectly (a false
negative).

Parametric Test of Relationship
Correlation statistics
● Relationships between 2 variables
● Pearson or Rho/Spearman Correlation
○ Strength of Relationships → +ve from 0 – 1.00 (direct relation);
->–ve from –1.00 – 0 (inverse relation)

• 0.00–0.25 “little or no relationship”
• 0.26–0.50 “fair degree of relationship”
• 0.51–0.75 “moderate to good relationship”
• 0.76–1.00 “good to excellent relationship”

Correlation coefficient (r)
DIRECTION:
(–) or (+) sign in front of the correlation coefficient
(–) indicates a negative or inverse relationship,

(+) indicates a positive relationship



● Regression
○ Purpose is to generate an equation that relates X to Y , such that if given values of X and Y can
be predicted
Parametric Tests of Differences
used when an investigator wants to compare mean scores from two or more groups of subjects or repeated
scores from the same subjects

“between-group” tests= Statistical comparisons made using distinct groups of subjects
“within group”= comparisons made using repeated measures on the same subjects


Parametric Test of Difference

T-test

=interval or ratio

=comparing 2 groups
1. Independent Sample= compare difference between 2 independent
samples/groups
2. Paired T-test=compares difference between 2 matched samples
1. 1 tailed

2. 2 tailed
● 1 tailed
○ Directional hypothesis, 1 end of the distribution
○ Either positive or negative
● 2 tailed
○ Non-directional, 2 ends of distribution
○ Both positive or negative

ANOVA
● A-one way ANOVA
○ 2 or more independent groups compared 1 intervention
● B-two way ANOVA
○ 2 or more independent groups compared on 2 intervention
● Y-repeated measures ANOVA
○ Individuals measured over time
ANCOVA
● Compare 2 or more treatment groups while controlling effects of variables
(covariate)
Nonparametric Statistics
Nominal
Ordinal

● Unequal distribution
● No randomization
● Skewed curve
● Qualitative
● Less powerful than parametric

Non-Parametric Test
● Chi-squared test:
○ Use of nominal data to find difference between groups

● Mann-U Whitney
○ Designed to test the null hypothesis with 2 independent samples from the same population
■ Cannot draw a line; continuation of the population
● Kruskal Wallis
○ 3 or more groups compared (unequal distribution)



QUIZZZ
● 1. Difference between ANCOVA and ANOVA

LAB week 5
● Choose one study from your list during the Week 1 activity (Lit search)
● Indicate the Study question, variables and measures
● Indicate the category of statistical analysis used to answer the actual question
● Indicate the specific statistical analysis used
APPRAISING EVIDENCE ABOUT
DIAGNOSTIC TESTS AND
CLINICAL MEASURES
• Diagnosis
• Differential diagnosis

• In both cases, the conclusions reached are informed by results from


the patient’s history, clinical examination, and associated diagnostic
tests
What is Diagnosis?
• Medical Diagnosis:
• Herniated Disc L4-5
• R CVD
• Physical Therapy Diagnosis:
• Right Sided radiculopathy centralizing with repeated extension
• Left-sided hemiplegia with all movements in synergy with marked
increased muscle tone
• Diagnostic tests have three potential purposes in physical therapist
practice:
• (1) to help focus the examination on a particular body region or
system,
• (2) to identify potential problems that require referral to a physician or
other health care provider,
• (3) to assist in the classification process
•For a given patient, there is a baseline
probability of a certain pre-testing
•“ “ probability

•Application of a clinical diagnostic test alters the


baseline probability
•“ “ probability
Approaches that uses Evidence from research
to guide pretest probability estimates
1. Patients with the same clinical problem undergo thorough
diagnostic evaluation, which yields a set of frequencies of the
underlying diagnosis made, which clinicians can use to estimate the
pretest probability
2. Clinical decision rules or prediction rules are generated in which
patients with a defined clinical problem undergo diagnostic
evaluation. Investigators use statistical methods to identify clinical
and diagnostic tests features, separating the patients into
subgroups with different probabilities of a target population
Diagnostic Process
• Generate possibilities and their relative likelihood or probabilities
• Gather new information to clarify your initial diagnostic possibilities
• Revise pretest and posttest probabilities
Diagnostic Probabilities
• Leading hypothesis or Working Diagnosis
• Active Alternative Diagnoses
• Remote Possibilities
Process of EBP
1. A-Translation of uncertainty to an answerable question
2. A-Systematic retrieval of best available evidence
3. A-Critical appraisal of evidence for validity, clinical relevance, and
applicability
4. A-Application of results in practice
5. A-Evaluation of performance
PICO becomes PEcO
• P: patient/problem
• E: exposure to the diagnostic test
• c: comparison (rarely) of multiple diagnostic tests (can be a study of
clinical prediction rule)
• O:Outcome

• Are (is) _________ (I) more accurate in diagnosing ________ (P)


compared with ______ (C) for _______ (O)?
• For a 30 year old male with radiating low back pain, what is the
diagnostic accuracy of the crossed leg raising test?
• How does the performance of a new diagnostic test compare against
a gold standard?

• Index test or measure: The diagnostic test or clinical measure of


interest, the utility of which is being evaluated through comparison to
a gold (or reference) standard test or measure.
• Gold standard (also referred to as reference standard) test or
measure: A diagnostic test or clinical measure that provides a
definitive diagnosis or measurement. A “best in class” criterion test or
measure
Process of EBP
1. A-Translation of uncertainty to an answerable question
2. A-Systematic retrieval of best available evidence
3. A-Critical appraisal of evidence for validity, clinical relevance, and
applicability
4. A-Application of results in practice
5. A-Evaluation of performance
• Pubmed clinical Queries (Diagnosis, Systematic reviews, Clinical
prediction guideline)
• Cochrane
• National Guideline Clearinghouse
Inside the Study
• Design
• Research question/purpose statement/Aim
• Subjects
• Target Diagnostic test (index test)
• Gold standard (reference test) comparison
• Sman, A. D., Hiller, C. E., Rae, K., Linklater, J., Black, D. A.,
Nicholson, L. L., Burns, J., & Refshauge, K. M. (2015).
Diagnostic accuracy of clinical tests for ankle syndesmosis
injury. British Journal of Sports Medicine, 49(5), 323–329.
https://doi.org/10.1136/bjsports-2013-092787
Diagnostic accuracy of clinical tests for
ankle syndesmosis injury
• Design: Cross-sectional diagnostic accuracy
Research question/Purpose statement/Aim
• Usually found at the end of the introduction
• ”Aim” or “Purpose”
• Clarifies the condition and diagnostic test being examined
Sample Article

ABSTRACT: At the end of the introduction


• Our aim was to investigate the • The aim of the present study
diagnostic accuracy of the was to investigate the accuracy
clinical presentation of ankle of four common clinical
syndesmosis injury and four diagnostic tests for identification
common clinical diagnostic tests. of ankle syndesmosis injury. In
addition, we investigated
whether the clinical presentation
has diagnostic value.
Subjects
• INCLUSION CRITERIA:
• To be included, participants had to be aged between 16 and 60 years and
present to any of the participating podiatry, physiotherapy or sports
medicine centers in two Australian cities with an ankle sprain injury within
1 week of the injuring incident. In addition, they had to have a positive
result on one or more of the investigated clinical tests, and suspicion of
potential ankle syndesmosis involvement.
• EXCLUSION CRITERIA
• (1) suspicion of lower limb fracture, (2) or an isolated anterior talofibular
ligament sprain, because tests for ankle syndesmosis and MRI would not
normally be prescribed, and the optimal method for determining test
validity is to examine the tests in the clinical situation in which they are
used,33 (3) the inability to obtain an MRI scan within 2 weeks of injury.
• A reader should be able to decide from this whether a particular
patient is sufficiently similar to these study subjects to justify
generalization of results
Did the investigators include subjects with all
levels or stages of the condition being
evaluated by the index test?
• Clinicians must face diagnostic uncertainty
• An appropriate array of severity (mild, moderate, severe) and some subjects
without the target condition

• Beware of studies where asymptomatic healthy “controls” are compared to


patients with obvious disease or severe manifestations of target disorder
Review of Validity (Quality)Factors
• Did the patients present to clinicians a genuine diagnostic dilemma?
• Did investigators compare the index test to an appropriate,
independent reference standard test?
• Were those interpreting the reference standard test blind to results of
the index test (and vice versa)?
• Did all the patients receive the reference standard test regardless of
results on the index test?

• The study should provide evidence that the reference is a gold


standard
• Do you consider the gold standard test to be the right one
• Independent: Index test must not be a component of the gold
standard test
• Target Disorder
• Ankle Syndesmosis Injury
• Gold Standard or Reference Test
• MRI Results
• Arthroscopy is the gold standard for diagnosis of ankle syndesmosis injury.4
27 28 However, arthroscopy is invasive, carries serious risks and is expensive.
MRI, however, has diagnostic accuracy similar to arthroscopy while being
relatively noninvasive, less costly and having few associated risks.28 30–32
MRI was therefore selected for use in our study as the reference standard
Blinding
• If investigators are not blinded, this introduces bias

• Independent: Person determining target test score is different from


person determining the reference test score
• FOR THE SAMPLE ARTICLE:
• Blinded: the testers are blinded to the score of the comparison test
• FOR THE SAMPLE ARTICLE:
Was the diagnostic test evaluated in an
appropriate spectrum of patients?
• Across full range of presenting signs and symptoms
• Including conditions typically confused with the target dx
• Answer must be yes to get a score of 1 for validity

• Spectrum bias, lack of sufficient heterogeneity, is typically thought to


occur when diagnostic test performance varies across patient
subgroups and a study of that test's performance does not
adequately represent all subgroups.
Did all patients receive the reference standard
test regardless of results on the index test?

•All patient should get the GOLD STANDARD TEST


•If not, the study will distort properties of the
proposed diagnostic test:
• Work up Bias or Verification Bias
Did the researchers repeat the study with a
new set of subjects?
• Most typically not completed
Statistical Analysis
• Goal is to apply findings from this sample to the target population
• Still using inferential statistics but not null hypothesis testing of
between group differences
• Proportion of subjects in which target and reference scores are in
agreement
What are the results?
• Sn (with 95%CI)
• Sp (with 95%CI)
• Positive Likelihood ratio (with 95%CI)
• PLR (OR LR+)
• Negative likelihood ratio (with 95%CI)
• NLR (OR LR-)
2X2 TABLE: DIAGNOSTIC TEST STUDY
GOLD STANDARD TEST RESULT
CONDITION CONDITION
INDEX TESTRESULTS

PRESENT (+) ABSENT (-)


TRUE POSTIVE FALSE POSITIVE
(+) CLINICAL TEST
a b
FALSE NEGATIVE TRUE NEGATIVE
(-) CLINICAL TEST
c d
SENSITIVITY
• True positive Rate
• Proportion of patients with the condition who have a positive test
result
• Tests with high sensitivity have few false negatives
• SnNout
SPECIFICITY
• True Negative Rate
• Proportion of patients without the condition who have a negative test
results
• Tests with high specificity have a few false positives
• SpPin
MRI
• Sn= 100%
• Sp=93.1%
• For detection of ankle syndesmosis injury

• Since there was variations, the researchers used:


• Sn= 95%
• Sp=90%
EXAMPLE: Crossed SLR for dx of lumbar
radiculopathy
• Sn= 0.28 (95%CI 0.22 to 0.35)
• Sp= 0.90 (95%CI 0.87 to 0.95)
• Can you trust a positive test result?

• Can you trust a negative test result?


Positive Predictive Value
• ability of a diagnostic test to correctly determine the proportion of
patients with the disease from all of the patients with positive test
results
Negative Predictive Value
describes the ability of a diagnostic test to correctly determine the
proportion of patients without the disease from all of the patients with negative test
results
Likelihood Ratio (LR)
• Very useful in estimating the magnitude of the impact of sensitivity
and specificity scores, considered together
• Increase or decrease in the odds of having the target condition, given
positive or negative target test
• If not given in the study, calculate from sensitivity and specificity
scores
• Used to reduce uncertainty about a patient’s likelihood of having the
target condition
• Pre-test probability of having the condition combined with LR yields a
post-test probability
• LR nomogram= a tool for computation
• If diagnostic test is positive=use PLR
• Useful PLR always >1.0
• If diagnostic test is negative=use NLR
• Useful NLR always <1.0

• If 95% CI for either LR includes 1.0 (null), the LR is not statistically significant
• Positive likelihood ratio (LR+): The likelihood that a positive test
result will be obtained in an individual with the condition of interest as
compared to an individual without the condition of interest
• Negative likelihood ratio (LR–): The likelihood that a negative test
result will be obtained in an individual with the condition of interest as
compared to an individual without the condition of interest.
• Posttest probability: The odds (probability) that an individual has a
condition based on the result of a diagnostic test.
• Pretest probability: The odds (probability) that an individual has a
condition based on clinical presentation before a diagnostic test is
conducted.
Positive Likelihood Ratio
• Sensitivity=
• Specificity=

• Sensitivity/1-specificity

• A positive target test result is_____times more likely to occur in a


patient who has the condition, as compared to the patient who does
not have the condition
Negative Likelihood Ratio
• Sensitivity=
• Specificity=
• 1-sensitivity/specificity
• The likelihood that a negative target test result was found in a person
who has, as compared to does not have, the disease or condition.
• The odds of having the target condition if you have a negative target
test, is 0.14
Nomograms
• The nomogram helps a clinician determine whether performing the
test will provide enough additional information to be worth the risk
and expense of conducting the procedure on a specific individual. Use
of a nomogram involves the following steps:
• Determine the patient’s or client’s pretest probability of the condition
• Identify the likelihood ratio for the test
• Connect the dots
Results:
• Goal: State and interpret the statistical findings of the study
• Calculate LR if not provided
• Interpret accordingly
• Assess pre test and post test probabilities using a nomogram analysis
Results for sample test?
Is the diagnostic test adequate?
• Available
• Affordable
• Accurate
• Precise

• For use in your practice setting?


Is the study up to date?
• Has anything changed in the practice environment since the evidence
was gathered that would substantively affect this test?
Is it likely that pts will comply with testing
procedures?
• Is the test:
• TOO PAINFUL
• TOO TIME CONSUMING
• TOO COSTLY
• TOO INCONVENIENT
QUIZZ!
• Difference between gold standard and index test
APPRAISING
EVIDENCE
ABOUT
PROGNOSTIC
(RISK) FACTORS

Begnoche DM, et al. Predictors of


independent walking in young children with
cerebral palsy.
Phys Ther. 2016;96(2):183-192.
What is Prognosis?
• the process of predicting the future about a patient or client’s
condition.
• Physical therapists develop prognoses about
• (1) the risk of developing a future problem;
• (2) the ultimate outcome of an impairment in body structures or
functions, an activity limitation, or a participation restriction
• (3) the results of physical therapy interventions
• Medical Prognosis
• How long will I live?
• What if I choose not to have chemotherapy?
• Physical Therapy Prognosis
• What are the odds of returning to basketball this season?
• Can I expect my back pain to simply go away over time?
EBP Approach
• Foreground question
• Get the best available evidence
• Assess the validity of the evidence
• Determine expected prognosis
• Determine modifying prognostic factors
• Decide whether the evidence applies to my patient
• If appropriate, provide evidence- based prognosis to my patient
• Post-treatment, determine whether patient outcome was consistent
with prognosis
• For a 16-year old male patient with a recent traumatic brain injury,
what is the probability of independent ambulation without an
assistive device upon discharge from an inpatient rehabilitation
facility?
Overall prognosis
• How likely is the outcome?
• Usually expressed as a percent with a 95% confidence interval
• This is the overall probability of the outcome
Elements of a Prognosis
• (1) the outcome (or outcomes) that are possible
• (2) the likelihood that the outcome (or outcomes will) occur
• (3) the time frame required for their achievement
prognostic factors- general term that describes characteristics
predictive of any type of future outcomes
Risk factors- Predictors of future adverse events
Prognosis study
• Typically broad outcomes
• Death, Cure, Return to work, Return to home, Return to sport
• Different purpose and design than an intervention or a diagnosis
study
• Not manipulating the situation for cause and effect relationships
• Not assessing a tool for diagnostic capability
• Are following the course of a condition over time
• Exploratory; Does the presence or absence of certain factors help predict an
outcome
• Could have descriptive components
The inner workings of the study
• Study Design
• Research question and/or hypothesis
• Subjects
• Independent Variables (predictor variables/factors)
• Dependent Variable (Predicted variable/outcome of interest)
• Time frame
Common Prognostic Study Design
• Cohort
• Prospective
• Follow prospectively over time individuals who share certain risk factors to predict who
gets outcome of interest
• More preferred design
• One cohort= all with risk factors
• Two cohorts= one with risk factor (smoking) another without the risk factor (non-smoker)
• Retrospective
• Practical approach
• Weaker design
• Case-control
• Retrospective
• Practical-avoids longitudinal study
• Has a comparison group
• BUT still using historical data
SAMPLE ARTICLE
• Study Design
• Research Question/Purpose Statement/Aim of prognosis study
• Subjects (Demographics,Inclusion/Exclusion)
• Independent Variables (Predictor)
• Dependent Variable/s (Outcome)
Independent Variables (Predictor)
• Demographic factors
• Disease specific factors
• Health characteristics
• Behavioral Characteristics
Dependent Variable (Outcome)
• The outcome the study is trying to predict
• Outcome of interest
• Time frame to achieve or assess the outcome of interest is often
stated as part of the dependent variable
RISK OF BIAS
RISK OF BIAS
• 1. Was the sample of patients sufficiently representative?
• Sample must represent population
• Beware of studies wherein patients are highly filtered
• 2. Did all study patients have reasonable similar prognostic risk?
• Beware of studies with great heterogeneity
• Look of adjusted estimates for prognostic factors
• Unadjusted Ors (univariate analyses)
• Adjusted Ors (multivariate analyses)
• 3. Was there sufficiently complete follow-up?
• Study dropouts pose a greater validity threat in studies with low incidence
rates
• The validity threat is more severe if there is reason to believe that those lost
to follow-up had a worse prognosis than those completing the study
• 4. Did the authors apply objective and unbiased outcome criteria?
• Outcomes that are completely subjective may pose a significant validity threat
• 5. If subgroups with different prognoses are identified (#3) and
adjusted for, was there validation in an independent group of “test
set” patients?
Did the researchers provide sufficient information
to describe the sample in their study?
• Must give you both inclusion and exclusion criteria with a reasonable
operational definition for each criteria
Are subjects representative of the target
population with this condition?
• Admission criteria vs description of the subjects who actually entered
the study
• EXAMPLE:
• Traumatic brain injury (TBI) is the leading cause of disability and
deathin children and adolescents in the U.S. According to the Centers
for Disease Control and Prevention, the two age groups at greatest
risk for TBI are age 0-4 and 15-19.
• For this study, the mean age at time of admission to inpatient rehab
facility was 10.5 years. Range of ages were from 2.1 -18.9yo.
Were subjects assembled at a common point
in the course of their disease?
• Using subjects who are similar in terms of their point in the course of
a disease process removes a confounding area of variability
• Easier to see factors most important in predicting an outcome
Were subjects followed for a long enough
timeframe to capture the outcome of interest?
• Does the study provide support that the length of the study is
sufficient to capture the outcome?
• Inadequate length of follow-up may incorrectly classify a subject as
not experiencing the outcome of interest, when indeed, they did- just
at a later date.
• Does it represent an important length of time?
Was the target outcome clearly defined (good
operational definitions)?
Training of Assessors
• Must be evidence that the assessors are able to reliably measure the
outcome of interest.
• Must be evidence that the researchers can reliably identify and
measure each prognostic factor
Are the outcome assessors blinded to the
subject’s Prognostic Factors
• The person assessing subject outcome should NOT know subject’s
scores on predictor factors.
Did at least 80% of the subjects finish the
study?
Were prognostic findings validated in an
independent group of pts?
• The study could be organized such that
• It is repeated in a completely new group of subjects;
OR
• If large subject pool is available, 1/2 the subjects (randomly selected) were
analyzed; then repeated using the remaining subjects.
Study Results
• Descriptive Portion
• usually are reported as proportions, such as the percentage of subjects with a
particular risk factor who developed a pressure ulcer or the percentage of
subjects with a certain prognostic factor who returned to work
• Survival curves- plotting events over time
• Survival curves are calculated most commonly when investigators are
evaluating the development of adverse events such as death, side effects from
treatment, and loss of function.
Statistical Tests
• Association among variables (correlational)
• Predicting an outcome
• Linear regression
• Logistic regression

• Regression analyses are used to predict the value of the outcome, as


well as to determine the relative contribution of each prognostic (risk)
factor to this result.
• Correlational analysis
• Is there a relationship between the variables?
• Regression analyses
• Predictive: Do the independent variables predict the outcome?
• Which independent variables are most helpful in predicting an outcomes

• REMEMBER: Prognosis studies are not experimental


Results
• How likely is the outcome of interest?
• Overall prognosis
• How precise is the estimate of overall prognosis?
• Width of 95%CI
• What are the modifying prognostic factors?
• Look for independent factors with associated ORs
• How precise are estimates of influence for prognostic factors?
Widths of 95%CIs around ORs (or RRs)
How precise Are the estimates of Likelihood?
• Risk of adverse outcomes are reported with their associated 95%
confidence intervals (CIs)
• In a valid study, the 95% CI defines the range of risks within which it is
highly likely that the true risk lies.
• An odds ratio, relative risk, or hazard ratio is meaningful if it lies
within the confidence interval
• Remember that if a confidence interval around an odds ratio, relative
risk, or hazard ratio contains the value “1,” then the odds, risks, or
hazards are no better than chance, and the predictor is not useful
ODDS RATIO
• reflect the odds that an individual with a prognostic (risk) factor had
an outcome of interest, as compared to the odds for an individual
without the prognostic (risk) factor.

• OR = [a/b]/[c/d] or ad/bc
OR = [a/b]/[c/d] or ad/bc

OUTCOME
Outcome Present Outcome Absent
Risk Factor Present 18 a 46 b
Risk Factor Absent 29 c 36 d
Interpreting the ODDS RATIO
• The odds for having the outcome among those with the prognostic
factor are only about 49% as high as the odds for having the outcome
among those without that attribute
Relative Risk (RR)
• the ratio of the risk of developing an outcome in patients with a
prognostic (risk) factor compared to the risk in patients without the
prognostic (risk) factor

• Indicates the likelihood that someone who has been exposed to a risk
factor will develop the disease, as compared with one who has been
exposed.

• Ratio of incidence of disease among exposed subjects to the incidence


of disease among the unexposed
Interpreting the RR <1.0
• those with the prognostic factor are only about 63% as likely to have
the outcome as those without that attribute.
Interpreting the RR >1.0
• If 2.0 is the RR:
• Those with the prognostic factor are about twice as likely to have the
outcome as those without that attribute.
Linear regressions
• Used when you want to predict a specific score using continuous
data.
• How much of the variance can you predict?
• The process of determining a regression equation to predict values of
Y based on a linear relationship with values of X.
Logistic regression
• Predicts whether something is True or False
Applying the Results
• Were study patients similar to my patients?
• Was the management of patients similarto my practice?
• Was there adequate follow-up?
• Can I use these results to manage my patients?

You might also like