Reliability of A Movement Quality Assessment Tool To Guide Exercise Prescription (Movementscreen)
Reliability of A Movement Quality Assessment Tool To Guide Exercise Prescription (Movementscreen)
ABSTRACT
Background/Purpose: Movement quality is commonly assessed to identify movement limitations and guide exercise prescription.
Rapid growth in the movement assessment landscape has led to the development and utilization of various movement quality
assessments, many without reliability estimates. MovementSCREEN is a novel, tablet-based, video-recorded movement assess-
ment tool, currently without published reliability information. Therefore, the purpose of this study was to determine the intra and
inter-rater reliability of the MovementSCREEN, including the impact of rater experience, and provide estimates of measurement
error and minimal detectable change.
Study Design: Cross-sectional design; reliability study.
Methods: Thirty healthy young adults (14M:16F, mean age 28.4 yrs, SD 9.1) were video recorded completing the nine Move-
mentSCREEN assessment items on two occasions, two weeks apart. Each individual movement was assessed against objective
scoring criteria (component items: yes/no) and using a 100-point sliding scale. To create an overall score for each movement, the
scale score is weighted against the objective items to provide a score out of 100. At the completion of all nine individual move-
ments, a mean composite score of movement quality is also established (0-100). The first recording was scored twice by two expert
and two novice assessors to investigate inter- and intra-rater reliability. The second recording was scored by one expert assessor to
investigate within-subject error. Inter- and intra-rater reliability was calculated using intraclass correlation coefficients (ICCs) and
Kappa statistics. The standard error of measurement (SEM), and minimal detectable change (MDC95) for the overall score for each
movement, and the composite score of movement quality, were calculated.
Results: Intra-rater reliability for the component items ranged from κ = 0.619 – 1.000 (substantial to near perfect agreement) and
0.233 – 1.000 (slight to near perfect agreement) for expert and novice assessors, respectively. The ICCs for the overall movement
quality scores for each individual movement ranged from 0.707 – 0.952 (fair to high) in expert and 0.502 – 0.958 (poor to high) in
novice assessors. Inter-rater agreement for the component items between expert assessors ranged from κ = 0.242 - 1.000 (slight to
almost perfect agreement), while for novice assessors ranged from 0.103 – 1.000 (less than chance to almost perfect agreement).
ICCs for the overall scores for each individual movement from expert and novice assessors ranged from 0.294 – 0.851 (poor to good)
and 0.249 – 0.775 (poor to fair), respectively. The SEM for the composite score was 2 points, while the MDC95 was 6 points, with an
ICC 0.901.
Conclusions: The MovementSCREEN can assess movement quality with fair to high reliability on a test-retest basis when used by
experienced assessors, although reliability scores decrease in novice assessors. Comparisons between assessors involve greater
error. Therefore, the training of inexperienced assessors is recommended to improve reliability.
Level of Evidence: 2b
Keywords: functional movement screening, movement dysfunction, movement quality, movement system
1
Alliance for Research in Exercise, Nutrition and Activity,
Sansom Institute for Health Research, University of South
Australia, Adelaide, Australia.
Conflict of Interest: Professor Kevin Norton and Max Martin CORRESPONDING AUTHOR
are directors of Movement Screen Pty Ltd, the company that Hunter Bennett
developed the MovementSCREEN movement assessment University of South Australia, GPO Box 2471,
tool. Neither was involved in any part of the data
management or analysis. Adelaide, SA, 5001
Mr Hunter Bennett, Mr Scott Wood, Dr Kade Davison, and E-mail: [email protected]
Dr John Arnold declare no conflicts of interest. +61 433 377 222
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 424
DOI: 10.26603/ijspt20190424
INTRODUCTION upon a criticism of existing movement quality assess-
The assessment of movement quality has become ments in their use of basic Likert-type scoring sys-
commonplace in both sport and recreational fitness tems (i.e. 0-3), in which a lack of sensitivity has been
settings, often as tools to predict injury risk.1,2 While observed.9 While the use of these Likert-type scor-
the association between movement quality and ing systems may have the potential to increase the
injury risk has been inconclusive,3,4 these assess- tools’ ease of use, the lack of sensitivity could limit
ments can provide coaches, trainers, and reha- the depth of information gathered, mask potential
bilitation practitioners with valuable information associations with physical performance and injury
regarding areas of muscular weakness, tightness, risk, and inhibit the ability to track training induced
and movement dysfunction.1 This information can changes in movement quality.9
therefore play an important role in guiding exercise
prescription to meet the individual needs of an ath- To allow the confident assessment of changes in an
lete or client.1,5 individual’s movement quality, the tool needs to be
reliable. As assessor experience has also been shown
Increasing interest in assessing movement quality to influence the reliability of movement quality
has led to the development and widespread utiliza- assessments,10 it is important to include raters with
tion of several movement screening tools in both different levels of experience when determining reli-
research and practical settings.1,2 These tools have ability. This has implications for increasing the util-
largely been developed for use in specific popula- ity of the tool in the field and establishing relevant
tions with relatively specific objectives, although training of inexperienced assessors. Moreover, while
their overreaching goal is to assess movement qual- assessing the same movement performance on two
ity through the appraisal of an individual’s capacity separate occasions (via video capture) provides reli-
to perform fundamental movements.1,5,6,7 The inabil- ability information pertaining to the technical error
ity to complete these movements may be indicative associated with use of the tool itself, it doesn’t pro-
of movement dysfunction, while their successful vide any information about within-subject error.
completion demonstrates a higher level of move- Within-subject error is likely to be introduced when
ment quality.5,8 an individual performs the movement assessment at
MovementSCREEN is a new electronic-based, video- two different time points, where small variations in
recorded movement quality assessment tool that movement may be observed.7 Subsequently, reliabil-
assists in gathering information necessary to guide ity measures that have relevance to clinical practice
individualized exercise interventions; providing a such as minimum detectable change (MDC95) and
clear starting point from which an athlete or indi- standard error measurement (SEM) should be estab-
vidual can commence gym-based resistance exer- lished accounting for within-subject error.
cise. MovementSCREEN evaluates the performance
The first aim of this study was to determine the intra
of nine fundamental movements. Each individual
and inter-rater reliability of a novel, tablet-based move-
movement is scored against objective criteria in com-
ment quality assessment tool (MovementSCREEN),
bination with an overall movement quality indicator
including estimates of typical measurement error
to provide an indication of global movement quality.
and minimal detectable change. The second aim was
This tool is further stated to assist coaches and practi-
to determine the impact that assessor experience has
tioners track changes in movement quality that occur
on reliability estimates
in response to individualised exercise interventions.
Although its assessment items share many similari-
ties with other movement assessment tools, Move- METHODS
mentSCREEN provides a novel, tablet-based method Participants
of assessing movement quality that offers simple Participants qualified as apparently healthy in
usability in exercise prescription settings. Addition- accordance to the Exercise and Sport Science Austra-
ally, it uses a 100-point scale to provide a composite lia (ESSA) pre-exercise screening tool,11 were free of
movement measure: thereby attempting to improve musculoskeletal and neurological disease, and were
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 425
physically able to perform the nine movements
within the assessment protocol. A sample size calcu-
lation was performed and indicated that with each
subject measured two times, a target ICC of 0.9, an
ICC of 0.75 or higher to be minimally acceptable,
α=0.05 and 80% power, 26 subjects were required.
Ethical Approval
This study was approved by the University of
South Australia human research ethics committee
(0000036268). All participants were informed of the
risks and benefits of the investigation prior to sign-
ing an institutionally approved consent document to
participate in the study. Reporting for this study was
conducted in accordance to COSMIN checklist for
reliability studies.12
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 426
Figure 2. Side view of start and finish position of the Figure 3. Side view of start and finish position of the
1) Single leg squat, 2) the Overhead reach, and the 3) Thoracic 1) 4-Point with opposite arm/leg lift, 2) the push up, and the
rotation. 3) active straight leg raise.
(Table 1). The component items relate to impor- For unilateral movements, each side is scored sepa-
tant aspects of each movement and are based on rately, and a mean score of the two sides is provided
elements of control required for safe and effective for the overall movement quality score for that spe-
movement. The quality of each individual move- cific movement. At the completion of all nine move-
ment is also scored using a 100-point sliding scale ments, a mean composite score is calculated from
with associated cues (with a score of 100 being the overall scores of each induvial movement to pro-
indicative of perfect movement quality). To create vide a global measure of movement quality (0-100).
a final movement quality score for each movement,
the subjective score is weighted against the sum of Protocol
the component items to provide an overall score Participants were video recorded completing the
out of 100 (100 being the highest achievable score). MovementSCREEN assessment protocol twice,
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 427
Table 1. MovementSCREEN assessment protocol.
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 428
Table 1. MovementSCREEN assessment protocol. (continued)
two weeks apart. The participants performed the during the movement tasks was prohibited. The
MovementSCREEN protocol in the sequential order entire assessment, including the warmup, took
outlined in Table 1 and illustrated in Figures 1-3. approximately 30 minutes per participant.
Participants were given specific instructions on how
to perform each movement, while also observing a Each testing session was video recorded using two
filmed demonstration of the movements performed Apple iPads (30 frames per second, 1080p) mounted
with optimal technique. A short warmup was per- on tripods and positioned four meters from the par-
formed prior to the assessment which included a ticipant. Cameras were positioned orthogonal to
five-minute bout of jogging, followed by some body- each other, where one camera recorded the sagittal
weight exercises (walking lunges with arms over- plane of movement and one the frontal plane from
head, leg swings, and overhead reaches). Feedback the anterior aspect. Each participant performed this
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 429
protocol on two individual occasions separated by calculated using the standard error of the mea-
14 days. surement (SEM) at the 95% level of confidence.16
The minimal detectable change (MDC95) values
Two expert and two novice assessors assessed the at the 95% level of confidence were calculated to
first video recording twice, 14 days apart to inves- determine the lowest level of change that can be
tigate intra- and inter- rater reliability. The expert considered ‘true’ change and not likely due to mea-
assessors were university-qualified exercise science surement error 19. ICC’s were interpreted accord-
graduates, each with over five years of experience ing to the following criteria: high (0.90–0.99); good
working clinically as exercise physiologists, and (0.80–0.89); fair (0.70–0.79) and poor (0.00–0.69).17
strength and conditioning specialists. The novice Kappa statistics were interpreted according to Lan-
assessors were current clinical exercise physiology dis and Koch:18 slight agreement (0.01-0.20), fair
students in their final year of study, with knowledge agreement (0.21- 0.40), moderate agreement (0.41-
of exercise prescription practices, but relatively lit- 0.60), substantial agreement (0.61-0.80), and almost
tle practical experience assessing and prescribing perfect agreement (0.81-1.00).
exercise. The video footage of the second session
was also assessed by one of the expert assessors to Data were analysed using the statistical package
investigate the reliability estimates for within-sub- SPSS version 24.0 for Windows, PC (IBM, Chicago,
ject error. In this manner, the assessment of the two IL). Alpha was set at the 0.05 level.
time points accounted for both the technical error of
assessment and the performer’s variability in move- Results
ment from one test to another, which cannot be con- Thirty apparently healthy adults (m = 14, f = 16;
sidered when only assessing video footage from a mean age 28.4 years, SD 9.1; mean height 171.3 cm,
single time point. All assessments for all assessors SD 9.4; mean weight 70.5 kg, SD 12.7) participated
were performed under the same conditions, with the in this study. All 30 participants recruited into the
filmed video footage being re-watched on an 18-inch study that met the inclusion criteria completed data
computer monitor independently. collection, with no dropouts.
Statistical Analysis Table 2 provides the mean scores for both novice and
Descriptive statistics were calculated for both the expert assessors for each movement, and demon-
overall and component scores for each tester and strates the intra- rater reliability for both the com-
session. All descriptive data are presented as mean ponent items and movement quality scores. Kappa
and standard deviations, where appropriate. Paired scores for the component items within the expert
t-tests were performed on both the movement and raters ranged from 0.619 – 1.000, suggesting substan-
composite scores for one expert and one novice tial to near perfect agreement, while the novice rat-
assessor to assess for systematic error. The intra- and ers demonstrated slight to near perfect agreement
inter- rater reliability of each individual assessment (0.233 – 1.000). ICCs for the final movement quality
item within each movement (binary data: Yes/No) score demonstrated fair to high intra-rater reliability
were determined with the Kappa statistic. Intraclass in expert assessors (0.707 – 0.952), and poor to high
correlation coefficients (ICC) were used to measure intra-rater reliability (0.502 – 0.958) in novice asses-
the agreement between the two groups for the over- sors. Paired t-test across all movement and composite
all scores. Intra- and inter-rater reliability was cal- scores demonstrated non-uniform differences across
culated with a two-way mixed ICC for the overall both expert (p = .022 - .729) and novice assessors
scores between expert and novice raters separately. (p = .039 - .998), indicating no systematic difference
Within-subject error was established using ICC to in scores between the first and second assessments.
measure the agreement between testing sessions
Table 3 outlines inter- rater agreement between
one and two, as scored by an expert assessor.
expert and novice assessors. Component items com-
Response stability of the overall scores of each indi- parisons between expert assessors showed Kappa
vidual movement, and the composite scores was ranged from 0.242 - 1.000, suggesting slight to almost
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 430
Table 2. MovementSCREEN movement scores (mean and standard deviation), and intra-rater reliability for component
items and movement quality scores, for novice and expert assessors.
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 431
Table 3. Inter-rater reliability for component items and movement quality scores in novice
and expert assessors.
measurement error and minimal detectable change, Kappa statistics demonstrated substantial to near-
and to determine the impact of assessor experience perfect agreement on a test-retest basis for the
on those reliability estimates. Data collected as part component items within expert assessors, while
of this study demonstrated that MovementSCREEN ICCs for the movement quality scores were fair to
can assess movement quality with fair to high reli- high in the same group. Agreement was generally
ability on a test-retest basis when used by experi- lower in novice assessors for nearly all movements
enced assessors, although reliability scores decrease (Table 3). This information alone suggests there is
in novice assessors. Subsequently, standardized likely to be a learning effect associated with the
training looks to be necessary to improve reliability assessment of movement quality, with experience
in inexperienced assessors. Moreover, the reliability assessing movement leading to more consistent
estimates provided can determine whether ‘true’ scoring.10 Interestingly, the range of scores allocated
changes in movement quality have occurred, and for each movement were greater in expert asses-
are essential to inform the interpretation of assess- sors than their novice counterparts when viewing
ment results in the field and future research studies. the same video. This suggests that with assessment
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 432
Table 4. Within-subject error of the movement quality assessment, as assessed by a
single expert assessor.
experience, there is likely to be an increased expo- may be more useful in practical settings.20 Addi-
sure to large variations in movement quality. This tionally, as the SEM and MDC95 scores within this
exposure may result in a greater learned ‘spectrum’ study have been established from two separate test-
of movement quality, causing greater discrimina- ing occasions, they also account for both within-
tion during observation and explaining the increased subject error and technical error. The MDC95 of the
confidence in expert assessors to score an individual overall movement quality scores of each individual
either lower or higher than novice assessors. movement varied between 14 and 26 points (Table
4). While this variability in some cases may appear
The reliability estimates for the MovementSCREEN quite large, at its highest it represents a 26% change
are comparable to other pre-existing movement qual- in the movement quality score to confidently sug-
ity assessment tools that have been evaluated in the lit- gest that a ‘true’ change has occurred within an indi-
erature.1 The current investigation demonstrated that vidual movement. This is comparable to observing
the inter-rater reliability for the individual component a one-point change in a given movement within
items demonstrated fair to almost perfect agreement the FMS™, which utilises a four-point ordinal scor-
in expert assessors, and poor to almost perfect agree- ing scale.5 The MDC95 for the composite score was
ment in novice assessors. Comparatively, the move- substantially lower at six points, suggesting any
ment quality scores demonstrated poor inter-rater changes beyond this would represent a ‘true’ change
reliability in both expert and novice assessors. This in global and whole-body movement quality. It is
suggests that although the 100-point movement qual- important to note that while the MDC95 value for
ity score may offer a way to capture smaller variations the composite score was markedly less than those
in movement quality, it might be too subjective for established for the individual movements, there
the system to be used interchangeably between asses- is explanation for this. As the composite score is
sors without appropriate assessment standardization derived from a mean of the individual movement
and education, irrespective of their experience. It is scores, it is likely to smooth out the variances seen
possible that coordinated and standardized training between those scores, resulting in tighter reliabil-
may assist in this regard to improve the utility of the ity measures. Therefore, taking into consideration
system among assessors.19 The objectivity associated both the composite score and overall score for each
with the component items appears to improve reliabil- individual movement is integral when assessing and
ity between both novice and expert assessors. As the tracking changes in movement quality.
information gathered from these specific assessment
items is arguably more valuable in terms of exercise
The study has several identified strengths. Reli-
prescription guidance, this will likely have positive
ability measures were established in both expert
implications in practical settings.
and novice assessors to determine how rater expe-
Although ICC’s and Kappa statistics provide a good rience affects assessment reliability. Both the SEM
indication of the tool’s reliability, SEM and MDC95 and MDC95 were established using two individual
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 433
testing sessions to account for both the technical quality with fair to high reliability on a test-retest
error of assessment and within-subject error. This basis when administered by experienced practitio-
better represents how the tool is used in practical ners. Fair to almost perfect inter-rater agreement
settings, and may provide useful information sur- was observed for the component items of the assess-
rounding the interpretation of any changes in move- ment tool, although the inter-rater reliability for the
ment quality observed from one test to another. subjective movement quality score was poor. This
suggests that the scores from the tool may not be
There are also limitations that should be considered reliable enough for confident application between
when deciphering the results of this study. The par- different practitioners without standardized training
ticipant group consisted of apparently healthy young surrounding its use. The MDC95 for the composite
adults. While variations in movement quality were score of global movement quality is approximately
observed, these were likely to be less than those seen six points, while it varied between 14 and 26 points
in clinical practice. As such, the reliability results for the component assessment items. This informa-
described may not be equivalent for higher level ath- tion is integral to any future research examining the
letic or clinical populations, with further reliability capacity of MovementSCREEN to identify changes
studies required in this area. Nonetheless, it was first in movement quality that occur over time or after
necessary to determine reliability in a healthy popu- interventions. Further research is required to vali-
lation before progressing to more specific groups. It date its capacity to guide exercise prescription and
is also important to note that while assessing video track associated changes in movement quality.
recorded footage of the MovementSCREEN offers a
convenient means to establish inter- and intra rater REFERENCES
reliability, it does slightly restrict the overall visual 1. Bennett H, Davison K, Arnold J, et al.
information provided to the assessor when com- Multicomponent musculoskeletal movement
pared to real time assessment scenarios. In real time assessment tools: A systematic review and critical
appraisal of their development and applicability to
assessment, the assessor has the capacity to move
professional practice. J Strength Cond Res. 2017;31:
around the subject as they perform the movement 2903-2919.
to gather further information if required. Therefore,
2. McCunn R, aus der Fünten K, Fullagar H, et al.
the reliability measures described may not necessar- Reliability and association with injury of movement
ily depict those obtained in real time assessments. screens: A critical review. Sports Med. 2016;46:
Finally, while the participants did receive thorough 763-781.
instructions and demonstrations surrounding the 3. Moran R, Schneiders A, Mason J, et al. Do functional
correct performance of the individual movement movement screen (FMS) composite scores predict
assessments, they did not receive any feedback dur- subsequent injury? A systematic review with meta-
ing the movement. As both feedback and knowledge analysis. Br J Sports Med. 2017;51:1661-1669.
of the grading criteria have been shown to influence 4. Newton F, McCall A, Ryan D, et al. Functional
movement assessment outcomes,21 this study imple- movement screen (FMS™) score does not predict
mented the protocol according to intended use in the injury in English Premier League youth academy
field in individuals naïve to the criteria. The impact football players. Sci Med Footbal. 2017;1: 1-5.
that different levels of cueing and prior knowledge 5. Cook G, Burton L, Hoogenboom B, et al. Functional
has on reliability and movement quality with this movement screening: the use of fundamental
tool would require further investigation. movements as an assessment of function-part 1. Int J
Sports Phys Ther. 2014a;9:396-409.
6. Kritz M, Cronin J, Hume P. The bodyweight squat: A
CONCLUSIONS movement screen for the squat pattern. Strength
The MovementSCREEN was developed to meet Cond J. 2009a;31(1):76-85.
the needs of coaches, exercise, and rehabilitation
7. McKeown I, Taylor-McKeown K, Woods C, et al.
professionals working within gym-based exercise Athletic ability assessment: a movement assessment
prescription environments. The results from this protocol for athletes. Int J Sports Phys Ther.
analysis suggest that the tool can assess movement 2014;9:862-873.
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 434
8. Kritz M. Development, reliability and effectiveness of 15. American College of Sports Medicine (ACSM).
the movement competency screen (MCS). Doctoral ACSM’s guidelines for exercise testing and prescription
dissertation, Auckland University of Technology. 2012. 9th Ed. Lippincott Williams & Wilkins. 2013.
9. Frost D, Beach T, Campbell T, et al. An appraisal of 16. Weir J. Quantifying test-retest reliability using the
the functional movement screen™ grading criteria–Is intraclass correlation coefficient and the SEM. J
the composite score sensitive to risky movement Strength Cond Res. 2005;19:231-240.
behavior? Phys Ther Sport. 2015;16:324-330. 17. Onate J, Dewey T, Kollock R, et al. Real-time
10. Norton K, Norton L. Pre-exercise screening, guide to the intersession and interrater reliability of the
Australian adult pre-exercise screening system. Exercise functional movement screen. J Strength Cond Res.
and Sports Science Australia. 2011. 2012;26:408-415.
11. Gulgin H, Hoogenboom B. The functional movement 18. Landis J, Koch G. The measurement of observer
screening (FMS)™: An inter-rater reliability study agreement for categorical data. Biometrics.
between raters of varied experience. Int J Sports Phys 1977;1:159-174.
Ther. 2014;9:14-20. 19. Starkes J, Ericsson K. Expert performance in sports:
12. Mokkink L, Terwee C, Patrick D, et al. The COSMIN Advances in research on sport expertise. Human
checklist for assessing the methodological quality of Kinetics. 2003.
studies on measurement properties of health status 20. Haley S, Fragala-Pinkham M. Interpreting change
measurement instruments: an international Delphi scores of tests and measures used in physical
study. Qual Life Res. 2010;19:539-549. therapy. Phys Ther. 2006;86:735-743.
13. Cook G. Movement: Functional movement systems: 21. Frost D, Beach T, Callaghan J, et al. FMS scores
Screening, assessment, corrective strategies. On Target change with performers’ knowledge of the grading
Publications. 2010. criteria—are general whole-body movement screens
14. Haff G, Triplett N. Essentials of strength training and capturing “dysfunction”?’ J Strength Cond Res.
conditioning 4th edition. Human kinetics. 2015. 2015;29:3037-3044.
The International Journal of Sports Physical Therapy | Volume 14, Number 3 | June 2019 | Page 435