Criterion-Related Validity of Sit-And-Reach Test For Estimating Hamstring and Lumber Extensibility

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

©Journal of Sports Science and Medicine (2014) 13, 01-14

http://www.jssm.org

Review article

Criterion-Related Validity of Sit-And-Reach Tests for Estimating Hamstring


and Lumbar Extensibility: A Meta-Analysis

Daniel Mayorga-Vega 1 , Rafael Merino-Marban 2 and Jesús Viciana 1


1
Department of Physical Education and Sport, University of Granada, Spain; 2 Department of Didactics of Musical,
Plastic and Corporal Expression, University of Malaga, Spain

present gait limitations, increased risk of falls, and sus-


Abstract ceptibility to musculoskeletal injuries (Erkula et al., 2002;
The main purpose of the present meta-analysis was to examine Jones et al., 1998).
the scientific literature on the criterion-related validity of sit- Nowadays different kinds of tests are used to as-
and-reach tests for estimating hamstring and lumbar extensibil- sess hamstring extensibility. Flexibility is typically char-
ity. For this purpose relevant studies were searched from seven acterized by the maximum range of motion in a joint or
electronic databases dated up through December 2012. Primary series of joints (McHugh et al., 1998). Thus, angular tests
outcomes of criterion-related validity were Pearson´s zero-order
correlation coefficients (r) between sit-and-reach tests and ham-
that specifically measure hip flexion with the knee ex-
strings and/or lumbar extensibility criterion measures. Then, tended (straight leg raise test), or the range of knee exten-
from the included studies, the Hunter-Schmidt´s psychometric sion with the hip flexed to 90 degrees (knee extension or
meta-analysis approach was conducted to estimate population popliteal angle test), have been widely considered the
criterion-related validity of sit-and-reach tests. Firstly, the cor- criterion measures of hamstring extensibility (e.g., Ayala
rected correlation mean (rp), unaffected by statistical artefacts et al., 2011; Hartman and Looney, 2003; López-Miñarro
(i.e., sampling error and measurement error), was calculated and Rodríguez-García, 2010c). Nevertheless, due to the
separately for each sit-and-reach test. Subsequently, the three necessity of sophisticated instruments, qualified techni-
potential moderator variables (sex of participants, age of partici- cians, and time constraints, the use of these angular tests
pants, and level of hamstring extensibility) were examined by a
partially hierarchical analysis. Of the 34 studies included in the
seem to be limited in several settings such as in a school
present meta-analysis, 99 correlations values across eight sit- context or large scale studies (Castro-Piñero et al.,
and-reach tests and 51 across seven sit-and-reach tests were 2009b).
retrieved for hamstring and lumbar extensibility, respectively. Unlike the angular tests, lineal tests have a simple
The overall results showed that all sit-and-reach tests had a procedure, are easy to administer, require-minimal skills
moderate mean criterion-related validity for estimating ham- training for their application, and the equipment necessary
string extensibility (rp = 0.46-0.67), but they had a low mean for to perform them is very affordable (Castro-Piñero et al.,
estimating lumbar extensibility (rp = 0.16-0.35). Generally, 2009b; López Miñarro et al., 2008c). Sit-and-reach (SR)
females, adults and participants with high levels of hamstring tests in which a fingertips-to-tangent feet distance is
extensibility tended to have greater mean values of criterion-
related validity for estimating hamstring extensibility. When the
measured are probably the most widely used lineal meas-
use of angular tests is limited such as in a school setting or in ures of flexibility (Holt et al., 1999; Castro-Piñero et al.,
large scale studies, scientists and practitioners could use the sit- 2009a). However, as the SR is a test which involves the
and-reach tests as a useful alternative for hamstring extensibility movement of the whole body, it has been suggested that
estimation, but not for estimating lumbar extensibility. the position of the fingertips does not give valid informa-
tion about hamstring extensibility (Hoeger et al., 1990).
Key words: Concurrent validity, range of motion, flexibility, The main factors that seem to affect the validity of SR
field test, lineal test, systematic review. tests to estimate hamstring extensibility are the differ-
ences in length proportion between the upper and lower
limbs (Hoeger et al., 1990), the position of the head
Introduction (Smith and Miller, 1985) and the position of the ankles
(Kawano et al., 2010; Liemohn et al., 1997). In addition,
Lack of hamstring muscles extensibility conditions a recent studies have also found that the levels of hamstring
decrease of pelvic mobility (Kendall et al., 2005). This extensibility influence the criterion-related validity of SR
invariably leads to biomechanical changes in the pressure tests (López-Miñarro et al., 2011; López-Miñarro and
distribution of the spine and consequent spinal disorders Rodríguez-García, 2010c).
(da Silva Días and Gómez-Conesa, 2008). Therefore, poor The choice of a flexibility test must be based on its
hamstring extensibility has been associated with thoracic functionality and validity (López-Miñarro, 2010). Al-
hyperkyphosis (Fisk et al., 1984), spondylolysis (Stan- though the angular tests have the advantage of being the
daert and Herring, 2000), disc herniation (Harvey and criterion measure to assess flexibility, due to several prac-
Tanner, 1991), changes in lumbopelvic rhythm (Esola et tical reasons they have the disadvantage of having a lim-
al., 1996; López-Miñarro and Alacid, 2009) and low back ited use in several settings (Castro-Piñero et al., 2009b).
pain (Biering-Sorensen, 1984; Mierau et al., 1989). Addi- In these settings, as the SR tests have the advantage of
tionally, individuals with shortened hamstring muscles allowing for an evaluation in a short amount of time with

Received: 21 June 2013 / Accepted: 22 August 2013 / First Available (online): 08 October 2013 / Published (online): 20 January 2014
2 Criterion-related validity of sit-and-reach tests

minimal skills and instruments, potentially they could be used were based on two concepts. Concept one included
a useful alternative to estimate flexibility. Nevertheless, terms for the SR test (sit and reach) and concept two in-
as in the application of any fitness field test, the SR tests’ cluded terms for validity (validity, related, relationship,
results are a simple estimation and, therefore, the evalua- correlation, comparison, hip, hamstring, flexibility, ROM,
tors must be aware of validity coefficients in order to range of motion, range of movement, straight leg raise,
interpret the scores of these tests correctly. Unfortunately, knee extension, popliteal angle, lumbar, back, Macrae and
the studies examining criterion-related validity of SR tests Wright, Macrae & Wright, Schober, radiography, go-
for estimating hamstring and lumbar extensibility have niometer, and inclinometer). The terms of the same con-
shown inconclusive results (Baltaci et al., 2003; Hui and cept were combined together with the Boolean operator
Yuen, 2000; Hui et al., 1999; Jones et al., 1998). “OR” and then the two concepts were combined using the
Each primary study that is published about crite- Boolean operator ‘‘AND’’ (Benito Peinado et al., 2007).
rion-related validity of the SR tests only constitutes as a The keywords that consisted of more than one word were
single piece of a constantly growing body of evidence enclosed in quotes. In addition, the reference lists of all
(Cooper et al., 2009). For example, in some studies the included papers were manually searched.
correlation coefficient is statistically significant, while in
others a statistically significant association is not found. Selection criteria
In some cases the strength of the association is quite high, The selection criteria to identify studies that examined the
while low in others. To make sense of the often conflict- criterion-related validity of SR tests for estimating ham-
ing results found in the scientific literature, researchers string and/or lumbar extensibility were: (a) studies with
have to conduct meta-analyses (Cooper et al., 2009; apparently healthy participants who did not present any
Hunter and Schmidt, 2004; Lipsey and Wilson, 2001). injury, physical and/or mental disabilities; (b) studies with
Hence, the meta-analyses remain a useful tool for the SR tests that yielded the values of the maximum reach of
evaluation of evidence (Flather et al., 1997), forming a the fingertips; and (c) studies in which hamstring and/or
critical process for theory development in science (Hunter lumbar extensibility criterion measurements used are
and Schmidt, 2004). widely accepted in the scientific literature (e.g., straight
Unfortunately, to our knowledge there are not any leg raise or knee extension tests for hamstring extensibil-
meta-analyses addressing the criterion-related validity of ity and Macrae & Wright or inclinometer methods for
SR tests. Beyond the simple but important function of lumbar extensibility). In addition to papers, mas-
describing and summarizing the scientific findings of a ter/doctoral dissertations and conference proceedings
research area, the main contribution of a meta-analysis is were also accepted. No language or publication date re-
to estimate as accurately as possible the population pa- strictions were imposed.
rameters (Hunter and Schmidt, 2004). Therefore, the
results of a meta-analysis let us generalize the research Coding studies
findings, as well as test hypotheses that may have never For this meta-analysis, data were collected from studies
been tested in primary studies. Finally, the meta-analyses that reported relationships between SR tests and ham-
permit us to examine today´s lack of knowledge in a spe- string and/or lumbar extensibility criterion measures with
cific area and to guide scientists in future research (Coo- apparently healthy participants of any age. From each
per et al., 2009). selected study the following data were coded: Study iden-
Consequently, the main purpose of the present tity number, sample size (n), sex of participants (1 =
meta-analysis was to examine the scientific literature on males, 2 = females), age of participants (1 = children, <
criterion-related validity of SR tests for estimating ham- 18 years; 2 = adults, ≥ 18 years), SR test protocol (1 =
string and lumbar extensibility in apparently healthy indi- Classic SR, 2 = Modified SR, 3 = Back-saver SR, 4 =
viduals. More specifically, the objectives of this study Modified back-saver SR, 5 = V SR, 6 = Modified V SR, 7
were: (a) to describe and summarize the up-to-date scien- = Unilateral SR, 8 = Chair SR), criterion-related validity
tific findings of criterion-related validity of SR tests for value (Pearson´s r correlation coefficient), reliability of
estimating hamstring and lumbar extensibility; (b) to SR test (intraclass correlation coefficient), reliability of
estimate and compare the overall population mean of the hamstring and/or lumbar extensibility criterion measures
criterion-related validity coefficients of each SR test for (intraclass correlation coefficient), and the average score
estimating hamstring and lumbar extensibility; and (c) to of hamstring extensibility criterion measure. Because
examine the influence of some study features (sex of the identification of study features is usually explicitly stated
participants, age of participants, and level of hamstring in each of the primary articles, the use of more than one
extensibility) in criterion-related validity coefficients of rather was deemed unnecessary.
SR tests. In addition, although various protocols for evaluat-
ing quality of single studies have been described, there is
Methods no widespread agreement on the validity of this type of
evaluation approach. Thus, rejecting certain single studies
Search strategy and accepting others for inclusion in a meta-analysis on
The following seven electronic databases were searched the basis of a quality score remains a controversial proce-
from their inception through December 2012: SportDis- dure (Flather et al., 1997). Hence, according to Flather et
cus, Scopus, Medline, Pubmed, Web of Science, ERIC, al. (1997), our approach has been to ensure that the design
and Dissertations & Theses Database. The search terms has not been flawed (e.g., conducted by scientifically
Mayorga-Vega et al. 3

evidenced criterion measures), and that there has been a performed to estimate the number of unlocated studies
complete reporting of relevant outcomes. For a study to averaging null results (r = 0) that would have to exist to
be included in this meta-analysis, sample size, SR test bring the mean effect size (rp) down to the small mean r
protocol, hamstring and/or lumbar criterion measures and value (Rosenthal, 1979). Depending on the results of the
Pearson’s r were considered to be critical. In the event file drawer analysis, we had to conclude if it was likely
that the authors mixed subgroups of a study feature (e.g., that there would be this particular number of “lost” stud-
males mixed with females), failed to identify a study ies to reduce the actual r to a small value. According to
feature (e.g., criterion measure or reliability scores) or Cohen´s guidelines (1992), the correlation coefficient was
were ambiguous (e.g., hamstring extensibility scores interpreted as small when r < 0.30.
around 80º shown graphically) the data was omitted. Secondly, according to Light and Pillemer (1984),
When in the same study data for males and females were the scatter plots of correlations coefficients against sam-
expressed both separately and together, only the separate ple size for each SR test protocol related to both ham-
data were coded. When in the same study data were ex- string and lumbar extensibility were analyzed. According
pressed for both legs separately or for two different days to this graphic method, in the absence of publication bias,
from the same sample (i.e., such as in Mier, 2011), the the resulting figure should take the form of an inverted
average value of the coefficients was coded. funnel. However, based on the statistical significance of
Finally, in the event that included studies used the studies, if there is publication bias the small-sample
multiple validity coefficients for hamstring and/or lumbar studies reporting small r values will be disproportionately
extensibility, only the data relative to one criterion meas- absent because they are the studies that will fail to attain
ure of each muscle group was coded. Regarding ham- statistical significance. Finally, with the objective of
string extensibility, all studies reported correlation values quantifying the outcomes of the scatter plots, as suggested
with the straight leg raise test, while only in a few articles by Begg and Mazumdar (1994), a Spearman´s rank order
the values with the knee extension test was also stated correlation between r values and sample size was calcu-
(Davis et al., 2008; García, 1995; Harman and Looney, lated. In the presence of publication bias, this correlation
2003). Therefore, in order to avoid moderator effects should be statistically significant negative due to the ab-
issues by criterion measure test, only the correlation val- sence of small-sample studies in the lower left hand cor-
ues of the straight leg raise test were coded. As regards ner.
lumbar extensibility, only Hartman and Looney (2003) Computation of correlations: The Hunter-
performed more than one criterion measure test (Single Schmidt´s psychometric meta-analysis approach was
inclinometer and Macrae & Wright methods). Due to the conducted to obtain the population estimates of the crite-
fact that the Macrae & Wright method has been used the rion-related validity of SR tests (Hunter and Schmidt,
most widely, the results with this test were coded. 2004). This approach estimates the population correlation
by individually correcting the observed correlations due
Data analyses to various artefacts such as sampling error and measure-
In the present study, Pearson´s zero-order correlation ment error. First, the “bare-bone” mean r (rc), corrected
coefficient (r) was considered the unit of measure as an for only sampling error, was calculated by weighting each
indication of criterion-related validity of SR tests, which r with the respective sample size when aggregating them
represents the strength of associations between the esti- into rc. Then, we calculated the corrected mean r at the
mates of SR tests and the criterion measures. Because population level (rp) that was unaffected by both sampling
several studies reported criterion-related validity results of error and measurement error. The resulting mean correla-
different SR test protocols from the same sample, r values tion corrected for sampling error and measurement error
were extracted separately for each SR test to avoid de- is offered as the best estimate of the population parameter.
pendency issues in the meta-analysis (Cooper et al., In order to correct the measurement errors, the reliability
2009). Similarly, criterion-related validity values were coefficients (intraclass correlation coefficients) of each
extracted separately for hamstring and lumbar extensibil- individual SR and criterion measure tests were used. Be-
ity. However, if a single study reported more than one r cause the reliability coefficients were not available for all
value within the same SR test protocol, but from different of the included studies, the unknown reliability values
subsamples (e.g., males and females), we assumed each r were previously estimated for each test. The median of
value from different subsamples to be independent from the all reported reliability coefficients for each SR test
each other and included them in a single meta-analysis protocol and criterion measure test was used. Finally, the
(Lipsey and Wilson, 2001). 95% confidence intervals of rp (95% CI) were calculated.
Publication bias: In addition to the followed search Moderator analysis: In the present meta-analysis,
strategy and selection criteria to avoid availability bias, an due to the low number of r values found, partially hierar-
examination of the selected studies was carried out to chical analyses of moderator variables were carried out.
avoid a potential duplication of information retrieved. According to Hunter and Schmidt (2004), to determine
Since some selected studies had full or partial duplicated the presence of moderator effects which may affect over-
information, these particular r correlations values were all criterion-related validity of SR tests (rp), three differ-
not analyzed in the meta-analyses. Furthermore, before ent criteria were simultaneously examined: (a) the per-
computing correlations, several exploratory analyses were centage of variance accounted for by statistical artefacts
also conducted to detect the presence of publication bias. is less than 75% of the observed variance in rp; (b) the
Firstly, a file drawer analysis based on effect size was Q homogeneity statistic is statistically significant (p <
4 Criterion-related validity of sit-and-reach tests

Search results (n = 2,432):


• SportDiscus (n = 407)
• Scopus (n = 596)
• Medline (n = 295)
• Pubmed (n = 302)
• Web of Science (n = 377)
• ERIC (n = 34)
• Dissertations & Theses (n = 421)

Potentially relevant articles identified and retrieved for


more detailed evaluation (n = 90)

Studies excluded (n = 52):


• Not relevant to apparently healthy
participants
• Not relevant to fingertips score
• Not relevant to criterion-related validity

Studies met selection criteria (n = 38)

Studies excluded (n = 4):


• Full duplicated information

Studies included in the meta-analysis (n = 34)

Figure 1. Flow chart of studies selection process.

0.05); and (c) the 95% credibly interval (95% CV) is extensibility, its measure must be considered as an esti-
relatively large or includes the value zero. If at least one mation of hamstring extensibility (indirect measure), and
of the three criteria were met, we concluded that the re- not as a criterion measure to determinate it (direct meas-
sults could be affected by moderator effects. In case of the ure) such as the straight leg raise or knee extension tests
presence of moderator effects, criterion-related validity (Santonja Medina et al., 1995). However, nowadays some
values of each SR test were analyzed separately by: (a) studies have suggested that the criterion measures of
sex of participants (i.e., male and female); (b) age of par- hamstring extensibility must be reexamined and read-
ticipants (i.e., children and adults); and (c) level of ham- justed (Cardoso et al., 2007; Hartman and Looney, 2003)
string extensibility (i.e., low average level, < 80º, and (see strengths and limitations section).
high average level, ≥ 80º) (Kendall et al., 2005). Table 1 presents a summary of studies of criterion-
related validity of SR tests for estimating hamstring and
Results lumbar extensibility. Regarding the criterion-related va-
lidity for estimating hamstring extensibility, a total of 99 r
Study description values across eight SR test protocols were retrieved, rang-
Figure 1 shows a flow chart of the study selection proc- ing from three values in the Chair SR and Modified V SR
ess. Of the 2,432 literature search results, 90 potentially tests to 47 values in the Classic SR test. Total sample
relevant publications were identified and retrieved for a sizes for each SR test ranged from 182 in the Chair SR
more detailed evaluation. Finally, due to duplication is- test to 3,481 in the Classic SR test. The individual crite-
sues, of the 38 studies that met the inclusion criteria, only rion-related validity correlation coefficients of SR tests
34 studies were included in the present meta-analysis. for estimating hamstring extensibility ranged from 0.19 to
Apart from a few studies retrieved which were carried out 0.93. Regarding criterion-related validity for estimating
with apparently non-healthy participants or lineal tests lumbar extensibility, a total of 51 r values across seven
that did not yield the values of the maximum reach of the SR test protocols were retrieved, ranging from two values
fingertips, other studies (or r values) were not included in the Unilateral SR test to 21 values in Classic SR test.
either in the present meta-analysis because they examined Studies examining the criterion-related validity of the
the relationship between the SR test and the pelvic tilt Chair SR test for estimating lumbar extensibility were not
scores (e.g., Davis et al., 2008; Kawano et al., 2010; found. Total sample sizes for each SR test ranged from
López-Miñarro, 2010; Rodríguez-García et al., 2008). The 158 in the Unilateral SR test to 1,762 in Classic SR test.
pelvic tilt is measured by the inclination angle of the The individual criterion-related validity correlation coef-
sacrum with regard to the horizontal line at the point of ficients of SR tests for estimating lumbar extensibility
maximal forward reach on the SR test. Therefore, al- ranged from 0.00 to 0.60.
though the pelvis position is influenced by the hamstring
Mayorga-Vega et al. 5

Table 1. Summary of studies of criterion-related validity of sit-and-reach tests for estimating hamstring and lumbar extensi-
bility.
Hamstring extensibility Lumbar extensibility
Reference Sample n Age (yrs) Test
Criter ♂ (r) ♀ (r) Criter ♂ (r) ♀ (r)
Ayala et al. Professional futsal ♂=55 26.0 (4.5) CSR PSLR .62* .93*
(2011) players ♀=48 23.0 (5.3) MSR PSLR .76* .73*
BSSR PSLR .47 .91*
Ayala et al. Recreationally ♂=156 21.3 (2.5) CSR PSLR .79*
(2012) active university ♀=87 20.7 (1.6)
students
Baker (1985) High and Middle ♀=100 14.1 (.8) CSR PSLR .64* MWM .28*
school students
Baltaci et al. University students ♀=102 22.0 (1.0) CSR PSLR .63*-.53*
(2003) BSSR PSLR .37*-.44*
CHSR PSLR .22*-.16
Book (1989) Public and private ♂=203 9-18 CSR PSLR .65* .81* MWM .33* .30*
schools students ♀=255
Bozic et al. Physically active ♂=84 21.3 (2.6) CSR PSLR .63*
(2010) sport and PE white
students
Castro-Piñero Caucasian children ♂=29 6-12 CSR PSLR .38*
et al. (2009b) and adolescents ♀=27 MSR PSLR .34*
♂=16 13-17 CSR PSLR .38*
♀=15 MSR PSLR .26
Chung and University students ♂=52 20.7 (1.3) CSR PSLR .77* MWM .23
Yuen (1999) MSR PSLR .71* MWM .24
Davis et al. University students ♂=42 23.6 (4.1) CSR PSLR .65*
(2008) ♀=39 24.1 (4.3) CSR PKE .57*
García (1995) High school stu- ♂=54 15-18 CSR PSLR .64* .67* MWM .32* .28*
dents ♀=55 CSR AKE .63* .61*
Hartman and Schoolchildren ♂=85 6-12 CSR PSLR .67*-.66* .47*-.49* MWM .05 .07
Looney (2003) ♀=88 CSR AKE .40*-.40* .52*-.54* SIM .29* .16
BSSR PSLR .69*-.67* .42*-.48* MWM .00-.07 .06-.06
BSSR AKE .50*-.47* .54*-.57* SIM .26*-.28* .10-.10
Hui et al. University students ♂=62 21.1 (3.1) CSR PSLR .48*-.47* .46*-.53* MWM .27* .24*
(1999) ♀=96 20.6 (2.1) MSR PSLR .45*-.45* .41*-.47* MWM .24 .22*
BSSR PSLR .46*-.44* .39*-.50* MWM .24-.27* .18-.15
MBSSR PSLR .44*-.45* .35*-.47* MWM .17-.20 .22*-.22*
VSR PSLR .58*-.63* .44*-.52* MWM .42* .24*
MVSR PSLR .57*-.62* .46*-.51* MWM .38* .28*
Hui and Yuen University students ♂=62 21.1 (3.1) CSR PSLR .48*-.47* .46*-.53* MWM .27* .24*
(2000) ♀=96 20.6 (2.1) BSSR PSLR .46*-.44* .39*-.50* MWM .24-.27* .18-.15
VSR PSLR .58*-.63* .44*-.52* MWM .42* .24*
USR PSLR .61*-.67* .50*-.54* MWM .47*-.47* .26*-.23*
Jackson and School PE students ♀=100 14.1 (.8) CSR PSLR .64* MWM .28*
Baker (1986)
Jackson and ? ♂=52 20-45 CSR PSLR .89* .70* MWM .59* .12
Langford (1989) ♀=52
Jones et al. Exercise classes at a ♂=32 74.5 (5.7) CSR PSLR .74* .71*
(1998) retirement commu- ♀=48 74.0 (6.7) BSSR PSLR .70* .71*
nity CHSR PSLR .76* .81*
Kanbur et al. Non-prepubertal/ ♂=69 13-14 CSR PSLR .64*-.65*
(2005) non-regularly exer-
cised boys
Langford (1987) ? ♂=52 20-45 CSR PSLR .89* .70* MWM .60* .12
♀=52
Lemmink et al. Independently living ♂=49 67.7 (7.5) CSR PSLR .74* .57* AAOS .13 .31*
(2003) people over 55 ♀=71 65.6 (8.6) MSR PSLR .54* .57* AAOS .21 .26*
Liemohn et al. University students ♂=20 24.0 (4.6) CSR PSLR .72* .70* SIM .29 .40
(1994) ♀=20 25.1 (6.3) BSSR PSLR .76* .70* SIM .32 .38

This table includes all studies that met selection criteria, however, full or partial information was not included in the meta-analysis (in bold) due to
duplication issues; ♂, males; ♀, females; ?, information unavailable; Criter: Criterion, CSR, Classic sit-and-reach test; MSR, Modified sit-and-reach
test; BSSR, Back-saver sit-and-reach test; MBSSR, Modified back-saver sit-and-reach test; VSR, V sit-and-reach test; MVSR, Modified v sit-and-
reach test; USR, Unilateral sit-and-reach test; CHSR, Chair sit-and-reach test; PSLR, Passive straight leg raise test; ASLR, Active straight leg raise
test; PKE, Passive knee extension test; AKE, Active knee extension test; SMM, Spinal Mouse method; SIM, Single Inclinometer method; MWM,
Macrae & Wright method; AAOSM, American Academy of Orthopedic Surgeons method;, Pearson´s r for the left and right leg, respectively.
* Pearson´s r statistically significant at p < 0.05
6 Criterion-related validity of sit-and-reach tests

Table 1. Continued.
Hamstring extensibility Lumbar extensibility
Reference Sample n Age (yrs) Test
Criter ♂ (r) ♀ (r) Criter ♂ (r) ♀ (r)
López-Miñarro University students ♂=102 22.9 (3.2) CSR PSLR .56*-.59* .72*-.74* SIM .32* .14
et al. (2008) ♀=96 23.2 (4.5) VSR PSLR .53*-.55* .63*-.65* SIM .33* .29*
López Miñarro Canoeists ♂=44 13.3 (.6) CSR PSLR .77*-.73* .74*-.81*
et al. (2008a) ♀=22
López Miñarro Canoeists ?=66 13.3 (.6) CSR PSLR .70*-.68*
et al. (2008b)
López Miñarro ? ♂=120 22.8 (3.1) CSR PSLR .56*-.59* .72*-.74*
et al. (2008c) ♀=100 23.1 (4.6) BSSR PSLR .51*-.50* .68*-.68*
USR PSLR .54*-.58* .73*-.75*
López-Miñarro University students ♂=76 23.5 (4.0) CSR PSLR .56*-.59* .75*-.73*
et al. (2009) ♀=67 23.9 (5.4) BSSR PSLR .53*-.51* .70*-.66*
López-Miñarro ? ♂=73 23.0 (3.5) CSR PSLR .44*-.48* .75*-.73*
et al. (2010a) ♀=71 23.1 (4.3) MSR PSLR .28*-.32* .63*-.64*
López-Miñarro University students ♂=130 22.9 (3.2) CSR PSLR .56*-.59* .72*-.74*
et al. (2010b) ♀=110 23.2 (4.5) MSR PSLR .41*-.45* .62*-.63*
BSSR PSLR .51*-.49* .68*-.68*
VSR PSLR .53*-.55* .63*-.65*
López-Miñarro Recreationally ♂=120 22.9 (3.6) CSR PSLR .31*-.41*
and Rodríguez- active university ♂=120 CSR PSLR .61*-.55*
García (2010c) students: Low and
normal flexibility
López-Miñarro Older women: Low, ♀=36 65.3 (9.1) CSR PSLR .43*-.41*
et al. (2011) moderate and high ♀=35 .54*-.57*
flexibility ♀=35 .73*-.70*
López-Miñarro Canoeists ♂=51 17.5 (6.3) CSR PSLR .67*-.66*
et al. (2012) Kayakers ♂=60 .59*-.59*
Mier (2011) Physically active ♂=30 25.0 (9.3) CSR PSLR .64*/.66* .79*/.81*
adults ♀=30 23.7 (7.9) CSR PSLR
Minkler and Regular PE activity ♂=48 24.3 (4.7) MSR PSLR .75* .66* MWM .40* .25
Patterson (1994) classes practitioners ♀=51 21.5 (3.8)
(mainly Caucasians)
Miyazaki et al. Community- ♂=42 72.6 (6.9) CSR PSLR .60* SMM .18
(2010) dwelling elderly ♀=119
Orloff (1988) Gymnasium practi- ♂=47 19-54 CSR ASLR .52*
tioners ♀=28
Patterson et al. Middle school ♂=40 13.0 (.9) BSSR PSLR .72*-.68* .51*-.52* MWM .15-.10 .17-.25
(1996) students (various ♀=44 12.7 (.8)
ethnic origins)
Rodríguez- Fit sports activities ♂=125 22.9 (3.2) CSR PSLR .56*-.59* .72*-.74* SIM .32* .14
García et al. practitioners ♀=118 23.2 (4.5)
(2008)
Simoneau Physically active ♀=34 20.3 (.9) CSR PSLR .78* MWM .26
(1998) university students
Yuen and Hui University students ♂=19 ? CSR PSLR .52*-.57* MWM .18
(1998) ♀=36 MSR PSLR .49*-.55* MWM .27
BSSR PSLR .46*-.44* MWM .16-.12
MBSSR PSLR .39*-.52* MWM .21-.17
VSR PSLR .54*-.60* MWM .19
MVSR PSLR .58*-.61* MWM .25
Note. This table includes all studies that met selection criteria, however, full or partial information was not included in the meta-analysis (in bold) due
to duplication issues; ♂, males; ♀, females; ?, information unavailable; Criter, Criterion, CSR, Classic sit-and-reach test; MSR, Modified sit-and-
reach test; BSSR, Back-saver sit-and-reach test; MBSSR, Modified back-saver sit-and-reach test; VSR, V sit-and-reach test; MVSR, Modified v sit-
and-reach test; USR, Unilateral sit-and-reach test; CHSR, Chair sit-and-reach test; PSLR, Passive straight leg raise test; ASLR, Active straight leg
raise test; PKE, Passive knee extension test; AKE, Active knee extension test; SMM, Spinal Mouse method; SIM, Single Inclinometer method;
MWM, Macrae & Wright method; AAOSM, American Academy of Orthopedic Surgeons method; Pearson´s r for the left and right leg, respectively.
* Pearson´s r statistically significant at p < 0.05

Publication bias difference in one r value, it was simply considered a typo


Due to some studies having fully or partially duplicated because the other data were equal) (Jackson and Baker,
information, these r coefficients values were not analyzed 1986; Jackson and Langford, 1989). López Miñarro´s et
in the present meta-analyses despite the fact that these al. (2008b) study information (males mixed with females)
studies met the selection criteria. For example, Baker were not computed because the same data were also pub-
(1985) and Langford´s (1987) doctoral dissertations were lished with males and females separately (López Miñarro
not included because the data were published later in a et al., 2008a). Additionally, full or partial information
journal (although in Langford´s works there was a little from a few studies of the same authors, sample character-
Mayorga-Vega et al. 7

istics, and correlation results was not included either due unlikely (67-133%). Hence, we concluded that it was
to duplication issues (Hui and Yuen, 2000; López Mi- unlikely that there would be this particular number of
ñarro et al., 2008c; López-Miñarro et al., 2010b; “lost” studies for each SR test protocol. On the other
Rodríguez-García et al., 2008). Pearson´s r correlation hand, regarding the lumbar extensibility, the file drawer
values of selected studies that were excluded for meta- analyses were not calculated because the actual r values
analysis are indicated (in bold) in Table 1. were small.

Figure 2. Scatter plots of sample size and criterion-related Figure 3. Scatter plots of sample size and criterion-related
validity coefficients (r) for estimating hamstring extensibil- validity coefficients (r) for estimating lumbar extensibility:
ity: (a) Classic sit-and-reach; (b) Modified sit-and-reach; (a) Classic sit-and-reach; (b) Modified sit-and-reach; and (c)
and (c) Back-saver sit-and-reach. Dashed line represents Back-saver sit-and-reach. Dashed line represents median
median values of validity coefficients. values of validity coefficients.

Afterward, several exploratory analyses were con- Figures 2 and 3 show the scatter plots of sample
ducted to detect the presence of publication bias. Regard- size against criterion-related validity coefficients for esti-
ing hamstring extensibility, the results of the file drawer mating hamstring and lumbar extensibility, respectively.
analyses are based on effect size for estimating the num- Due to the low number of r values for the most SR test
ber of unlocated studies averaging null results (r = 0) that protocols (2-5 r values), only the scatter plots for the
would have to exist to bring the mean rp down to 0.29. Classic SR, Modified SR, and Back-saver SR tests were
These results are shown in the following lines (in paren- examined. According to this graphic method, the figures
thesis the unlocated/located percentage): 63 for the Clas- suggested that there was an absence of publication bias
sic SR (134%), 15 for the Modified SR (94%), 19 for the for the Classic SR and Modified SR tests. However, the
Back-saver SR (106%), 2 for the Modified back-saver SR two scatter plots of the Back-saver SR test suggested the
(67%), 6 for the V SR (120%), 4 for the Modified V SR presence of publication bias, because of the absence of r
(133%), 5 for the Unilateral SR (125%), and 3 for the values in the lower left hand corner of the inverted funnel
Chair SR (100%). Although we are aware that there is not plot. In this line, for the Back-saver SR test, the results of
a large number of “lost” studies for the Modified back- Spearman´s rank order correlation between r values and
saver SR, V SR, Modified V SR, Unilateral SR, and Chair sample size showed a statistically significant negative
SR, the percentage of unlocated/located studies was correlation for estimating hamstring extensibility (r = -
8 Criterion-related validity of sit-and-reach tests

0.66, p = 0.003) and marginally significant for lumbar that all SR test protocols had a low mean correlation coef-
extensibility (r = -0.61, p = 0.081). Nevertheless, for the ficient of criterion-related validity for estimating lumbar
Classic SR and Modified SR tests the results did not show extensibility (rp range = 0.16-0.35) in which, the 95% CI
a statistically significant correlation for either estimating of the Back-saver SR and the Modified back-saver SR
hamstring (Classic SR, r = -0.29, p = 0.050; Modified SR, tests included the value zero. Furthermore, studies ad-
r = -0.33, p = 0.207) or lumbar extensibility (Classic SR, r dressing the criterion-related validity of the Chair SR test
= -0.02, p = 0.935; Modified SR, r = -0.22, p = 0.608). for estimating lumbar extensibility were not found. Fi-
Finally, although we aware that the results for the Classic nally, since none of the three criteria were met in the
SR test for estimating hamstring extensibility were mar- seven SR test protocols, moderator analyses were not
ginally significant, the r value was considerably lower conducted for lumbar extensibility.
than for the Back-saver SR test.
Moderator analyses
Criterion-related validity Table 3 reports the results of moderator analyses to exam-
Table 2 reports the number of studies (K), the cumulative ine the effects of the sex of the participants (i.e., male and
number of r values (n), the total sample size accumulated female), the age of participants (i.e., children and adults),
(N), the overall weighted mean of r corrected for sam- and the level of hamstring extensibility (i.e., low average
pling error only (rc), the overall weighted mean of r cor- level, < 80º, and high average level, ≥ 80º) on overall
rected for both sampling error and measurement error (rp), criterion-related validity correlation coefficients for esti-
as well as the 95% CI for overall criterion-related validity mating hamstring extensibility for each SR test protocol
correlation coefficients (rp) separately for estimating potentially affected by moderator effects (i.e., the Classic
hamstring and lumbar extensibility across each SR test SR, Modified SR, Back-saver SR, Unilateral SR, and
protocol. In addition, to detect the presence of moderator Chair SR). Collectively, slight differences in rp values
effects which may affect overall criterion-related validity were detected in different categories of included modera-
of SR tests, the 95% CV, the percentage of variance ac- tors across the analyzed SR tests.
counted for by statistical artefacts, and the Q homogeneity Gender of participants: The results showed that all
statistic were calculated. SR test protocols had a moderate-to-high mean correla-
Hamstring extensibility: The overall results tion coefficient of criterion-related validity for estimating
showed that all SR test protocols had a moderate mean hamstring extensibility for males (rp range = 0.55-0.83)
correlation coefficient of criterion-related validity for and moderate for females (rp range = 0.41-0.70) in which
estimating hamstring extensibility (rp range = 0.46-0.67) all 95% CI did not include the value zero. There was a
in which all 95% CI did not include the value zero. For tendency of the mean correlation coefficient being
five of the eight SR test protocols, the percentage of vari- slightly greater for females than for males on each SR
ance accounted for by statistical artefacts was less than test, except for the Chair SR test where the opposite re-
75%, the Q homogeneity statistic was statistically signifi- sults were found. However, we have to be aware that,
cant (p < 0.05), and the 95% CV was relatively large. except for the Chair SR test, all the 95% CI of mean cor-
Therefore, follow-up moderator analyses were conducted relation coefficients were overlapped. Moreover, we
using predefined moderators as it was hypothesized in the should also be cautious because the low numbers of r
present study. values over the analyses were supported. Additionally,
Lumbar extensibility: The overall results showed according to moderator analysis criteria, at least one of

Table 2. Results of meta-analyses for overall criterion-related validity correlation coefficients across sit-and-reach test proto-
cols.
Sit-and-reach test K n N rc rp 95% CIa 95% CVb % of variancec Q statistic
Hamstring extensibility
Classic sit-and-reach 28 47 3,481 .65 .67 .55, .80 .44, .91 22.55 208.39*
Modified sit-and-reach 9 16 1,058 .54 .56 .39, .73 .32, .80 33.14 48.28*
Back-saver sit-and-reach 10 18 1,158 .57 .59 .43, .75 .38, .80 36.65 49.12*
Modified back-saver sit-and-reach 2 3 213 .44 .46 .28, .65 .46, .46 100.00 .18
V sit-and-reach 3 5 411 .56 .60 .46, .74 .60, .60 100.00 3.23
Modified V sit-and-reach 2 3 213 .55 .59 .44, .74 .59, .59 100.00 1.14
Unilateral sit-and-reach 2 4 378 .61 .64 .52, .76 .51, .76 47.43 8.43*
Chair sit-and-reach 2 3 182 .45 .49 .29, .68 -.11, 1.00 9.50 31.57*
Lumbar extensibility
Classic sit-and-reach 13 21 1,762 .25 .26 .05, .46 .19, .32 91.52 22.95
Modified sit-and-reach 5 8 484 .26 .26 .03, .50 .26, .26 100.00 1.39
Back-saver sit-and-reach 5 9 510 .15 .16 -.10, .41 .16, .16 100.00 4.72
Modified back-saver sit-and-reach 2 3 213 .20 .21 -.01, .44 .21, .21 100.00 .07
V sit-and-reach 3 5 411 .30 .31 .11, .51 .31, .31 100.00 2.42
Modified V sit-and-reach 2 3 213 .30 .32 .11, .53 .32, .32 100.00 .63
Unilateral sit-and-reach 1 2 158 .34 .35 .15, .54 .26, .43 83.73 2.39
Chair sit-and-reach - - - - - - - - -
Note. K, number of studies; n, number of rs; N, total sample size; rc, overall weighted mean of r corrected for sampling error only; rp, overall
weighted mean of r corrected for sampling error and measurement error; a 95% confidence interval; b 95% credibly interval c Percentage of variance
accounted for by statistical artefacts including sampling error and measurement error of sit-and-reach tests. * p < 0.05
Mayorga-Vega et al. 9

Table 3. Results of moderator analyses for criterion-related validity correlation coefficients for estimating hamstring extensi-
bility across all sit-and-reach test protocols potentially affected by moderator effects†
Moderator Effect K n N rc rp 95% CIa 95% CVb % of variancec Q statistic
Gender of participants
Classic sit-and-reach Males 19 21 1,493 .62 .64 .50, .78 .46, .82 38.28 54.86*
Females 18 20 1,361 .68 .70 .58, .82 .49, .91 25.06 79.80*
Modified sit-and-reach Males 7 7 469 .53 .55 .38, .71 .26, .83 26.80 59.70*
Females 6 6 447 .60 .62 .48, .76 .53, .72 68.72 23.28*
Back-saver sit-and-reach Males 8 8 490 .57 .59 .42, .75 .48, .69 71.70 11.16
Females 9 9 613 .58 .60 .45, .75 .33, .87 24.05 37.42*
Unilateral sit-and-reach Males 2 2 182 .59 .61 .48, .74 .61, .61 100.00 .72
Females 2 2 196 .63 .66 .55, .77 .47, .85 25.66 7.80*
Chair sit-and-reach Males 1 1 32 .76 .83 .71, .94 .83, .83 100.00 .00
Females 2 2 150 .39 .41 .22, .60 -.16, .99 9.79 20.42
Age of participants
Classic sit-and-reach Children 8 14 1,173 .66 .67 .55, .79 .47, .87 26.28 53.27*
Adults 20 33 2,308 .64 .68 .55, .80 .43, .92 21.22 155.50*
Modified sit-and-reach Children 1 2 87 .31 .32 .05, .59 .32, .32 100.00 1.34
Adults 8 14 971 .56 .58 .42, .74 .37, .79 35.13 45.55*
Back-saver sit-and-reach Children 2 4 257 .58 .59 .43, .75 .45, .74 55.88 7.16
Adults 8 14 901 .56 .59 .43, .75 .36, .82 33.38 41.94*
Unilateral sit-and-reach Children - - - - - - - - -
Adults 2 4 378 .61 .64 .52, .76 .51, .76 47.43 8.43*
Chair sit-and-reach Children - - - - - - - -
Adults 2 3 182 .45 .49 .29, .68 -.11, 1.00 9.50 31.57*
Level of hamstring extensibility
Classic sit-and-reach Low 15 16 1,129 .60 .63 .48, .77 .41, .84 30.79 51.97*
High 19 25 1,984 .67 .70 .59, .81 .46, .94 18.39 135.91*
Modified sit-and-reach Low 4 5 355 .51 .53 .36, .70 .24, .82 24.87 64.35*
High 7 10 648 .55 .58 .41, .74 .36, .79 37.42 42.75*
Back-saver sit-and-reach Low 5 6 433 .54 .57 .41, .72 .31, .82 27.37 21.92*
High 7 11 670 .59 .61 .45, .77 .44, .79 45.34 24.26*
Unilateral sit-and-reach Low 1 1 120 .56 .58 .47, .70 .58, .58 100.00 .00
High 2 3 258 .63 .66 .55, .78 .51, .81 38.22 7.85*
Chair sit-and-reach Low 2 2 134 .33 .35 .14, .56 -.13, .83 16.42 12.18*
High 1 1 48 .81 .86 .79, .94 .86, .86 100.00 .00
Note. K, number of studies; n, number of rs; N, total sample size; rc, overall weighted mean of r corrected for sampling error only; rp, overall
weighted mean of r corrected for sampling error and measurement error; a 95% confidence interval; b 95% credibly interval c Percentage of variance
accounted for by statistical artefacts including sampling error and measurement error of sit-and-reach tests. † Because some studies mixed genders or
hamstring extensibility levels were missed, the overall n for these categories is lower for some sit-and-reach tests. * p < 0.05

the three criteria was met in the SR test protocols (except values over the analyses were supported. Finally, accord-
for the Unilateral SR and Chair SR for males, because ing to moderator analysis criteria, at least one of the three
logically these had only two and one r values, respec- criteria was met in most SR test protocols (except for the
tively), indicating that the criterion-related validity of Modified SR for children, because logically these had
these SR tests separately for sex was still heterogeneous. only two r values), indicating that the criterion-related
Finally, because some studies grouped males and females validity of these SR tests separately for age were still
together, in Table 3 overall n of the sex of participants is heterogeneous.
lower for some SR test protocols. Level of hamstring extensibility: The results
Age of participants: The results showed that all SR showed that all SR test protocols had a low-to-moderate
test protocols had a low-to-moderate mean correlation mean correlation coefficient of criterion-related validity
coefficient of criterion-related validity for estimating for participants with low level of hamstring extensibility
hamstring extensibility for children (rp range = 0.32-0.67) (< 80º in the average score of the straight leg raise test) (rp
and moderate for adults (rp range = 0.49-0.68) in which range = 0.35-0.63) and moderate-to-high for participants
all 95% CI did not include the value zero. Of all the ex- with a high level of hamstring extensibility (≥ 80º in the
amined SR test protocols, only studies of the Classic SR, average score of the straight leg raise test) (rp range =
Modified SR, and Back-saver SR tests were found for 0.58-0.86) in which all 95% CI did not include the value
both children and adults. The results of the present meta- zero. For all examined SR test protocols, there was a trend
analysis showed that there was a trend in the mean corre- of the mean correlation coefficient to being greater for
lation coefficient reported to be greater for adults than for participants with high levels of hamstring extensibility
children in the Classic SR and Modified SR, but not in the than for those with low levels.
Back-saver SR test where the r average values were However, we have to be aware that, except for the
equal. However, in any case, all 95% CI of mean correla- Chair SR test, all the 95% CI of mean correlation coeffi-
tion coefficients were overlapped. Furthermore, we cients were overlapped, as well as the low numbers of r
should also be cautious because the low numbers of r values over the analyses were supported. Additionally,
10 Criterion-related validity of sit-and-reach tests

according to moderator analysis criteria, at least one of adults (López-Miñarro and Rodríguez-García, 2010c) and
the three criteria was met in all SR test protocols (except elderly women (López-Miñarro et al., 2011) found that
for the Unilateral SR for low levels and Chair SR for high the level of hamstring extensibility influenced the crite-
levels because logically these had only one r value), indi- rion-related validity of the Classic SR and Toe touch tests.
cating that the criterion-related validity of these SR tests However, due to the fact that in the present meta-analysis
separately for level of hamstring extensibility were still the n was classified based on the average scores of the
heterogeneous. Finally, because several studies failed to straight leg raise test, we were aware that several partici-
identify the level of hamstring extensibility or were am- pants with low hamstring extensibility could be classified
biguous (i.e., hamstring extensibility scores around 80º as high flexibility and vice versa. This fact could reduce
shown graphically), in Table 3 overall n of level of ham- drastically the difference reported in the results of the
string extensibility is lower for some SR tests. present meta-analysis.

Discussion Strengths and limitations

From its conception, the Classic SR test has been sub- The meta-analysis is a useful tool to assess the scientific
jected to numerous modifications, often with the aim of evidence, but an understanding of its strengths and limita-
improving its validity. However, according to the results tions is needed for most appropriate use of this method
of the present meta-analysis, and although we are aware (Flather et al., 1997). Overall, the main strength of a
that all the 95% CI of mean correlation coefficients were meta-analysis is that it lets us obtain more reliable popula-
overlapped, the Classic SR test showed a greater average tion estimates of findings than those of the constituent
criterion-related validity coefficient. Hence, if our pur- studies. Therefore, the results of a meta-analysis let us
pose is to assess hamstring extensibility, it seems that the generalize the research findings, as well as test hypothe-
use of a modification of the classic protocol is not justi- ses that may have never been tested in primary studies.
fied. Likewise, the meta-analysis represents the best up-to-date
Specifically, it has been suggested for several years approach to describe and summarize the scientific find-
that the Classic SR test did not consider limb length dif- ings of a research area (Hunter and Schmidt, 2004).
ferences (Hoeger et al., 1990). To solve this methodologi- Lastly, meta-analysis methods can advance in an entire
cal “problem”, Hoeger et al. (1990) proposed the Modi- discipline by addressing more general questions in the
fied SR, which incorporates a finger-to-box distance to area (Cooper et al., 2009).
account for proportional differences between legs and Regarding the strengths of the present meta-
arms. In this line, these authors found that adolescents analysis, we followed several measures to avoid (or at
with longer legs relative to arms had poorer performance least to reduce) publication bias. A lot of research studies
on the Classic SR test, and the Modified SR negated the fail to be published at all, while others are published only
concern about disproportionate limb length bias by estab- in abstract form, conference proceeding, or dissertation
lishing a relative zero point for each person. Unfortu- but not as scientific articles. Furthermore, research studies
nately, this study failed to address the very important with favorable results are far more likely to be published
issue of criterion-related validity. than those with inconclusive results. Likewise, identifica-
The present meta-analysis showed a greater overall tion of relevant studies may also be difficult because of
mean criterion-related validity for the Classic SR than for their publication in less accessible journals. Thus, per-
the Modified SR. In addition, for other modifications of forming a meta-analysis when a proportion of the relevant
SR tests that incorporated fingers-to-box distance (i.e., the data is missing can provide misleading results, and publi-
Modified back-saver SR and Modified V SR), the average cation bias may spuriously support a hypothesis by con-
criterion-related validity coefficients were higher for the tinuously selecting favorable results and rejecting unfa-
end scores version than for the modified one. In this line, vorable ones (Flather et al., 1997).
in most primary studies in which the criterion-related Therefore, to avoid availability bias, we conducted
validity of end and differences scores of SR tests was a wide literature search. The potential inclusion of all
studied among the same sample, coefficients values were relevant single studies in the present meta-analysis (i.e.,
slightly greater for traditional protocols (e.g., Ayala et al., published and unpublished or English and non-English
2011; Castro-Piñero et al., 2009b; Lemmink et al., 2003; language) by extent and careful searching might clearly
López-Miñarro et al., 2010a; López-Miñarro et al., help reduce the impact of publication bias in the present
2010b). meta-analysis. Hence, the inclusion of unpublished and
Regarding the criterion-related validity for estimat- non-English language studies in the literature search is an
ing lumbar extensibility, in addition to the low correlation important strength of the present meta-analysis. Multiple
coefficient found, we have to be aware that the Pearson´s publication bias also exists when the same researchers
zero-order correlation coefficient was considered; there- responsible for multiple publications report the same
fore, because of the common explanation for hamstring validity coefficients, derived from the same participants
and lumbar extensibility, the “real” criterion-related valid- under the same experimental conditions. Thus, in the
ity values for estimating lumbar extensibility could be present meta-analysis all studies by the same authors were
even lower. thoroughly cross-referenced with each other. Since some
Finally, in line with the results of the present meta- selected studies had fully or partially duplicated informa-
analysis, previous primary studies carried out with young tion, these particular correlations values were not ana-
Mayorga-Vega et al. 11

lyzed in the meta-analyses. Lastly, several exploratory showed that there was still a large amount of unexplained
analyses were also conducted to detect the presence of variance after controlling for artefacts and predefined
publication bias. moderators. Studies included in a meta-analysis are ex-
Finally, the Hunter-Schmidt´s psychometric meta- pected to vary in a number of ways. Thus, beyond the
analysis approach (2004) was conducted in the present sampling error and other statistical artefacts, differences
study to obtain the population estimates of criterion- between studies (e.g., sample, study design, or tests pro-
related validity of SR tests. Because sample sizes are cedure) undoubtedly affect these results. For example, the
never infinite and measures are never perfectly reliable, straight raise leg test can be measured by different kinds
sampling error and measurement error are always present of movements (i.e., active or passive), instruments (e.g.,
in all real data. The psychometric meta-analysis approach radiography, goniometer or inclinometer), number of
corrects the observed correlations due both to sampling researchers, number of repetitions, time of rest between
error and measurement error. Thus, this method is proba- repetitions, and criteria of maximum extensibility. Addi-
bly one of the best approaches to estimate the population tionally, in the present meta-analysis different criterion
correlation coefficients. measures were used to estimate the lumbar extensibility.
On the other hand, there were some limitations that This statistical heterogeneity can be quantified, but there
should be considered when examining the results of the is usually uncertainty about how important the differences
present study. The main limitations of the present meta- really are. Thus, quantifying and accounting for differ-
analysis were related to the small number of criterion- ences between component studies in a meta-analysis re-
related validity coefficients found. Firstly, estimating the mains a substantial methodological problem and a con-
population parameters based on small samples is simply tinuing source of debate (Flather et al., 1997).
less accurate than in a large-sized meta-analysis. Sec- Finally, coding some study features was problem-
ondly, a partially hierarchical breakdown had to be used. atic due to different reasons. The moderator analysis had
The main problem in this kind of analysis is that it might missing data in sex categories because some authors
produce quite misleading results due to confounding and mixed males with females in their studies. Hamstring
interaction effects. We are aware that a fully hierarchical extensibility also had missing data because several au-
moderator analysis approach may be a more appropriate thors failed to identify it or it was ambiguous. In addition,
method to resolve this problem. However, more correla- because in the present meta-analysis the hamstring exten-
tions coefficients would be needed for each level of mod- sibility was classified based on the average scores, we are
erators. For these reasons, the results of the present study aware that several participants with low hamstring exten-
should be considered with caution, especially for those sibility could be classified as high flexibility and vice
SR test protocols from which only a few studies were versa. Lastly, although participant characteristics such as
retrieved. Firmer conclusions should await the accumula- physical activity levels or sports practice were potentially
tion of a larger number of studies (Hunter and Schmidt, moderating features, coding for them was not possible
2004). because most studies did not identify them.
Another limitation of the present meta-analysis is
related to the criterion measures used in the included Conclusion
studies. Joint(s) range of motion measured through radi-
ography seems to be the best criterion measurement to Overall the SR tests have a moderate mean correlation
assess flexibility (Gajdosik and Bohannon, 1987), but due coefficient of criterion-related validity for estimating
to several practical reasons such us high cost, necessity of hamstring extensibility, but they have a low mean crite-
sophisticated instruments, qualified technicians, or time rion-related validity for estimating lumbar extensibility.
constraints, the use of this method is limited (Castro- The Classic SR test shows the greater average criterion-
Piñero et al., 2009b). On the other hand, goniometers are related validity for estimating hamstring extensibility. The
relatively easy to obtain, valid and highly accurate in- results of the present meta-analysis suggest that the end
struments to measure joint range of motion; therefore, scores of the classic versions of the SR tests (e.g., the
joint(s) range of motion measured through goniometers Classic SR) are a better indicator of hamstring extensibil-
has been widely considered a valid and suitable criterion ity than the modifications that incorporate the fingers-to-
measure of hamstring extensibility (e.g., Ayala et al., box distance (e.g., the Modified SR). Regarding the three
2011; Hartman and Looney, 2003; López-Miñarro and potential moderators examined (sex of participants, age of
Rodríguez-García, 2010c). In this line, all the previous participants, and level of hamstring extensibility), gener-
studies found considered the angular tests measured by ally females, adults, and participants with high levels of
goniometers as the criterion measures. However, nowa- hamstring extensibility tended to have greater mean val-
days some studies have suggested that the criterion meas- ues of criterion-related validity for estimating hamstring
ures of hamstring extensibility must be reexamined and extensibility. However, due to the low number of r values
readjusted (Cardoso et al., 2007; Hartman and Looney, found, the fact that almost all the 95% CI of mean correla-
2003). Similarly, although none of the previous studies tion coefficients were overlapped, and that criterion-
has used radiography as the criterion measure of lumbar related validity of SR tests within each category was still
extensibility, they administered tests with a demonstrated heterogeneous, we should be cautious with the results of
high reliability and validity (Macrae and Wright, 1969; the present meta-analysis.
Williams et al., 1993). Therefore, when angular tests such as the straight
Another area of concern is that moderator analyses leg raise or knee extension tests cannot be used, the SR
12 Criterion-related validity of sit-and-reach tests

tests seem to be a useful alternative to estimate hamstring Chung, P. K. and Yuen, C. K. (1999) Criterion-related validity of sit-
and-reach test in university men in Hong Kong. Perceptual and
extensibility; however, to assess lumbar extensibility Motor Skills 88, 304-316.
other widely used tests such as the Macrae & Wright or Cohen, J.A. (1992) Power primer. Psychological Bulletin 112, 155-159.
Single/Double inclinometer methods should be used. Cooper, H., Hedges, L.V. and Valentine, J.C. (2009) The handbook of
Nevertheless, as in the application of any field fitness test, research synthesis and meta-analysis. 2nd edition. Sage, New
York.
evaluators must be aware that the results of SR tests are Da Silva Díaz, R. and Gómez-Conesa, A. (2008) Shortened hamstring
simply an estimation and, therefore, not a direct measure syndrome. Fisioterapia 30, 186-193.
of the hamstring extensibility. On the other hand, when Davis, D.S., Quinn, R.O., Whiteman, C.T., Williams, J.D. and Young,
there are a higher number of studies accumulated, a large- C.R. (2008) Concurrent validity of four clinical tests used to
measure hamstring flexibility. Journal of Strength and Condi-
sized meta-analysis with a fully hierarchical analysis tioning Research 22, 583-588.
approach should be carried out. Future research should Erkula, G., Demirkan, F., Kilic, B.A. and Kiter, E. (2002) Hamstring
further study the criterion-related validity of SR tests, shortening in healthy adults. Journal of Back and Musculoskele-
especially in modifications of SR tests such as the SR tal Rehabilitation 16, 77-81.
Esola, M.A., McClure, P.W., Fitzgerald, G.K. and Siegler, S. (1996)
with plantar flexion, among populations such as children Analysis of lumbar spine and hip motion during forward bend-
or athletes, and go deeply into other related aspects such ing in subjects with and without a history of low back pain.
as the level of hamstring extensibility. Spine 21, 71-78.
Fisk, J.W., Baigent, M.L. and Hill, P.D. (1984) Scheuermann´s disease.
Acknowledgments Clinical and radiological survey of 17 and 18 year olds. Ameri-
We thank Anna Szczesniak and Aliisa Hatten for the English revision. can Journal of Physical Medicine 63, 18-30.
The first author is supported by a research grant from the Spanish Minis- Flather, M.D., Farkouh, M.E., Pogue, J.M. and Yusuf, S. (1997)
try of Education (AP2010-5905). Strengths and limitations of meta-analysis: Larger studies may
be more reliable. Controlled Clinical Trials 18, 568-579.
Gajdosik, R.L. and Bohannon, R.W. (1987) Clinical measurement of
References range of motion: Review of goniometry emphasizing reliability
and validity. Physical Therapy 67, 1867-1872.
Ayala, F., Sainz de Baranda, P., De Ste Croix, M. and Santonja, F. García, S.C. (1995) Validity of the sit-and-reach test for male and
(2011) Criterion-related validity of four clinical tests used to female adolescents. Doctoral thesis, University of Oregon,
measure hamstring flexibility in professional futsal players. United States.
Physical Therapy in Sport 12, 175-181. Hartman, J.G. and Looney, M. (2003) Norm-referenced and criterion-
Ayala, F., Sainz de Baranda, P., De Ste Croix, M. and Santonja, F. referenced reliability and validity of the back-saver sit-and-
(2012) Reproducibility and criterion-related validity of the sit reach. Measurement in Physical Education and Exercise Sci-
and reach test and toe touch test for estimating hamstring flexi- ence 7, 71-87.
bility in recreationally active young adults. Physical Therapy in Harvey, J. and Tanner, S. (1991) Low back pain in young athletes: A
Sport 13, 219-226. practical approach. Sports Medicine 12, 394-406.
Baker, A.A. (1985) The relative contribution of flexibility of the back Hoeger, W.W., Hopkins, D.R., Button, S. and Palmer, T.A. (1990)
and hamstring muscles in the performance of the sit and reach Comparing the sit and reach with the modified sit and reach in
component of the AAHPERD health related fitness test in girls measuring flexibility in adolescents. Pediatric Exercise Science
thirteen to fifteen years of age. Doctoral thesis, University of 2, 156-162.
North Texas, United States. Holt, L.E., Pelma, T.W. and Burke, D.G. (1999) Modifications to the
Baltaci, G., Tunay, V., Besler, A. and Gerçeker, S. (2003) Comparison standard sit-and-reach flexibility protocol. Journal of Athletic
of three different sit and reach tests for measurement of ham- Training 34, 43-47.
string flexibility in female university students. British Journal Hui, S.S. and Yuen, P.Y. (2000) Validity of the modified back-saver sit-
of Sports Medicine 37, 59-61. and-reach test: A comparison with other protocols. Medicine &
Begg, C.B. and Mazumdar, M. (1994) Operating characteristics of a Science in Sports & Exercise 32, 1655-1659.
rank order correlation for publication bias. Biometrics 50, 1088- Hui, S.C., Yuen, P.Y., Morrow, J.R. and Jackson, A.W. (1999) Com-
1101. parison of the criterion-related validity of sit-and-reach tests
Benito Peinado, P.J., Díaz Molina, V., Calderón Montero, F.J., Peinado with and without limb length adjustment in Asian adults. Re-
Lozano, A.B., Martín Caro, C., Álvarez Sánchez, M. and Pérez search Quarterly for Exercise and Sport 70, 401-406.
Tejero, J. (2007) Literature review in exercise physiology: Prac- Hunter, J.E. and Schmidt, F.L. (2004) Methods of meta-analysis: Cor-
tical recommendations. Revista Internacional de Ciencias del recting error and bias in research findings. 2nd edition. Sage,
Deporte 6, 1-11. Newbury Park.
Biering-Sorensen, F. (1984) Physical measurements as risk indicator for Jackson, A.W. and Baker, A.A. (1986) The relationship of the sit and
low-back trouble over a one year period. Spine 9, 106-119. reach test to criterion measures of hamstring and back flexibility
Book, C. B. (1989) Validation of the sit-and-reach test. Doctoral thesis, in young females. Research Quarterly for Exercise and Sport
University of Minnesota, United States. 57, 183-186.
Bozic, P. R., Pazin, N., Berjan, B. B., Planic, N. M. and Cuk, I. D. Jackson, A. and Langford, N.J. (1989) The criterion-related validity of
(2010) Evaluation of the field tests of flexibility of the lower ex- the sit and reach test: Replication and extension of previous
tremity: Reliability, and the concurrent and factorial validity. findings. Research Quarterly for Exercise and Sport 60, 384-
Journal of Strength and Conditioning Research 24, 2523-2531. 387.
Cardoso, J.R., Azevedo, N.C.T., Cassano, C.S., Kawano, M.M. and Jones, C. J., Rikli, R. E., Max, J. and Noffal, G. (1998) The reliability
Âmbar G. (2007) Intra and interobserver reliability of angular and validity of a chair sit-and-reach test as a measure of ham-
kinematic analysis of the hip joint during the sit-and-reach test string flexibility in older adult. Research Quarterly for Exercise
to measure hamstring length in university students. Revista Bra- and Sport 69, 338-343.
sileña de Fisioterapia 11, 133-138. Kanbur, N. O., Düzgün, I., Derman, O. and Baltaci, G. (2005) Do sexual
Castro-Piñero, J., Artero, E.G., España-Romero, V., Ortega, F.B., maturation stages affect flexibility in adolescent boys aged 14
Sjöström, M. and Ruiz, J.R. (2009a) Criterion-related validity of years? The Journal of Sports Medicine and Physical Fitness 45,
field-based fitness tests in youth: A systematic review. British 53-57.
Journal of Sports Medicine 44, 934-943. Kawano, M.M., Ambar, G., Oliveira, B.I.R., Boer, M.C., Cardoso, A.P.
Castro-Piñero, J., Chillón, P., Ortega, F.B., Montesinos, J.L., Sjöström, R.G. and Cardoso, J.R. (2010) Influence of the gastrocnemios
M. and Ruiz, J.R. (2009b) Criterion-related validity of sit-and- muscle on the sit-and-reach test assessed by angular kinematic
reach and modified sit-and-reach test for estimating hamstring analysis. Revista Brasileña de Fisioterapia 14, 10-15.
flexibility in children and adolescents aged 6-17 years. Interna- Kendall, F.P., McCreary, E.K., Provance, P.G., Rodgers, M.M. and
tional Journal of Sports Medicine 30, 658-662. Romani, W.A. (2005) Muscles: Testing and function with pos-
Mayorga-Vega et al. 13

ture and pain. 5th edition. Lippincott, Williams, & Wilkins, ing hamstring flexibility and validity of the sit-and-reach test.
Baltimore. Research Quarterly for Exercise and Sport 82, 617-623.
Langford, N.J. (1987) The relationship of the sit and reach test to crite- Mierau, D., Cassidy, J.D. and Yong-Hing, K. (1989) Low-back pain and
rion measures of hamstring and back flexibility in adult males straight in children and adolescents. Spine 14, 526-528.
and females. Doctoral thesis, North Texas State University, Minkler, S. and Patterson, P. (1994) The validity of the modified sit-
United States. and-reach test in college-age students. Research Quarterly for
Lemmink, K.A.P.M., Kemper, H.C.G., de Greef, M.H.G., Rispens, P. Exercise and Sport 65, 189-192.
and Stevens, M. (2003) The validity of the sit-and-reach test and Miyazaki, J., Murata, S., Horie, J. and Suzuki, S. (2010) Relationship
the modified sit-and-reach test in middle-aged to older men and between the sit-and-reach distance and spinal mobility and
women. Research Quarterly for Exercise and Sport 74, 331- straight leg raising range. Rigakuryoho Kagaku 25, 683-686.
336. Orloff, H. A. (1988) Standardization, reliability and validation of flexi-
Liemohn, W., Martin, S.B. and Pariser, G.L. (1997) The effect of ankle bility field testing protocols. Doctoral thesis, University of Kan-
posture on sit-and-reach test performance. Journal of Strength sas, United States.
and Conditioning Research 11, 239-241. Patterson, P., Wiksten, D. L., Ray, L., Flanders, C. and Sanphy, D.
Liemohn, W., Sharpe, G. L. and Wasserman, J. F. (1994) Criterion (1996) The validity and reliability of the back saver sit-and-
related validity of the sit-and-reach test. Journal of Strength and reach test in middle school girls and boys. Research Quarterly
Conditioning Research 8, 91-94. for Exercise and Sport 67, 448-451.
Light, R.J. and Pillemer, D.B. (1984) Summing up: The science of Rodríguez-García, P.L., López-Miñarro, P.A., Yuste, J.L. and Sáinz de
reviewing research. Harvard University Press, Cambridge. Baranda, P. (2008) Comparison of hamstring criterion-related
Lipsey, M.W. and Wilson, D.B. (2001) Practical meta-analysis. Sage, validity, sagittal spinal curvatures, pelvic tilt and score between
Newbury Park. sit-and-reach and toe-touch tests in athletes. Medicina dello
López Miñarro, P.A., Ferragut Fiol, C., Alacid Cárceles, F., Yuste Sport 61, 11-20.
Lucas, J.L. and García Ibarra, A. (2008a) Validity of sit-and- Rosenthal, R. (1979) The “file drawer problem” and tolerance for null
reach and toe-touch tests in the evaluation of hamstring muscle results. Psychological Bulletin 33, 1005-1008.
length in young paddlers. Apunts 43, 24-29. Santonja Medina, F., Ferrer López, V. and González-Moro, I.M. (1995)
López Miñarro, P.A., Rodríguez García, P.L., Yuste, J.L., Alacid, F., Clinical examination of hasmtring tightness. Selección 4, 81-91.
Ferragut, C. and García Ibarra, A. (2008b) Validity of the Simoneau, G. G. (1998) The impact of various anthropometric and
lumbo-sacral position in bending as measure of hamstring mus- flexibility measurements on the sit-and-reach test. Journal of
cle extensibility on young athletes. Archivos de Medicina del Strength and Conditioning Research 12, 232-237.
Deporte 25, 103-110. Smith, J.F. and Miller, C.V. (1985) The effect of head position on sit-
López Miñarro, P.A., Sainz de Baranda Andújar, P., Yuste Lucas, J.L. and-reach performance. Research Quarterly for Exercise and
and Rodríguez García, P.L. (2008c) Validity of the unilateral Sport 56, 84-85.
sit-and-reach test as measure of hamstring muscle extensibility. Standaert, C. J. and Herring, S. A. (2000) Spondylolysis: A critical
Comparison with other protocols. Cultura, Ciencia y Deporte 3, review. British Journal of Sports Medicine 34, 415-422.
87-92. Williams, R., Binkley, J., Bloch, R., Goldsmith, C.H. and Minuk, T.
López-Miñarro, P.A. (2010) Criterion-related validity of the lumbo- (1993) Reliability of the Modified-modified Schöber and Dou-
horizontal angle in flexion as a measure of hamstring muscle ble inclinometer methods for measuring lumbar flexion and ex-
extensibility in young adults. Cultura, Ciencia y Deporte 5, 25- tension. Physical Therapy 73, 26-37.
31. Yuen, P. Y., and Hui, S. C. (1998) Are difference scores better predictor
López-Miñarro, P.A. and Alacid, F. (2009) Influence of hamstring of flexibility than end scores in sit-and-reach tests? Medicine &
muscle extensibility on spinal curvatures in young athletes. Sci- Science in Sports & Exercise 30, S125.
ence & Sports 25, 188-193.
López-Miñarro, P.A., Alacid, F., Muyor, J.M. and López, F.J. (2010a)
Criterion-related validity of the modified sit-and-reach test as
hamstring muscle extensibility measure in young adults. Kronos Key points
9, 39-46.
López-Miñarro, P.A., García Ibarra, A. and Rodríguez García, P.L.
(2010b) Comparison between sit-and-reach tests for measuring
• Overall sit-and-reach tests have a moderate mean
hamstring muscle extensibility. Apunts 99, 56-64. criterion-related validity for estimating hamstring
López-Miñarro, P.A., Muyor, J.M. and Alacid, F. (2011) Validity of sit- extensibility, but they have a low mean validity for
and-reach tests as measures of hamstring extensibility in older estimating lumbar extensibility.
women. Revista Internacional de Medicina y Ciencias de la Ac-
tividad Física y el Deporte 11, 564-572.
• Among all the sit-and-reach test protocols, the Clas-
López-Miñarro, P.A. and Rodríguez-García, P.L. (2010c) Hamstring sic sit-and-reach test seems to be the best option to
muscle extensibility influences the criterion-related validity of estimate hamstring extensibility.
sit-and-reach toe-touch tests. Journal of Strength & Condition- • End scores (e.g., the Classic sit-and-reach test) are a
ing Research 24, 1013-1018.
López-Miñarro, P. A., Sainz de Baranda Andújar, P. and Rodríguez- better indicator of hamstring extensibility than the
García, P. L. (2009) A comparison of the sit-and-reach test and modifications that incorporate fingers-to-box
the back-saver sit-and-reach test in university students. Journal distance (e.g., the Modified sit-and-reach test).
of Sports Science and Medicine 8, 116-122.
López-Miñarro, P. A., Sáinz de Baranda, P., Rodríguez-García, P. L. and • When angular tests such as straight leg raise or knee
Yuste, J. L. (2008) Comparison between sit-and-reach test and extension tests cannot be used, sit-and-reach tests
V sit-and-reach test in young adults. Gazzetta Medica Italiana seem to be a useful field test alternative to estimate
Archivio per le Scienze Mediche 167, 135-142. hamstring extensibility, but not to estimate lumbar
López-Miñarro, P. A., Vaquero-Cristóbal, R., Muyor, J. M., Alacid, F.
and Isorna, M. (2012) Criterion-related validity of the sit-and- extensibility.
reach test as measure of hamstring extensibility in paddlers.
Cultura, Ciencia y Deporte 7, 95-101.
Macrae, I.F. and Wright, V. (1969) Measurement of back movement.
Annals of Rheumatic Disease 28, 584-589.
McHugh, M.P., Kremenic, I.J., Fox, M. B. and Gleim, G.W. (1998) The
role of mechanical and neural restraints to joint range of motion
during passive stretch. Medicine and Science in Sports and Ex-
ercise 30, 928-932.
Mier, C.M. (2011) Accuracy and feasibility of video analysis for assess-
14 Criterion-related validity of sit-and-reach tests

AUTHORS BIOGRAPHY
Daniel MAYORGA-VEGA
Employment
Research fellow, Department of Physical
Education and Sport, University of Granada,
Spain.
Degree
PhD student
Research interests
Measurement and evaluation, health-related
physical fitness, physical education-based
interventions, motivation toward physical
activity.
E-mail: [email protected]
Rafael MERINO-MARBAN
Employment
Tenured Professor, Department of Didactics
of Musical, Plastic and Corporal Expression,
University of Malaga, Spain
Degree
PhD
Research interests
Health-related physical fitness, physical
education-based interventions, kinesio tape.
E-mail: [email protected]
Jesús VICIANA
Employment
Tenured Professor, Department of Physical
Education and Sport, University of Granada,
Spain.
Degree
PhD
Research interests
Planning in physical education, physical
education-based interventions, motivation
toward physical activity.
E-mail: [email protected]

Daniel Mayorga-Vega
Department of Physical Education and Sport, University of
Granada, Spain

You might also like