Estill Voice Training and Voice Quality Control in Contemporary Commercial Singing: An Exploratory Study
Estill Voice Training and Voice Quality Control in Contemporary Commercial Singing: An Exploratory Study
Estill Voice Training and Voice Quality Control in Contemporary Commercial Singing: An Exploratory Study
To cite this article: Marco Fantini, Franco Fussi, Erika Crosetti & Giovanni Succo (2016): Estill
Voice Training and voice quality control in contemporary commercial singing: an exploratory
study, Logopedics Phoniatrics Vocology, DOI: 10.1080/14015439.2016.1237543
Article views: 22
Download by: [Cornell University Library] Date: 10 October 2016, At: 12:50
LOGOPEDICS PHONIATRICS VOCOLOGY, 2016
http://dx.doi.org/10.1080/14015439.2016.1237543
ORIGINAL ARTICLE
CONTACT Marco Fantini [email protected] ENT Department, San Luigi Gonzaga Hospital, Regione Gonzole 10, 10043 Orbassano, Turin, Italy
ß 2016 Informa UK Limited, trading as Taylor & Francis Group
2 M. FANTINI ET AL.
Table 1. Estill figures for voice. absolute acoustic quality of the vocal signal but to assess
Figures Positions the ability of the singers to control and modulate the qual-
Larynx TVF: Onset/Offset control Glottal ity of the voice according to a specific task. A group of
Aspirate
Smooth
contemporary commercial singers who had attained a CFP
FVF control Constrict was compared to a control group. We hypothesized that
Mid EVT might contribute to the acquisition of a greater ability
Retract
TVF: Body Cover control Slack
in conscious voice quality control.
Thick
Thin
Stiff Methods
Thyroid Cartilage control Vertical
Tilt This retrospective small scale exploratory study was carried
Cricoid Cartilage control Vertical out according to the Declaration of Helsinki. All subjects
Tilt
AES control Wide enrolled in the study gave their informed consent.
Narrow
Vocal tract Larynx control Low
Mid Participants
High
Tongue control Low In the present study, 35 contemporary commercial music
Mid singers (pop and rock singers) with no vocal complaints
High
Compress were recruited. All the singers enrolled in the current study
Velum control Low were professional or semi-professional (earning a living as
Mid singers or about to start their career as professional singers).
High
Jaw control Forward Twenty singers studied EVT and had a CFP; fifteen other
Mid singers had a control role: they studied in Italian contem-
Back
Drop
porary popular music institutions; they were trained in some
Lips control Protrude contemporary commercial styles (pop and rock) but were
Mid not familiar with EVT.
Spread
The experimental group was composed of 15 female sub-
Support Head and Neck control Relax
Anchor
jects and 5 male subjects, with a mean age of 33 ± 6.49 years
Torso control Relax and a mean singing experience of 10.55 ± 5.85 years; 9 sing-
Anchor ers were from northern Italy, 11 from southern Italy.
The control group was composed of 11 female subjects
and 4 male subjects, with a mean age of 32 ± 9.26 years and
a mean singing experience of 10.16 ± 6.04 years; 8 singers
Certified Master Teacher (CMT) and Certified Course were from northern Italy, 7 from southern Italy.
Instructor (CCI), which correspond to different levels of
competence and different professional tasks in the field of
EVT (14). Procedures
Concerning voice pedagogy, several authors have investi- Spectral energy distribution control and simultaneous funda-
gated the effects of specific voice training on the acoustic mental frequency and sound perturbation control were
quality of the professional voice (15–20). Nevertheless, little assessed through a specific vocal exercise. Each singer was
research about the efficacy of the modern singing methods asked to perform three sustained sung /a/ at a comfortable
in the acquisition of vocal function awareness has been car- pitch and intensity level, with chest voice (M1 vibratory
ried out so far (20,21). mechanism according to Roubeau et al. (28)). The accom-
The acoustic quality of the singing voice can be assessed plishment of M1 register by the singers was assessed
through a variety of acoustic measures, such as perturb- through a perceptual evaluation by an expert vocologist dur-
ation parameters like Jitter, Shimmer and Noise to ing the execution of the vocal task. The same fundamental
Harmonic Ratio; sound pressure level; fundamental fre- frequency and perturbation features were to be maintained
quency (F0); tempo and vibrato (22). Long-Term Average during the three sung /a/ tones. Besides, each singer was
Spectrum (LTAS) is another useful analysis which provides asked to alter the spectral energy distribution so as the first
information about the spectral energy distribution of a /a/ was to be harmonically “neutral” (a speech-like sound);
sound. It reflects both the voice source and the vocal tract the second /a/ was to be a “ringing” sound (enhancement of
resonance characteristics (23,24). Several parameters can be the middle-high harmonic partials of the spectrum); the
used in order to assess the spectral energy balance from a third /a/ was to be a “dark” sound (neutralization of the
LTAS, such as the Singing Power Ratio (SPR) (25), the middle-high harmonic partials and enhancement of the low
energy ratio (26), the alpha coefficient (27) and the differ- harmonic partials of the spectrum). The instructions were
ence in energy between different frequency bands. In the the same for each singer and no specific EVT terminology
present study, the acoustic analysis was performed with a was used. Before the recording session, each singer was
dynamical approach: the aim was not to investigate the asked if she/he had clearly understood the task. The
LOGOPEDICS PHONIATRICS VOCOLOGY 3
Figure 1. Example of SPR variation in the power spectrum of three sung /a/ with same F0 but different spectral energy distribution.
described vocal exercise aimed at assessing both the ability the lowest mean value from the highest mean value for
to control pitch and sound perturbation (which consisted of each parameter (DJitt%, DShimm%, DNHR, DmF0). SPR
maintaining a constant sound in terms of fundamental fre- was calculated to assess the ability to control the spectral
quency and perturbation during the production of the three energy distribution of the sound. SPR was introduced by
sung /a/) and, at the same time, the ability to control the Omori et al. (25) and is calculated by subtracting the amp-
spectral energy distribution (consisting of a conscious select- litude of the strongest partial between 2 and 4 kHz from
ive harmonic enhancement for each sung /a/). Each singer’s the level of the strongest partial between 0 and 2 kHz in
voice was recorded with a microphone Samson Meteor Mic the power spectrum. It is expressed in dB, it reflects the
(Samson Technologies, Hauppauge, NY) connected via USB “ring” of the voice, and it relates to the resonant quality of
to a MacBook Pro computer (Apple, Cupertino, CA) run- the singing voice (31–33). In the present study, LTAS were
ning the Apple Soundtrack Pro software version 3.0.1 computed for each of the three sung /a/ with a bandwidth
(Apple, Cupertino, CA). The audio signals were digitized on of 100 Hz and a frequency range of 0–24.99 kHz. SPR was
16 bit at a sampling frequency of 50 kHz. Voice recording then calculated for each of the sung /a/; its variation
was performed in standard conditions, with a mouth-to- (DSPR) has been considered as an indicator of the ability
microphone distance of 30cm, quiet environment (<40dB) to modify the spectral energy distribution according to the
and constant gain. vocal task. An example of SPR variation in three sung /a/
with same F0 but different spectral features (neutral, ring-
ing and dark sound) is shown in Figure 1.
Acoustic analysis
Acoustic analysis was carried out with PRAAT software
Perceptual analysis
(version 5.3.57 for Mac, Boersma & Weenink, University
of Amsterdam, Amsterdam, The Netherlands (29)). The A forced choice recognition task was performed by two
acoustic parameters used for sound perturbation control trained blinded auditors, who independently listened to each
analysis were: Jitter (Jitt%), Shimmer (Shimm%) and NHR; vocal exercise. The three sung /a/ of each singer were pre-
mean F0 (mF0) was calculated to assess fundamental fre- sented in random order. The listeners had to define –
quency variations. Jitter and Shimmer are measures of according to their perceptual evaluation – what was the neu-
cycle-to-cycle variations of fundamental frequency and tral, the dark and the ringing /a/. Once listening was accom-
amplitude, respectively, while NHR represents the average plished, the ratings (60 for the experimental group and 45
ratio of the disharmonic spectral energy components for the control group) were compared. In case of disagree-
(noise) to the harmonic spectral energy components (30). ment between the auditors, they jointly reassessed the rating
The mean of each parameter was calculated for each of the until a consensus was reached. The percentages of correct
three sustained sung /a/ and the maximum variation answers for the experimental and the control group were
between the obtained means was calculated by subtracting then compared.
4 M. FANTINI ET AL.
Statistical analysis In the experimental group mean SPR variation between the
neutral /a/ and the ringing /a/ was 7.82 ± 5.37 dB, mean SPR
Means and standard deviations (SDs) for all analyses were
variation between the neutral /a/ and the dark /a/ was
calculated. The Kolmogorov–Smirnov test was used to assess
4.52 ± 5.59 dB and mean SPR variation between the dark /a/
the normality of distributions. Fisher exact tests were used and the ringing /a/ was 12.33 ± 5.32 dB. The corresponding
to analyse differences in gender and geographical distribu- SPR variations for the control group were 4.09 ± 5.15 dB;
tion data between the experimental group and the control 0.72 ± 4.08 dB and 4.82 ± 4.34 dB, respectively. As shown in
group, while unpaired t tests were used to analyse differen- Figure 3, SPR variations of ringing-to-neutral /a/ (t ¼ 2.06;
ces in age and singing experience data between the experi- p ¼ .047); neutral-to-dark /a/ (t ¼ 2.21; p ¼ .034) and
mental and the control group. A repeated measures ringing-to-dark /a/ (t ¼ 4.46; p < .001) were significantly
ANOVA with Tukey post-hoc test was performed to analyse higher for the experimental group.
intra-group SPR variations between the three sung /a/.
Unpaired t tests and Mann–Whitney tests were used to
detect significant differences in the acoustic measurements
between the experimental group and the control group, as
appropriate. A Fisher exact test was used to compare the
amount of correct perceptual ratings between the experi-
mental and the control group. An alpha of 0.05 was consid-
ered for the statistical procedures. Statistical analysis was
carried out with GraphPad INSTAT software (Version 3.06
for Windows (San Diego, CA)).
Results
No statistically significant differences were found regarding
age, gender, singing experience and geographical distribution
between the two groups of singers.
Acoustic analysis
Maximum variations for the perturbation parameters and
Figure 2. Mean SPR values of the three sung/a/in the two groups of singers.
mF0 are shown in Table 2. Their values were found higher Significance (unpaired t-test).
for the control group. Significant statistical differences
between the experimental and the control group were found
for DNHR.
In the experimental group SPR mean values were
16.81 ± 4.29 dB for the neutral /a/; 8.99 ± 5.99 dB for the
ringing /a/ and 21.33 ± 4.32 dB for the dark /a/. Significant
differences were found between the neutral /a/ and the ring-
ing /a/ (q ¼ 9.1; p < .001), between the neutral /a/ and the
dark /a/ (q ¼ 5.2; p < .01) and between the ringing /a/ and
the dark /a/ (q ¼ 14.36; p < .001). In the control group SPR
mean values were 17.18 ± 5.68 dB; 13.09 ± 4.85 dB and
17.90 ± 4.60 dB respectively. Significant differences were
found between the neutral /a/ and the ringing /a/ (q ¼ 4.92;
p < .01) and between the dark /a/ and the ringing /a/
(q ¼ 5.8; p < .001); the neutral /a/ and the dark /a/ were not
significantly different. Significant differences of SPR mean
values between the experimental and the control group were Figure 3. Mean SPR variations (DSPR) and standard deviations in the two
found both for the dark and the ringing /a/ (Figure 2). groups of singers. Significance (unpaired t-test).
Perceptual analysis interpretation was found for the control group too, SPR
variations resulted significantly higher for the singers who
In the experimental group a correspondence of 100% was
studied EVT, confirming a greater control ability for the
found between the recognition task answers and the sound
experimental group. Spectral energy distribution (and thus
qualities that the singers tried to produce (neutral, ringing
SPR) can be influenced both by source and vocal tract
and dark). In the control group the amount of correspond-
activity (24,25,31,34,35). A recent study by Mainka et al.
ence was 80%. The difference was found significant
(36) analysed 3D vocal tract models of singers from mag-
(p < .001).
netic resonance imaging and compared them with LTAS
analysis of audio recordings. The authors concluded that
Discussion lower vocal tract morphologic adjustments are relevant for
voice timbre in singing. Sundberg et al. (37) had previ-
Voice quality control was studied through acoustic and ously described clear differences in terms of subglottal
perceptual analysis in 35 contemporary commercial singers.
pressure, source and vocal tract activity (in particular,
Twenty singers studied EVT and had a CFP, fifteen singers
regarding laryngeal height and pharyngeal width) in belt-
were not familiar with EVT. The two groups of singers
ing and opera styles, which are two of the six voice qual-
had similar gender and geographical distributions, similar
ities described in the EVT system. Regarding the current
mean age and singing experience but had followed a differ-
study, since the vocal task presupposed a constant glottal
ent learning process. At present, no data about the efficacy
activity (in terms of pitch, register and wave regularity), it
of the EVT system as a programme aiming at improving
can be hypothesized that the vocal tract had a preponder-
vocal function control exist in literature. Several authors
ant role in determining the required selective harmonic
have prospectively studied and reported the effects of spe-
enhancements. Nevertheless, since vocal loudness variation
cific vocal training on the acoustic quality of the profes-
plays a role in affecting LTAS, it must be considered that
sional voice (15–20). A systematic review of Hazlett et al.
source activity may have had a certain influence too (34).
(18) about the impact of voice training on the vocal quality
The perceptual recognition task confirmed the results of
of professional voice users found no conclusive evidence
the SPR analysis, showing a significantly higher corres-
about the effectiveness of voice training on vocal quality,
pondence between the listeners’ ratings and the voice
mainly because of the methodological limitations of the
qualities that the singers of the experimental group tried
included studies. While some studies reported no or little
effect, many others showed that voice training may to produce according to the task. It can be assumed that
improve the knowledge, awareness and quality of voice, the higher control ability of the experimental group deter-
resulting beneficial for professional voice users. It could be mined a larger amount of correct recognition ratings by
expected that voice training, aside from improving the the listeners. EVT system provides the “tools” (the figures
quality of voice, may help students to acquire specific vocal for voice) to obtain different voice qualities. The results
abilities. Nevertheless, little research about the acquisition of the acoustic and the perceptual analysis could reflect
of vocal function control in the field of singing has been the ability of the singers who attained a CFP to connect
carried out so far. In the current study, a specific vocal various figures in order to obtain the requested voice
function ability has been investigated; acoustic and percep- qualities (neutral, ringing and dark sounds). According to
tual analysis showed significant differences between the two the EVT system – the source activity being equal – a nar-
groups of singers regarding both sound perturbation con- rowing of the aryepiglottic sphincter, a high laryngeal pos-
trol and spectral energy distribution control ability. Sound ition and spread lips would determine a ringing voice
perturbation and pitch control were studied with acoustic quality, while a low larynx position, a tilted thyroid cartil-
perturbation parameters (Jitt%, Shimm% and NHR) and age and lips protrusion would determine a dark sound
the mean fundamental frequency (mF0), respectively; sig- quality (1–3,9,10). Anyhow, it must be remarked that
nificant differences between the groups were found in source activity and vocal tract postures of the singers
DNHR during the execution of the vocal task, with higher enrolled in the present study were not directly measured.
variation for the control group. NHR includes contribu- The causal relation between the correct mixture of various
tions from both perturbations of amplitude and frequency figures for voice and the requested different voice qualities
and correlates with the overall perception of noisiness or is thereby hypothetical.
roughness in the signal (30). The higher NHR variation in The main limitations of the current study were the
the control group reflects less sound noisiness control small number of recruited singers, the absence of direct
while performing the three sung /a/. It could be assumed measurements of vocal tract postures and source activity
that this is due to less source control ability by the singers and the lack of a prospective approach. Future research
of the control group, resulting in a higher sound noisiness should include prospective studies on larger number of
variation while performing the three sung /a/. The other singers with different levels of competence in the field of
acoustic parameters (Jitt%, Shimm% and mF0) showed EVT (CFP, CMT and CCI). Multidimensional voice evalu-
mean higher variations (albeit not significant) in the ations (e.g. through acoustic analysis, subjective assess-
control group, suggesting a higher sound perturbation and ment, endoscopy and imaging) at different times during
pitch control for the experimental group. Regarding spec- an EVT learning programme could suggest when and how
tral energy distribution control, even if a correct task certain vocal abilities are acquired by the students.
6 M. FANTINI ET AL.
untrained talented and nontalented singers. J Voice 2006; 20: 35. White P. The effect of vocal intensity variation on children’s voi-
82–8. ces using long-term average spectrum (LTAS) analysis. Logoped
32. Lundy D, Roy S, Casiano R, et al. Acoustic analysis of the sing- Phoniatr Vocol 1998;23:111–20.
ing and speaking voice in singing students. J Voice 2000; 14: 36. Mainka A, Poznyakovskiy A, Platzek I, et al. Lower vocal
490–3. tract morphologic adjustments are relevant for voice timbre in
33. Cesari U, Iengo M, Apisa P. Qualitative and quantitative meas- singing. PLoS ONE 2015;10:e0132241. doi: 10.1371/journal.
urement of the singing voice. Folia Phoniatr Logop 2012; 64: pone.0132241.
304–9. 37. Sundberg J, Gramming P, Lovetri J. Comparisons of pharynx,
34. Nordenberg M, Sundberg J. Effect on LTAS of vocal loudness source, formant, and pressure characteristics in operatic and
variation. Logoped Phoniatr Vocol 2004; 29: 183–91. musical theatre singing. J Voice 1993;7:301–10.