0% found this document useful (0 votes)
2 views11 pages

User Guide vtl3d

VocalTractLab3D is an advanced articulatory synthesizer that integrates 3D acoustic simulations, allowing users to compute transfer functions, visualize transverse modes, and synthesize vowel and fricative sounds. It operates on a multimodal method for efficient simulations and is available as free open-source software for Windows and Linux. However, it has limitations, such as the inability to simulate continuous cross-sectional shape variations or lip shapes, and it does not support time domain simulations.

Uploaded by

eechenchen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views11 pages

User Guide vtl3d

VocalTractLab3D is an advanced articulatory synthesizer that integrates 3D acoustic simulations, allowing users to compute transfer functions, visualize transverse modes, and synthesize vowel and fricative sounds. It operates on a multimodal method for efficient simulations and is available as free open-source software for Windows and Linux. However, it has limitations, such as the inability to simulate continuous cross-sectional shape variations or lip shapes, and it does not support time domain simulations.

Uploaded by

eechenchen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

User guide for VocalTractLab3D

Rémi Blandin
August 31, 2022

Contents
1 Introduction 1
1.1 What are 3D acoustic simulations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 What VocalTractLab3D can do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 What VocalTractLab3D cannot do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Download, intallation and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 How to cite VocalTractLab3D? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Interface 4
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Vocal tract geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Defining a vocal tract geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Visualizing the vocal tract geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Transverse modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Transfer functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4.1 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4.2 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Acoustic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.1 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.2 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Default parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Phoneme synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Log file 10

4 Aknowledgements 11

1 Introduction
VocalTractLab3D is a special version of the articulatory synthesizer VocalTractLab 2.3 [1] (www.vocaltractlab.de)
which integrates a module that performs 3D acoustic simulations. The other modules are, to some very little
differences, the same as the original VocalTractLab 2.3 and the reader is referred to the manual of VocalTractLab 2.3
to learn how to use them. The 3D acoustic simulations are performed with a frequency domain multimodal method
which has been designed to be particularly fast and accurate. The details of this simulation method are provided
in Blandin et al. [4].

1.1 What are 3D acoustic simulations?


What are vocal tract acoustic simulations?
It consists in describing how acoustic waves travel inside the vocal tract volume, are reflected at the discontinuities,
such as changes of cross-section or the mouth opening and create resonances. This allows one to compute transfer

1
a) b)

20
Magnitude (dB)

-20

1D
-40 3D

0 2 4 6 8 10 12 14 16 18 20
f(kHz)
c)

Figure 1: Transfer function and acoustic fields computed for the vowel /u/. The transfer function has been computed
with a 1D and a 3D simulation. The acoustic fields have been computed at frequencies shown on the transfer function
with arrows. They are shown in the sagittal plane and some selected transverse planes indicated with dashed lines.

functions, as an example, between the acoustic volume flow created at the vocal folds and the acoustic pressure
radiated in front of the lips. It can also be used to compute the acoustic field, which describes the variations of the
acoustic pressure and the particle velocity over space.
What is specific to vocal tract acoustics?
The vocal tract has an elongated shape in which the acoustic waves are guided to travel mainly along its length.
From the point of view of wave propagation, the vocal tract can be called a waveguide. This specificity of the vocal
tract makes easy to approximate the propagation of acoustic waves using a single value of the acoustic pressure
varying along its length, thus neglecting transverse variations of the acoustic field. This has led to 1D simulation
methods and electrical analogies which are very widely used to simulate vocal tract acoustics [9].
What is 3D vocal tract acoustics?
Even though not very important below 4-5 kHz, the acoustic field has transverse variations, and thus, varies in all
the three dimensions of space. At low frequency these variations appear as a curvature of the acoustic field related
to variations of cross-sectional dimensions (see Fig. 1a). At higher frequency, the 3D nature of the acoustic field is
more obvious as transverse resonances can be observed (see Fig. 1b).
What does accounting for the 3D acoustics changes in comparison to using a 1D simplifying
assumption?
The impact of accounting for the 3D nature of the acoustic field inside the vocal tract is rather limited up to about
3 kHz. It consists mainly in small changes in the resonance properties (frequency, amplitude and bandwidth). From
3 kHz on, the changes in the resonance properties can be more substantial. Above 4-5 kHz the transverse resonances
can induce zeros and additional peaks in the transfer function (see Fig.1c).
What is the multimodal method?
It is a simulation method which relies on the projection on eigen-function basis. Such an approach is very efficient
to reduce computation times and memory requirements. In the case of the vocal tract, the 3D geometry is cut
in multiple segments in which the local transverse eigen modes are computed and on which the acoustic field is

2
decomposed. The method implemented in VocalTractLab3D is described in details in Blandin et al. [4].

1.2 What VocalTractLab3D can do?


• Compute transfer functions between the volume velocity at the location of the vocal folds and one or several
points inside or outside the vocal tract.
• Compute transfer functions between the acoustic pressure on a transverse plane anywhere inside the vocal
tract and one or several points inside or outside the vocal tract. This is useful to emulate noise generation by
aeroacoustic sound sources.
• Compute the input impedance of the plane mode at the location of the vocal folds.
• Compute and visualize the transverse modes.
• Compute the acoustic field at a specific frequency in the sagittal plane and in transverse planes anywhere
inside the vocal tract.
• Compute the above listed quantities for vocal tract geometries generated with the articulatory model imple-
mented in VocalTractLab3D or vocal tract geometries imported from external .csv file coded in a specific
format. Note that in this last case any waveguide geometry can be imported and not only vocal tract geom-
etry (e.g. airways of other animals, wind instruments ...). However, the parameters of VocalTractLab3D are
optimized for human vocal tracts and may not be optimal for other applications.
• Synthesize vowel and fricative sounds.

1.3 What VocalTractLab3D cannot do?


The simulation method implemented in VocalTractLab3D has been designed primarily to be efficient, and this
comes with some limitations regarding the geometries which can be simulated. However, it is to be noted that some
of the geometrical simplifications are also used with other simulation methods, such as the finite element method
(FEM), in order to reduce the computation time, or simplify the task of describing the geometry.
VocalTractLab3D cannot:
• Simulate continuous cross-sectional shape variations within the segments of the vocal tract geometry. The
cross-sectional shape can be scaled to account for area variation, but the shape remains constant.
• Simulate lip shapes. The mouth opening is contained in a plane, and thus, the 3D lip shape cannot be
simulated.
• Simulate branches such as piriform sinus or the nasal cavity.
• Simulate the diffraction by the head and the torso outside of the vocal tract.
• Perform time domain simulation.
Some of the limitations listed above may be overcomed with future developments of the simulation method used
(3D lip shape, branches and diffraction by the head and torso), and some are inherent to the simulation method
used (constant cross-sectional shape in the segments, and frequency domain simulations).

1.4 Download, intallation and requirements


VoocalTractLab3D is distributed as a free and open source software under the GNU General Public Licence (GPL).
It is written in C++ and developed for both Windows and Linux platforms. Adaptation to MacOS should be
easy as cross-platform libraries are used. The software is free of charge and available for download as a ZIP
file from www.vocaltractlab.de. It needs no special installation: just unzip the achive and run the executable
"VocalTractLab.exe" for Windows, or "VocalTractLab" for Linux. contained in the folder. VoocalTractLab3D
was tested on Windows 10 and Ubuntu 20.04.4 LTS, but can probably run on other versions of Windows and
Linux. A fast computer and a high screen resolution are strongly recommended. Tablet computers and netbooks
are generally not suited to work with VTL. On some Windows systems it could be necessary to explicitly install
OpenGL. Without OpenGL, the 3D model of the vocal tract will not be displayed properly.
Since the simulation method requires to solve complex problems it was necessary to rely on external libraries
for specific issues:

3
• wxWidgets 3.1.5 (https://www.wxwidgets.org) is used for the graphical interface.
• boost 1.71.0 (https://www.boost.org/) is used for Bessel functions and Gauss integration.
• The Computational Geometry Algorithms Library (CGAL 5.0) (https://www.cgal.org) is used for the
generation of mesh and other geometry problems.
• Eigen 3.3.9 (http://eigen.tuxfamily.org) for solving linear algebra problems, in particular eigenvalue
decomposition for the computation of the transverse modes.
Note that VoocalTractLab3D comes with no warranty of any kind.

1.5 How to cite VocalTractLab3D?


R Blandin et al. “Efficient 3D acoustic simulation of the vocal tract by combining the multimodal method and finite
elements”. In: IEEE Access (2022), pp. 69922 –69938. doi: 10.1109/ACCESS.2022.3187424

1.6 Troubleshooting
For any issue, bug report or question related to VocalTractLab3D, please write to [email protected] or
[email protected].

2 Interface
2.1 Overview

Figure 2: The different panels of the "3D acoustic simulation" page.

When VocalTractLab3D is started, it shows by default two windows:


• The main window showing the "3d acoustic simulation" page.
• A small window showing the 3D geometry corresponding to the articulatory model (called the vocal tract
dialog). The user can interact with the articulatory model using the control points to move the articulators.
When closed, this window can be shown again by clicking on the button "Show vocal tract".

4
The "3d acoustic simulation" page is divided into 4 panels (see Fig. 2):
1. the left panel contains buttons to manage the geometries, launch the simulations and synthesize phonemes.
2. The middle panel shows a sagittal cut of the geometry simulated.
3. The right panel shows a transverse cut of the geometry corresponding to a specific segment.
4. The bottom panel shows the transfer functions and input impedance computed.

2.2 Vocal tract geometry


2.2.1 Defining a vocal tract geometry
When VocalTractLab3D is started, the default geometry of the articulatory model is already loaded and ready to
use for simulations. However, the geometry can be defined or modified in several ways:
• It can be simply defined by moving the control points in the vocal tract dialog (shown by clicking on "Show
vocal tract" in the left panel). When doing so, one can see the updates of the segmented geometry in the
central panel.
• If a speaker file containing predefined geometries is loaded, a geometry can be selected using the dialog shown
by clicking on the button "Vocal tract shapes" of the left panel.
• An external geometry can be loaded as a .csv file formatted in a specific way detailed hereafter.
Csv format for externally defined geometries:
The file describes a list of segments by specifying
• a centerline point,
• a normal,
• input and output scaling factors,
• and a contour.

Centerline x Normal x Input scaling Contour point 1 y Contour point 2 y ... Contour point N y
Centerline y Normal y Output scaling Contour point 1 z Contour point 2 z ... Contour point N z

Table 1: Csv file format to encode segmented vocal tract geometries.

One segment is defined on two lines: the first one describes the first coordinates (x or y) and the input scaling
factor, the second one the second coordinates (y or z) and the output scaling factor. This is summarized in the
Tab. 1. The columns must be separated by semi-columns ";". An example of such .csv file encoding a simple
waveguide geometry is provided in Tab. 2. Note that in this example the normal of the second segment is not
normalized. In this case the normalization is done when the file is imported, otherwise this would be equivalent to
applying a scaling factor. The length and curvature of a segment are defined by its centerline point and normal
and the centerline point and normal of the following segment. Thus, a minimal number of two segments must
be provided. The before-last and the last segment are defined by computing an intermediate centerline point and
normal between the last and before last centerline points and normals provided.

-2.; -1.; 0.5; -1.; -1.; 1.; 1.;


0.; 0.; 1.; -1.; 1.; 1.; -1.;
-1.4142; -0.5; 1.; -1.; -1.; 1.; 1.;
1.4142; 0.5; 1.5; -1.; 1.; 1.; -1.;

Table 2: Example of .csv file which can be imported to generate a waveguide geometry.

Examples of such files generated for vowel geometries measured on magnetic resonance image (MRI) are provided
in the archive in the folder "geometries_from_MRI". They have been generated with VocalTractTransferFunction
from MRI provided in the Dresden Vocal Tract Dataset [2]. The software VocalTractTransferFunction can be

5
downloaded at www.vocaltractlab.de and used to load a surface mesh and save it in the .csv format specified
above (Not yet though).
The geometry can also be exported in the same format through the context menu which appears with a right
click on the central panel "Export geometry in a csv file".
Geometry options:
When a geometry is loaded, it can be chosen to take into account or not the curvature and the area variations in
the segments. This is done using the simulation parameters dialog displayed by clicking on the button "Simulation
parameters" in the left panel. These options are found in the "Geometry options" section. When "Varying area" is
checked, the variation of area is taken into account through the scaling factor which is set to vary linearly from the
entrance to the exit of the segments. The entrance and exit scaling factors are displayed in the information text of
the right panel. The variations of the scaling factor can be computed in two different ways, or directly provided by
the user when the geometry is loaded as a .csv file. These options can be selected in the "Geometry options" as
well. One scaling factor computation method, "Area", consists simply in linearly interpolating the cross-sectional
area. The other one, "Bounding box", interpolates the largest dimension of the bounding box of the contour of the
segments, provided that the resulting scaled contour does not exceed the area of the following contour, in which
case it is set to interpolate the area. Finally, one can specify that the scaling factors provided in the input .csv file
must be used by selecting "From file".

2.2.2 Visualizing the vocal tract geometry


The geometry is visualized both in the central and the right panels. The central panel shows a sagittal cut of the
bounding box of the segments. This bounding box is a rectangle whose dimensions are the maximal and minimal
dimensions of the cross-sectional contour. The center of the segment is shown as a red line.
This sagittal cut of the segments can be exported as a list of coordinates in a text file using the context menu of
the central panel "Export segment picture". This can be used to plot the same picture with other softwares such
as Matlab. This can be easily done with this kind of Matlab code:
 
1 load ( " s e g m e n t _ p i c t u r e . t x t " ) ;
2 plot ( s e g m e n t _ p i c t u r e ( : , 1 ) , s e g m e n t _ p i c t u r e ( : , 2 ) ) ;
 
When clicking on a segment, its outline becomes red to confirm that it has been selected, and the contour and
some information regarding the segment are displayed in the right panel. It is also possible to move to the previous
and next segments using the buttons "<" and ">" at the bottom of the central panel. Note that this is the only
way to visualize the junction segments which have a zero length. When the geometry is provided by the articulatory
model, it is possible to identify the type of surfaces which constitutes the edges of the contour (tongue, teeth, lips,
openings on the sides of the lips, uvula, epiglottis and other walls). The color code for these different surfaces is
given in Fig. 3. By clicking on the arrows "<" and ">" at the bottom of the right panel, one can display the
contour with its original dimensions ("Mode computation size") and scaled at the entrance and exit of the segment.

Epiglotis

Non specified walls

Openings on the
side of the lips

Teeth

Uvula

Tongue

Lips

Figure 3: Correspondence between the colors of the contour and the anatomical parts to which the walls corresponds.

The coordinates of the points of the contour can also be exported into a text file through the context menu of
the right panel obtained with a right click "Export contour in text file". Note that the contour exported has the
scaling with which it is displayed: if at the entrance the scaling is 0.5 and the entrance contour is displayed, the
coordinate of the exported contour will be the ones of the original contour multiplied by 0.5. The exported contour
can easily be plotted using another software in the same way as the segment picture.

6
2.3 Transverse modes
2.3.1 Computation
The transverse modes can be computed by clicking the button "Compute modes" of the left panel. This can be
useful if one is interested in analyzing the transverse modes without computing the acoustic field or the transfer
functions.
The computation of the transverse modes is parametrized by:
• the density of the mesh which is used to solve with 2D FEM the eigenvalue problem giving the transverse
modes and their associated cutoff frequencies. √This is related to the average side length of the elements
through the relationship average side length = cross−sectional
mesh density
area
. Thus, the mesh density is an estimation
of the number of elements per characteristic length.
• The maximal cutoff frequency. It is an upper limit to the cutoff frequency of the transverse modes included
in the simulations: for a given segment, only the modes having a cutoff frequency lower than this value are
kept. Thus, segments having a small cross-section have less transverse modes than the ones having a bigger
one. This is done to increase the efficiency of the simulations.
The cutoff frequency is related to the sound speed, which itself is related to the temperature. Both the sound
speed and the temperature can be set in the "Physical constants" section of the "Simulation parameters" dialog.
Since both quantities are related, they cannot be modified independently: changing the temperature will change
the sound speed and conversely.

2.3.2 Visualization
The mesh used to compute the transverse modes can be visualized in the right panel by selecting "Mesh" in the
bottom. The transverse modes can be visualized by selecting "Modes" at the bottom. One can browse the different
modes using the arrows "<" and ">". The amplitude variation of the modes is displayed as a color scale, and their
cutoff frequency is given in the text information.

2.4 Transfer functions


2.4.1 Computation
The transfer functions can be computed by clicking on the button "Compute transfer functions" in the left panel.
The computation of the transfer function requires to solve the walve problem. In this purpose, the simulation
of wave propagation can be achieved either with an analytical solution [3] or a numerical Magnus-Möbius scheme
[8]. This can be selected in the section "Numerical scheme options" of the "Simulation parameters" dialog. The
analytical solution is enabled by selecting "Straight". However, this solution has strong limitations: it cannot take
into account the curvature, cross-sectional area variations and wall losses. The numerical scheme is enabled by
selecting "Magnus". In this case a number of integration steps is given and can be modified if necessary. This
number is the same for each segment. Note that since the segmentation generally used for vocal tract geometries
gives segments having approximately the same length, this parameter is expected to affect the accuracy in a similar
way for each segment. However, if the segment length is made inhomogeneous (e.g. in an imported geometry), this
parameter needs to be considered more carefully.
The boundary conditions can be set in the "Boundary conditions options" section of the "Simulation parameters"
dialog. It includes the mouth boundary condition which can be described with a radiation boundary condition
computed following Blandin et al. [5], or a zero pressure condition. Several types of wall losses can be selected:
• "Visco-thermal losses" includes frequency dependent visco-thermal losses implemented according to Bruneau
et al. [6].
• "Soft Walls" includes frequency dependent losses corresponding to soft walls using the same model as the one
implemented in VocalTractLab 2.3.

• "Constant wall admittance" includes a frequency independent wall admittance whose real and imaginary parts
can be set by the user.

7
The index of the segment in which the noise source is integrated can be set either in the "Transfer functions
options" section of the "Simulation parameters" dialog, or through the context menu of the central panel by selecting
"Define current segment as noise source location". The noise source segment is highlighted in blue (or green when
selected) in the central panel. When the segment has a non-zero length, the noise source is implemented at the end
of the segment which is closer to the mouth exit. The noise source implemented is uniform over the cross-sectional
surface, which is equivalent to excite the vocal tract with a plane wave at the specified location. The transfer
function computed for the noise source is a pressure-pressure transfer function, contrarily to the glottal transfer
function which is a velocity-pressure transfer function.
The upper frequency limit for the transfer function computation can be set in the "Transfer functions options"
section of the "Simulation parameters" dialog. The frequency step size can be selected in a list. The proposed
values correspond to divisions of the sampling frequency by powers of 2 to make the synthesis which can be done
afterward faster.
The coordinates of the reception point of the transfer functions can also be set in the "Transfer functions options"
section of the "Simulation parameters" dialog. It can be chosen either to use a single point whose coordinates can
be directly set, or to use several points whose coordinates can be loaded from a .csv file. The origin of the landmark
in which the coordinates of the reception points are expressed is the center of the mouth exit. The ny unit vector
of this landmark is the normal to the centerline at the mouth exit. The reception points can be placed anywhere.
However, if it is located in the half-space behind the mouth exit and not inside the vocal tract, the returned value
will be "nan".
When point coordinates are loaded from a .csv file, they must be given in 3 columns corresponding to the x,
y and z coordinates. This functionality can be useful to compute the directivity patterns of the radiated sound, or
the acoustic field at multiple frequencies.

2.4.2 Visualization
The transfer function points are visualized as "+" on the middle panel. It is possible to hide them by unchecking
"Show TF points" on the bottom of the panel. This can be useful if many points are used and their visualization
disturbs the visualization of the other elements. Note that if the point is located outside of the area of the sagittal
cut displayed, it will not be visible.
The transfer function and the input impedance computed are displayed in the bottom panel. The glottal transfer
function, the noise source transfer function and the input impedance are plotted in black, blue and green respectively.
It is possible to show or hide each of them by checking or unchecking "Glottal transfer function", "Noise transfer
function" or "Input impedance" on the right of the bottom panel.
In case several points are used, the transfer functions corresponding to the different reception points can be
visualized by clicking on the "<" and ">" buttons on the right of the bottom panel. The coordinates of the point
corresponding to the transfer function plotted appear above these buttons, and the corresponding point is displayed
as a red "+" in the middle panel. Note that the input impedance does not depend on a reception point location,
and hence it will be the same for each point.
The transfer functions and the input impedance can be exported using the context menu which is displayed by
a right click on the bottom panel. They are saved in a text file in which the first column gives the frequency, the
second and third the magnitude and phase of the first point, and the following columns the magnitude and phase
of the other points, if other points have been included. Such text files can easily be loaded in another software such
as Matlab to plot and analyse the data. This can be done easily with Matlab with the following code:
 
1 load t r a n s f e r _ f u n c t i o n . t x t
2 figure
3 subplot 211
4 plot ( t r a n s f e r _ f u n c t i o n ( : , 1 ) , 20∗ log10 ( t r a n s f e r _ f u n c t i o n ( : , 2 ) ) )
5 xlabel ( " f ( Hz ) " )
6 ylabel ( " Magnitude (dB ) " )
7 subplot 212
8 plot ( t r a n s f e r _ f u n c t i o n ( : , 1 ) , t r a n s f e r _ f u n c t i o n ( : , 3 ) )
9 xlabel ( " f ( Hz ) " )
10 ylabel ( " Phase ( rad ) " )
 

8
2.5 Acoustic field
2.5.1 Computation
The acoustic pressure field can be computed in the sagittal plane and the transverse planes by clicking the button
"Compute acoustic field". It corresponds to a sound source located at the glottis. The frequency at which it is
computed can be set by moving the vertical dashed line in the transfer function plot in the bottom panel. A precise
frequency can also be set in the section "Acoustic field options" of the "Simulation parameters" dialog.
In the sagittal plane the acoustic field is computed in a rectangular area displayed as a gray rectangle in the
middle panel. By default, this rectangle is the bounding box of the geometry outline. The dimensions of this
rectangle can be modified by clicking "Define bounding box lower corner" or "Define bounding box upper corner"
in the context menu which is displayed by right clicking on the middle panel. In this case, the location of the
right click is attributed to the lower left corner or the upper right corner of the rectangular area respectively. This
functionality is useful for looking in more details at a specific area. Alternatively the dimensions of this rectangular
area can be manually set in the "Acoustic field options" of the "Simulation parameters" dialog. This can be useful
if one wants to visualize the radiated field as well. In this case, the maximal value of x can be increased to extend
the area to the radiated field. The original dimension of the acoustic field area can be restored by double clicking
on the middle panel, or clicking "Reset bounding box" in the context menu of the middle panel.
The resolution of the grid of points used to compute the sagittal plane acoustic field can be set in the "Acoustic
field options". In the transverse plane the resolution of the field corresponds to the resolution of the image displayed
on the screen.
The computation of the radiated field takes a bit more time than the internal field, so it is possible to avoid
computing the radiated field by unchecking the option "Compute radiated field" in the "Acoustic field options".

2.5.2 Visualization
Once computed, the acoustic field is displayed as a logarithmic color scale in the middle and right panels. For
a better visualization, the segments and/or the transfer function points can be hidden by unchecking the options
"Show segments" and "Show TF points" in the middle panel. Alternatively, the acoustic field can also be hidden
by unchecking the option "Show field". This can be useful if the acoustic field disturbs the visualization of the
segments and/or transfer function points. In the transverse plane the acoustic field corresponds to the exit plane
of the segments.
The acoustic field can be exported in text files and easily loaded in other softwares such as Matlab for further
analysis or different visualization. This can be done by clicking "Export acoustic field as text file" in the context
menu shown by right clicking on the middle and right panels. The acoustic field can be easily loaded and plotted
with Matlab with the following code:
 
1 load " a c o u s t i c _ f i e l d . t x t "
2 figure
3 imagesc ( 2 0 ∗ log10 ( a c o u s t i c _ f i e l d ) ) ;
4 axis xy
5 axis e q u a l
 

2.6 Default parameters


Default simulation parameters can be set by clicking the buttons "Default (fast)" and "Default (accurate)" at the
bottom of the "Simulation parameters" dialog. The fast default parameters are set by default when VocalTract-
Lab3D is started. They ensure fast simulation with computation times of the order of a few minutes only, but
give inacurate results. Thus, they should not be considered as reliable. This can be useful to test the software
or to test quickly a simulation before runing a more accurate one. The accurate default parameters have been
optimized to offer a good compromise between accuracy and computation time (see Blandin et al. [4]). With these
parameters, the computation time for a 1000 frequencies transfer function is of the order of one hour. However,
these computation times depend on the geometry simulated and obviously on the computer used.

2.7 Phoneme synthesis


Once the transfer functions have been computed, it is possible to synthesize static phonemes with the buttons "Play
vowel" and "Play noise source" on the left panel. The button "Play vowel" synthesize a vowel sound by convolving
a glottal pulse synthetic signal with the impulse response computed from the glottal transfer function displayed in

9
the bottom panel. The glottal pulses are generated with a Liljencrants-Fant model [7] whose parameters can be
defined using the dialog shown by clicking on the button "LF glottal flow pulse" in the left panel.
The noise synthesis can be useful to synthesize fricative consonants. A synthetic noise signal corresponding to a
white noise filtered with a first order low-pass filter having a cutoff frequency of 5 kHz is convolved with the impulse
response of the noise source transfer function displayed in the bottom panel.
The synthetic sound generated correspond to the point at which the transfer functions displayed have been
computed. Thus, it is possible to listen to the synthetic vowel generated at various locations. This can be useful
to study directivity effects. However, note that phenomena important for directivity such as the head and torso
diffraction are not simulated. Thus, the directivity effects which can be studied are only due to the mouth opening
dimension and the influence of the vocal tract on the acoustic field at the mouth exit.
The synthesized sounds can be visualized and analyzed in the "Signal" page. It is also possible to export them
as .wav files by clicking "Save WAV" or "Save WAV as TXT" in the "File" menu of the main window.

3 Log file
The parameters used for each simulation and information regarding the evolution of the simulation process are
given in a log file. An example of such file is given below:
 
1 Wed J u l 20 1 5 : 0 6 : 0 9 2022

3 Geometry i s from VocalTractLab

5 PHYSICAL PARAMETERS:
6 Temperature 3 1 . 4 2 6 6 C
7 Volumic mass : 0 . 0 0 1 1 5 7 7 1 g /cm^3
8 Sound s p e e d : 35000 cm/ s

10 BOUNDARY CONDITIONS :
11 P e r c e n t a g e l o s s e s 100 %
12 Visco−t h e r m a l l o s s e s i n c l u d e d
13 v i s c o u s boundary s p e c i f i c a d m i t t a n c e ( 2 . 0 2 9 8 4 e − 0 5 , 2 . 0 2 9 8 4 e −05) g . cm^−2 . s^−1
14 t h e r m a l boundary s p e c i f i c a d m i t t a n c e ( 4 . 8 4 8 3 2 e − 0 5 , 4 . 8 4 8 3 2 e −05) g . cm^−2 . s^−1
15 Wall l o s s e i n c l u d e d
16 g l o t t i s boundary c o n d i t i o n : IFINITE_WAVGUIDE
17 mouth boundary c o n d i t i o n : ZERO_PRESSURE

19 MODE COMPUTATION PARAMETERS:


20 Mesh d e n s i t y : 5
21 Max cut−on f r e q u e n c y : 20000 Hz
22 Compute modes and j u n c t i o n m a t r i c e s : NO

24 INTEGRATION SCHEME PARAMETERS:


25 P r o p a g a t i o n mmethod : MAGNUS o r d e r 2
26 Number o f i n t e g r a t i o n s t e p s : 3
27 Take i n t o a c c o u n t c u r v a t u r e
28 Area v a r i a t i o n w i t h i n s e gm e nt s t a k e n i n t o a c c o u n t
29 s c a l i n g f a c t o r computation method : AREA

31 TRANSFER FUNCTION COMPUTATION PARAMETERS:


32 Index o f n o i s e s o u r c e s e c t i o n : 25
33 Maximal computed f r e q u e n c y : 10000 Hz
34 Spectrum exponent 10
35 Frequency s t e p s : 4 3 . 0 6 6 4 Hz
36 Number o f s i m u l a t e d f r e q u e n c i e s : 233
37 T r a n s f e r f u n c t i o n p o i n t (cm ) :
38 3 0 0

40 ACOUSTIC FIELD COMPUTATION PARAMETERS:


41 A c o u s t i c f i e l d computation a t 1 2 0 6 . 9 Hz with 30 p o i n t s p e r cm
42 S p a t i a l r e s o l u t i o n f o r f i e l d p i c t u r e : 30 p o i n t s p e r cm
43 Bounding box :
44 min x −3.34796
45 max x 5 . 9 4 2 9 8
46 min y −7.95
47 max y 1 . 5 7 2 4 3
48 Compute r a d i a t e d f i e l d YES
 

10
This file, named log.txt, is generated and modified automatically in the working directory of VocalTractLab3D.
It can be useful to assert which parameters have been used for a specific simulation, or to follow in more details the
simulation process. During a simulation, on can follow its updates in real time with an appropriate software. This
can be done with Notepad++ by selecting the option "Monitoring" in "View". In Linux this can be done in the
command line with
 
1 t a i l −f l o g . t x t
 
A copy of the log file can be saved to keep track of the parameters used for a specific simulation.

4 Aknowledgements
The development of VocalTractLab3D was supported by the German Research Foundation (DFG) under Grant BI
1639/7-1.
I am very grateful for the support of Peter Birkholz for the development of this special version of VocalTractLab.
Without his work on articulatory synthesis, which led to the development of VocalTractLab, this software could
not exist. All along this project he was very helpful and supportive through insightful discussions which helped
designing the software and solving problems.
We acknowledge the contribution of Jingyan Geng for the creation of the geometry files from MRI data. We
thank all the members of the Chair of Speech Technologies and Cognitive Systems of the TU-Dresden and Mario
Fleischer for helping with testing VTL3D and spotting bugs.

References
[1] P Birkholz. “Modeling consonant-vowel coarticulation for articulatory speech synthesis”. In: PloS one 8.4 (2013),
e60603. doi: 10.1371/journal.pone.0060603.
[2] P Birkholz et al. “Printable 3D vocal tract shapes from MRI data and their acoustic and aerodynamic prop-
erties”. In: Scientific data 7.1 (2020), pp. 1–16.
[3] R Blandin et al. “Effects of higher order propagation modes in vocal tract like geometries”. In: J. Acoust. Soc.
Am. 137.2 (2015), pp. 832–843. doi: 10.1121/1.4906166.
[4] R Blandin et al. “Efficient 3D acoustic simulation of the vocal tract by combining the multimodal method and
finite elements”. In: IEEE Access (2022), pp. 69922 –69938. doi: 10.1109/ACCESS.2022.3187424.
[5] R Blandin et al. “Multimodal radiation impedance of a waveguide with arbitrary cross-sectional shape termi-
nated in an infinite baffle”. In: J. Acoust. Soc. Am. 145.4 (2019), pp. 2561–2564. doi: 10.1121/1.5099262.
[6] AM Bruneau et al. “Boundary layer attenuation of higher order modes in waveguides”. In: J. Sound Vib. 119.1
(1987), pp. 15–27. doi: 10.1016/0022-460X(87)90186-6.
[7] G Fant, J Liljencrants, QG Lin, et al. “A four-parameter model of glottal flow”. In: STL-QPSR 4.1985 (1985),
pp. 1–13.
[8] V. Pagneux. “Multimodal admittance method in waveguides and singularity behavior at high frequencies”.
In: J. Comput. Appl. Math. 234.6 (2010). Eighth International Conference on Mathematical and Numerical
Aspects of Waves (Waves 2007), pp. 1834–1841. issn: 0377-0427. doi: 10.1016/j.cam.2009.08.034.
[9] M Sondhi and J Schroeter. “A hybrid time-frequency domain articulatory speech synthesizer”. In: IEEE Trans.
Audio Speech Lang. Process. 35.7 (1987), pp. 955–967. doi: 10.1109/TASSP.1987.1165240.

11

You might also like