|
One of VTI main feature is the ability to go from the acoustic space (defined by the
formant frequencies) to the articulatory space (defined in the DRM control parameters space [4]).
This process is called acoustic-to-articulatory inversion [5]. Three different inversion methods
are implemented:
- Inversion by a connexionnist network [2][3][6][7]: connexionnist network can be trained to
provide as output the control parameters of a vocal tract model when the formant frequencies
are given as input. In VTI, three formant frequencies are used.
- Inversion by table lookup [1][2][5]: a table contains couples of acoustic and articulatory
data. The table is searched for the best match in the acoustic space between the target data
and the acoustic data and the corresponding articulatory controls are output. In VTI, three
formant frequencies are used as acoustic data, and the controls of the DRM model plus the
total length of the tract are used as articulatory data.
- Optimization [2][5]: Gradient descent can be used to modify the control parameters of the
vocal tract in order to decrease the error between the target data and the formant frequencies
obtained with the current configuration of the vocal tract.
VTI allows the use of optimization either from a starting vocal tract configuration obtained
manually by direct control of the vocal tract parameters, or after one of the other inversion
methods i.e. inversion by connexionnist network or inversion by table lookup.
VTI files
Installation of VTI is simple. Just copy the folder "VTI f" on your hard disk. Do not move
any file out of this folder, or you may encounter some problems launching the application.
The folder "VTI f" contains the following:
The "Tables" folder contains the tables available for inversion by table lookup, and the "Networks"
folder contains the networks available for inversion by connexionnist network.
Using VTI
VTI implements balloon help. Don"t hesitate to work with balloon help turned on in order to obtain
direct information on the program user interface.
Each VTI document consists of five windows. The main window displays informations about the selected
frame, and ways to select a different frame. The four remaining windows display respectively, spectral
enveloppe of the current frame, area function of the vocal tract of the current frame (if any), formant
frequencies and bandwidths for each frames, and the synthesized signal.
- Main window:
Any control can be modified by typing the new value or by moving the corresponding slider (if any).
- Acoustic window:
The acoustic window displays the spectral enveloppe computed from the formant frequencies and bandwidths
in blue, and the spectral enveloppe computed from the vocal tract area function (if any) in red.
- Articulatory window:
The articulatory window displays the area function (area in cm2 vs. distance from the glottis in cm).
- Formants window:
The formants window displays formants frequencies and bandwidths for all frames. In red, values obtained
with the vocal tract (if any), in blue, values obtained with formant description.
- Sound window:
The sound window displays the synthesized signal.
What you need to use VTI
To use VTI, you need these pieces of hardware and software:
- a Power Macintosh computer
- system software version 7.5 or later
A color screen is highly recommanded in order to help the readability of the graphics (256 colors is a good choice).
Feedback
This beta version of VTI has been sent to several laboratories. Any comments or bug reports are
welcome. Please send them to me by E-mail, and I will try to reply to these comments and fix bugs.
|