Fast speaker adaptation using "a priori" knowledge

Kuhn, Roland;Nguyen, Patrick;Junqua, Jean-Claude;Boman, R;Niedzielski, Nancy;Fincke, S;Field, K;Contolini, M
ICASSP 1999, 24th IEEE International Conference on Acoustics, Speech and Signal Processing, March 15-19, 1999, Phoenix, USA

Recently, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating K eigenvoice coefficients for the new speaker; typically, K is very small compared to original dimension D. Here, we review how to find the eigenvoices, give a maximum-likelihood estimator for the new speaker's eigenvoice coefficients, and summarize mean adaptation experiments carried out on the Isolet database. We present new results which assess the impact on performance of changes in training of the SD models. Finally, we interpret the first few eigenvoices obtained.

Sécurité numérique
Eurecom Ref:
© 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also: