SYNVOX2: Towards a privacy-friendly VOXCELEB2 dataset

Miao, Xiaoxiao; Wang, Xin; Cooper, Erica; Yamagishi, Junichi; Evans, Nicholas; Todisco, Massimiliano; Bonastre, Jean-François; Rouvier, Mickael

ICASSP 2024, IEEE International Conference on Acoustics, Speech and Signal Processing, 14-19 April 2024, Seoul, Korea

The success of deep learning in speaker recognition relies heavily on the use of large datasets. However, the data-hungry nature of deep learning methods has already being questioned on account the ethical, privacy, and legal concerns that arise when using large-scale datasets of natural speech collected from real human speakers. For example, the widely-used VoxCeleb2 dataset for speaker recognition is no longer accessible from the official website. To mitigate these concerns, this work presents an initiative to generate a privacyfriendly synthetic VoxCeleb2 dataset that ensures the quality of the generated speech in terms of privacy, utility, and fairness. We also discuss the challenges of using synthetic data for the downstream task of speaker verification.

Detail

Document

ARXIV

DOI

BIBTEX

Type:

Conférence

City:

Seoul

Date:

2024-04-14

Department:

Sécurité numérique

Eurecom Ref:

7435

© 2024 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.