Dual-stream temporal convolutional neural network for voice presentation attack detection

Gonzalez-Soler, Lazaro J; Gomez-Barrero, Marta; Kamble, Madhu; Todisco, Massimiliano; Busch, Christoph
IWBF 2022, International Workshop on Biometrics and Forensics, 20-21 April 2022, Salzburg, Austria

Improving the robustness of biometric systems to external attacks is of the utmost importance for the research community. In particular, Automatic Speaker Verification (ASV) can be easily bypassed by launching either attack presentations (i.e., physical access attacks) over the capture devices (i.e., microphone) or exchanging the input sample in the channel between the capture device and the signal processor (i.e., logical access attacks). In order to address these security threats, ASVspoof challenges have evaluated the generalisation ability of several Presentation Attack Detection (PAD) approaches in the last decade. Those algorithms have reported a remarkable detection performance to detect physical and logical access attacks when they are combined with the decision provided by the ASV systems. They fundamentally depend upon the complementary information of ASV systems for a reliable detection performance. Therefore, they are not interoperable across different systems. In this work, we propose an interoperable dual-stream PAD method which leverages temporal information from image-based voice spectrograms to enhance generalisation on PAD. The experimental results conducted over the publicly available ASVspoof 2019 and 2021 databases show the feasibility of our approach to detect both physical and logical access attacks unknown in training.


DOI
HAL
Type:
Conférence
City:
Salzburg
Date:
2022-04-20
Department:
Sécurité numérique
Eurecom Ref:
6930
Copyright:
© 2022 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PERMALINK : https://www.eurecom.fr/publication/6930