Integrated deepfake detection and automatic speaker verification

Fonction

PhD Student

Département

Digital Security

Date

07-2024

Reference

PhD Position – Thesis offer M/F (Reference: SN/NE/PhD/PEPR1/072024)

Description

Research to improve the robustness of voice biometric systems to spoofing attacks is now relatively mature. Typical solutions involve the use of an auxiliary binary classifier to detect and filter deepfake or spoofed samples. The role of the speaker verification system is then only to determine whether or not enrolment and test utterances correspond to the same speaker. The use of separate sub-systems combined in this fashion might be sub-optimal and be limited by the propagation of errors; errors made by one system cannot be corrected by the other sub-system. Recent work has shown the potential of joint optimisation whereby the two sub-systems are trained together to solve the single task of reliable automatic speaker verification.

The performance of jointly optimised and single, integrated systems is limited by the use of training data collected from too few speakers. The ASVspoof 5 database [1] contains data collected from a substantial number of speakers and will allow not only the further exploration of joint optimisation but also more closely integrated systems, specifically single classifiers which perform simultaneous detection and recognition. This thesis will investigate such single, integrated solutions based either upon the fine-tuning of pre-trained binary classifiers or upon multi-task learning techniques. The goal will be to learn the subspaces containing speaker-related and spoofing-related artefacts and then to perform reliable automatic speaker verification with a single classifier. Since there is potential for degraded generalisation when the roles traditionally fulfilled by two separate sub-classifiers are performed by a single classifier, we will explore the use of data augmentation to help improve robustness to unseen attacks and speakers.

The successful candidate will join the Audio Security and Privacy Group within EURECOM’s Digital Security Department. You will work under the supervision of Profs. Nicholas Evans and Massimiliano Todisco and with Prof. Driss Matrouf at the Laboratoire d'Informatique Avignon (LIA), and there will be opportunities for international collaboration, e.g. with members of the ASVspoof organising committee. The position is funded by the French National Research Agency (ANR) Cybersecurity Priority Research and Equipment Programme (PEPR).

[1] “ASVspoof 5 Evaluation Plan”, Hector Delgado, Nicholas Evans, Jee-weon Jung, Tomi Kinnunen, Ivan Kukanov, Kong Aik Lee, Xuechen Liu, Hye-jin Shim, Md Sahidullah, Hemlata Tak, Massimiliano Todisco, Xin Wang, Junichi Yamagishi, ASVspoof consortium, 2024 https://www.asvspoof.org/file/ASVspoof5___Evaluation_Plan_Phase2.pdf

Requirements

Education Level / Degree: Master’s degree
Field / specialty: Computer Science, Artificial Intelligence, Speech Processing, Deepfake Detection
Technologies / languages / systems: machine learning, deep learning, Python and PyTorch
Other skills / specialties: strong mathematics, analytical, problem solving, communications and writing skills
Other important elements: an excellent academic track record, proficiency in English.

Application

The application must include:

Detailed curriculum,
Motivation letter of two pages also presenting the perspectives of research and education,
Name and address of three references.

Applications should be submitted by e-mail to secretariat@eurecom.fr with the reference: SN/NE/PhD/PEPR1/072024

Start date: Sept./Oct. 2024
Duration: Duration on the thesis

More info

SN_NE_PhD_PEPR1_072024_US.pdf155.34 KB