Detection of adversarial audio deepfakes

Fonction

PhD Student

Département

Digital Security

Date

07-2024

Reference

PhD Position – Thesis offer M/F (Reference: SN/NE/PhD/PEPR2/072024)

Description

The design of reliable, robust solutions for the detection of deepfakes is a continuous arms race with the fraudsters. Once a new deepfake attack is identified and the detection solution is appropriately updated, the fraudster can design new attacks to overcome the new detector. Attacks are naturally becoming more sophisticated and adversarial in nature. Those seen thus far are typically designed to overcome a single classifier, perhaps an automatic speaker verification sub-system or a deepfake detection sub-system. We aim to explore and improve upon the resilience of spoofing robust speaker verifications systems to attacks which are designed to manipulate both sub-systems.

This thesis will study a new generation of adversarial attacks designed to fool both a voice biometric system and a deepfake spoofing detector. Starting with a selection of state-of-the-art detectors, we will design post-processing techniques to suppress the distortions and artefacts in speech signals that are generated using text-to-speech (TTS) and voice conversion (VC) algorithms. This work will establish the vulnerability of existing solutions to more adversarial attacks. Building on the concepts of adversarial training, the second stage will be to design alternative detection approaches that detect speech attributes that even state-of-the-art TTS and VC approaches do not model well and which cannot be removed or attenuated through adversarial post-processing. Since adversarial training is typically highly demanding in terms of computation and may result in even more complex models than those used currently, we are also interested to study knowledge distillation and other model complexity reduction techniques, e.g. in the form of teacher-student networks or gradient flow preservation pruning, to reduce complexity and to help learn efficient models suited to practical applications.

The successful candidate will join the Audio Security and Privacy Group within EURECOM’s Digital Security Department. You will work under the supervision of Profs. Nicholas Evans and Massimiliano Todisco and with Prof. Anthony Larcher at the Laboratoire d'Informatique de l'Université du Mans (LIUM), and there will be opportunities for international collaboration, e.g. with members of the ASVspoof organising committee. The position is funded by the French National Research Agency (ANR) Cybersecurity Priority Research and Equipment Programme (PEPR).

[1] “ASVspoof 5 Evaluation Plan”, Hector Delgado, Nicholas Evans, Jee-weon Jung, Tomi Kinnunen, Ivan Kukanov, Kong Aik Lee, Xuechen Liu, Hye-jin Shim, Md Sahidullah, Hemlata Tak, Massimiliano Todisco, Xin Wang, Junichi Yamagishi, ASVspoof consortium, 2024 https://www.asvspoof.org/file/ASVspoof5___Evaluation_Plan_Phase2.pdf

Requirements

Education Level / Degree : Master’s degree
Field / specialty: Computer Science, Artificial Intelligence, Speech Processing, Deepfake Detection
Technologies / languages / systems: machine learning, deep learning, Python and PyTorch
Other skills / specialties: strong mathematics, analytical, problem solving, communications and writing skills
Other important elements: an excellent academic track record, proficiency in English

Application

The application must include:

Detailed curriculum,
Motivation letter of two pages also presenting the perspectives of research and education,
Name and address of three references.

Applications should be submitted by e-mail to secretariat@eurecom.fr with the reference: SN/NE/PhD/PEPR2/072024

Start date: Sept./Oct. 2024
Duration: Duration of the thesis

More info

SN_NE_PhD_PEPR2_072024_US.pdf143.08 KB