Spoof diarization: “What spoofed when” in partially spoofed audio

Zhang, Lin; Wang, Xin; Cooper, Erica; Diez, Mireia; Landini, Federico; Evans, Nicholas; Yamagishi, Junichi

INTERSPEECH 2024, 25th Conference of the International Speech Communication Association, 1-5 September 2024, Kos Island, Greece / Also on ArXiV

This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Countermeasure-Condition Clustering (3C) model. Utilizing this model, we first explore how to effectively train countermeasures to support spoof diarization using three labeling schemes. We then utilize spoof localization predictions to enhance the diarization performance. This first study reveals the high complexity of the task, even in restricted scenarios where only a single speaker per audio file and an oracle number of spoofing methods are considered. Our code is available at https://github.com/ nii-yamagishilab/PartialSpoof.

Detail

ARXIV

BIBTEX

Type:

Conference

City:

Kos Island

Date:

2024-09-01

Department:

Digital Security

Eurecom Ref:

7770

© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2024, 25th Conference of the International Speech Communication Association, 1-5 September 2024, Kos Island, Greece / Also on ArXiV and is available at :