The purpose of this work is to describe the Microphone Network presently used at ITC-irst for multi-microphone data collection and prototype development, with the specific aim of conducting research inside the CHIL European Project. In the project, we define a generic multi-sensor system which consists of two main components: a distributed multi-camera system for visual room observation, including several calibrated cameras, and a multi-microphone system for acoustic scene analysis, which consists of microphone arrays, microphone clusters, table top microphones and close-talking microphones allowing detection of multiple acoustic events, voice activity detection, ASR and speaker location and tracking [1]. The target scenario comprises seminars and meetings. The entire audio acquisition system is based on a common sampling rate of 44.1 kHz and a sample accuracy of 24 bit. Also for acoustic sensors, a detailed characterization process as well as a calibration step are necessary, according to the purpose of having a jointly consistent description of the audio-video sensor geometry. In the CHIL room at ITC-irst (See Figure 1), seven T-shaped microphone arrays, each consisting of four omnidirectional microphones, were installed in order to obtain an optimal coverage of the environment for speaker localization and tracking purposes. Moreover, a NIST-MarkIII array [2] of 64 microphones was installed on the wall facing the seminar speaker. Through its use the primary objective is far field ASR: however, benefits are expected also for what concerns the use of MarkIII signals for speech activity detection and speaker localization.
On calibration and coherence signal analysis of the CHIL microphone network at IRST
HSCMA 2005, Workshop on Hands-Free Speech Communication and Microphone Arrays, March 17-18, 2005, Piscataway, USA
Type:
Conférence
City:
Piscataway
Date:
2005-03-17
Department:
Sécurité numérique
Eurecom Ref:
1612
See also:
PERMALINK : https://www.eurecom.fr/publication/1612