Latent semantic analysis for an effective region-­based video shot retrieval system

Souvannavong, Fabrice; Mérialdo, Bernard; Huet, Benoit
MIR 2004, 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, October 15-16, 2004, New York, USA

We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation

New York
Data Science
Eurecom Ref:
© ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in MIR 2004, 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, October 15-16, 2004, New York, USA