This year EURECOM and ECNU participated together at the TRECVID Semantic Indexing
Task. We built four different systems for the light (10 concepts) submission. Three of our runs
are functionally similar to the system used by EURECOM for last year's High Level Feature
Extraction task (see [6] for further details).
We keep as a basic run (Fusebase) the best-performing system from 2009, testing how such
system performs on the new dataset; we then improve the EURECOM Fusebase by adding
a global descriptor, originally built for scene recognition, and proved to be effective in the
TRECVID context for spatially-independent concepts like "Nighttime". We then experiment
with a multi-modal analysis, combining the visual features with the textual metadata that have
been provided with the 2010 video database. As last run, we try a new system based on Hamming
Embedding and Weighted Visual words.