Zero-shot classification of events for character-centric video summarization