This paper gives an overview of approaches to video representation targeting semantic analysis for content-based indexing and retrieval. It highlights the major achievements of the existing methodologies and sheds new light to the challenges that are still unsolved. The problem of adaptive representation of digital multimedia is critically assessed and some novel ideas are presented. In addition, the concept of video multimodality is reevaluated and redefined in order to introduce the modalities like editing technique. An extensive literature survey on the topics involved is given.