Skip to main content

Video Segmentation and Indexing using Motion Estimation

Sarah Porter, Video Segmentation and Indexing using Motion Estimation. PhD thesis. University of Bristol. February 2004. PDF, 2537 Kbytes.


Video indexing is a central component necessary to facilitate efficient content-based retrieval and browsing of visual information stored in large multimedia databases. This thesis presents work towards a unified framework for automated video indexing. To create an efficient index, a set of representative key frames are selected which capture and encapsulate the entire video content. This is achieved by, firstly, segmenting the video into its constituent shots and, secondly, selecting an optimal number of frames between the identified shot boundaries. The segmentation algorithm is designed to detect both abrupt shot transitions, or \emph{cuts}, and gradual transitions, such as \emph{dissolves} and \emph{fades}. This is achieved by means of a two-component frame differencing metric taking both image structure and colour distributions into account. The application of hierarchical block-based normalised correlation and local colour histogram differences leads to a method which is both accurate and robust.

After the segmentation step, the key frames are selected to minimise representational redundancy whilst still portraying the content in each shot. This is achieved by employing a graph-based representation of each shot where nodes represent frames and connection weights the amount of shared content between the frames corresponding to the connected nodes. The key frames are then selected as those corresponding to nodes present on the least weight path through the graph. As a final step, the camera motion is characterised to provide an additional layer of video annotation which may prove useful for indexing.

Bibtex entry.

Publication Admin