A new video segmentation scheme for MPEG-4 applications: algorithm, architecture and implementation

By Durga Kishore

Abstract

The rapid advances in computer hardware, software and communication systems have opened the doors for the new cutting edge technology of multimedia communications. MPEG 4 is the new coding standard for multimedia communications. For MPEG 4 video, the most important functionalities are object-based interactivity, high coding efficiency, and improved error resilience. The content-based visual object representation is certainly the key to interactivity and other content-based functionalities of MPEG 4. However, to take advantage of these features, a prior decomposition of video sequences into semantically meaningful objects is required. Partitioning a video sequence into video object planes by means of automatic or semi automatic segmentation is a very difficult task and, comparatively little research has been undertaken in this field.

The objectives of this work are to propose a novel algorithm for video object segmentation for MPEG-4 applications, and developing a distributed memory archirecture to implement the spatial segmentation of the algorithm. Towards this end, a new VOP segmentation algorithm using mathematical morphology and object tracking system in spatio-temporal manner is proposed. Temporal segmentation is performed based on Morphological change detection mask (MCDM). The spatial segmentation method consists of morphological filters to simplify the image, and is followed by gradient detection. Morphological region filling is used to get accurate objects utilizing the results of MCDM and gradient detection. The main idea is that the object boundaries are the same as the gradient detection boundaries. Therefore, the MCDM results are used to get the foreground regions in the gradient image by region filling. The object based tracking system is used to track the objects in consecutive frames. This approach leads to the foreground video objects with perfect preservation of contours. In addition, the algorithm avoids watershed flooding, and region merging steps, which are the most complex and time consuming modules of the Kim et al. algorithm that was proposed. The proposed algorithm and Kim et al. algorithm are coded in C and simulated on Pentium II processor running at 200 MHz. The new algorithm is two times faster than the Kim et al. algorithm.

Further, in order to implement the spatial segmentation process, which is the most time consuming process in the new algorithm, an efficient architecture is proposed. It utilizes the inherent pipelining that exists in the morphological operations and the inherent parallelism in the incoming data. It is a distributed memory architecture with four processing elements operating under the control of a master processor. The architecture is functionally simulated and tested using Verilog HDL and synthesized with Synopsys CAD tools.