Partner und Internationale Organisationen
(Englisch)
|
ALU, TUT,US, AUT,UT, CMM, EPFL
|
Abstract
(Englisch)
|
During the third year EPFL concentrated its activity to the development of nonlinear video segmentation schemes. A method has been developed in order to provide a flexible tool, capable of addressing a wide range of applications. The system is based on the distinction between regions and semantic objects. The algorithm extracts automatically regions which are homogeneous in colour and/or motion. The grouping of the regions into semantic objects can take place by means of user interaction (e.g. in interactive multimedia applications). However, when the application is known a priori, the grouping can also take place in an automatic mode: for instance, in a surveillance application it can be driven by a change detection mask. This is a technique implemented at EPFL, that performs an efficient motion analysis based on statistical testing and noise modeling. The developed segmentation method is based on a two-level multi-feature approach. First, a low level segmentation of the image into homogeneous regions is obtained by multidimensional analysis of several image features using a spatially constrained Fuzzy C-Means algorithm. Features characterize both spatial and temporal information. The relative weights of the different features are changed adaptively, depending on the local image characteristics. The grouping of the resulting regions into objects can be interactive or automatic, depending on the applications. Interactive multimedia applications require a high level of flexibility. In this case the user may interact with the process in order to obtain a higher level segmentation, resulting in the extraction of semantically meaningful objects. In other situations where the specific application is known a priori, the grouping of regions into objects can be automatic, as for example in a surveillance application, where it can be driven by a change detection mask. This tool allows in particular the extraction and tracking of objects in a video sequence of generic nature. It enables object-based coding (e.g. MPEG-4) as well as content-based representation (e.g. MPEG-7) and can be used in applications under their scope, such as video editing, interactive video, video database retrieval, and so on. Beside the above applications, the tool developed in the context of this project can be used in smart cameras, intelligent vision and media conversion problems.
|