ServicenavigationHauptnavigationTrailKarteikarten


Research unit
EU RFP
Project number
99.0562
Project title
ASSAVID: Automatic segmentation and semantic annotation of sports videos

Texts for this project

 GermanFrenchItalianEnglish
Key words
-
-
-
Anzeigen
Alternative project number
-
-
-
Anzeigen
Research programs
-
-
-
Anzeigen
Short description
-
-
-
Anzeigen
Further information
-
-
-
Anzeigen
Partners and International Organizations
-
-
-
Anzeigen
Abstract
-
-
-
Anzeigen
References in databases
-
-
-
Anzeigen

Inserted texts


CategoryText
Key words
(English)
Sports video annotation; speech recognition; text detection; text recognition.
Alternative project number
(English)
EU project number: IST-1999-13082
Research programs
(English)
EU-programme: 5. Frame Research Programme - 1.2.4 Essential technologies and infrastructures
Short description
(English)
See abstract
Further information
(English)
Full name of research-institution/enterprise:
Institut dalle Molle d'intelligence artificielle perceptive IDIAP

Partners and International Organizations
(English)
Coordinator:University of Surrey (UK)
Abstract
(English)
The usefulness of archived audiovisual material is strongly dependent on the quality of the accompanying annotation. Currently this is a labour-intensive process, which is therefore limited in the amount of detail that can be stored. In particular, in real-time applications (such as live broadcast events) it is unrealistic to add much manual annotation.
The proposed information management system will automatically extract descriptive features, using MPEG-7 descriptors where relevant, and associate these features with a small thesaurus relevant to the subject matter. In this project, the subject matter will be to sports events. The features will include video, text, speech and other audio. These features will be associated with the thesurus by means of a training process. In this way the user will be able to make text-based queries on the audiovisual material, using only the automatically-extracted annotation.
The aim of the project is to develop techniques for automatic segmentation and semantic annotation of sports videos. Such material typically originates 'live', thus making detailed manual annotation impractical. The level of annotation should be sufficient to enable simple text-based queries. The target will be to segment the material into shots, and to group and classify the shots into semantic categories (type of sport). To do this, the system will extract information from each shot, based on speech and text recognition, and identify the highlights from the audio track and from visual audience reactions. A training system will then be developed that will then associate these features with a small thesaurus relevant to the subject matter.
The project is based on the construction of several software modules that extract various different modes of information (speech, other audio cues, encapsulated text, other video cues). Each module functions fairly independently of the others, so that the development of the modules can proceed in parallel. The video module can be further roughly subdivided into shot detection, spatiotemporal object extraction and mosaicing, and object characterisation and recognition.
The output of these module is fed into a contextual annotation module, which is able to associate the automatically-extracted features with an internal lexicon. The output is a text-based summary of the video material. The associative mechanism is trained for the specific application - in this case sports material. A user interface will be developed which will enable the user to browse through the video database, using the automatically-extracted text summary to link to the original video material. Special attention will be paid to the flexibility an user-friendliness of this interface. As the software components are integrated into a complete system, attention will be paid to the development of testing methodology, and to an evaluation of the prototype. The direction of this work will be guided strongly by input from the user partners. Recent work on video annotation has highlighted the difficulty of locating sufficient material with adequate ground truth information. Thus the compilation, cataloguing and assessment of suitable test material forms an important component of this part of the work.
References in databases
(English)
Swiss Database: Euro-DB of the
State Secretariat for Education and Research
Hallwylstrasse 4
CH-3003 Berne, Switzerland
Tel. +41 31 322 74 82
Swiss Project-Number: 99.0562