ServicenavigationHauptnavigationTrailKarteikarten


Forschungsstelle
COST
Projektnummer
C01.0042
Projekttitel
Robust Spoken and Multi-Modal Communication
Projekttitel Englisch
Robust Spoken and Multi-Modal Communication

Texte zu diesem Projekt

 DeutschFranzösischItalienischEnglisch
Schlüsselwörter
-
-
-
Anzeigen
Forschungsprogramme
-
-
-
Anzeigen
Kurzbeschreibung
-
-
-
Anzeigen
Weitere Hinweise und Angaben
-
-
-
Anzeigen
Partner und Internationale Organisationen
-
-
-
Anzeigen
Abstract
-
-
-
Anzeigen
Datenbankreferenzen
-
-
-
Anzeigen

Erfasste Texte


KategorieText
Schlüsselwörter
(Englisch)
Speech recognition; multi-channel and multi-modal processing
Forschungsprogramme
(Englisch)
COST-Action 278 - Spoken language interaction in telecommunication
Kurzbeschreibung
(Englisch)
See abstract
Weitere Hinweise und Angaben
(Englisch)
Full name of research-institution/enterprise: Institut dalle Molle d'intelligence artificielle perceptive IDIAP
Partner und Internationale Organisationen
(Englisch)
AT, BE, CY, CZ, DK, FI, FR, DE, EL, HU, IT, LT, NL, NO, PT, SK, SI, ES, SE, CH, TR, UK
Abstract
(Englisch)
This project is about the development of new approaches towards multimodal communication systems involving speech and visual processing. In the framework of COST 278, IDIAP mainly investigated new approaches towards the processing and combination of non-stationary and non-synchronous streams of data, typically resulting of the joint use of audio and visual information. More specificall, fusion algorithms based on entropy minimization have been further developed (see, e.g., (1)). New forms of hidden Markov models able to deal with correlated asynchronous information sequences have also been developed and were successfully tested on audio-visual speech recognition problems (see, e.g., (2)). Building upon these developments, new forms of truly multimodal user tracking algorithms using both visual tracking (based on particle filtering) and audio localization (using microphone arrays and used to initialize the visual tracker) have been developed.; see, e.g., (3). Finally, some of the resulting algorithms were also used to model multi-modal human interaction (involving audio and video features) in meetings. In 2005, in addition to the above, we mainly focused on new approaches towards the extraction of information from multimedia meeting collections and hierarchical multi-channel processing, which is further expending the possibilities towards advanced multimodal communication (6).
Datenbankreferenzen
(Englisch)
Swiss Database: COST-DB of the State Secretariat for Education and Research Hallwylstrasse 4 CH-3003 Berne, Switzerland Tel. +41 31 322 74 82 Swiss Project-Number: C01.0042