Robust Spoken and Multi-Modal Communication - Texte

zur Starseite
zur Hauptnavigation
zum Inhalt
zur Suche
zur Hilfe

Schrift Standard Fett
Login

Forschungsstelle

COST

Projektnummer

C01.0042

Projekttitel

Robust Spoken and Multi-Modal Communication

Projekttitel Englisch

Robust Spoken and Multi-Modal Communication

Texte zu diesem Projekt

	Deutsch	Französisch	Italienisch	Englisch
Schlüsselwörter	-	-	-
Forschungsprogramme	-	-	-
Kurzbeschreibung	-	-	-
Weitere Hinweise und Angaben	-	-	-
Partner und Internationale Organisationen	-	-	-
Abstract	-	-	-
Datenbankreferenzen	-	-	-

Erfasste Texte

Kategorie	Text
Schlüsselwörter (Englisch)	Speech recognition; multi-channel and multi-modal processing
Forschungsprogramme (Englisch)	COST-Action 278 - Spoken language interaction in telecommunication
Kurzbeschreibung (Englisch)	See abstract
Weitere Hinweise und Angaben (Englisch)	Full name of research-institution/enterprise: Institut dalle Molle d'intelligence artificielle perceptive IDIAP
Partner und Internationale Organisationen (Englisch)	AT, BE, CY, CZ, DK, FI, FR, DE, EL, HU, IT, LT, NL, NO, PT, SK, SI, ES, SE, CH, TR, UK
Abstract (Englisch)	This project is about the development of new approaches towards multimodal communication systems involving speech and visual processing. In the framework of COST 278, IDIAP mainly investigated new approaches towards the processing and combination of non-stationary and non-synchronous streams of data, typically resulting of the joint use of audio and visual information. More specificall, fusion algorithms based on entropy minimization have been further developed (see, e.g., (1)). New forms of hidden Markov models able to deal with correlated asynchronous information sequences have also been developed and were successfully tested on audio-visual speech recognition problems (see, e.g., (2)). Building upon these developments, new forms of truly multimodal user tracking algorithms using both visual tracking (based on particle filtering) and audio localization (using microphone arrays and used to initialize the visual tracker) have been developed.; see, e.g., (3). Finally, some of the resulting algorithms were also used to model multi-modal human interaction (involving audio and video features) in meetings. In 2005, in addition to the above, we mainly focused on new approaches towards the extraction of information from multimedia meeting collections and hierarchical multi-channel processing, which is further expending the possibilities towards advanced multimodal communication (6).
Datenbankreferenzen (Englisch)	Swiss Database: COST-DB of the State Secretariat for Education and Research Hallwylstrasse 4 CH-3003 Berne, Switzerland Tel. +41 31 322 74 82 Swiss Project-Number: C01.0042