Abstract
(Englisch)
|
The key objective of the project CARROUSO is to provide a new technology that enables to transfer a sound field, generated at a certain real or virtual space, to another usually remote located space. Full interactive control of relevant temporal, spatial and perceptual properties of the sound field is included, especially in combination with visual objects. CARROUSO is based on the synergy of two new and powerful technologies: - the flexible transmission standard mpeg-4, offering object-oriented coding and interactive manipulation of 3D audio. - the wave field synthesis (WFS) rendering technique, which is able to produce a true sonic space, not its stereophonic representation. During the first year, three main objectives have been achieved or outlined concretely: the functional architecture of the CARROUSO system has been defined, some mpeg-4 audio decoders have been integrated into an existing mpeg-4 systems player, and finally guidelines for dissemination and exploitation were set. During the second year the CARROUSO consortium focused on the achievement of two main technical objectives. First of all, user interaction has been added to the mpeg-4 audio/systems player, so that graphic user interfaces can be included in the same bitstream containing sounds, to control them in a standardized and compliant way. Furthermore, network capabilities have been added to the same player and an encoding utility has been implemented; in this way it is possible to have a complete multimedia chain from content coding to media playback, which has been demonstrated for the first time at the IEEE conference on multimedia and EXPO (ICME2002) in Lausanne. Secondly, research and implementation have been continued on audio recording with source tracking and on room modeling, and on wave field synthesis rendering technology. Dry sources can be recorded by close microphones and composed with structured, physical or perceptual room models, while the same sources can be tracked by array of microphones; at the other side sounds are convolved with impulse responses derived from room models and source positions, in order to recreate an immersive sound field with several degrees of interaction exposed to the end user. A prototype, flat-panel WFS system has been demonstrated at the audio engineering society convention (112th AES) in Munich. The next and almost final steps will be to integrate the object-oriented recording toolset with the encoding utility, and the mpeg-4 player with the rendering devices. Several papers have been published in 2002, and two special sessions at AES and IEEE conferences have been organized.
|