ServicenavigationHauptnavigationTrailKarteikarten


Research unit
EU RFP
Project number
01.0500
Project title
CLEF: Cross-language evaluation forum

Texts for this project

 GermanFrenchItalianEnglish
Key words
-
-
-
Anzeigen
Alternative project number
-
-
-
Anzeigen
Research programs
-
-
-
Anzeigen
Short description
-
-
-
Anzeigen
Further information
-
-
-
Anzeigen
Abstract
-
-
-
Anzeigen
References in databases
-
-
-
Anzeigen

Inserted texts


CategoryText
Key words
(English)
Information; Media; Information Processing; Information Systems; Social Aspects
Alternative project number
(English)
EU project number: IST-2000-31002
Research programs
(English)
EU-programme: 5. Frame Research Programme - 1.2.3 Multimedia content and tools
Short description
(English)
See abstract
Further information
(English)
Full name of research-institution/enterprise:
Eurospider Information Technology AG

Abstract
(English)
The project aims at supporting multilingual information access to European digital libraries by providing a platform for the evaluation of monolingual and cross-language information retrieval systems. The technical infrastructure will include an evaluation protocol and metrics and a testbed of multilingual training and testing data. Two evaluation campaigns will be organised and the results will be analysed and discussed at annual workshops. Collaborative links will be established with similar systems evaluation initiatives in the US and Asia, working on other sets of languages. The end-product will be test-suites of multilingual data that can be used by system developers for benchmarking purposes. The goal is to assist European cross-language system development in order to guarantee its competitiveness on the global market.

Objectives:
The objective of the CLEF proposal is to support global digital library applications by:
(i) developing an infrastructure for the evaluation, testing and tuning of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and
(ii) creating test-suites of reusable data which can be employed by system developers for benchmarking purposes.

Through the organisation of system evaluation campaigns, the aim is to create a community of researchers and developers studying the same problems and to facilitate future collaborative initiatives between groups with similar interests. CLEF will also establish strong links, exchanging ideas and sharing results, with similar cross-language evaluation initiatives in the US and Asia, working on other sets of languages. The final goal is to assist and stimulate the development of European cross-language retrieval systems in order to guarantee their competitiveness on the global marketplace.

Work description:
The CLEF work schedule is planned in three stages: creation of the evaluation framework; organisation of two successive annual evaluation campaigns; formulation of exploitation and exit plans, and recommendations for future evaluation actions. The activity will begin with a survey of the needs of system developers and end users. The results will provide input for the final definition of the evaluation framework and the specific tasks to be offered by the CLEF campaigns.

These will include:
monolingual (non-English) information retrieval system evaluation;
cross-language text retrieval evaluation;
cross-language domain-specific evaluation.

A second survey will be performed at the end of the first campaign to identify new and emerging requirements.

A technical infrastructure will be implemented supporting:
an evaluation protocol and metrics;
a testbed of multilingual training and testing data;
rules to create multiple language sets of queries;
procedures to assess the runs submitted by participating systems;
procedures to generate precision and recall measures and produce comparable results;
a discussion forum.

The core set of languages in the multilingual collection will be English, French, German, Italian and Spanish; criteria will be defined for the addition of other European languages during the project lifetime. Two evaluation campaigns will be organised with the aim of testing different types of mono- and cross-language system issues. The results will be discussed at annual workshops. The end-product will be reusable electronic resources in the form of test-suites that can be used by system developers for future benchmarking activities. Distribution agreements will be negotiated with the data providers and an exploitation plan will be studied to make the test-suites generally available to the interested R&D community. An exit plan proposing mechanisms that could render future evaluation activities self-sustaining will be formulated.

Milestones:
User needs reports;
An evaluation infrastructure for mono- and cross-language information retrieval systems operating on European languages;
Multilingual comparable corpora;
Testing and training data;
Two system evaluation campaigns;
Annual workshops with Proceedings;
Test-suites for system developers;
Exploitation and exit plans.
References in databases
(English)
Swiss Database: Euro-DB of the
State Secretariat for Education and Research
Hallwylstrasse 4
CH-3003 Berne, Switzerland
Tel. +41 31 322 74 82
Swiss Project-Number: 01.0500