Titel
Accueil
Navigation principale
Contenu
Recherche
Aide
Fonte
Standard
Gras
Identifiant
Interrompre la session?
Une session sous le nom de
InternetUser
est en cours.
Souhaitez-vous vraiment vous déconnecter?
Interrompre la session?
Une session sous le nom de
InternetUser
est en cours.
Souhaitez-vous vraiment vous déconnecter?
Accueil
Plus de données
Partenaires
Aide
Mentions légales
D
F
E
La recherche est en cours.
Interrompre la recherche
Recherche de projets
Projet actuel
Projets récents
Graphiques
Identifiant
Titel
Titel
Unité de recherche
PCRD EU
Numéro de projet
97.0495
Titre du projet
EUROSEARCH: Multilingual European federated search service
Titre du projet anglais
EUROSEARCH: Multilingual European federated search service
Données de base
Textes
Participants
Titel
Textes relatifs à ce projet
Allemand
Français
Italien
Anglais
Mots-clé
-
-
-
Autre Numéro de projet
-
-
-
Programme de recherche
-
-
-
Description succincte
-
-
-
Autres indications
-
-
-
Partenaires et organisations internationales
-
-
-
Résumé des résultats (Abstract)
-
-
-
Références bases de données
-
-
-
Textes saisis
Catégorie
Texte
Mots-clé
(Anglais)
Cross-language information retrieval; federated search engine; automatic categorization; multilingual web search
Autre Numéro de projet
(Anglais)
EU project number: -LE-8303
Programme de recherche
(Anglais)
EU-programme: 4. Frame Research Programme - 1.1 Information technologies
Description succincte
(Anglais)
See abstract
Autres indications
(Anglais)
Full name of research-institution/enterprise:
Eurospider Information Technology AG
Partenaires et organisations internationales
(Anglais)
Italia Online (I) (Koordinator), CNR (I), CINET (S), Universität Dortmund (D), Gruner + Jahr EMS (D) (Aufnahme beantragt)
Résumé des résultats (Abstract)
(Anglais)
The Eurosearch project was focused on two main areas: the linguistic area for 'cross language (multilingual). web search' among federated search engines and the automalic categorization area for the Italian and German web document domains. The consortium has produced a number of working prototypes and has also created new online web services for the automatic categorization on Arianna (JOL) and Fireball (EMS) search engines as a direct exploitation of the project results.
In the linguistic area, three sub-prototypes for Italian-English, Italian-Spanish (Lexicon-based) and Italian-German (Similarity Thesaurus based) query translations have been implemented and integrated in the 'Final Prototype of integrated Multilingual Services'.
The prototypes have allowed the realisation of the concept of federated search engines. They have been implemented and experimented on Arianna and Eurospider search engines for the Italian and German web document domains and allow bi-directional query translations. An open cross-language architecture has been developed and successfully implemented for Eurosearch. This architecture is flexible enough to interact with multiple types/structures of search engines (Arianna, Eurospider, Altavista and Trovator) and to operate on different domains and with different indexing methods and query syntaxes. Different technologies have been integrated/developed and tested using the cross-language web search prototypes, such as Lexicon-based, Corpus-based and Similarity Translation base'.
The Similarity Thesaurus-based technology uses a data structure containing lists of terms based on their similarity. For use in Eurosearch, where the query has to be translated, multilingual similarity thesauri are employed. The multilingual variant connects words in the source language to similar terms in the target language. It has been extensively tested using TREC-style methods with encouraging results.
The implementation of the Lexicon-based technology has implied the set-up of a multilingual lexicon consisting of a set of bilingual Italian/English and Spanish/English dictionaries, with procedures that map between the English datasets in the different dictionaries, since English is used as an intermediate language to go from Italian to Spanish.
Corpus-based enhancements to the lexicon-based technology have been introduced by developing an experimental prototype that uses data extracted from document archives consisting of comparable corpora to expand queries with a vocabulary of related terms~
In the area of automatic categorization of web documents (Catalogue Generator) two prototypes capable of identifying relevant web sites and extracting a summary of their content have been implemented for the Italian and German web document domains. The automatic categorization of web documents is based on a probabilistic description-oriented representation of web documents. The automatic categorization prototype has been integrated into Arianna as an online service and later on ported to the German web domain through Fireball by EMS.
Références bases de données
(Anglais)
Swiss Database: Euro-DB of the
State Secretariat for Education and Research
Hallwylstrasse 4
CH-3003 Berne, Switzerland
Tel. +41 31 322 74 82
Swiss Project-Number: 97.0495
SEFRI
- Einsteinstrasse 2 - 3003 Berne -
Mentions légales