THIS PROJECT IS A CONTINUATION OF TWO PREVIOUS PROJECTS, KNOWN AS THE EAGLES PROJECTS, AND CONTIRIBUTES TO THE DEVELOPMENT OF STANDARDS FOR LANGUAGE ENGINEERING INITIATED BY THE TWO PREVIOUS PROJECTS. UNLIKE THE TWO PRECEE DING PROJECTS, HOWEVER, THE CURRENT PROJECT IS A JOINT EFFORT BETWEEN EUROPEAN AND AMERICAN PARTNERS.
THE WORK IS CARRIED OUT BY THREE SUB-GROUPS, EACH CONCENTRATING ON A SPECIFIC TOPIC:
· COMPUTATIONAL LEXICONS
· NATURAL INTERACTION AND MULTIMODALITY
· EVALUATION
EACH WORKING GROUP IS CO-CHAIRED BY AN EU AND A US MEMBER, AND IS MADE UP OF MEMBERS FROM BOTH SIDES OF THE ATLANTIC.
SWISS PARTICIPATION CONCERNS ALL THREE GROUPS, AND THE SWISS PARTNER IS ONE OF THE CO-CHAIRS OF THE EVALUATION GROUP. THE WORK CARRIED OUT BY EACH OF THE SUB-GROUPS IN THE FIRST YEAR OF THE PROJECT IS BRIEFLY SUMMARIZED BELOW.
COMPUTATIONAL LEXICONS.
THE COMPUTATIONAL LEXICONS WORKING GROUP CARRIED OUT PREPARATORY WORK WITH A VIEW TO A JOINT MEETING IN MARCH OF 2001. THE TASK OF THE ISSCO GROUP DURING THIS PHASE WAS THE ANALYSIS OF SENSE INDICATORS IN BILINGUAL DICTIONARY ENTRIES AND THEIR CLASSIFICATION ACCORDING TO LEXICOGRAPHICAL REFERENCE TO THE TARGET HEADWORD. THE FOLLOWING SUB-TASKS CONTRIBUTED TO THIS OVERALL AIM:
· EXTRACTION OF THE SENSE INDICATOR WITH THE SOURCE AND TARGET HEADWORDS FROM THE COLLINS GEM DICTIONARY, FOLLOWING THE MODEL BELOW, WHERE IND IS THE SENSE INDICATOR AND SWORD/TWORD ARE THE SOURCE AND TARGET HEADWORDS:
IND = [OF ANIMALS] SWORD=TEAM [N,] TWORD = ATTELAGE
IND=[OF PRICE] SWORD=REDUCTION [N,] TWORD=BAISSE
…· CLASSIFICATION ON THE BASIS OF THE SYNTAX OF THE SENSE INDICATOR (OF NOUN AGAINST NOUN, NOUN, ADJECTIVE ETC.).
· ANALYSIS OF THE SEMANTIC INFORMATION CONVEYED BY EACH TYPE OF SENSE INDICATOR: SYNTACTIC POSITION OF THE SENSE INDICATOR (FOR EXAMPLE, THE SENSE INDICATOR IS THE COMPLEMENT OF THE HEADWORD), HIERARCHICAL RELATIONSHIPS (MERONYMY, HYPONOMY ETC.), ATTRIBUTES (LIKE DOMAIN, LANGUAGE VARIETY), ADJUNCTIVE (FOR EXAMPLE, THE SENSE INDICATOR EXPRESSES THE LOCATION, MANNER ETC., OF THE HEADWORD), OTHERS (INDICATOR IS ITEMS COLLECTED BY COLLECTIVE NM, INDICATOR IS STATE PRIOR TO EVENT/ACTION ETC.).
· CLASSIFICATION OF THE SENSE INDICATORS ACCORDING TO THE SEMANTIC INFORMATION. FOR EXAMPLE, IN THE EXAMPLE ABOVE, 'OF ANIMALS' IS DEFINED AS THE ITEMS COLLECTED BY COLLECTIVE N AND 'OF PRICE' IS THE COMPLEMENT OF THE NOUN HEADWORD
· IMPLEMENTATION OF A DATA BASE FOR ACCESSING THIS INFORMATION IN DIFFERENT WAYS. (THIS DATA BASE WILL SOON BE ACCESSIBLE ON THE WWW).
THIS WORK IS ESSENTIAL PREPARATORY WORK FOR WP3, DEFINITION OF THE MULTILINGUAL ISLE LEXICON ENTRY, SINCE IT SHOWS WHAT KIND OF INFORMATION LEXICOGRAPHERS USE WHEN DESIGNING MULTILINGUAL DICTIONARIES. ISSUES TO DO WITH THE DEFINTION WERE DISCUSSED IN A MEETING IN PISA DURING MARCH 2001, AND WILL BE FURTHER DISCUSSED IN A MEETING PLANNED FOR AUGUST 2001 IN PHILADELPHIA.
NATURAL INTERACTION AND MULTIMODALITY.
THE NATURAL LANGUAGE AND MULTIMODALITY (NIMM) WORKING GROUP AIMS AT DEVELOPING GUIDELINES AND STANDARDS FOR THE CREATION, ANNOTATION, PUBLICATION AND RETRIEVAL OF MULTIMODAL DATA RESOURCES. WITHIN THIS DOMAIN, THE ISLE PROJECT INCLUDES ALSO CHALLENGING COLLABORATIONS WITH PARTNERS FROM THE USA. THE ISSCO GROUP IS MORE SPECIFICALLY INVOLVED IN THE WORKPACKAGE THAT FOCUSES ON THE DESCRIPTION OF LINGUISTIC AND MULTIMODAL RESOURCES USING A STANDARDIZED DESCRIPTION LANGUAGE (EXAMPLES OF RESOURCES ARE WRITTEN / SPOKEN / MULTIMODAL CORPORA, LEXICONS, GRAMMARS, ETC.). AT THIS LEVEL, THE MAIN PROBLEM IS TO DEFINE A META-DATA STANDARD THAT IS INFORMATIVE ENOUGH FOR AUTOMATIC SEARCH ENGINES AND TRACTABLE ENOUGH FOR THE USERS WHO TYPE OR EDIT META-DATA FOR THEIR RESOURCES.
A WORKSHOP WAS ORGANIZED INVOLVING PARTICIPANTS IN THE WORKPACKAGE AND EXTERNAL GUESTS IN ORDER TO AGREE ON A META-DATA STANDARD, WHICH IS NOW ALMOST COMPLETED.
THE ISSCO GROUP IS INVOLVED IN THE DEVELOPMENT OF A DATABASE MODEL FOR META-DATA, TOGETHER WITH AN ARCHITECTURE THAT ALLOWS REMOTE QUERYING OF THE META-DATA USING THE DEFINED STANDARD. THE ADAPTATION OF AN OPEN SOURCE SYSTEM TO META-DATA FOR LANGUAGE RESOURCES WAS TESTED, AS WELL AS ITS ABILITY TO ANSWER VARIOUS QUERY PROTOCOLS. THE CONVERSION OF EXISTING META-DATA REPOSITORIES TO THE DESIRED FORMAT IS CURRENTLY UNDER STUDY. ONE OF THE MAIN OBJECTIVES HERE IS THE CAPACITY TO SERVE META-DATA UNDER VARIOUS COMPATIBLE FORMATS, SINCE A MINIMALIST META-DATA STRUCTURE, LESS EXPRESSIVE THAN THE CURRENT ISLE ONE, HAS BEEN ADVOCATED BY SOME PARTICIPANTS.
INTEGRATION WITH THE OTHER WORKPACKAGES IN THE NIMM WORKING GROUP, AS WELL AS DISSEMINATION OF THE RESULTS, IS ENSURED THROUGH FREQUENT MEETINGS WITH THE OTHER PARTICIPANTS AND PARTICIPATION IN WORKSHOPS.
EVALUATION.
THE WORK OF THE EVALUATION GROUP IN THE ISLE PROJECT IS CONCERNED WITH DEFINING QUALITY MODELS FOR MACHINE TRANSLATION SYSTEMS. THE QUALITY MODEL TAKES THE FORM OF A STRUCTURED TAXONOMY, WHERE ONE PART OF THE TAXONOMY REFLECTS POTENTIAL USER NEEDS AND A SECOND PART REFLECTS SYSTEM CHARACTERISTICS. METRICS ARE ASSOCIATED WITH SPECIFIC CHARACTERISTICS.
THE WORK IS PLANNED AROUND A SERIES OF WORKSHOPS, EACH DEVOTED TO HANDS ON EXPERIMENTATION WITH A DRAFT VERSION OF THE QUALITY MODEL AND ASSOCIATED METRICS.
SOME PRELIMINARY CONCERTATION WORK WAS DONE DURING THE LREC (LANGUAGE RESOURCES AND EVALUATION) CONFERENCE HELD IN ATHENS IN MAY 2000. A FIRST DRAFT OF THE QUALITY MODEL WAS THEN PREPARED BY THE AMERICAN PARTNERS IN TIME FOR THE FIRST PROJECT WORKSHOP, WHICH WAS HELD IN CUERNAVACA, MEXICO, IN OCTOBER OF 2001.
FEEDBACK FROM THE FIRST WORKSHOP LED TO SIGNIFICANT RE-STRUCTURING OF THE QUALITY MODEL, AND TO A RE-IMPLEMENTTAION IN XML AS PREPARATION FOR THE SECOND PROJECT WORKSHOP HELD IN GENEVA IN APRIL OF 20001.
IT IS ANTICIPATED THAT FEEDBACK FROM THAT WORKSHOP WILL FEED INTO A THIRD PROJECT WORKSHOP IN CONJUNCTION WITH THE MACHINE TRANSLATION SUMMIT TO BE HELD IN SPAIN IN SEPTEMBER OF 2001.
A VERSION OF THE QUALITY MODEL CAN BE CONSULTED AT
HTTP://www.ISSCO.UNIGE.CH/PROJECTS/ISLE/EWG.HTML .