University of Twente

Outline of TWLT14 poster presentations

[to program] [to abstracts] [to home]

Monday, December 7, 15.00 - 16.00

Evaluation of an automatic abstracting system

Michael P. Oakes and Chris. D. Paice
Computing Department, Lancaster University, Lancaster LA1 1YR, England

The Concept-Based Abstracting system reduces the full text of an academic article to a list of domain-specific roles and suitable fillers. In the domain of agriculture, a role might be SPECIES and its fillers maize and soybean. In this paper we use the measures of strict and premissive recall and precision to compare the machine-generated list of roles and fillers with the corresponding human-selected lists of ideal roles and fillers.

Towards Automatic Indexing and Retrieval of Video Content: the VICAR system

Marten den Uyl, Sentient machine Research, The Netherlands
Ed S. Tan, Word and Image Studies, Vrije Universiteit Amsterdam, The Netherlands
Heimo Müller and Peter Uray, Joanneum Research, Austria

Cataloguing, Annotation, and Retrieval. Four function- alities are discussed: indexing, interpretation, interro- gation and instruction. A brief glimpse is offered at the system architecture, and some of its components reviewed, including image processing, feature abstraction and indexing. Examples of the system's main use in the context of television archives are given. It has been concluded that VICAR's technology my be fruitfully combined with language technology in order to enhance its performance, while the system as it is may be of help in handling mixed media databases, such as ones containing conversation and speech in video.

Sumatra: A system for Automatic Summary Generation

Danny Lie, Carp Technologies, The Netherlands

This paper describes a system for automatic summary generation called Sumatra. It differs from other systems in being domain independent and, instead of relying on statistical techniques, it uses a Natural Language Processing approach, involving parsing, semantic analysis and text generation. The system has been evaluated by using final exam texts from the Dutch grammar school in summarizing. The main conclusion is that the Sumatra system is adequately capable of extracting the important information elements from a text.

Access, Exploration and Visualization of Interest Communities: The VMC Case Study (in Progress)

Anton Nijholt, Centre of Telematics and Information Technology, University of Twente, the Netherlands

This paper discusses a virtual world for representing information and natural interactions about performances in an existing theatre. Apart from mouse and keyboard input, interactions take place using speech and language. It is shown how this virtual environment can be considered as an interest community and it is shown what further research and development is required to obtain an environment where visitors can retrieve information about artists, authors and performances, can discuss performances with others and can be provided with information and contacts in accordance with their preferences.

MULINEX: Multilingual Web Search and Navigation

Joanne Capstick, Abdel Kader Diagne, Gregor Erbach and Hans Uszkoreit, German Research Center for Artificial Intelligence - Language Technology Lab, Saarbrücken, Germany

MULINEX is a fully implemented multilingual search and navigation system for the WWW. The system allows users to search and navigate multilingual document collections using only their native language to formulate, expand and disambiguate queries, navigate the result set and read the retrieved documents. This multilingual functionality is achieved by the use of dictionary-based query translation, multilingual document categorisation and automatic translation of summaries and documents. The system has been installed in the online services of two Internet content and service provider companies.

OLIVE: speech based video retrieval

Klaus Netter and Franciska de Jong, Language Technology Lab German Research Center for Artificial Intelligence DFKI GmbH and University of Twente

This paper describes the Olive project which aims to contribute to the needs of video archives by supporting the automated indexing of video material on the basis of human language processing. Olive develops speech recognition to automatically derive transcripts for the sound track, thus generating time coded linguistic elements which are the basis for text-based retrieval functionality.

Twenty-One: a baseline for multilingual multimedia retrieval

Franciska de Jong, University of Twente

In this paper we will give a short overview of the ideas underpinning the demonstrator developed within the EU- funded project Twenty-One; this system provides for the disclosure of information in a heterogeneous document environment that includes documents of different types and languages. As part of the off-line document processing that has been integrated in the system noun phrases are extracted to build a phrase-based index. They are the starting point for the generation of both a fuzzy phrase index and a translation step that is needed for the realization of cross-language retrieval functionality.

CONDORCET: Combining Linguistic and Knowledge-based Engineering for Information Retrieval and Information Extraction

Paul van der Vet and Bas van Bakel, University of Twente

This poster presents Condorcet, a domain-specific prototype indexing system for tens of thousands of documents covering two scientific domains: engineering ceramics and epilepsy. The development corpus consists of 400 documents taken from one year volumes of two scientific journals.

Condorcet takes a controlled-term approach to indexing: title and abstract of a document are mapped to concepts and relations, defined in modern versions of classical thesauri, i.e. structured ontologies. The index process makes intensive use of linguistic knowledge. The poster focuses on the linguistic principles that form the conceptual basis of the index process

Pop-Eye: language technology for video retrieval

Wim van Bruxvoort, VDA informatiebeheersing, the Netherlands

The Pop-Eye project is building a demonstrator of a multilingual film and video indexing system. Pop-Eye uses natural language processing to index and partially translate text captions that were used to subtitle audio-visual programmes. These multilingual indexes can then be used within broadcasting companies, across the internet, or via an intranet, to help producers to locate and retrieve video fragments to use in new television productions.

Pop-Eye is funded by the European Commission Telematics Applications Programme, Sector Language Engineering.

[to program] [to abstracts] [to home]