next top
The Computation of
Movement
Sandiway Fong
NEC Research Institute, Inc.
4 Independence Way, Princeton NJ 08540.
USA
sandiway@research.nj.nec.com
Abstract
A central goal of parsing is to recover
linguistic structure for interpretation. One property of language
that seems to be prevalent is the so-called displacement
property. That is, syntactic items commonly appear in places
other than where we would normally expect for interpretation.
Some examples of phenomena involving displacement include
Wh-movement, raising, passivization, scrambling, topicalization
and focus. As Chomsky (1995) points out, displacement is an
irreducible fact about human language that every contemporary
theory of language has to address. In the
principles-and-parameters framework, it is customary to posit a
general movement operation, Move-a , that in concert with
conditions on its application serve to link displaced elements
with their base positions. In terms of parsing, the task is to
decode or unravel the effects of Move-a from the surface order.
More specifically, for each element, we have to determine whether
that element has been displaced or not, and, if so, determine the
original position it was displaced from and reconstruct the path
it took - including any intermediate positions or landing sites.
In general, each displaced element is said to head a
(non-trivial) chain with one or more empty categories known as
traces occupying the positions that it passed through. Note that
in such theories, empty categories are not just simple
placeholders, but elements with much of the same type and range
of syntactic properties displayed by their overt counterparts.
For example, empty categories in argument positions, like
anaphors and pronouns, participate in binding theory and theta
role discharge. Hence, the well-formedness of a given sentence
will depend, in general. in recovering both the visible and
non-visible parts of syntactic structure.
In this talk, we will describe how PAPPI, a
multi-lingual parser for theories in the
principles-and-parameters frameworks, deals with the computation
of movement chains and empty categories in general. Drawing from
implemented examples across a variety of languages, we will
discuss the mechanism used to handle standard cases of phrasal
movement commonly discussed in the literature such as
Wh-movement, passivization. raising and verb second (V2)
phemomena. We will also describe how this mechanism is adapted to
handle instances of argument scrambling in languages like Korean
and Japanese. We will also focus our attention on head movement.
Here, following Pollock (1989), we will discuss the mechanism
used to handle the surface differences in the behaviour of verbal
inflection in English and French. Following Pesetsky (1995), we
will also discuss the implementation of a theory of double object
constructions involving the incorporation of both overt and
non-overt prepositions into verbal heads.
Finally, we will describe two recent additions
to the movement mechanism in the PAPPI system. Moving towards a
theory of goal-driven movement - as opposed to the free movement
system implied bv Move-a , we will discuss an implementation of Case-driven
movement within the VP-shell to handle examples involving focus,
backgrounding and topicalization in Turkish. Finally, using
examples from English and Turkish, we will discuss the necessity
of a mechanism of reconstruction that optionally
"undoes" or reverses the effects of movement to handle
facts involving binding and scope.
previous next top
Parsing Technology and RNA
Folding:
a Promising Start
Fabrice Lefebvre
LIX, Ecole Polytechnique91128
Palaiseau Cedex, FRANCE
lefebvre@lix.polytechnique.fr
Abstract
The determination of the secondary structure of
RNAs is a problem which has been tackled bv distantly related
methods ranging from comparative analysis to thermodynamic energy
optimization or stochastic context-free grammars (SCFGs). Because
of its very nature (properly nested pairs of bases of a single
stranded sequence) the secondary structure of RNAs is well
modeled by context-free grammars (CFGs). This fact has been
recognized several years ago by people who used context-free
grammars as a tool to discover some combinatorial properties of
secondary structures. More recently SCFGs were used bv several
teams (esp. David Haussler's team at UC Santa Cruz) as an
effective tool to fold RNAs through Cocke-Younger-Kasami-Iike
parsers. Until 1996, and in the context of RNA folding, CFGs and
their derivatives where still considered theoretical tools,
barely usable outside the computer scientist lab. The exception
of SCFGs seemed promising, with all the hype around Hidden Markov
Models and other stochastic methods, but it remained to be
confirmed for RNAs longer than 200 bases.
The main obstacle to the use of context-free
grammars and parsing technology for RNA folding and other closely
related problems is the following : suitable grammars are
exponentially, ambiguous, and sentences to parse (i.e. RNA or DNA
sequences) typically have more than 200 words. and sometimes more
than 4000 words. These figures are rather unusual for ordinary
parsers or parser generators, because they are mostly, used in
the context of natural language parsing., and thus do not have to
face the same computation problems. Fact is, most people dealing
with RNA folding problems were manually writing dynamic
programming based tools. This was the case for folding models
popularized by Michael Zuker, and based on free
energy-mininimization. This was also the case for folding models
based on SCFGs. This was in effect the case for just about every
computer method available to fold or align sequenees. Parsing
sequences was not an issue because it simply seemed too slow, too
memory hungry and even unrelated.
In 1995, I showed that S-attribute grammars
were perfectly able to handle both the thermodynamic model and
the stochastic model of RNA folding. I then introduced a parser
generator which was able, given a proper S-attribute grammar, to
automatically write an efficient parser based on suitable
optimizations of Earley's parsing algorithm. All generated
parsers turned out to be faster and less memory hungry than other
available parsers for the same exponentially ambiguous grammars
and the same sequences. More surprisingly, these parsers also
turned out to be faster than hand-written programs based on
dynamic programming equations. This was the first proof that
improvements in parsing technology may certainly be put to good
use in biocomputing problems, and that they shall lead to better
algorithms and tools.
While trying to overcome some limitations of
SCFGs, I generalized S-attribute grammars to multi-tape
S-attribute grammars (MTSAGs). The automata theory counterpart of
a MTSAG would be a non-deterministic push-down automaton with
several one-way reading heads, instead of a single one-way
reading head as it is the case for CFG. Given these MTSAGs, a
generalization of the previous single-tape parser generator was
the obvious way forward.
Thanks to this new parser generator, I was able
to show that most biocomputing models previously based on dynamic
programming equations were unified bv MTSAGs, and that they were
better handled by automatically generated parsers than by
handwritten programs. lt did not matter whether these models were
trying to align sequences, fold RNAS, align folded RNAS, align
folded and unfolded RNAS, simultaneously align and fold RNAS,
etc. It also turned out that the way SCFGs and HMMs are currently
used rnay be better pictured, thanks to 2-tape MTSAGs, as the
simultaneous alignrnent and folding of a first special tape,
representing the target model against a second tape, containing
the actual sequence. This representation may lead to algorithms
which will efficiently learn SCFGs frorn initially unaligned
sequences.
While the current parser generator for MTSAGs
is a usable proof of concept, which nevertheless required several
rnonths of work, 1 arn quite convinced that there should be
better ways than the current algorithm to parse several tapes.
There should also exist other generalizations of CFGs which rnay
reveal themselves fruitful. Current results are only promising
starting points.
The irony of the story is that HMMs and SCFGs
were borrowed by biocomputing people from other fields such as
signal or speech analysis. lt may very well be the time for these
fields to retrofit their own rnodels with current advances in
biocomputing such as MTSAGS.
previous next top
Intelligent Multimedia
Information Access
Mark T. Maybury
Advanced Information Systems
Center
The MITRE Corporation
202 Burlington Road
Bedford, MA 01730, USA
maybury@mitre.org
http://www.mitre.org/resources/centers/advanced_info/
Abstract
The expansion of the information
highway has generated requirements for more effective access to
global and corporatic information repositories. These
repositories are increasingly multimedia, including text, audio
(e.g., spoken language, music), graphics, imagery, and video. The
advent of large, multimedia digital libraries has turned
attention toward the problem of processing and managing multiple
and heterogeneous media in a principled manner, including their
creation, storage, indexing, browsing, search, visualization, and
summarization.
Intelligent multimedia information access is a
multidisciplinary area that lies at the intersection of
artificial intelligence, information retrieval, human computer
interaction, and multimedia computing. Intelligent multimedia
information access includes those systems which go beyond
traditional hypermedia or hypertext environments and analyze
media, generate media, or support intelligent interaction with or
via multiple media using knowledge of the user, discourse,
domain, world, or the media itself.
Providing machines with the ability to
interpret, generate, and support interaction with multimedia
artifacts (e.g., documents, broadcasts, hypermedia) will be a
valuable facility for a number of key applications such as
videoteleconference archiving, custom on-line news, and briefing
assistants. These media facilities, in turn, may support a
variety of tasks ranging from training to information analysis to
decision support.
In this talk I will describe our group's
efforts to provide content based access to broadeast news
sources, including our use of corpus-based processing techniques
to the problems of video indexing, segmentation, and
summarization. In addition to better access to content, we also
need to concern ourselves with enabling more effective, efficient
and natural human computer or computer mediated human-human
interaction. This will require automated understanding and
generation of multimedia and demand explicit representation of
and reasoning about the user, discourse, task and context
(Maybury 1993). To this end, I will describe our work in progress
that aims to fully instrument the interface and build
(automatically and semi-automatically) annotated corpora of
human-machine interaction. We believe this will yield deeper and
more comprehensive models of interaction which should ultimately
enable more principled interface design.
previous next top
Making Use of Intonation
in
Interactive Dialogue
Translation
Mark Steedman
University of Pennsylvania, USA
steedman@linc.cis.upenn.edu
Abstract
Intonational information is frequently
discarded in speech recognition, and assigned by default
heuristies in text-to-speech generation. However, in many
applications involving dialogue and interactive discourse,
intonation conveys significant information, and we ignore it at
our peril. Translating telephones and personal assistants are an
interesting test case, in which the salience of rapidly shifting
discourse topics and the fact that sentences are
machine-generated, rather than written by humans, combine to make
the application particularly vulnerable to our poor theoretical
grasp of intonation and its functions. I will discuss a number of
approaches to the problem for such applications, ranging from
cheap tricks to a combinatory grammar-based theory of the
semantics involved and a syntax-phonology interface for building
and generating from interpretations.
previous next top
Encoding Frequency Information
In
Lexicalized Grammars
John Carroll David Weir
School of Cognitive and
Computing Sciences
University of Sussex
Faliller, Brighton, BN1 9QH,
UK
{Jolinca,davidw}@cogs.susx.ac.uk
Abstract
We address the issue of how to
associate frequency information with lexicalized grammar
formalisms, using Lexicalized Tree Adjoining
Grammar as a representative framework. We consider systematically
a number of alternative probabilistic frameworks,
evaluating their adequacy from both a theoretical and empirical
perspective using data from existing large treebanks. We
also propose three orthogonal approaches for backing off
probability estimates to cope with the large number of
parameters involved.
previous next top
Towards a Reduced Commitment,
D-Theory Style TAG
Parser
John Chen*
K. Vijay-Shanker
Department of Computer and
lnformation Sciences, University of Delaware
Newark, DE 19716, USA
{jchen,vijay}@cis.udel.edu
Abstract
Many traditional TAG parsers handle ambiguity
by considering all of the possible choices as they unfold during
parsing. In contrast, D-theory parsers cope with ambiguity by
using underspecified descriptions of trees. This paper introduces
a novel approach to parsing TAG, namely one that explores how
D-theoretic notions may be applied toTAG parsing.
Combining the D-tlieoretic approach to TAG
parsing as we do here raises new issues and problems. D-theoretic
underspecifition is used as a novel approach in the context of
TAG parsing for delaying attachment decisions. Conversely, the
use of TAG reveals the need for additional types of
underspecification that have not been considered so far in the
D-theoretic framework. These include combining sets of trees into
their underspecified equivalents as well as underspecifying
combination of trees.
ln this paper, we examine various issues that
arise in this new approach to TAG parsing and present solutions
to some of the problems. We, also describe other issues which
need to be resolved for this method of parsing to be implemented.
previous next top
Controlling Bottom-Up Chart
Parsers
through Text Chunking
Fabio Ciravegna, Alberto
Lavelli
Istituto per la Ricerca
Scientifica e Technologica
Loc. Pante' di Povo, I-38050
Trento, Italy
{cirave|lavelli}@irst.itc.it
Abstract
In this paper we propose to use text chucking
for controlling a bottom-up parsing. As it is well known, during
analysis such parsers produce many constituents not contributing
to the final solution(s). Most of these constituents are
introduced due to the parser inability of checking the input
context around them. Preliminary text chunking allows to focus
directly on the constituents that seem more likely and to prune
the search space in the case some satisfactory solutions are
found. Preliminary experiments show that a CYK-like parser.
controlled through chunking is definitely more efficient than a
traditional parser without significantly losing in correctness.
Moreover the quality of possible partial results produced by the
controlled parser is high. The strategy is particularly suited
for tasks like Information Extraction from text (IE) where
sentences are often long and complex and it is very difficult to
have a complete coverage. Hence, there is a strong necessity of
focusing on the most likely solutions; furthermore, in IE the
quality of partial results is important.
previous next top
Pruning Search Space For Parsing
Free Coordination In Categorial Grammar
Crit Cremers
Maarten Hijzelendoorn
Department of General Linguistics, Leiden
University, The Netherlands
{cremers,hijzelendrn}@rullet.leidenuniv.nl
Abstract
The standard resource sensitive invariants of categorial
grammar are not suited to prune search space in the presence of
coordination. We propose a weaker variant of count invariance in
order to prune the search space for parsing coordinated sentences
at a stage prior to proper parsing. This Coordinative Count
Invariant is argued to be the strongest possible instrument to
prune search space for parsing coordination in categorial
grammar. Its mode of operation is explained, and its effect at
pruning search space is exemplified.
previous next top
Bilexical Grammars And
a
Cubic-Time Probabilistic
Parser
Jason Eisner
Dept. of Computer and
Information
Science, Univ. of Pennsylvania
200 South 33rd St.,
Philadelphia, PA 19103-6389 USA
jeisner@linc.cis.upenn.edu
Introduction
Computational linguistics has a long tradition of lexicalized
grammars, in which each grammatical rule is specialized for some
individual word. The earliest lexicalized rules were
word-specific subeategoriza,tion frames. It is now common to find
fully lexicalized versions of many grammatical formalisms, such
as context-ftee and tree-adjoining grammars [Schabes et al. 1988].
Other formalisms, such as dependency grammar [Mel'cuk 1988] and
head-driven phrase-structure grammar [Pollard & Sag 1994],
are explicitly lexical from the start.
Lexicalized grammars have two well-known advantages. Where
syntactic acceptability is sensitive to the quirks of individual
words, lexicalized rules are necessary for linguistic
description. Lexicalized rules are also computationally cheap for
parsing written text: a parser may ignore those rules that do not
mention any input words.
More recently, a third advantage of lexicalized grammars has
emerged. Even when syntactic acceptability is not
sensitive to the particular words chosen, syntactic distribution
may be [Resnik 1993]. Certain words may be able but highly
unlikely to modify certain other words. Such facts can be
captured by a probabilistic lexicalized grammar, where
they may be rises to resolve ambiguity in favor of the most
probable analysis, and also to speed parsing by avoiding
("pruning") unlikely search paths. Accuracy and
efficiency can therefore both benefit.