|
|
Abstract BeekAlpino is a wide-coverage computational analyzer of Dutch which aims at accurate, full, parsing of unrestricted text. The grammar produces dependency structures, thus providing a reasonably abstract and theory-neutral level of linguistic representation. The annotation format is taken from the Corpus of Spoken Dutch. For development and evaluation purposes, we have started to annotate the cdbl (newspaper) part of the Eindhoven corpus.The annotation process starts by parsing a sentence with the Alpino grammar. This produces a (often large) number of possible analyses, from which the annotator picks the correct parse. To make the annotation process more efficient, two tools have been developed that narrow down the number of possible analyses (interactive POS-tagging and constituent marking) and the Hdrug environment has been extended with a graphical tool for parse selection. In the talk we will present our annotation method, explain why we use dependency structures and discuss the tools that have been developed to facilitate the annotation process. Last modified $Date: 2001/10/04 13:39:43 $ by Parlevink Webmaster |