|
|
Abstract Veronique HosteWe show that a memory-based learning approach to lexical sample word-sense disambiguation, as developed in the context of the SENSEVAL-1 competition (Veenstra et al. 2000), scales well to disambiguating the meaning of all words in free text (the "all-words" task). This is shown using the all-words task data from SENSEVAL-2 for Dutch and English. Because of the efficiency of memory-based learning, its relative robustness towards noisy and sparse data, and its ability to integrate diverse sources of information (in this case local syntactic and lexical context and more global predictive key-words for particular word-sense combinations), memory-based learning seems to be well-suited for all-words sense tagging.The two systems for Dutch and English are composed of "word expert" modules; a word expert is a memory-based learner trained to disambiguate between the senses of one particular word. Both systems thus involve hundreds to thousands of word experts, and all of the parameters of each of these word experts are individually optimized by cross-validation on learning material. Experimental results indicate that the effect of both individual optimization and global system parameters has a strong and significantly positive effect on disambiguation accuracy. Last modified $Date: 2001/10/04 13:39:44 $ by Parlevink Webmaster |