
[ ILPnet2 | Library | Newsletter | CSCW | Education | End-User Club | Events | Nodes | Systems | Applications | Members only ]
Semantic Lexicon Acquisition for Learning Natural Language Interfaces
Cynthia Ann Thompson.
PhD thesis, Department of Computer Sciences, University of Texas, Austin, TX,
December 1998..
Also appears as Artificial Intelligence Laboratory Technical Report AI 99-278
(see http://www.cs.utexas.edu/users/ai-lab)
Abstract
A long-standing goal for the field of artificial intelligence is to enable
computer understanding of human languages. A core requirement in reaching
this goal is the ability to transform individual sentences into a form better
suited for computer manipulation. This ability, called semantic parsing,
requires several knowledge sources, such as a grammar, lexicon, and parsing
mechanism. Building natural language parsing systems by hand is a tedious,
error-prone undertaking. We build on previous research in automating the
construction of such systems using machine learning techniques. The result is
a combined system that learns semantic lexicons and semantic parsers from one
common set of training examples. The input required is a corpus of
sentence/representation pairs, where the representations are in the output
format desired. A new system, Wolfie, learns semantic lexicons to be used as
background knowledge by a previously developed parser acquisition system,
Chill. The combined system is tested on a real world domain of answering
database queries. We also compare this combination to a combination of Chill
with a previously developed lexicon learner, demonstrating superior
performance with our system. In addition, we show the ability of the system
to learn to process natural languages other than English. Finally, we test
the system on an alternate sentence representation, and on a set of large,
artificial corpora with varying levels of ambiguity and synonymy. One
difficulty in using machine learning methods for building natural language
interfaces is building the required annotated corpus. Therefore, we also
address this issue by using active learning to reduce the number of training
examples required by both Wolfie and Chill. Experimental results show that
the number of examples needed to reach a given level of performance can be
significantly reduced with this method.
BibTeX entry.
Other publications
ILPnet2 librarian,
ilpnet2-lib@cs.bris.ac.uk. Last modified on Wednesday 9 April 2003 at 18:31. © 2003 ILPnet2