Computer Science Colloquium, 2003-2004

Shalom Lappin
Department of Computer Science, King's College London
May 18th, 2004

A Machine Learning Approach to Classifying Ellipsis in Dialogue

We are concerned with the problem of identifying the interpretation type of non-sentential question fragments in dialogue, where this task is part of a system for dialogue parsing and understanding. We present a machine learning approach to the disambiguation of bare sluices in dialogue. We extracted a set of heuristic principles from a corpus-based sample and formulated them as probabilistic Horn clauses. We then used the predicates of such clauses to create a set of domain independent features to annotate an input dataset. We ran two different machine learning systems: SLIPPER, a rule-based learning algorithm, and TiMBL, a memory-based procedure. Both algrorithms performed well, yielding similar success rates of approx 90%. These results indicate that the features in terms of which we formulated our heuristic principles have significant predictive power, and that rules closely resembling our Horn clauses can be learned automatically from these features.

Joint work with Raquel Fernandez and Jonathan Ginzburg

Shuly Wintner

Last modified: Mon Apr 19 10:30:36 IDT 2004