Hostname: page-component-84b7d79bbc-g5fl4 Total loading time: 0 Render date: 2024-07-26T16:53:41.410Z Has data issue: false hasContentIssue false

Randomized rule selection in transformation-based learning: a comparative study

Published online by Cambridge University Press:  25 July 2001

SANDRA CARBERRY
Affiliation:
Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA; e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu
K. VIJAY-SHANKER
Affiliation:
Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA; e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu
ANDREW WILSON
Affiliation:
Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA; e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu
KEN SAMUEL
Affiliation:
The Mitre Corporation, Reston, VA 22090, USA; e-mail: samuel@mitre.org

Abstract

Transformation-Based Learning (TBL) is a relatively new machine learning method that has achieved notable success on language problems. This paper presents a variant of TBL, called Randomized TBL, that overcomes the training time problems of standard TBL without sacrificing accuracy. It includes a set of experiments on part-of-speech tagging in which the size of the corpus and template set are varied. The results show that Randomized TBL can address problems that are intractable in terms of training time for standard TBL. In addition, for language problems such as dialogue act tagging where the most effective features have not been identified through linguistic studies, Randomized TBL allows the researcher to experiment with a large set of templates capturing many potentially useful features and feature interactions.

Type
Research Article
Copyright
© 2001 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)