Hostname: page-component-84b7d79bbc-l82ql Total loading time: 0 Render date: 2024-07-28T10:15:19.848Z Has data issue: false hasContentIssue false

LearningPinocchio: adaptive information extraction for real world applications

Published online by Cambridge University Press:  13 May 2004

F. CIRAVEGNA
Affiliation:
Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP Sheffield, UK e-mail: F.Ciravegna@dcs.shef.ac.uk
A. LAVELLI
Affiliation:
ITC-irst Centro per la Ricerca Scientifica e Tecnologica, via Sommarive 18, 38050 Povo (TN), Italy e-mail: lavelli@itc.it

Abstract

The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we outline the basic algorithm behind the scenes and present a number of applications developed with LearningPinocchio. Then we report about an evaluation performed by an independent company. Finally, we discuss the general suitability of this IE technology for real world applications and draw some conclusion.

Type
Papers
Copyright
© 2004 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)