
[ ILPnet2 | Library | Newsletter | CSCW | Education | End-User Club | Events | Nodes | Systems | Applications | Members only ]
A First-order Representation for Knowledge Discovery and Bayesian
Classification on Relational Data
Nicolas Lachiche
and Peter Flach.
In Pavel Brazdil
and Alipio Jorge, editors, PKDD2000 workshop on Data
Mining, Decision Support, Meta-learning and ILP : Forum for Practical Problem
Presentation and Prospective Solutions, pages 49--60. 4th International
Conference on Principles of Data Mining and Knowledge Discovery, September
2000. More behind this link.
Abstract
In this paper we consider different representations for relational learning
problems, with the aim of making ILP methods more applicable to real-world
problems. In the past, ILP tended to concentrate on the term representation,
with the flattened Datalog representation as a `poor man's version'. There
has been relatively little emphasis on database-oriented representations,
using e.g. the relational datamodel or the Entity-Relationship model. On the
other hand, much of the available data is stored in multi-relational
databases. Even if we don't actually interface our ILP systems with a DBMS,
we need to understand the database representation sufficiently in order to
convert it to an ILP representation. Such conversions and relations between
different representations are the subject of this paper. We consider four
different representations: the Entity-Relationship model, the relational
model, a flattened individual-centred representation based on so-called ISP
declarations we use for our ILP systems Tertius and 1BC, and the term-based
representation. We argue that the term-based representation does not have all
the flexibility and expressiveness provided by the other representations. For
instance, there is no way to deal with graphs without partly flattening the
data (i.e., introducing identifiers). Furthermore, there is no easy way of
switching to another individual without converting the data, let alone
learning with different individual types. The flattened representation has
clear advantages in these respects.
BibTeX entry.
Other publications
Nicolas Lachiche,
lachiche@cs.bris.ac.uk,
P A Flach,
Peter.Flach@bristol.ac.uk. Last modified on Wednesday 9 April 2003 at 18:31. © 2003 ILPnet2