
[ ILPnet2 | Library | Newsletter | CSCW | Education | End-User Club | Events | Nodes | Systems | Applications | Members only ]
Feature construction with Inductive Logic Programming: a study of
quantitative predictions of biological activity aided by structural
attributes
A. Srinivasan
and R.D. King.
Data Mining and Knowledge Discovery, 3(1):37--57, 1999. More behind this link.
Abstract
Recently, computer programs developed within the field of Inductive Logic
Programming (ILP) have received some attention for their ability to construct
restricted first-order logic solutions using problem-specific background
knowledge. Prominent applications of such programs have been concerned with
determining ``structure-activity'' relationships in the areas of molecular
biology and chemistry. Typically the task here is to predict the ``activity''
of a compound (for example, toxicity), from its chemical structure. A summary
of the research in the area is: (a) ILP programs have largely been restricted
to qualitative predictions of activity (``high'', ``low'' etc.); (b) When
appropriate attributes are available, ILP programs have equivalent
predictivity to standard quantitative analysis techniques like linear
regression. However ILP programs usually perform better when such attributes
are unavailable; and (c) By using structural information as background
knowledge, an ILP program can provide comprehensible explanations for
biological activity. This paper examines the use of ILP programs as a method
of ``discovering'' new attributes. These attributes could then be used by
methods like linear regression, thus allowing for quantitative predictions
while retaining the ability to use structural information as background
knowledge. Using structure-activity tasks as a test-bed, the utility of ILP
programs in constructing new features was evaluated by examining the
prediction of biological activity using linear regression, with and without
the aid of ILP learnt logical attributes. In three out of the five data sets
examined the addition of ILP attributes produced statistically better
results. In addition six important structural features that have escaped the
attention of the expert chemists were discovered. The method used here to
construct new attributes is not specific to the problem of predicting
biological activity, and the results obtained suggest a wider role for ILP
programs in aiding the process of scientific discovery.
BibTeX entry.
Other publications
A Srinivasan,
Ashwin.Srinivasan@comlab.ox.ac.uk. Last modified on Wednesday 9 April 2003 at 18:31. © 2003 ILPnet2