In this paper we presented a first-order Bayesian classifier. While many propositional learners have been upgraded to first-order logic in ILP, the case of the Bayesian classifier poses a problem that has not been satisfyingly solved before, namely how to distinguish between elementary and non-elementary first-order features. Treating each Prolog literal as a feature, as is done in LINUS [9] is not a solution, because many literals do not contain a reference to an individual, and thus the relative frequency associated with that literal cannot be attributed to an individual. Our approach gives a clear picture of how an individual-based first-order representation upgrades attribute-value learning, namely by allowing relational and non-determinate features. In this respect, the work extends previous work on the relationship between propositional and first-order learning [1,6,10,15,17].
One can argue that any individual-based first-order learning problem, as defined in this paper, can be transformed to attribute-value learning, by introducing an attribute for any first-order feature in the hypothesis language. However, one should distinguish between transformation of the hypothesis language and transformation of the data (propositionalisation). 1BC does the first but not the second. Furthermore, the transformation of the hypothesis language is mostly conceptual: it is perfectly possible to explain how 1BC operates on a multiple-instance problem by decomposing a probability distribution on sets. Thus, 1BC clearly extends the propositional naive Bayesian engine. That being said, notice that 1BC is able to consider the same features and is guaranteed to perform at least as well as a purely propositional Bayesian classifier, as was demonstrated in the KRK-illegal domain. This is a desirable property that is currently shared by only a few ILP systems.
Pompe and Kononenko also describe an application of naive Bayesian classifiers in a first-order context [13]. However, in their approach the naive Bayesian formula is used in a post-processing step to combine the predictions of several, independently learned first-order rules. As far as we are aware, the present paper is the first to describe a first-order naive Bayesian learner.
Future work includes refining the declarative bias specification of Tertius and 1BC. We would also like to handle other non-determinate types such as lists. Finally, we would like to extend 1BC to an interactive truth-value predictor, which could answer ground Prolog queries from an extensional database.