Database dependency discovery: a machine learning approachPeter A. Flach, Iztok Savnik, Database dependency discovery: a machine learning approach. AI Communications, 12 (3). ISSN 0921-7126, pp. 139–160. November 1999. No electronic version available. External information
Database dependencies, such as functional and multivalued dependencies, express the presence of structure in database relations, that can be utilised in the database design process. The discovery of database dependencies can be viewed as an induction problem, in which general rules (dependencies) are obtained from specific facts (the relation). This viewpoint has the advantage of abstracting away as much as possible from the particulars of the dependencies. The algorithms in this paper are designed such that they can easily be generalised to other kinds of dependencies. Like in current approaches to computational induction such as inductive logic programming, we distinguish between top-down algorithms and bottom-up algorithms. In a top-down approach, hypotheses are generated in a systematic way and then tested against the given relation. In a bottom-up approach, the relation is inspected in order to see what dependencies it may satisfy or violate. We give a simple (but inefficient) top-down algorithm, a bi-directional algorithm, and a bottom-up algorithm. In the case of functional dependencies, these algorithms have been implemented in the FDEP system and evaluated experimentally. The bottom-up algorithm is the most efficient of the three, and also outperforms other algorithms from the literature.