Skip to main content

Multi-class Subgroup Discovery: Heuristics, Algorithms and Predictiveness

Tarek Abudawood, Multi-class Subgroup Discovery: Heuristics, Algorithms and Predictiveness. PhD thesis. University of Bristol, Department of Computer Science, Faculty of Engineering. April 2011. PDF, 1541 Kbytes.

Abstract

We investigate the multi-class subgroup discovery task under the rule learning framework in propositional logic and first-order logic. The discovered subgroups describe interesting patterns that apply to subsets of examples across multiple classes. A subgroup is interesting if it is sufficiently large and statistically unusual with respect to the distribution of the different classes. We study and investigate suitable heuristics for searching, evaluating and finding multi-class subgroups. While the subgroup discovery task is an interesting task on its own, we also demonstrate the usefulness of discovered subgroups by exploiting their predictive power. We developed several systems for learning multi-class subgroups in both propositional logic and first-order logic. Multi-class subgroups are learned faster than conventional two-class subgroups and classification rules. In addition, the learned multi-class subgroup discovery theories are more compact, individual subgroups are simpler and when used for making predictions their predictive performance is no worse than conventional two-class subgroup discovery and classification rule learning methods. Our investigation and empirical evaluations suggest that the multi-class subgroups can capture the most important patterns exist within a given domain. A large part of the thesis is concerned with handling subgroups or rules learned over multi-class domains in propositional and first-order rule learning context. We generally promote and illustrate the use of tree-based theories for representing and building accurate multi-class predictive theories throughout the thesis.

Bibtex entry.

Contact details

Publication Admin