Decision tree-based characterization for meta-learningY. Peng, P. A. Flach, P. Brazdil, C. Soares, Decision tree-based characterization for meta-learning. ECML/PKDD'02 workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning. M. Bohanec, B. Kasek, N. Lavrac, D. Mladenic, (eds.), pp. 111–122. August 2002. No electronic version available.
Appropriate selection of learning algorithms is essential for the success of data mining. Meta-learning is one approach to achieve this objective. Meta-learning tries to identify a mapping from data characteristics to algorithm performance. Appropriate data characterization is, thus, of vital importance for the meta-learning. To this effect, a variety of data characterization techniques, based on three strategies including simple measure, statistical measure and information theory based measure, have been developed, however, the quality of them is still needed to be improved. This paper presents new measures to characterise datasets for meta-learning based on the idea to capture the characteristics from the structural shape and size of the decision tree induced from the dataset. A total of 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the classical data characteristics techniques, including DCT that is the most wide used technique in meta-learning and Landmarking that is the most recently developed method and produced better performance comparing to DCT.