Improved data set characterisation for meta-learningY. Peng, P.A. Flach, C. Soares, P. Brazdil, Improved data set characterisation for meta-learning. Proc. 5th International Conference on Discovery Science (DS-02). ISBN 3-540-00188-3, pp. 141–152. January 2002. PDF, 116 Kbytes. External information
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in meta-learning, and Landmarking that is the most recently developed method.