Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curvesN. Lachiche, P.A. Flach, Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. Proc. 20th International Conference on Machine Learning (ICML'03). ISBN 1-57735-189-4, pp. 416–423. January 2003. PDF, 107 Kbytes. External information
The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This paper proposes the use of ROC curves for this purpose. For two classes, the algorithm is a simple adaptation of the algorithm for tracing a ROC curve by sorting the instances according to their predicted probability of being positive. As there is no obvious way to upgrade this algorithm to the multi-class case, we propose a hill-climbing approach which adjusts the weights for each class in a pre-defined order. Experiments on a wide range of datasets show the proposed method leads to significant improvements over the naive Bayes classifier's accuracy. Finally, we discuss an method to find the global optimum, and show how its computational complexity would make it untractable.