ABSTRACT
A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods.
- Ayer, M., Brunk, H., Ewing, G., Reid, W., & Silverman, E. (1955). An empirical distribution function for sampling with incomplete information. Annals of Mathematical Statistics, 5, 641--647.Google ScholarCross Ref
- Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36. Google ScholarDigital Library
- Blake, C., & Merz, C. (1998). UCI repository of machine learning databases.Google Scholar
- Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123--140. Google ScholarDigital Library
- Breiman, L. (2001). Random forests. Machine Learning, 45, 5--32. Google ScholarDigital Library
- Buntine, W., & Caruana, R. (1991). Introduction to ind and recursive partitioning (Technical Report FIA-91-28). NASA Ames Research Center.Google Scholar
- Caruana, R., & Niculescu-Mizil, A. (2004). Data mining in metric space: An empirical analysis of suppervised learning performance criteria. Knowledge Discovery and Data Mining (KDD'04). Google ScholarDigital Library
- Cooper, G. F., Aliferis, C. F., Ambrosino, R., Aronis, J., Buchanan, B. G., Caruana, R., Fine, M. J., Glymour, C., Gordon, G., Hanusa, B. H., Janosky, J. E., Meek, C., Mitchell, T., Richardson, T., & Spirtes, P. (1997). An evaluation of machine learning methods for predicting pneumonia mortality. Artificial Intelligence in Medicine, 9.Google Scholar
- Giudici, P. (2003). Applied data mining. New York: John Wiley and Sons.Google Scholar
- Gualtieri, A., Chettri, S. R., Cromp, R., & Johnson, L. (1999). Support vector machine classifiers as applied to aviris data. Proc. Eighth JPL Airborne Geoscience Workshop.Google Scholar
- Joachims, T. (1999). Making large-scale svm learning practical. Advances in Kernel Methods.Google Scholar
- King, R., Feng, C., & Shutherland, A. (1995). Statlog: comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9.Google Scholar
- LeCun, Y., Jackel, L. D., Bottou, L., Brunot, A., Cortes, C., Denker, J. S., Drucker, H., Guyon, I., Muller, U. A., Sackinger, E., Simard, P., & Vapnik, V. (1995). Comparison of learning algorithms for handwritten digit recognition. International Conference on Artificial Neural Networks (pp. 53--60). Paris: EC2 & Cie.Google Scholar
- Lim, T.-S., Loh, W.-Y., & Shih, Y.-S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40, 203--228. Google ScholarDigital Library
- Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proc. 22nd International Conference on Machine Learning (ICML'05). Google ScholarDigital Library
- Perlich, C., Provost, F., & Simonoff, J. S. (2003). Tree induction vs. logistic regression: a learning-curve analysis. J. Mach. Learn. Res., 4, 211--255. Google ScholarDigital Library
- Platt, J. (1999). Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Adv. in Large Margin Classifiers.Google Scholar
- Provost, F., & Domingos, P. (2003). Tree induction for probability-based rankings. Machine Learning. Google ScholarDigital Library
- Provost, F. J., & Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. Knowledge Discovery and Data Mining (pp. 43--48).Google Scholar
- Robertson, T., Wright, F., & Dykstra, R. (1988). Order restricted statistical inference. New York: John Wiley and Sons.Google Scholar
- Schapire, R. (2001). The boosting approach to machine learning: An overview. In MSRI Workshop on Nonlinear Estimation and Classification.Google Scholar
- Vapnik, V. (1998). Statistical learning theory. New York: John Wiley and Sons. Google ScholarDigital Library
- Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques. San Francisco: Morgan Kaufmann. Second edition. Google ScholarDigital Library
- Zadrozny, B., & Elkan, C. (2001). Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. ICML. Google ScholarDigital Library
- Zadrozny, B., & Elkan, C. (2002). Transforming classifier scores into accurate multiclass probability estimates. KDD. Google ScholarDigital Library
Index Terms
- An empirical comparison of supervised learning algorithms
Recommendations
An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme
AbstractClass-imbalance learning is one of the most challenging problems in machine learning. As a new and important direction in this field, multi-class imbalanced data classification has attracted a great many research focus in recent years. ...
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants
Methods for voting classification algorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and real-world datasets. We review these algorithms and describe a large ...
An Extensive Empirical Study on Semi-supervised Learning
ICDM '10: Proceedings of the 2010 IEEE International Conference on Data MiningSemi-supervised classification methods utilize unlabeled data to help learn better classifiers, when only a small amount of labeled data is available. Many semi-supervised learning methods have been proposed in the past decade. However, some questions ...
Comments