Article

Ensemble Methods in Machine Learning

Author:

Thomas G. DietterichAuthors Info & Claims

MCS '00: Proceedings of the First International Workshop on Multiple Classifier Systems

Pages 1 - 15

Published: 21 June 2000 Publication History

Abstract

Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boosting. This paper reviews these methods and explains why ensembles can often perform better than any single classifier. Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.

References

[1]

Ali, K. M., & Pazzani, M. J. (1996). Error reduction through learning multiple descriptions. Machine Learning, 24 (3), 173-202.

[2]

Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36 (1/2), 105-139.

[3]

Blum, A., & Rivest, R. L. (1988). Training a 3-node neural network is NP-Complete (Extended abstract). In Proceedings of the 1988 Workshop on Computational Learning Theory, pp. 9-18 San Francisco, CA. Morgan Kaufmann.

[4]

Breiman, L. (1996). Bagging predictors. Machine Learning, 24 (2), 123-140.

[5]

Cherkauer, K. J. (1996). Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks. In Chan, P. (Ed.), Working Notes of the AAAI Workshop on Integrating Multiple Learned Models, pp. 15-21. Available from http://www.cs.fit.edu/~imlm/.

[6]

Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning.

[7]

Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263-286.

[8]

Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Tech. rep., AT&T Bell Laboratories, Murray Hill, NJ.

[9]

Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proc. 13th International Conference on Machine Learning, pp. 148-146. Morgan Kaufmann.

[10]

Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intell., 12, 993-1001.

[11]

Hornik, K., Stinchcombe, M., & White, H. (1990). Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks, 3, 551-560.

[12]

Hyafil, L., & Rivest, R. L. (1976). Constructing optimal binary decision trees is NP-Complete. Information Processing Letters, 5 (1), 15-17.

[13]

Kolen, J. F., & Pollack, J. B. (1991). Back propagation is sensitive to initial conditions. In Advances in Neural Information Processing Systems, Vol. 3, pp. 860-867 San Francisco, CA. Morgan Kaufmann.

[14]

Kwok, S. W., & Carter, C. (1990). Multiple decision trees. In Schachter, R. D., Levitt, T. S., Kannal, L. N., & Lemmer, J. F. (Eds.), Uncertainty in Artificial Intelligence 4, pp. 327-335. Elsevier Science, Amsterdam.

[15]

Neal, R. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Tech. rep. CRG-TR-93-1, Department of Computer Science, University of Toronto, Toronto, CA.

[16]

Parmanto, B., Munro, P. W., & Doyle, H. R. (1996). Improving committee diagnosis with resampling techniques. In Touretzky, D. S., Mozer, M. C., & Hesselmo, M. E. (Eds.), Advances in Neural Information Processing Systems, Vol. 8, pp. 882-888 Cambridge, MA. MIT Press.

[17]

Raviv, Y., & Intrator, N. (1996). Bootstrapping with noise: An effective regularization technique. Connection Science, 8 (3-4), 355-372.

[18]

Ricci, F., & Aha, D. W. (1997). Extending local learners with error-correcting output codes. Tech. rep., Naval Center for Applied Research in Artificial Intelligence, Washington, D.C.

[19]

Schapire, R. E. (1997). Using output codes to boost multiclass learning problems. In Proceedings of the Fourteenth International Conference on Machine Learning, pp. 313-321 San Francisco, CA. Morgan Kaufmann.

[20]

Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1997). Boosting the margin: A new explanation for the effectiveness of voting methods. In Fisher, D. (Ed.), Machine Learning: Proceedings of the Fourteenth International Conference. Morgan Kaufmann.

[21]

Schapire, R. E., & Singer, Y. (1998). Improved boosting algorithms using confidence-rated predictions. In Proc. 11th Annu. Conf. on Comput. Learning Theory, pp. 80-91. ACM Press, New York, NY.

[22]

Tumer, K., & Ghosh, J. (1996). Error correlation and error reduction in ensemble classifiers. Connection Science, 8 (3-4), 385-404.

Cited By

Poduval PPatnala SOberoi GSrivasatava NAsthana SBaeza-Yates RBonchi F(2024)CASH via Optimal Diversity for Ensemble LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671894(2411-2419)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671894
Cao SLiu AHuang C(2024)Designing for Appropriate Reliance: The Roles of AI Uncertainty Presentation, Initial User Decision, and User Demographics in AI-Assisted Decision-MakingProceedings of the ACM on Human-Computer Interaction10.1145/36373188:CSCW1(1-32)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637318
Fang CLi XFan ZXu JNag KKorpeoglu EKumar SAchan KHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value ExtractionProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661357(2910-2914)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3661357
Show More Cited By

Index Terms

Ensemble Methods in Machine Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees

Index terms have been assigned to the content through auto-classification.

Recommendations

Popular ensemble methods: an empirical study

An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any ...
Performance Evaluation of Ensemble Methods For Software Fault Prediction: An Experiment
ASWEC ' 15 Vol. II: Proceedings of the ASWEC 2015 24th Australasian Software Engineering Conference

In object-oriented software development, a plethora of studies have been carried out to present the application of machine learning algorithms for fault prediction. Furthermore, it has been empirically validated that an ensemble method can improve ...
Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition

In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree DT are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

MCS '00: Proceedings of the First International Workshop on Multiple Classifier Systems

June 2000

402 pages

ISBN:3540677046

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 June 2000

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

859
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Poduval PPatnala SOberoi GSrivasatava NAsthana SBaeza-Yates RBonchi F(2024)CASH via Optimal Diversity for Ensemble LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671894(2411-2419)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671894
Cao SLiu AHuang C(2024)Designing for Appropriate Reliance: The Roles of AI Uncertainty Presentation, Initial User Decision, and User Demographics in AI-Assisted Decision-MakingProceedings of the ACM on Human-Computer Interaction10.1145/36373188:CSCW1(1-32)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637318
Fang CLi XFan ZXu JNag KKorpeoglu EKumar SAchan KHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value ExtractionProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661357(2910-2914)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3661357
Oh HJeong KLee EJeong JHong JPark J(2024)LINK: Self-Adaptive System with Human-Machine Teaming-based Loop for Interoperability in IoT EnvironmentsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635948(592-599)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3605098.3635948
Wimalasooriya CLicorish Sda Costa DMacDonell S(2024)Just-in-Time crash prediction for mobile appsEmpirical Software Engineering10.1007/s10664-024-10455-729:3Online publication date: 8-May-2024
https://dl.acm.org/doi/10.1007/s10664-024-10455-7
Chen LLi BShen SYang JLi CKeutzer KDarrell TLiu ZOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Large language models are visual reasoning coordinatorsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669194(70115-70140)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669194
Zhang JLiu SSong JZhu TXu ZSong MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Lookaround optimizerProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667369(28678-28698)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667369
Letzelter VFontaine MChen MPérez PEssid SRichard GOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Resilient multiple choice learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666384(6001-6013)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666384
Zhang JBottou LKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Learning useful representations for shifting tasks and distributionsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620119(40830-40850)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620119
Dbouk HShanbhag NKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)On the robustness of randomized ensembles to adversarial perturbationsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618697(7303-7328)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618697
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents