Construction of High-accuracy Ensemble of Classifiers

Full Text (PDF, 414KB), PP.1-10

Author(s)

1. Department of Computer Science, School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran

2. Department of Computer Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran

* Corresponding author.

Received: 21 Jun. 2013 / Revised: 17 Oct. 2013 / Accepted: 10 Jan. 2014 / Published: 8 Apr. 2014

Index Terms

Classifier Ensembles, Bagging, Boosting, Decision Tree

Abstract

There have been several methods developed to construct ensembles. Some of these methods, such as Bagging and Boosting are meta-learners, i.e. they can be applied to any base classifier. The combination of methods should be selected in order that classifiers cover each other weaknesses. In ensemble, the output of several classifiers is used only when they disagree on some inputs. The degree of disagreement is called diversity of the ensemble. Another factor that plays a significant role in performing an ensemble is accuracy of the basic classifiers. It can be said that all the procedures of constructing ensembles seek to achieve a balance between these two parameters, and successful methods can reach a better balance. The diversity of the members of an ensemble is known as an important factor in determining its generalization error. In this paper, we present a new approach for generating ensembles. The proposed approach uses Bagging and Boosting as the generators of base classifiers. Subsequently, the classifiers are partitioned by means of a clustering algorithm. We introduce a selection phase for construction the final ensemble and three different selection methods are proposed for applying in this phase. In the first proposed selection method, a classifier is selected randomly from each cluster. The second method selects the most accurate classifier from each cluster and the third one selects the nearest classifier to the center of each cluster to construct the final ensemble. The results of the experiments on well-known datasets demonstrate the strength of our proposed approach, especially applying the selection of the most accurate classifiers from clusters and employing Bagging generator.

Cite This Paper

Hedieh Sajedi, Elham Masoumi, "Construction of High-accuracy Ensemble of Classifiers", International Journal of Information Technology and Computer Science(IJITCS), vol.6, no.5, pp.1-10, 2014. DOI:10.5815/ijitcs.2014.05.01

Reference

[1]Roli F., Giacinto, “An Approach to the Automatic Design of Multiple Classifier Systems”, Department of Electrical and Electronic Engineering, University of Cagliari, Italy, 2000.

[2]Thomas, G. Diettrech, “Ensemble method in machine learning”, Proc. of First International Workshop on Multiple Classifier Systems, 2000, 1-15.

[3]Hansen L., Salamon P., “Neural network ensembles”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12: 993-1001.

[4]Khanchel R., Limam M., “Empirical comparison of Arcing algorithms”, in proc. of Applied Stochastic Models and Data Analysis, 2000, 1433-1440.

[5]Breiman L., “Bagging Predictors”, Journal of Machine Learning, 1996, 24(2):123-140.

[6]Minaei-Bidgoli B., Topchy A.P., Punch W.F., “Ensembles of Partitions via Data Resampling”, Proc. of Information Technology: Coding and Computing, 2004, 188-192.

[7]Minaei-Bidgoli B., beigi A., “A new classifier ensembles framework”, Proc. of Knowledge-Based and Intelligent Information and Engineering Systems, Springer, 2011, 110-119.

[8]Minaei-Bidgoli B., William F. Punch, “Ensembles of Partitions via Data Resampling”, Proc. of International Conference on Information Technology: Coding and Computing, 2004.

[9]Simon Günter, Horst Bunke, “Creation of Classifier Ensembles for Handwritten Word Recognition Using Feature Selection Algorithms”, Proc. of the 8th International Workshop on Frontiers in Handwriting Recognition, 2002.

[10]Tumer, K., Ghosh, J., “Error correlation and error reduction in ensemble classifiers”, Connection Science, 1996, 8(4): 385–403.

[11]Kuncheva L., “Combining Pattern Classifiers, Methods and Algorithms”, Wiley, 2005.

[12]Peña J.M., “Finding Consensus Bayesian Network Structures”, Journal of Artificial Intelligence Research, 2011, 42: 661-687.

[13]Kuncheva, L.I. and Whitaker, C., “Measures of diversity in classifier ensembles and their relationship with ensemble accuracy”, Journal of Machine Learning, 2003, 51(2): 181-207.

[14]Parvin H., Minaei-Bidgoli B., Shahpar H., “Classifier Selection by Clustering”, Proc. of 3rd Mexican Conference on Pattern Recognition, LNCS, Springer, 2011, 60–66,.

[15]Yang T., “Computational Verb Decision tree”, International Journal of Computational Cognition, 2006, 4(4): 34-46.

[16]Blake C.L., Merz C.J., “UCI Repository of machine learning databases”, 1998.

[17]Breiman L., “Random Forests”, Machine Learning, 2001, 45(1): 5-32.

[18]Giacinto, G., Roli, F., “An approach to the automatic design of multiple classifier systems”, Pattern Recognition Letters, 2001, 22: 25–33.

[19]Sharkey, A.J.C., “On Combining Artificial Neural Nets”, Connection Science, vol. 8, pp. 299-314, 1996.

[20]Amigó E., Gonzalo J., Artiles J., and Verdejo F., “Combining Evaluation Metrics via the Unanimous Improvement Ratio and its Application to Clustering Tasks”, Journal of Artificial Intelligence Research, 2011, 42: 689-718.

[21]Freund Y., Schapire R.E., “A Decision-Theoretic Generalization of Online Learning and an Application to Boosting”, Journal Computer Syst. Sci., 1997, 55(1): 119-139.

[22]Dietterich, T.G., “An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and randomization”, Machine Learning, 2000, 40(2): 139–157.

[23]Dasgupta S., “Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback”, Journal of Artificial Intelligence Research, 2010, 39: 581-632.

[24]Haykin S., “Neural Networks, a comprehensive foundation”, 1999.

[25]Gunter S., Bunke H., “Creation of classifier ensembles for handwritten word recognition using features selection algorithms”, 2002.