An Enhanced Approach for Solving Class Imbalance Problem in Automatic Image Annotation

Full Text (PDF, 454KB), PP.9-16

Views: 0 Downloads: 0


T.Sumadhi 1,* M.Hemalatha 1

1. Department of Software Systems, Karpagam University Coimbatore, Tamilnadu, India

* Corresponding author.


Received: 18 Oct. 2012 / Revised: 23 Nov. 2012 / Accepted: 29 Dec. 2012 / Published: 8 Feb. 2013

Index Terms

Automatic image annotation, Gentle Ada-Boost, Improvised FSMOTE, Synthetic minority over sampling technique, JEC, SVM


Classifying an object captured in an image is useful for understanding the contents of the image and annotating it exactly with corresponding tags automatically is the problem faced recently. As the real world data set is highly imbalanced it degrades the performance of automatic image annotation and object detection. To prevail over this drawback we have proposed a new system for pattern matching and annotation which is based on the fusion of principles obtained from Fractal Transform and gentle AdaBoost algorithm. This paper, also tries to overcome deterioration in the performance occurring through imbalance dataset, different orientation, scaling in image annotation by choosing an over sampling method for learning the classifier. The proposed IFSMOTE classifier is initially trained up by setting a threshold value which helps to identify the objects correctly and an over-sampling technique based on fractal is used to classify the imbalanced dataset. Experimental results on the Flicker image dataset have shown superior performance results in terms of precision, recall and F-measure. This paper also presents the comparative results of our proposed system with other traditional image annotation algorithm like SVM, SMOTE and FSMOTE.

Cite This Paper

T.Sumadhi, M.Hemalatha,"An Enhanced Approach for Solving Class Imbalance Problem in Automatic Image Annotation", IJIGSP, vol.5, no.2, pp.9-16, 2013. DOI: 10.5815/ijigsp.2013.02.02


[1]Dongmei ZHANG, Wei LIU, Xiaosheng GONG, Hui JIN," A Novel Improved SMOTE Resampling Algorithm Based on Fractal" Journal of Computational Information Systems 7: 6 (2011) 2204-2211

[2]Chawla N. V., K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, 2002. "SMOTE:Synthetic minority oversampling technique," J Artif Intell Res, 16: 321-257.

[3]Tahir M. A., J. Kittler, F. Yan and K. Mikolajczyk" Concept Learning For Image And Video Retrieval: The Inverse Random Under Sampling Approach" European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009.

[4]Shuyun Xie, Zhengyu Bao. Fractal and Multifractal Properties of Geochemical Fields. Earth and Environmental Science. Mathematical Geology. Volume 36,Number 7, 847-864.

[5]Gustavo E. A. P. A. Batista, Ronaldo C. Prati, Maria Carolina Monard, A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data, Sigkdd Explorations, Volume 6, Issue 1 - Page 20.

[6]Chawla, N.V., Lazarevic A., Hall L.O., Bowyer K.W.: SMOTEBoost: Improving Prediction of the Minority Class in Boosting. PKDD 2003, LNAI 2838, pp.107-119, 2003.Springer-Verlag Berlin Heidelberg 2003.

[7]Yang Zhi-ming, Qiao Li-yan, Peng Xi-yuan. Research on Datamining Method for Imbalanced Dataset Based on Improved SMOTE[J]. Acta Electronica Sinica. 2007,35(B12): 22-26.

[8]Chawla, N.V., Data Mining for Imbalanced Datasets: An OverView. In: Maimon, O.,Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 853-867. Springer, Heidelberg (2005).

[9]L. Jiang and C. Li. An Empirical Study on Attribute Selection Measures in Decision Tree Learning. Journal of Computational Information Systems, 2010, 6(1): 105-112.

[10]Qiong Gu, Zhihua Cai, Li Zhu, Bo Huang. Data Mining on Imbalanced Data Sets. 2008. International Conference on Advanced Computer Theory and Engineering. 1020-1024, 2008.

[11]HeyongWang, Hongkun Fan, Zheng'an Yao. Imbalance Data Set Classification Using SMOTE and Biased-SVM. Computer Science. Vol. 35(5):174-176, 2008. 

[12]Sheng TANG and Si-ping CHEN" The Generation Mechanism of Synthetic Minority Class Examples "The 2nd International Symposium & Summer School on Biomedical and Health Engineering, China, May 30-31, 978-1-4244-2255-5/08/$25 2008 IEEE

[13]Benitez A. B. and S.F. Chang. Semantic knowledge construction from annotated image collection. Proceedings of IEEE International Conference on Multimedia, August 2002.

[14]Chang S.F., W. Chen, and H. Sundaram. Semantic visual templates: Linking visual features tosemantics. Proceedings of IEEE International Conference on Image Processing, 1998.

[15]Dietterich T. and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 1995.

[16]He X., W.-Y. Ma, O. King, M. Li, and H. Zhang. Learning and inferring a semantic space from user's relevance feedback for image retrieval. Proc. Of ACM Multimedia, :343-347, 2002.

[17]Ho T. K., J. Hull, and S. Srihari. Decision combination in multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(1):66-75, 1994. 

[18]J. Li and J. Z. Wang. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(20), 2003. 

[19]Lipson P.. Context and configuration based scene classifcation. MIT EECS Department Phd Dissertation, September 1996.

[20]Moreira M. and E. Mayoraz. Improving pairwise coupling classifcation with error correcting classifiers. Proceedings of the Tenth European Conference on Machine Learning, April 1998.

[21]Platt J., N. Cristianini, and J. Shawe-Taylor. Large margin dags for multiclass classification. In Advances in Neural Information Processing Systems 12, pages 547-553. MIT Press, 2000.

[22]Platt J.. Probabilistic outputs for svms and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers. MIT Press, 1999.

[23]Poddar P. and P. Rao. hierarchical ensemble of neural networks. International Conference on Neural Networks, 1, 1993.

[24]Rodriguez C., J. Muguerza, M. Navarro, A. Zarate, J. Martin, and J. Perez. A two-stage classifier for broken and blurred digits in forms. International Conference on Pattern Recognition, 2:1101-1105,1998.

[25]Schapire R. F. and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 80-91, July1998.

[26]Shen H. T., B. C. Ooi, and K. L. Tan. Giving meanings to www images. Proc. of ACM Multimedia: 39-48, 2000.

[27]Smith J. R. and S.F. Chang. Multi-stage classification of images from features and related text. Proc. of the 4th DELOS Workshop, August 1997.

[28]Srihari R., Z. Zhang, and A. Rao. Intelligent indexing and semantic retrieval of multimodal documents. Information Retrieval, 2:245-275, 2000.

[29]Tong S. and E. Chang. Support vector machine active learning for image retrieval. ACM International Conference on Multimedia, October 2001.

[30]Vapnik V.. Estimation of Dependences Based on Empirical Data. Springer Verlag, 1982.

[31]Vapnik V.. Statistical Learning Theory. Wiley, 1998.

[32]Wang J., J. Li, and G. Wiederhold. Simplicity:Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9):947-963, 2001.

[33]Wang J. Z. and J. Li. Learning-based linguistic indexing of pictures with 2-d mhmms. Proc. Of ACM Multimedia, pages 436-445, December 2002.

[34]Wenyin L., S. Dumais, Y. Sun, H. Zhang, M. Czerwinski, and B. Field. Semi-automatic image annotation. In Proc. of Interact 2001: Conference on Human-Computer Interaction, pages 326-333, Jul 2001.

[35]Wu G. and E. Chang. Adaptive feature-space conformal transformation for learning imbalanced data. International Conference on Machine Learning, August 2003.