God Class Refactoring Recommendation and Extraction Using Context based Grouping

Full Text (PDF, 777KB), PP.14-37

Views: 0 Downloads: 0


Tahmim Jeba 1,* Tarek Mahmud 2 Pritom S. Akash 1 Nadia Nahar 1

1. Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh

2. Department of Computer Science, Texas State University, San Marcos, Texas, USA

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2020.05.02

Received: 18 Feb. 2020 / Revised: 11 Mar. 2020 / Accepted: 16 Mar. 2020 / Published: 8 Oct. 2020

Index Terms

Code Smell, God Class, Extract Class Refactoring, Hierarchical Clustering, Cluster Composition, Automatic Refactoring


Code smells are the indicators of the flaws in the design and development phases that decrease the maintainability and reusability of a system. A system with uneven distribution of responsibilities among the classes is generated by one of the most hazardous code smells called God Class. To address this threatening issue, an extract class refactoring technique is proposed that incorporates both cohesion and contextual aspects of a class. In this work, greater emphasis was provided on the code documentation to extract classes with higher contextual similarity. Firstly, the source code is analyzed to generate a set of cluster of extracted methods. Secondly, another set of clusters is generated by analyzing code documentation. Then, merging these two, a final cluster set is formed to extract the God Class. Finally, an automatic refactoring approach is also followed to build newly identified classes. Using two different metrics, a comparative result analysis is provided where it is shown that the cohesion among the classes is increased if the context is added in the refactoring process. Moreover, a manual inspection is conducted to ensure that the methods of the refactored classes are contextually organized. This recommendation of God Class extraction can significantly help the developers in minimizing the burden of refactoring on own their own and maintaining the software systems.

Cite This Paper

Tahmim Jeba, Tarek Mahmud, Pritom S. Akash, Nadia Nahar, "God Class Refactoring Recommendation and Extraction Using Context based Grouping", International Journal of Information Technology and Computer Science(IJITCS), Vol.12, No.5, pp.14-37, 2020. DOI:10.5815/ijitcs.2020.05.02


[1]Fowler, M. and Beck, K. (1999) Refactoring: improving the design of existing code. Addison-Wesley Professional, Boston, MA, USA.

[2]Bavota, G., De Lucia, A., Marcus, A., and Oliveto, R. (2010) A two-step technique for extract class refactoring. Proceedings of the IEEE/ACM international conference on Automated software engineering, Antwerp, Belgium, 20-24 September, pp. 151{154. ACM, New York, NY, USA.

[3]Bavota, G., De Lucia, A., Marcus, A., and Oliveto, R. (2014) Automating extract class refactoring: an improved method and its evaluation. Empirical Software Engineering, 19, 1617-1664.

[4]Gethers, M. and Poshyvanyk, D. (2010) Using relational topic models to capture coupling among classes in object-oriented software systems. Proceedings of the 2010 IEEE International Conference on Software Maintenance, Timisoara, Romania, 12-18 September, pp. 1-10. IEEE Computer Society,Washington, DC, USA.

[5]Bavota, G., Oliveto, R., De Lucia, A., Antoniol, G., and Gueheneuc, Y.-G. (2010) Playing with refactoring: Identifying extract class opportunities through game theory. Software Maintenance (ICSM), 2010 IEEE International Conference on, Timisoara, Romania, 12-18 September, pp. 1-5. IEEE.

[6]Fokaefs, M., Tsantalis, N., Stroulia, E., and Chatzigeorgiou, A. (2011) Jdeodorant: identification and application of extract class refactorings. Proceedings of the 33rd International Conference on Software Engineering, Honolulu, HI, USA, 21-28 May, pp. 1037-1039. IEEE.

[7]Bavota, G., Gethers, M., Oliveto, R., Poshyvanyk, D., and Lucia, A. d. (2014) Improving software modularization via automated analysis of latent topics and dependencies. ACM Transactions on Software Engineering and Methodology (TOSEM), 23, 4:1-4:33.

[8]Jeba, T., Uddin Mahmud, T. S., and Nahar, N. (2018) A cluster compositional algorithm for incorporation of multiple sets of clusters of identical data. 2018 Joint 7th International Conference on Informatics, Electronics Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision Pattern Recognition (icIVPR), Kitakyushu, Japan, 25-29 June, pp. 59-64. IEEE.

[9]Marcus, A., Poshyvanyk, D., and Ferenc, R. (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Transactions on Software Engineering, 34, 287-300.

[10]Abilio, R., Padilha, J., Figueiredo, E., and Costa, H. (2015) Detecting code smells in software product lines – an exploratory study. 2015 12th International Conference on Information Technology-New Generations, Las Vegas, NV, USA, 13-15 April, pp. 433{438. IEEE.

[11]Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., Poshyvanyk, D., and De Lucia, A. (2014) Mining version histories for detecting code smells. IEEE Transactions on Software Engineering, 41, 462-489.

[12]Mansoor, U., Kessentini, M., Maxim, B. R., and Deb, K. (2017) Multi-objective code-smells detection using good and bad design examples. Software Quality Journal, 25, 529-552.

[13]Kessentini, W., Kessentini, M., Sahraoui, H., Bechikh, S., and Ouni, A. (2014) A cooperative parallel search-based software engineering approach for code smells detection. IEEE Transactions on Software Engineering, 40, 841-861.

[14]Bavota, G., De Lucia, A., and Oliveto, R. (2011) Identifying extract class refactoring opportunities using structural and semantic cohesion measures. Journal of Systems and Software, 84, 397-414.

[15]Akash, P., Sadiq, A., and Kabir, A. (2019) An approach of extracting god class exploiting both structural and semantic similarity. Proceedings of the 14th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, Heraklion, Crete, Greece, 4-5 May, pp. 427-433. SciTePress.

[16]Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003) Latent dirichlet allocation. Journal of Machine Learning research, 3, 993-1022. 

[17]Tan, P.-N. et al. (2006) Introduction to data mining. Pearson Education India, Boston, MA, USA.

[18]Jain, A. K. and Dubes, R. C. (1988) Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.

[19]Mazinanian, D., Tsantalis, N., Stein, R., and Valenta, Z. (2016) Jdeodorant: clone refactoring. Software Engineering Companion (ICSE-C), IEEE/ACM International Conference on, pp. 613-616. IEEE.

[20]Tsantalis, N., Chaikalis, T., and Chatzigeorgiou, A. (2008) Jdeodorant: Identification and removal of type-checking bad smells. Software Maintenance and Reengineering, 2008. CSMR 2008. 12th European Conference on, Athens, Greece, 1-4 April, pp. 329-331. IEEE.

[21]Fokaefs, M., Tsantalis, N., and Chatzigeorgiou, A. (2007) Jdeodorant: Identification and removal of feature envy bad smells. Software Maintenance, 2007. ICSM 2007. IEEE International Conference on, Paris, France, 2-5 October, pp. 519-520. IEEE.

[22]Larson, R. R. (2010) Introduction to information retrieval. Journal of the American Society for Information Science and Technology, 61, 852-853.

[23]Fokaefs, M., Tsantalis, N., Chatzigeorgiou, A., and Sander, J. (2009) Decomposing object-oriented class modules using an agglomerative clustering technique. Software Maintenance, 2009. ICSM 2009. IEEE International Conference on, Edmonton, AB, Canada, 20-26 September, pp. 93-101. IEEE.

[24]Dexun, J., Peijun, M., Xiaohong, S., and Tiantian, W. (2013) Detection and refactoring of bad smell caused by large scale. International Journal of Software Engineering & Applications, 4, 1.

[25]Chang, J., Blei, D. M., et al. (2010) Hierarchical relational models for document networks. The Annals of Applied Statistics, 4, 124-150.

[26]Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., and Wu, A. Y. (2002) An efficient k-means clustering algorithm: Analysis and implementation. IEEE transactions on pattern analysis and machine intelligence, 24, 881-892.

[27]Bentley, J. L. (1975) Multidimensional binary search trees used for associative searching. Communications of the ACM, 18, 509-517.

[28]Parsian, M. (2015) Data algorithms: recipes for scaling up with Hadoop and Spark. O'Reilly Media, Inc., Sebastopol, California.

[29]Fokaefs, M., Tsantalis, N., Stroulia, E., and Chatzigeorgiou, A. (2012) Identification and application of extract class refactorings in object-oriented systems. Journal of Systems and Software, 85, 22412260.

[30]Anquetil, N. and Lethbridge, T. C. (1999) Experiments with clustering as a software remodularization method. Sixth Working Conference on Reverse Engineering (Cat. No. PR00303), Atlanta, GA, USA, USA, 8-8 October, pp. 235-255. IEEE.

[31]AbdAllah, L. and Shimshoni, I. (2014) Mean shift clustering algorithm for data with missing values. International Conference on Data Warehousing and Knowledge Discovery, Munich, Germany, 2-4 September, pp. 426-438. Springer.

[32]Xerces-J 2 7 0. https://github.com/apache/xerces2-j/releases/tag/Xerces-J_2_7_0. Online; accessed 10 February, 2018.

[33]Baeza-Yates, R., Ribeiro-Neto, B., et al. (1999) Modern information retrieval. ACM press, New York.

[34]Palomba, F., Panichella, A., De Lucia, A., Oliveto, R., and Zaidman, A. (2016) A textual-based technique for smell detection. Program Comprehension (ICPC), 2016 IEEE 24th International Conference on, pp. 1-10. IEEE.

[35]Porter, M. F. (1980) An algorithm for suffix stripping. Program, 14, 130-137.

[36]GanttProject. https://sourceforge.net/projects/ganttproject/ files%2FOldFiles/. Online; accessed 13 February, 2018.

[37]Li, W. and Henry, S. (1993) Maintenance metrics for the object oriented paradigm. [1993] Proceedings First International Software Metrics Symposium, Baltimore, MD, USA, USA, 21-22 May, pp. 52-60. IEEE.