Similar Words Identification Using Naive and TF-IDF Method

Full Text (PDF, 401KB), PP.42-47

Views: 0 Downloads: 0


Divya K.S. 1,* R. Subha 1 S. Palaniswami 2

1. Department of CSE, Sri Krishna College of Technology, Coimbatore, India

2. Government College of Engineering, Bodinayakanur, India

* Corresponding author.


Received: 28 Nov. 2013 / Revised: 22 Mar. 2014 / Accepted: 9 May 2014 / Published: 8 Oct. 2014

Index Terms

Requirements Document, Design Documents, Requirement Satisfaction, Porter’s Stemming Algorithm, Term Frequency, Inverse Document Frequency


Requirement satisfaction is one of the most important factors to success of software. All the requirements that are specified by the customer should be satisfied in every phase of the development of the software. Satisfaction assessment is the determination of whether each component of the requirement has been addressed in the design document. The objective of this paper is to implement two methods to identify the satisfied requirements in the design document. To identify the satisfied requirements, similar words in both of the documents are determined. The methods such as Naive satisfaction assessment and TF-IDF satisfaction assessment are performed to determine the similar words that are present in the requirements document and design documents. The two methods are evaluated on the basis of the precision and recall value. To perform the stemming, the Porter’s stemming algorithm is used. The satisfaction assessment methods would determine the similarity in the requirement and design documents. The final result would give a accurate picture of the requirement satisfaction so that the defects can be determined at the early stage of software development. Since the defects determines at the early stage, the cost would be low to correct the defects.

Cite This Paper

Divya K.S., R. Subha, S. Palaniswami, "Similar Words Identification Using Naive and TF-IDF Method", International Journal of Information Technology and Computer Science(IJITCS), vol.6, no.11, pp.42-47, 2014. DOI:10.5815/ijitcs.2014.11.06


[1]Elizabeth Ashlee Holbrook, Jane Huffman Hayes, Alex Dekhtyar, Wenbin Li. A study of methods for textual satisfaction assessment. Springer- Empirical Software Engineering,2013,18(1):139-176.

[2]Holbrook Ea, Hayes J H, Dekhtyar A. Towards automating requirements satisfaction assessment. In: Proceedings of IEEE International Conference on Requirements Engineering, 2009, 149 – 158.

[3]Hayes J H, Dekhtyar A, Sundaram S, Holbrook A, Vadlamudi S. Requirements Tracing on Target (RETRO): Improving software maintenance through traceability recovery. Springer Innovations System Software Engineering,2007,3(3):193-202.

[4]Jane Huffman Hayes, Alex Dekhtyar, Senthil Karthikeyan Sundaram. Advancing candidate link generation for requirements tracing: the study of methods. IEEE Transactions on Software Engineering,2006,32(1): 4 – 19.

[5]Robinson W N. Implementing rule-based monitors within a framework for continuous requirements monitoring. In: Proceedings of Annual Hawaii International Conference on System Sciences, 2005, 188a.

[6]Marcus A, Maletic J I. Recovering documentation-to-source code traceability links using latent semantic indexing. In: Proceedings of International Conference on Software Engineering, 2003,125-135.

[7]Cleland-Huang J, Chang C K, Sethi G, Javvaji K, Haijian H U, Jinchun Xia. Automating speculative queries through event-based requirements traceability. In: Proceedings of IEEE Joint Conference on Requirements Engineering,2002,289-296.

[8]Giuliano Antoniol, Gerardo Canfora, Gerardo Casazza, Andrea De Lucia, Ettore Merlo. Recovering traceability links between code and documentation. IEEE Transactions On Software Engineering,2002,28(10): 970 – 983.

[9]Roger S Pressman. Software Engineering: a practitioner’s approach. 6th edition, McGraw-Hill Pub Co, New York,2005.

[10]Phillip A Laplante. Requirements Engineering for Software and Systems. 2nd edition, CRC press, New York.

[11]Donald Firesmith. Common Requirements Problems, Their Negative Consequences, and the Industry Best Practices to Help Solve Them. Journal of Object Technology,2007,6(1).

[12]Noraida Haji Ali, Noor Syakirah Ibrahim. Porter Stemming Algorithm for Semantic Checking. In: Proceedings of 16th International Conference on Computer and Information Technology, 2012, 253 – 258.