Sridhar Mandapati; Raveendra Babu Bhogapathi; Ratna Babu Chekka

A Hybrid Algorithm for Privacy Preserving in Data Mining

Full Text (PDF, 462KB), PP.47-53

Views: 0 Downloads: 0

Author(s)

Sridhar Mandapati ^1,* Raveendra Babu Bhogapathi ² Ratna Babu Chekka ³

1. Dept. of Computer Applications, R.V.R & J.C College of Engineering, Guntur, India

2. Dept. of Computer Science and Engineering, VNR VJIET, Hyderabad, India

3. Dept. of Computer Science and Engineering, R.V.R & J.C College of Engineering, Guntur, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2013.08.06

Received: 7 Oct. 2012 / Revised: 11 Feb. 2013 / Accepted: 4 May 2013 / Published: 8 Jul. 2013

Index Terms

Privacy-Preserving Data Mining (PPDM), Evolutionary Algorithms (EAs), Swarm Intelligence, Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Adult Dataset

Abstract

With the proliferation of information available in the internet and databases, the privacy-preserving data mining is extensively used to maintain the privacy of the underlying data. Various methods of the state art are available in the literature for privacy-preserving. Evolutionary Algorithms (EAs) provide effective solutions for various real-world optimization problems. Evolutionary Algorithms are efficiently employed in business practice. In privacy-preserving domain, the existing EA solutions are restricted to specific problems such as cost function evaluation. In this work, it is proposed to implement a Hybrid Evolutionary Algorithm using Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). Both GA and PSO in the proposed system work with the same population. In the proposed framework, k-anonymity is accomplished by generalization of the original dataset. The hybrid optimization is used to search for optimal generalized feature set.

Cite This Paper

Sridhar Mandapati, Raveendra Babu Bhogapathi, Ratna Babu Chekka, "A Hybrid Algorithm for Privacy Preserving in Data Mining", International Journal of Intelligent Systems and Applications(IJISA), vol.5, no.8, pp.47-53, 2013. DOI:10.5815/ijisa.2013.08.06

Reference

[1]Xinjing Ge and Jianming Zhu, (2011), Privacy Preserving Data Mining, New Fundamental Technologies in Data Mining.

[2]Agrawal R., Srikant R. Privacy-Preserving Data Mining. Proceedings of the ACM SIGMOD Conference, 2000.

[3]Malin, B., Benitez, K., & Masys, D. (2011). Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule. Journal of the American Medical Informatics Association, 18(1), 3-10.

[4]Singh, M. D., Krishna, P. R., & Saxena, A. (2010, January). A cryptography based privacy preserving solution to mine cloud data. In Proceedings of the Third Annual ACM Bangalore Conference (p. 14). ACM.

[5]Patrick Sharkey, Hongwei Tian, Weining Zhang, and Shouhuai Xu, 2008, Privacy-Preserving Data Mining through Knowledge Model Sharing, Springer-Verlag Berlin Heidelberg, pp. 97–115, 2008

[6]Pawel Jurczyk, Li Xiong, 2008, Privacy-Preserving Data Publishing for Horizontally Partitioned Databases, CIKM’08, October 26–30USA., ACM 978-1-59593-991-3/08/10.

[7]Campan, A., & Truta, T. (2009). Data and structural k-anonymity in social networks. Privacy, Security, and Trust in KDD, 33-54.

[8]Nergiz, M. E., Clifton, C., & Nergiz, A. E. (2009). Multirelational k-anonymity. Knowledge and Data Engineering, IEEE Transactions on, 21(8), 1104-1117.

[9]Stokes, K., & Torra, V. (2012, March). N-Confusion: a generalization of k-anonymity. In Proceedings of the 2012 Joint EDBT/ICDT Workshops (pp. 211-215). ACM.

[10]Cao, J., Karras, P., Kalnis, P., & Tan, K. L. (2011). SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness. The VLDB Journal, 20(1), 59-81.

[11]Shi, P., Xiong, L., & Fung, B. (2010, October). Anonymizing data with quasi-sensitive attribute values. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 1389-1392). ACM.

[12]A. Meyerson, R. Williams, On the complexity of optimal k-anonymity, in: Proc. of the 23rd ACM SIGMOD-SIGCAT-SIGART Symposium, ACM, New York, NY, 2004, pp. 223–228.

[13]P.Samarati, Protecting respondents’ identities in microdata release, IEEE Transactions on Knowledge and Data Engineering 13 (6) (2001) 1010–1027.

[14]Van der Merwe, D., & Engelbrecht, A. P. (2003). Data clustering using particle swarm optimization. In IEEE congress on evolutionary computation (1) (pp. 215–220). New York: IEEE.

[15]Holden, N., & Freitas, A. (2008). A hybrid PSO/ACO algorithm for discovering classiﬁcation rules in data mining. Journal of Artiﬁcial Evolution and Applications, 2008, 11 pages.

[16]Van den Bergh F. and Engelbrecht A.P., ‘A Cooperative Approach to Particle Swarm Optimization’, IEEE Transactions on Evolutionary Computation, 2004, pp. 225-239.

[17]Premalatha, K., & Natarajan, A. M. (2009). Hybrid PSO and GA for global maximization. Int. J. Open Problems Compt. Math, 2(4), 597-608.

[18]Bayardo R. J., Agrawal R.: Data Privacy through Optimal k-Anonymization. Proceedings of the ICDE Conference, pp. 217–228, 2005.

[19]Sakuma, J., & Kobayashi, S. (2007, July). A genetic algorithm for privacy preserving combinatorial optimization. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (pp. 1372-1379). ACM.

[20]Dehkordi, M. N., Badie, K., & Zadeh, A. K. (2009). A novel method for privacy preserving in association rule mining based on genetic algorithms. Journal of software, 4(6), 555-562.

[21]Matatov, N., Rokach, L., & Maimon, O. (2010). Privacy-preserving data mining: A feature set partitioning approach. Information Sciences, 180(14), 2696-2720.

[22]P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, CMU, SRI, 1998.

[23]Lefevre, K., Dewitt, D., And Ramakrishnan, R. 2005. Incognito: Efficient full domain k-anonymity. In SIGMOD.

[24]Zhong, S., Yang, Z., And Wright, R. N. 2005. Privacy-enhancing k-anonymization of customer data. In Proceedings of the International Conference on Principles of Data Systems (PODS).

[25]L. David, Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold. 1991.

[26]D.E. Goldberg, Genetic Algorithms: in Search, Optimization, and Machine Learning. New York: Addison-Wesley Publishing Co. Inc. 1989.

[27]Qing Cao, Tian He, and Tarek Abdelzaher, uCast: Unified Connectionless Multicast for Energy Efficient Content Distribution in Sensor Networks, IEEE Transactions On Parallel And Distributed Systems, Vol. 18, No. 2, February 2007

[28]Latiff, N.M.A.; Tsimenidis, C.C.; Sharif, B.S., "Performance Comparison of Optimization Algorithms for Clustering in Wireless Sensor Networks," Mobile Adhoc and Sensor Systems, 2007. MASS 2007. IEEE International Conference on , vol., no., pp.1-4, 8-11 Oct. 2007

[29]Matthew Settles,” An Introduction to Particle Swarm Optimization”, 2005

[30]Eberhart, R. C., Shi, Y.: Particle swarm optimization: Developments, applications and resources, In Proceedings of IEEE International Conference on Evolutionary Computation, vol. 1 (2001), 81-86.

[31]El-Abd, M., & Kamel, M. (2005). A taxonomy of cooperative search algorithms. Hybrid Metaheuristics, 902-902.

International Journal of Intelligent Systems and Applications (IJISA)