Performance of Medical Image Processing Algorithms Implemented in CUDA running on GPU based Machine

Full Text (PDF, 533KB), PP.58-68

Views: 0 Downloads: 0


T. Kalaiselvi 1,* P. Sriramakrishnan 1 K. Somasundaram 1

1. Department of Computer Science and Applications, The Gandhigram Rural Institute - Deemed University, Tamilnadu, India

* Corresponding author.


Received: 10 May 2017 / Revised: 15 Jun. 2017 / Accepted: 6 Jul. 2017 / Published: 8 Jan. 2018

Index Terms

Medical images, image processing, GPU, CUDA, parallel processing


This paper illustrates the design and performance evaluation of few algorithms used for analysing the medical image volumes on the massive parallel graphics processing unit (GPU) with compute unified device architecture (CUDA). These algorithms are selected from the general framework, devised for computer aided diagnostic (CAD) system. The CAD system used for analysing large medical image datasets are usually a pipeline processing that includes a variety of image processing operations. A MRI scanner captures the 3D human head into a series of 2D images. Considerable time spent in pre and post processing of these images. Noise filters, segmentation, image diffusion and enhancement are few such methods. The algorithms are chosen for study requires local information, available in few pixels or global information available in the entire image. These problems are best candidates for GPU implementation, since the parallelism is naturally provided by the proposed Per-Pixel Threading (PPT) or Per-Slice Threading (PST) operations. In this paper implement the algorithms for adaptive filtering, anisotropic diffusion, bilateral filtering, non-local means (NLM) filtering, K-Means segmentation and feature extraction in 1536 core’s NVIDIA GPU and estimated the speed up gained. Our experiments show that the GPU based implementation achieved typical speedup values in the range of 3-338 times compared to conventional central processing unit (CPU) processor in PPT model and up to 30 times in PST model.

Cite This Paper

T. Kalaiselvi, P. Sriramakrishnan, K. Somasundaram, "Performance of Medical Image Processing Algorithms Implemented in CUDA running on GPU based Machine", International Journal of Intelligent Systems and Applications(IJISA), Vol.10, No.1, pp.58-68, 2018. DOI:10.5815/ijisa.2018.01.07


[1]S. Tariq, “An Introduction to GPU Computing and CUDA Architecture”, Computing and CUDA Architecture, NVIDIA Corporation, vol. 6, no.5, 2011.
[2]A. Mohan and G. Remya, “A Review on Large Scale Graph Processing using Big Data Based Parallel Programming Models”, International Journal of Intelligent Systems and Applications, vol. 9, no. 2, pp. 49-57, 2017.
[3]T. Praveen T and P. Arun Raj Kumar, “Multi-Objective Memetic Algorithm for FPGA Placement using Parallel Genetic Annealing”, International Journal of Intelligent Systems and Applications, vol. 8, no. 4, pp. 60-66, 2016.
[4]J Ghorpade, J. Parande, M. Kulkarni and A. Bawaskar, “GPGPU Processing in CUDA Architecture”, Advance Computing: An International Journal, vol. 3, no.1, pp.105-120, 2012.
[5]P. Harish and P. J. Narayanan, “Accelerating large graph algorithms on the GPU using CUDA”, Proceedings of the International Conference on High Performance Computing, pp. 197-208, 2007.
[6]K. Yadav, A. Srivastava and M. A. Ansari, “Parallel Implementation of Texture based Medical Image Retrieval in Compressed Domain using CUDA”, International Journal on Computer Applications, vol. 1, pp. 53-58, 2011.
[7]A. Das, “Process Time Comparison between GPU and CPU”, Tech. Report, 2011.
[8]M. Almazrooie, R. Abdullah, and M. Vadiveloo, “GPU-Based Fuzzy C-Means Clustering Algorithm for Image Segmentation”, CoRR, abs/1601.00072, 2016.
[9]J. L. Van Hemert and J. A. Dickerson, “Monte Carlo randomization tests for large-scale abundance datasets on the GPU”, Computer Methods and Programs in Biomedicine, vol. 101, pp.80-86, 2011.
[10]K. Somasundaram and T. Kalaiselvi, “Automatic Brain Extraction Methods for T1 magnetic Resonance Images using Region Labelling and Morphological Operations”, Computers in Biology and Medicine, vol. 41, no. 8, pp.716-725, 2011.
[11]A. Eklund, P. Dufort, D. Forsberg and S. M. LaConte, “Medical image processing on the GPU- Past, present and future”, Medical Image Analysis, vol. 17, no.8, pp.1073-1094, 2013.
[12]G. Pratx and L. Xing, “GPU computing in medical physics: A review”, The International Journal on Medical Physics and Practice, vol. 38, no. 5, pp. 2685-2697, 2011.
[13]E. Smistad, T. L. Falch, M. Bozorgi, A. C. Elster and F. Lindseth, “Medical image segmentation on GPUs – A comprehensive review”, Medical Image Analysis, vol. 20, no. 1, pp. 1-18, 2015.
[14]R.Shams, P. Sadeghi, R. A. Kennedy and R. I. Hartley, “A Survey of Medical Image Registration on Multicore and the GPU”, IEEE Signal Processing Magazine, vol. 27, no. 2, pp. 50-60, 2010.
[15]ELEKS, “CUDA-Accelerated Image Processing for Healthcare”, Last accessed on 19th June 2016.
[16]Y. Jing, W. Zeng, N. Wang, T. Ren, Y. Shi, J. Yin and Q. Xu, “GPU-based parallel group ICA for functional magnetic resonance data”, Computer Methods and Programs in Biomedicine, vol. 119, no. 1, pp.9-16, 2015.
[17]A. Eklunda, M. Andersson and H. Knutsson, “fMRI analysis on the GPU - Possibilities and challenges”, Computer Methods and Programs in Biomedicine, vol. 105, vol.2, pp. 145-161, 2012.
[18]F. Zhu, D. R. Gonzalez, T. Carpenterb, M. Atkinsona and J. Wardlaw, “Parallel perfusion imaging processing using GPGPU”, Computer Methods and Programs in Biomedicine, vol. 108, pp. 1012-1021, 2012.
[19]CUDA C Programming Guide, Version 8.0, Tech. report, NVIDIA, June 2017.
[20]D. B. Kirk and W. W. Hwu, Programming Massively Parallel Processor: A Hands-on Approach, 3rd Ed., Elsevier, pp. 1-576, 2016.
[21]S.O. Haykin, Adaptive Filter Theory, 5th Ed. Prentice Hall, 2013.
[22]P. Perona and J. Malik, “Scale-Space and Edge Detection using Anisotropic Diffusion”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629-639, 1990.
[23]C. Tomasi and R. Manduchi, “Bilateral filtering for gray and colour images”, Proceedings of IEEE International Conference on Computer Vision, pp. 839-846, 1998.
[24]A. Buades, B. Coll and J. M. Morel, “A Non-Local Algorithm for Image Denoising”, Computer Vision and Pattern Recognition, vol. 2, pp. 60-65, 2005.
[25]A. K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall, Englewood Clis, New Jersey, 1989.
[26]M. Mam, G. Leena and N. S. Saxena, “Improved K-means Clustering based Distribution Planning on a Geographical Network”, International Journal of Intelligent Systems and Applications, vol. 9, no. 4, pp. 69 – 75, 2017
[27]R. W. Conners and C. A. Harlow, “A Theoretical Comparison of Texture Algorithms”, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 2, pp. 204-222, 1980.