IJEM Vol. 15, No. 4, 8 Aug. 2025
Cover page and Table of Contents: PDF (size: 519KB)
PDF (519KB), PP.29-38
Views: 0 Downloads: 0
Breast Cancer, K-Nearest Neighbor (KNN), SVM (Support Vector Machine), Random Forest
Breast cancer is a leading cause of mortality among women worldwide, particularly in developing countries. Accurate and early diagnosis is crucial to improve patient outcomes. This study compares the performance of three supervised machine learning algorithms—Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF)—in classifying breast cancer cases using the Breast Cancer Wisconsin dataset. The dataset consists of 569 instances with 33 features, categorized into malignant and benign classes. Each method was evaluated based on its classification accuracy. The results show that Random Forest achieved the highest accuracy at 94.07%, outperforming SVM with 90.06% and KNN with 90.00%. The findings suggest that Random Forest provides the most reliable performance for breast cancer classification within the scope of this dataset. This study highlights the importance of selecting an appropriate algorithm to enhance diagnostic precision and recommends Random Forest as an effective method for similar classification tasks.
Adithya Kusuma Whardana, Ruth Kristian Putri, "Comparative Evaluation of Supervised Learning Algorithms for Breast Cancer Classification", International Journal of Engineering and Manufacturing (IJEM), Vol.15, No.4, pp. 29-38, 2025. DOI:10.5815/ijem.2025.04.03
[1]Hartmann, D., Müller, D., Soto-Rey, I., & Kramer, F. (2021). Assessing the role of random forests in medical image segmentation. arXiv preprint arXiv:2103.16492.
[2]Farizi Rachman, Santi Wulan Purnami. Perbandingan Klasifikasi Tingkat Keganasan Breast Cancer Dengan Menggunakan Regresi Logistik Ordinal Dan Support Vector Machine (SVM). Jurnal Sains Dan Seni ITS, 2012: 1-6.
[3]Pangaribuan, Vincent Angkasa and Jefri Junifer. Komparasi Tingkat Akurasi Random Forest Dan Knn Untuk Mendiagnosis Penyakit Kanker Payudara. Universitas Pelita Harapan PSDKU Medan Jurusan Sistem Informasi, 2022: 49-61.
[4]Harun Al Azies, Gangga Anuraga. Klasifikasi Daerah Tertinggal di Indonesia Menggunakan Algoritma SVM dan K-NN. Jurnal ILMU DASAR, 2021: 31-38.
[5]Permana Putra, Akim M H Pardede , Siswan Syahputra. Analisis Metode K-Nearest Neighbour (Knn) Dalam Klasifikasi Data Iris Bunga. Jurnal Teknik Informatika Kaputama (JTIK), 2022: 297-305.
[6]Braiek, Rafika Harrabi and Ezzedine Ben. Color image segmentation using multi-level thresholding approach and data fusion techniques: application in the breast cancer cells images. EURASIP Journal on Image and Video Processing, 2012: 1-11.
[7]Ahmad Fauzi, Riki Supriyadi, Nurlaelatul Maulidah. Deteksi Penyakit Kanker Payudara dengan Seleksi Fitur berbasis Principal Component Analysis dan Random Forest. Jurnal Infortech, 2020: 1-6.
[8]Cuong Nguyen, Yong Wang, Ha Nam Nguyen. Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic. Journal of Biomedical Science and Engineering, 2013: 1-10.
[9]Sharmin, Shakil Mahmud Boby and Shaela. Medical Image Denoising Techniques against Hazardous Noises: An IQA Metrics Based Comparative Analysis. I.J. Image, Graphics and Signal Processing, 2021: 25-43.
[10]Runi Hari Bagus Saputra, Roy Mubarok. Implementasi Algoritma Random Forest Untuk Mendiagnosis Kejadian Berulang (Kekambuhan) Pada Kanker Payudara Berbasis Web. OKTAL : Jurnal Ilmu Komputer dan Sains, 2022: 564-572.
[11]Widhi Ramdhani, David Bona, Rafi Bagus Musyaffa, Chaerur Rozikin. Klasifikasi Penyakit Kanker Payudara Menggunakan Algoritma K-Nearest Neighbor. Jurnal Ilmiah Wahana Pendidikan, 2022: 445-452.
[12]H. Harafani, H. Aji Al-Kautsar. Meningkatkan Kinerja K-Nn Untuk Klasifikasi Kanker Payudara Dengan Seleksi Fitur. Jurnal Pendidikan Teknologi dan Kejuruan, 2021: 99-110.
[13]Guo, J., Yuan, H., Shi, B., Zheng, X., Zhang, Z., Li, H., & Sato, Y. (2024). A novel breast cancer image classification model based on multiscale texture feature analysis and dynamic learning. Scientific Reports, 14(1), 7216.
[14]Helmi Imaduddin, Brian Aditya Hermansyah and Frischa Aura Salsabilla B. Arison Of Support Vector Machine And Decision Tree Methods In The Classification Of Breast Cancer. Cyberspace: Jurnal Pendidikan Teknologi Informasi, 2021: 22-30.
[15]Chalifa Chazar, Bagus Erawan Widhiaputra. Machine Learning Diagnosis Kanker Payudara Menggunakan Algoritma Support Vector Machine. INFORMASI (Jurnal Informatika dan Sistem Informasi), 2020: 67-80.
[16]Yustanti, Wiyli. Algoritma K-Nearest Neighbour untuk Memprediksi Harga Jual Tanah. Jurnal Matematika Statistika & Komputasi (JMSK), 2012: 57-68.
[17]Rasmi, Retno Paras. Peningkatan Hasil Diagnosis Kanker Payudara Dari Hasil Citra Mammogram Menggunakan Metode Ekstrasi Ciri Dan Klasifikasi. Jurnal Teknik Elektro FTI Universitas Islam Indonesia Yogyakarta, 2020: 5-13.
[18]Emi Susilowati, Amelia Tri Hapsari, Muhammad Efendi,Priadhana Edi Kresnha. Diagnosa Penyakit Kanker Payudara Menggunakan Metode K-Means Clustering. Jurnal Sistem Informasi, Teknologi Informatikas dan Komputer, 2020: 27-32.
[19]Sri Rezeki Candra Nursari, Nanda Mahya Barokatun Nisa. Services Cancer Detection System Using K-Nearest Neighbours(K-Nn) Method And Naïve Bayes Classifier. International Journal Information System and Computer Science (IJISCS), 2020: 40-43.
[20]Dewi Cahyantia, Alifah Rahmayania, Syafira Ainy Husniara. Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara. Indonesian Journal of Data and Science, 2020: 39-43.
[21]Rina Resmiati, Toni Arifin. Klasifikasi Pasien Kanker Payudara Menggunakan Metode Support Vector Machine dengan Backward Elimination. SISTEMASI: Jurnal Sistem Informasi, 2021: 381-393.
[22]Hermawan, H., & Whardana, A. K. (2024). Hemorrhage Segmentation on Retinal Images for Early Detection of Diabetic Retinopathy. JEECS (Journal of Electrical Engineering and Computer Sciences), 9(2), 127-138.
[23]Yahya, Winda Puspita Hidayanti. Penerapan Algoritma K-Nearest Neighbor Untuk Klasifikasi Efektivitas Penjualan Vape (Rokok Elektrik) pada “Lombok Vape On”. Jurnal Informatika dan Teknologi, 2020: 104-114.
[24]Alfayat, M. P., & Whardana, A. K. (2024). Deteksi Dini Alzheimer Pada Otak Dengan Kombinasi Metode. Scan: Jurnal Teknologi Informasi Dan Komunikasi, 19(1), 32-41.
[25]Shastry, K. A., & Sattar, S. A. (2023). Logistic random forest boosting technique for Alzheimer’s diagnosis. International Journal of Information Technology, 15(3), 1719-1731.