Evaluasi Recursive Feature Elimination Untuk Klasifikasi Kanker Payudara Menggunakan Berbagai Algoritma Machine Learning

Syarifah Yusnaini Putri; Sayuti Rahman; Nia Ramadani; Novalia Aprianti Ginting; Layla Syalsyadilla; Dedi Agustriaman Zebua

Authors

Syarifah Yusnaini Putri Universitas Medan Area
Sayuti Rahman Universitas Medan Area
Nia Ramadani Universitas Medan Area
Novalia Aprianti Ginting Universitas Medan Area
Layla Syalsyadilla Universitas Medan Area
Dedi Agustriaman Zebua Universitas Medan Area

Keywords:

: breast cancer, machine learning, recursive feature elimination (RFE), classification, feature optimization

Abstract

Early detection of breast cancer requires classification models that are not only accurate but also efficient and interpretable. This study evaluates the effect of Recursive Feature Elimination (RFE) on the performance of several machine learning algorithms for breast cancer classification. The dataset used is the Wisconsin Diagnostic Breast Cancer (WDBC) dataset from the UCI Machine Learning Repository, consisting of 569 samples and 30 numerical features. The research stages include data preprocessing, removal of non-informative attributes, feature standardization using StandardScaler, train-test splitting with an 80:20 ratio, feature selection using Logistic Regression-based RFE, and training and testing of 11 classification algorithms. Model performance was evaluated using accuracy, precision, recall, F1-score, confusion matrix, and Receiver Operating Characteristic (ROC) curve. The results show that before feature selection, Support Vector Machine, Logistic Regression, and Voting Classifier achieved the highest accuracy of 98.25%. After applying RFE, the accuracy of these models decreased slightly to 97.37%, while the number of features was reduced from 30 to 15. Several algorithms, including Nearest Centroid, Naïve Bayes, and AdaBoost, showed improved accuracy after RFE. These findings indicate that RFE does not always improve the best model accuracy, but it can produce a more compact, efficient, and interpretable classification model.

Downloads

Download data is not yet available.

References

H. Sung et al., “Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA: A Cancer Journal for Clinicians, vol. 71, no. 3, pp. 209–249, 2021.

L. Wilkinson and T. Gathani, “Understanding breast cancer as a global health concern,” British Journal of Radiology, vol. 95, no. 1130, p. 20211033, 2022.

A. Stanislawek, “Breast cancer—epidemiology, risk factors, classification, prognostic markers, and current treatment strategies: An updated review,” Cancers, vol. 13, no. 17, pp. 1–30, 2021.

M. Xiao et al., “Diagnostic value of breast lesions between deep learning-based computer-aided diagnosis system and experienced radiologists,” Frontiers in Oncology, vol. 10, pp. 1–10, 2020.

X. Y. Liew, N. Hameed, and J. Clos, “A review of computer-aided expert systems for breast cancer diagnosis,” Cancers, vol. 13, no. 11, p. 2764, 2021.

K. Loizidou, R. Elia, and C. Pitris, “Computer-aided breast cancer detection and classification in mammography: A comprehensive review,” Computers in Biology and Medicine, vol. 153, p. 106554, 2023.

A. Rasool et al., “Improved machine learning-based predictive models for breast cancer diagnosis,” International Journal of Environmental Research and Public Health, vol. 19, no. 6, p. 3211, 2022.

C. G. Yedjou et al., “Application of machine learning algorithms in breast cancer diagnosis and classification,” International Journal of Science and Academic Research, vol. 2, no. 1, pp. 3081–3086, 2021.

K. M. M. Uddin et al., “Machine learning-based diagnosis of breast cancer utilizing feature optimization technique,” Computer Methods and Programs in Biomedicine Update, vol. 3, p. 100098, 2023.

L. Alkhathlan and A. K. J. Saudagar, “Predicting and classifying breast cancer using machine learning,” Journal of Computational Biology, vol. 29, no. 6, pp. 497–514, 2022.

W. H. Wolberg, W. N. Street, and O. L. Mangasarian, “Breast Cancer Wisconsin (Diagnostic),” UCI Machine Learning Repository, 1993.

I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, pp. 389–422, 2002.

R. J. Urbanowicz et al., “Relief-based feature selection: Introduction and review,” Journal of Biomedical Informatics, vol. 85, pp. 189–203, 2018.

F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

T. A. Assegie, R. L. Tulasi, and N. K. Kumar, “Breast cancer prediction model with decision tree and adaptive boosting,” International Journal of Artificial Intelligence, vol. 10, no. 1, pp. 184–190, 2021.

Evaluasi Recursive Feature Elimination Untuk Klasifikasi Kanker Payudara Menggunakan Berbagai Algoritma Machine Learning

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

Sinta

Contact

indeks

sidebar

information

tools

visitor

lokasi

Current Issue