Machine Learning Based Cervical Cancer Risk Prediction with SHAP-Driven Feature Interpretation

Authors

  • Fachrizal Ardiansyah Universitas Informatika dan Bisnis Indonesia image/svg+xml Author
    • Raka Deny Abdi Putra Universitas Informatika dan Bisnis Indonesia image/svg+xml Author
      • Budiman Universitas Informatika dan Bisnis Indonesia image/svg+xml Author

        DOI:

        https://doi.org/10.65780/bima.v1i3.16

        Keywords:

        Cervical cancer, Machine Learning, risk prediction, explainable AI (XAI), catboost

        Abstract

        Cervical cancer remains a critical public health problem, particularly in developing countries where early detection is often limited. This study presents a machine learning–based approach for cervical cancer risk prediction that emphasizes both predictive accuracy and interpretability. Several supervised algorithms, namely K-Nearest Neighbors, Random Forest, XGBoost, and CatBoost, were evaluated using the Cervical Cancer (Risk Factors) dataset from the UCI Machine Learning Repository following comprehensive data preprocessing and systematic hyperparameter optimization. The experimental results show that CatBoost achieved the best overall performance, with an optimized accuracy of 97.01% and improved sensitivity in detecting high-risk cases, supported by stable k-fold cross-validation results. To enhance clinical transparency, explainable artificial intelligence was incorporated via SHAP, revealing that key predictors such as the Schiller test, age, and reproductive factors played dominant roles in the model’s decisions. These findings demonstrate that the proposed framework is not only accurate and stable but also interpretable and clinically relevant, making it well-suited to support early detection and decision-making in cervical cancer screening, especially in resource-limited healthcare settings.

        Downloads

        Download data is not yet available.

        References

        [1] H. Sung et al., “Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA: A Cancer Journal for Clinicians, vol. 71, no. 3, pp. 209–249, 2021.

        [2] World Health Organization, “Cervical Cancer,” World Health Organization, Mar. 05, 2024. https://www.who.int/news-room/fact-sheets/detail/cervical-cancer (accessed Jan. 22, 2026).

        [3] M. Arbyn et al., “Accuracy and effectiveness of HPV mRNA testing in cervical cancer screening: A systematic review and meta-analysis,” The Lancet Oncology, vol. 23, no. 7, pp. 950–960, 2022.

        [4] T. E. Sangers et al., “Towards successful implementation of artificial intelligence in skin cancer care: A qualitative study exploring the views of dermatologists and general practitioners,” Archives of Dermatological Research, vol. 315, no. 5, pp. 1187–1195, 2023.

        [5] B. He et al., “Prediction Models for Prognosis of Cervical Cancer: Systematic Review and Critical Appraisal,” Frontiers in Public Health, vol. 9, May 2021, doi: https://doi.org/10.3389/fpubh.2021.654454.

        [6] L. Akter et al., “Prediction of cervical cancer from behavior risk using machine learning techniques,” SN Computer Science, vol. 2, no. 3, Art. no. 177, 2021.

        [7] J. Dunn and P. Balaprakash, Data Science Applied to Sustainability Analysis. Amsterdam, Netherlands: Elsevier, 2021.

        [8] G. Kostopoulos, G. Davrazos, and S. Kotsiantis, “Explainable artificial intelligence-based decision support systems: A recent review,” Electronics, vol. 13, no. 14, Art. no. 2842, 2024.

        [9] M. M. Uddin et al., “The role of machine learning in transforming healthcare: A systematic review,” Information Systems Research, vol. 1, no. 1, 2024.

        [10] N. A. Wani et al., “Synergizing fusion modeling for accurate cardiac prediction through explainable artificial intelligence,” IEEE Transactions on Consumer Electronics, vol. 71, no. 1, pp. 1504–1512, 2024.

        [11] P. E. Castle, “Looking back, moving forward: Challenges and opportunities for global cervical cancer prevention and control,” Viruses, vol. 16, no. 9, Art. no. 1357, 2024.

        [12] C. Yue et al., “Machine learning in early screening for high-grade cervical intraepithelial neoplasia using blood testing,” BMC Medical Informatics and Decision Making, 2025.

        [13] Tanimu, Jesse Jeremiah, et al. "A machine learning method for classification of cervical cancer." Electronics 11.3 (2022): 463.

        [14] Chadaga, Krishnaraj, et al. "Predicting cervical cancer biopsy results using demographic and epidemiological parameters: A custom stacked ensemble machine learning approach." Cogent Engineering 9.1 (2022): 2143040.

        [15] Kılıçarslan, Serhat, Maruf Gögebakan, and Cemil Közkurt. "Cervical cancer prediction using SMOTE algorithm and machine learning approaches." Journal of the Institute of Science and Technology 13.2 (2023): 747-759.

        [16] Okyay, Tugba Muhlise, Ibrahim Yilmaz, and Macit Koldas. "Evaluating Cervical Cancer Risk Using Machine Learning." The Medical Bulletin of Haseki (2025).

        [17] Dong, Binhua, et al. "Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling study." The Lancet Regional Health–Western Pacific 55 (2025).

        Downloads

        Published

        2026-03-31

        How to Cite

        Machine Learning Based Cervical Cancer Risk Prediction with SHAP-Driven Feature Interpretation. (2026). Bulletin of Intelligent Machines and Algorithms, 1(3), 79-87. https://doi.org/10.65780/bima.v1i3.16

        Most read articles by the same author(s)