Explainable Machine Learning For Early HIV Detection Using Extra Trees and SHAP Algorithms
DOI:
https://doi.org/10.65780/bima.v1i2.8Keywords:
HIV; early detection; Extra Trees; Explainable Machine Learning; SHAPAbstract
Human Immunodeficiency Virus (HIV) remains a global health challenge that requires accurate and reliable early detection approaches. The use of machine learning offers potential in classifying HIV status based on clinical, demographic, and behavioral data. However, the limitations of interpretability in black-box models are an obstacle to clinical application. This study proposes an Explainable Machine Learning approach for early HIV detection by integrating the Extra Trees algorithm and the Shapley Additive exPlanations (SHAP) method. The model was developed using an HIV dataset obtained from the Kaggle platform and processed through standard data preprocessing stages without class balancing. Performance evaluation was conducted using classification metrics, confusion matrices, and learning curves to assess accuracy and learning stability. The results of the experiment show that the Extra Trees model achieved 88% accuracy with strong generalization. SHAP and mean absolute SHAP analyses revealed the dominant features that contributed to the prediction of HIV status consistently at the global and local levels. These findings show that integrating Extra Trees and SHAP produces an HIV early-detection model that is not only competitive in performance but also transparent and clinically relevant, potentially supporting the development of reliable artificial intelligence-based medical decision support systems.














