International Journal of Innovative Research in Computer and Communication Engineering

ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines

| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |


TITLE An Efficient Hybrid Ensemble Framework for Cardiovascular Disease Prediction with Explainable AI
ABSTRACT This paper introduces a better method of predicting heart disease, combining advanced preprocessing methods of data with a powerful ensemble learning model. This dataset underwent a thorough cleaning process, and the issue of class imbalance was solved with the help of the Synthetic Minority Oversampling Technique (SMOTE), which was applied solely to the training dataset to ensure that no information is leaked when the model is being tested. The ANOVA F-test was used to select the most significant clinically features with SelectKBest method, which identified eight clinically significant features, such as age, sex, chest pain type, maximum heart rate, exercise-induced angina, ST depression, ST slope, and number of major vessels. In order to improve predictive performance, a stacking ensemble model was constructed where the base learners were Random Forest and Extra Trees and Logistic Regression was the meta-learner to merge the results of the base learners. Moreover, the classification threshold was lowered to 0.6, to be more accurate to the medical diagnosis, which is a cost-sensitive issue. The proposed model had a high accuracy of 97.88, high precision, recall, and F1-score, and a ROC-AUC of 99.76, which is higher than the traditional methods. SHAP (SHapley Additive exPlanations) was used to provide model transparency with the KernelExplainer method and the high-dimensional results were rolled into a two-dimensional form to be visualized effectively. The most significant predictors identified in the analysis are sex, exercise-induced angina, age, number of vessels, and ST slope, which is in line with the known clinical knowledge. In general, the findings indicate that systematic preprocessing, ensemble learning, and explainable AI can be used to achieve a high level of prediction reliability and be practically applicable in cardiovascular healthcare. Future efforts will be directed at additional optimization of the pipeline and integration of real-time clinical data to maintain and improve predictive performance.
AUTHOR POTHAMSETTY HANU PHANI SRI SAI SATHWIK RAM, DR. DEEPAK NEDUNURI M. Tech Scholar, Department of CSE, Sir C R Reddy College of Engineering, Eluru, India Professor, Department of CSE, Sir C R Reddy College of Engineering, Eluru, India
VOLUME 183
DOI DOI: 10.15680/IJIRCCE.2026.1404143
PDF pdf/143_An Efficient Hybrid Ensemble Framework for Cardiovascular Disease Prediction with Explainable AI.pdf
KEYWORDS
References [1] A. Anderies, J. A. R. W. Tchin, P. H. Putro, Y. P. Darmawan, and A. A. S. Gunawan, “Prediction of heart disease UCI dataset using machine learning algorithms,” Engineering MAthematics and Computer Science (EMACS) Journal, vol. 4, no. 3, pp. 87–93, Sep. 2022, doi: 10.21512/emacsjournal.v4i3.8683.
[2] L. Hussain, K. J. Lone, I. A. Awan, A. A. Abbasi, and J.-U.-R. Pirzada, “Detecting congestive heart failure by extracting multimodal features with synthetic minority oversampling technique (SMOTE) for imbalanced data using robust machine learning techniques,” Waves in Random and Complex Media, vol. 32, no. 3, pp. 1079–1102, Aug. 2020, doi: 10.1080/17455030.2020.1810364..
[3] A. Sheta, W. El-Ashmawi, and A. Baareh, “Heart Disease Diagnosis Using Decision Trees with Feature Selection Method,” The International Arab Journal of Information Technology, vol. 21, no. 3, Jan. 2024, doi: 10.34028/iajit/21/3/7.
[4] M. Pal and S. Parija, “Prediction of Heart Diseases using Random Forest,” Journal of Physics Conference Series, vol. 1817, no. 1, p. 012009, Mar. 2021, doi: 10.1088/1742-6596/1817/1/012009.
[5] U. J. Nzenwata, E. Edwin, E. A. Chukwu, D. Osilaja, J. O. Hinmikaiye, and C. Enyinnah, “Extra trees model for heart disease prediction,” Journal of Data Analysis and Information Processing, vol. 13, no. 02, pp. 125–139, Jan. 2025, doi: 10.4236/jdaip.2025.132008.
[6] C. Mohan and S. Nagarajan, “An improved tree model based on ensemble feature selection for classification,” TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, pp. 1290–1307, Mar. 2019, doi: 10.3906/elk-1808-85.
[7] S. Q. Sultan, N. Javaid, N. Alrajeh, and M. Aslam, “Machine Learning-Based Stacking Ensemble Model for Prediction of Heart Disease with Explainable AI and K-Fold Cross-Validation: A Symmetric Approach,” Symmetry, vol. 17, no. 2, p. 185, Jan. 2025, doi: 10.3390/sym17020185.
[8] F. Shaker, R. R. S. Alnaily, S. N. Turky, E. K. Wanas, and S. S. Sadon, “Machine Learning-Based Heart Disease Detection with ANOVA Feature Selection,” Journal of Al-Qadisiyah for Computer Science and Mathematics, vol. 17, no. 3, Sep. 2025, doi: 10.29304/jqcsm.2025.17.32427.
[9] S. Z. Sande, L. Seng, J. Li, and R. D’Agostino, “Statistical Learning in Medical Research with Decision Threshold and Accuracy Evaluation,” Journal of Data Science, pp. 634–657, Jan. 2021, doi: 10.6339/21-jds1022.
[10] S. Ahmed, M. S. Kaiser, M. S. Hossain, and K. Andersson, “A comparative analysis of LIME and SHAP interpreters with explainable ML-Based diabetes predictions,” IEEE Access, vol. 13, pp. 37370–37388, Jul. 2024, doi: 10.1109/access.2024.3422319.
[11] T. Banerjee and İ. Paçal, “A systematic review of machine learning in heart disease prediction,” TURKISH JOURNAL OF BIOLOGY, vol. 49, no. 5, pp. 600–634, Oct. 2025, doi: 10.55730/1300-0152.2766
[12] M. S. Pathan, A. Nag, M. M. Pathan, and S. Dev, “Analyzing the impact of feature selection on the accuracy of heart disease prediction,” Healthcare Analytics, vol. 2, p. 100060, May 2022, doi: 10.1016/j.health.2022.100060.
[13] B. Remeseiro and V. Bolon-Canedo, “A review of feature selection methods in medical applications,” Computers in Biology and Medicine, vol. 112, p. 103375, Jul. 2019, doi: 10.1016/j.compbiomed.2019.103375.
[14] I. D. Mienye, Y. Sun, and Z. Wang, “An improved ensemble learning approach for the prediction of heart disease risk,” Informatics in Medicine Unlocked, vol. 20, p. 100402, Jan. 2020, doi: 10.1016/j.imu.2020.100402.
[15] A. M. Carrington et al., “Deep ROC analysis and AUC as balanced average accuracy, for improved classifier selection, audit and explanation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 329–341, Jan. 2022, doi: 10.1109/tpami.2022.3145392.
image
Copyright © IJIRCCE 2020.All right reserved