Performance Comparison of Machine Learning Algorithms for Early Diagnosis of Hearth Failure
Abstract views: 714 / PDF downloads: 463
DOI:
https://doi.org/10.5281/zenodo.8238065Keywords:
Heart Failure, Machine Learning, Prediction, ClassificationAbstract
Heart failure is a condition where the heart is unable to pump an adequate amount of blood and can lead to serious health problems if left untreated. Early diagnosis can prevent the progression of the disease and improve the quality of life. This article evaluates the performance of different machine learning algorithms in early detection of heart failure disease. The data set from the Kaggle database consists of 11 independent variables from a comprehensive database of patients with heart failure and healthy individuals. Eight algorithms were used in the study, namely Classification and Regression Tree (CART), K-Nearest Neighborhoods (KNN), Logistic Regression, Random Forest (RF), AdaBoost, XGBoost, LightGBM and CatBoost. The results of the article demonstrate that the machine learning algorithms used can be effectively employed in the early diagnosis of heart failure disease. The algorithms exhibit high accuracy rates and low error values. Additionally, performance differences among different algorithms are identified. Random forest was the best estimator of the study, with F1 score (0.98(1)), ROC AUC (0.999), and accuracy (0.99). These findings emphasize the potential of machine learning algorithms for the early diagnosis of heart failure disease.
References
Aktaş Potur, E., Erginel, N. (2021). Kalp Yetmezliği Hastalarının Sağ Kalımlarının Sınıflandırma Algoritmaları ile Tahmin Edilmesi. Avrupa Bilim ve Teknoloji Dergisi. (24), 112-118.
Bergstra, J. ve Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research. 13(Feb), 281-305.
Breiman, L. (2001). Random forests. Machine learning. 45(1), 5-32.
Breiman, L., Friedman, J., Stone, C. J., ve Olshen, R. A. (1984). Classification and Regression Trees. Chapman and Hall/CRC, New York, USA.
Chen, T. ve Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785-794.
Chen, W., Liu, T., Ying, X., Wei, D. Q., ve Li, Y. (2016). Deep learning in bioinformatics: introduction, application, and perspective in big data era. Methods. 93, 3-12.
Chicco, D. ve Jurman, G. (2020). Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making. 20(1), 16.
Cover, T. ve Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 13(1), 21-27.
Davison, A. (2020). Heart Failure Clinical Data. Kaggle. https://www.kaggle.com/andrewmvd/heart-failure-clinical-data (Erişim tarihi: 21.10.2022).
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters. 27(8), 861-874.
Freund, Y. ve Schapire, R. E. (1996). Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on Machine Learning. 148-156.
Galen Lab. CPK Testi. https://www.galenlab.com.tr/cpk-testi/ (Erişim tarihi: 12.01.2023).
Goodfellow, I., Bengio, Y. ve Courville, A. (2016). Deep Learning. MIT press, Massachusetts, USA.
Gündoğdu S. (2021). Kalp hastalık risk tahmini için Python aracılığıyla sınıflandırıcı algoritmalarının performans değerlendirmesi. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi. 23(69), 1005-1013.
Hosmer Jr, D. W. ve Lemeshow, S. (1989). Applied Logistic Regression. John Wiley & Sons, New Jersey, USA.
Huang, G., Liu, Z., Van Der Maaten, L., ve Weinberger, K. Q. (2017). Densely connected convolutional networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 4700-4708.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., ... ve Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems. 3146-3154.
Karanfil, S. (2017). Yapay Öğrenme ile Hastalık Riski Tahmini. (Yüksek Lisans Tezi), İstanbul Kemerburgaz Üniversitesi, Fen Bilimleri Enstitüsü.
Kartal, E. (2015). Sınıflandırmaya Dayalı Makine Öğrenmesi Teknikleri ve Kardiyolojik Risk Değerlendirmesine İlişkin Bir Uygulama. (Doktora Tezi), İstanbul Üniversitesi, Fen Bilimleri Enstitüsü.
Medical Park (2022). https://www.medicalpark.com.tr (Erişim tarihi: 16.01.2023).
Memorial (2022). https://www.memorial.com.tr/ (Erişim tarihi: 03.02.2023).
Coşar, M. ve Deniz, E. (2021). Makine Öğrenimi Algoritmaları Kullanarak Kalp Hastalıklarının Tespit Edilmesi, Avrupa Bilim ve Teknoloji Dergisi. (28), 1112-1116.
Özdemir, C. ve Erdil, E. (2021). Kalp Yetersizliği Hastalarının Sağ Kalım Tahmini. Afyon Kocatepe Üniversitesi Fen ve Mühendislik Bilimleri Dergisi. 21(1), 91-99.
Provost, F. ve Fawcett, T. (2001). Robust classification for imprecise environments. Machine learning. 42(3), 203-231.
Karanfil, S. (2017). Yapay Öğrenme ile Hastalık Riski Tahmini. (Yüksek Lisans Tezi), İstanbul Kemerburgaz Üniversitesi, Fen Bilimleri Enstitüsü.
Shaikhina, T. ve Khovanova, N. A. (2017). Handling limited datasets with neural networks in medical applications: A small-data approach. Artificial Intelligence in Medicine. 75, 51-63.
Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science. 240(4857), 1285-1293.
University of California, Irvine. (1999). Heart Disease Data Set. https://archive.ics.uci.edu/ml/datasets/heart+disease (Erişim tarihi: 21.10.2022).
Wang, S., Sun, H., Ma, J., Zang, C., Wang, C., Wang, J., ... ve Liu, Z. (2017). Targeting NEK2 attenuates glioblastoma cell proliferation and radioresistance via inhibiting the ROS/NF-κB signaling pathway. Scientific Reports. 7(1), 1-13.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Euroasia Journal of Mathematics, Engineering, Natural & Medical Sciences
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.