La Trobe
- No file added yet -

Mortality Prediction of Patients with Cardiovascular Disease using Medical Claims Data under Artificial Intelligence Architectures: Validation Study

Download (1.87 MB)
journal contribution
posted on 2021-12-13, 06:01 authored by L Tran, Lianhua ChiLianhua Chi, A Bonti, M Abdelrazek, Yi-Ping Phoebe ChenYi-Ping Phoebe Chen

https://medinform.jmir.org/2021/4/e25000

Citation:

Tran L, Chi L, Bonti A, Abdelrazek M, Chen YP Mortality Prediction of Patients With Cardiovascular Disease Using Medical Claims Data Under Artificial Intelligence Architectures: Validation Study
JMIR Med Inform 2021;9(4):e25000

Background: Cardiovascular disease (CVD) is the greatest health problem in Australia, which kills more people than any other disease and incurs enormous costs for the health care system. In this study, we present a benchmark comparison of various artificial intelligence (AI) architectures for predicting the mortality rate of patients with CVD using structured medical claims data. Compared with other research in the clinical literature, our models are more efficient because we use a smaller number of features, and this study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit. Objective: This study aims to support health clinicians in accurately predicting mortality among patients with CVD using only claims data before a clinic visit. Methods: The data set was obtained from the Medicare Benefits Scheme and Pharmaceutical Benefits Scheme service information in the period between 2004 and 2014, released by the Department of Health Australia in 2016. It included 346,201 records, corresponding to 346,201 patients. A total of five AI algorithms, including four classical machine learning algorithms (logistic regression [LR], random forest [RF], extra trees [ET], and gradient boosting trees [GBT]) and a deep learning algorithm, which is a densely connected neural network (DNN), were developed and compared in this study. In addition, because of the minority of deceased patients in the data set, a separate experiment using the Synthetic Minority Oversampling Technique (SMOTE) was conducted to enrich the data. Results: Regarding model performance, in terms of discrimination, GBT and RF were the models with the highest area under the receiver operating characteristic curve (97.8% and 97.7%, respectively), followed by ET (96.8%) and LR (96.4%), whereas DNN was the least discriminative (95.3%). In terms of reliability, LR predictions were the least calibrated compared with the other four algorithms. In this study, despite increasing the training time, SMOTE was proven to further improve the model performance of LR, whereas other algorithms, especially GBT and DNN, worked well with class imbalanced data. Conclusions: Compared with other research in the clinical literature involving AI models using claims data to predict patient health outcomes, our models are more efficient because we use a smaller number of features but still achieve high performance. This study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit.

History

Publication Date

2021-04-01

Journal

JMIR Medical Informatics

Volume

9

Issue

4

Article Number

PMID 33792549

Pagination

21p.

Publisher

JMIR Publications

ISSN

2291-9694

Rights Statement

©Linh Tran, Lianhua Chi, Alessio Bonti, Mohamed Abdelrazek, Yi-Ping Phoebe Chen. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.04.2021. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.

Usage metrics

    Journal Articles

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC