An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department

A multi-modal AI system that predicts the risk of mortality, intubation, or admission to the intensive care unit among COVID-19 patients using chest X-ray images and clinical variables

Published in Healthcare & Nursing

May 12, 2021

Farah Shamout, Farah Shamout & Krzysztof Geras

3 contributors

An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department

Like Be the first to like this

Read the paper

Motivation

The spread of the coronavirus disease 2019 (COVID-19) has led to a surge in patients presenting to the emergency department with respiratory illness. This overload, on an already strained healthcare system, emphasizes the need for automated triage systems that can support decision-making by predicting the risk of patient deterioration.

Given the promise of digital health and the motivation to contribute to combating the global pandemic, we developed a prognostic system using artificial intelligence (AI) as presented in our recent paper, “An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department”. An overview of the system is shown in Figure 1.

**Figure 1.** Overview of the AI system that assesses the patient’s risk of deterioration every time a chest X-ray image is collected in the ED. The system produces three types of outputs: (i) overall risk of deterioration within 24, 48, 72, and 96 hours using the multimodal average prediction of a deep neural network (COVID-GMIC) and a gradient boosting model (COVID-GBM), (ii) saliency maps for interpretability using COVID-GMIC, and (iIi) deterioration risk curves (DRC) using a modified version of the deep learning network (COVID-GMIC-DRC).

Multidisciplinary collaboration with clinical experts

To ensure that our proposed work is clinically meaningful, we collaborated closely with radiologists and front-line physicians at NYU Langone Health to define realistic tasks for our prognostic system, extract and curate the data from the hospital’s complex medical records, and specify a meaningful inclusion and exclusion criteria to preprocess the dataset. Based on the constraints of the available data and the day-to-day experience of the clinicians, we defined the risk for intubation, admission to the intensive care unit, or mortality as the prognostic system’s predicted outputs, at the time of patient assessment.

In the emergency department, chest X-ray imaging is used as a first-line triage tool for patients who test positive for COVID-19. Compared to other imaging modalities, it is cheap and easy to obtain without incurring the risk of contaminating imaging suites. Other clinical variables, such as vital signs, laboratory test results, and patient demographics are also recorded. To learn from the diverse types of data in a multimodal manner, we used chest X-ray imaging along with the clinical variables that were recorded closest to the time of image acquisition as input data to our prognostic system. We developed this system rapidly as the data was being collected at NYU Langone Health between March 3, 2020 and May 13, 2020. To develop the system and perform hyperparameter tuning, we used a training set consisting of 5,617 chest X-ray images collected from 2,943 patients. To evaluate the performance of the system retrospectively, we used a test set consisting of 832 images collected from 718 patients.

Multi-modal AI system using chest X-ray images and clinical variables

We processed the chest X-ray images using the Globally Aware Multiple Instance Classifier (GMIC) neural network architecture [1]. COVID-GMIC predicts the overall risk of deterioration within 24, 48, 72, and 96 hours, and computes saliency maps that highlight the regions of the image that most informed its predictions. As shown in Figure 2, COVID-GMIC utilizes the global network to generate four saliency maps that highlight the regions on the X-ray image that are predictive of the onset of adverse events within the four time windows. COVID-GMIC then applies a local network to extract fine-grained visual details from these regions. Finally, it employs a fusion module that aggregates information from both the global context and local details to make a holistic diagnosis. The predictions of COVID-GMIC are combined with predictions of a gradient boosting model [2] that learns from routinely collected clinical variables, referred to as COVID-GBM. The optimal weights assigned to the COVID-GMIC prediction in the COVID-GMIC and COVID-GBM ensemble were derived through optimizing the performance on the validation set (obtained from the folds of the Monte Carlo cross validation iterations).

**Figure 2.** Architecture of COVID-GMIC.

Performance results on the test set and reader study

Table 1 summarizes the key performance results. The multi-modal model ensemble of COVID-GMIC and COVID-GBM, denoted as ‘COVID-GMIC + COVID-GBM’, achieved the best performance across all time windows in terms of the area under receiver operating characteristic curve (AUC) and the area under the precision recall curve (PRAUC), except for the PR AUC in the 96 hours task.

**Table 1:** Performance of the outcome classification task on the held-out test set, and on the subset of the test set used in the reader study. We include 95% confidence intervals estimated by 1,000 iterations through bootstrapping.

In a reader study consisting of 200 images, our main finding is that COVID-GMIC outperforms radiologists A & B, respectively with 3 and 17 years of experience, across time windows longer than 24 hours. Note that since the radiologists did not have access to clinical variables, their performance is not directly comparable to the COVID-GBM model; we include it only for reference.

Interpretability to establish trust with clinicians

We also qualitatively evaluated the saliency maps computed by COVID-GMIC. Two examples are shown in Figure 3. Both patients were admitted to the intensive care unit and were intubated within 48 hours. In the first example, there are diffuse airspace opacities, though the saliency maps primarily highlight the medial right basilar and peripheral left basilar opacities. Similarly, the two regions of interest (ROI) patches (1 and 2) on the basilar region demonstrate comparable attention values, 0.49 and 0.46 respectively. In the second example, the extensive left mid to upper-lung abnormality (ROI patch 1) is highlighted, which correlates with the most extensive area of parenchymal consolidation.

**Figure 3:** From left to right: the original X-ray image, saliency maps for clinical deterioration within 24, 48, 72, and 96 hours, locations of region-of-interest (ROI) patches, and ROI patches with their associated attention scores.

Deterioration risk curves

We designed a second model to compute deterioration risk curves, inspired by survival analysis. The second model, COVID-GMIC-DRC, predicts how the patient’s risk of deterioration evolves over time in the form of deterioration risk curves. The DRCs generated by the COVID-GMIC-DRC in the test set and the reliability plot are shown in Figure 4. The mean DRC for patients with adverse events (red dashed line) is higher than the DRC for patients without adverse events (blue dashed line) at all times. In the reliability plot, perfect calibration is indicated by the diagonal black dashed line. The figure shows that the model is well-calibrated.

Implications for clinical practice

Overall, we developed and evaluated an AI system that is able to predict deterioration of COVID-19 patients presenting to the ED, where deterioration is defined as the composite outcome of mortality, intubation, or ICU admission. The system aims to provide clinicians with a quantitative estimate of the risk of deterioration, and how it is expected to evolve over time, in order to enable efficient triage and prioritization of patients at the high risk of deterioration. The tool may be of particular interest for pandemic hotspots where triage at admission is critical to allocate limited resources such as hospital beds.

To allow for reproducibility and share our work with the research community, we made our code and parameters of trained models publicly available at https://github.com/nyukat/COVID-19_prognosis.

References

[1] Shen, Yiqiu, et al. "An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization." Medical image analysis 68 (2021): 101908.

[2] Ke, Guolin, et al. "Lightgbm: A highly efficient gradient boosting decision tree." Advances in neural information processing systems 30 (2017): 3146-3154.

Multiple Contributors

Farah Shamout, Farah Shamout & Krzysztof Geras

View all

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Health Care

Life Sciences > Health Sciences > Health Care

npj Digital Medicine

npj Digital Medicine

An online open-access journal dedicated to publishing research in all aspects of digital medicine, including the clinical application and implementation of digital and mobile technologies, virtual healthcare, and novel applications of artificial intelligence and informatics.

More about the journal

Related Collections

With collections, you can get published faster and increase your visibility.

Clinical applications of AI in mental health care

This joint venture Collection between npj Mental Health Research and npj Digital Medicine highlights how AI can be safely, ethically, & impactfully utilized to advance our understanding of mental illnesses & improve patient care.

Publishing Model: Open Access

Deadline: Jun 22, 2024

Explore this collection

Harnessing digital health technologies to tackle climate change and promote human health

This collection invites research on the use of digital health technologies that innovate solutions to improve sustainable health care practice and delivery.

Publishing Model: Open Access

Deadline: Apr 30, 2024

Explore this collection

Latest Content

Behind the Paper

Mutations in the splicing factor SF3B1 are linked to frequent emergence of HLA-DRlow/neg monocytes in lower-risk myelodysplastic neoplasms

Fasting mimicking diet cycles versus a Mediterranean diet and cardiometabolic risk in overweight and obese hypertensive subjects: a randomized clinical trial

Research Communities by Springer Nature