Since the outbreak of the coronavirus disease 2019 (i.e., COVID-19), many research groups have devoted to developing models using artificial intelligence (AI), especially machine learning, for computer aided diagnosis of the disease and prognostic analysis of patients. The main target of prognostic analysis is to predict whether a confirmed patient on admission will become severe or high-risk in death, or how long he/she will stay in the hospital, which is clinically important as it can support clinicians to make suitable personalized treatment of the patient. Most prior studies focused on developing models for the severity or mortality prediction of COVID-19 patients by classification techniques. However, effective models for recovery-time prediction are still lacking.
In this study, we collected multimodal clinical information (covariates) from multiple hospitals in China and aimed to develop a model for simultaneous prediction of patient recovery-time and death-risk. Developing a such model is extremely challenging due to three main key scientific problems:
- How to build time-variant non-linear relationship? The disease status of patient changes over time, i.e., a dynamic process of mutual influence between treatments and patient covariates, and there is complicated non-linear relationship between the clinical information and the event to predict.
- How to simultaneously use data of patients with different outcomes (i.e., death, recover, and censor) to model the mutually exclusive events (i.e., recovery-time and death-risk)? It is necessary to reasonably design the model and the corresponding loss function.
- How to interpret the non-linear prediction of the model. In clinical practice, clinicians are eager to know the clinical factors that are highly related to the prediction result rather than simply the prediction result. Therefore, interpretable mechanism is required to address the non-linear prediction.
To tackle the above-mentioned problems, we developed a deep learning model named iCOVID for simultaneous prediction of patient’s risk and recovery-time. The proposed model has three main advantages as follows:
- Considering more clinical information: iCOVID takes treatment scheme and a large quantity of heterogeneous multimodal clinical predictors (i.e., clinical features collected within 48 hours after admission) as input to estimate the outcomes. The features include CT images, demographics, biomarkers, symptoms, and comorbidities. Particularly, treatment information is considered an important factor in our work.
- Clinical interpretability and easy-for-use: iCOVID contains an interpretable mechanism named “FSR” (Feature Significance Ranking), which can automatically select important clinical predictors in an end-to-end manner for the prediction. This mechanism can provide individual interpretability to the prediction of each patient. More importantly, it helped us to find the statistically important clinical features (e.g., Albumin, Hemoglobin) that are highly related to the prediction of the outcomes (i.e., recovery-time or death risk). Benefit from this mechanism, the model can also achieve promising performance, even only fed with the top-important predictors, which also makes the model easy-for-use.
- Time-dependent prognostic prediction: iCOVID is a time-dependent regression model, rather than a classification model. The output of the model is a recovery probability distribution within a time-range (day 3 to 32 since admission, the 32nd day indicates the death risk). In most prior studies, the models were only implemented as classification tasks, e.g., classifications of severity-level, mortality-risk, or hospital stays. In contrast, the proposed time-dependent prediction is more clinical practice.
More details please refered to : https://www.nature.com/articles/s41746-021-00496-3.