Chest X-ray (CXR) and computed tomography (CT) are both relevant tools in the clinical workflow for diagnosing and prognosticating respiratory diseases, including lung cancer, tuberculosis (TB), COVID-19 pneumonia, etc. However, there remain two aspects limiting the clinical value of both imaging tools: on the one side, the interpretation of screening results is subjective, resulting in inconsistency among different clinicians due to their varying levels of clinical expertise; on the other side, the manual assessment process is time-consuming and labor-intensive. The scans are piling up to be interpreted, while the healthcare capacity is limited. If the “golden hours” of diagnosis and treatment are missed out, diseases might exacerbate, adding to the difficulties of clinical management.
Critical gaps between health-care resources and urgent demands in respiratory field
China faces a serious shortage of medical resources per capita, both in terms of human resources and medical facility accessibility, as well as the inequity of medical resource allocation. Expert doctors in tertiary hospitals are severely overloaded, with tremendous numbers of patients awaiting them for proper management, while patients in distant mountainous areas tend to be inevitably misdiagnosed by less experienced doctors. The "Healthy China 2030" blueprint outlines precision medicine as a key development direction. Simultaneously, technological advances in the medical field are playing an increasingly more prominent role in promoting the development of health. And AI is emerging as a far promising assistant to reform medical workflow. In cities, the AI-powered gadget could release the pressure off worn-out medical workers and significantly improve the efficiency of clinical workflow. Simultaneously, in rural areas, inexperienced doctors could benefit considerably from automatic analysis references developed based on expert wisdom.
Establishing a modular CT/CXR-based pipeline: DeepMRDTR
In the chest, a wide range of imaging abnormalities can occur concurrently and scatter to various locations within the same imaging modality, hindering accurate diagnosis and therapy for respiratory diseases. However, previous relevant studies, on the other hand, did not fully evaluate the actual clinical scenarios, instead focusing on a single disease binary diagnosis or blurring the boundaries between abnormalities and diseases. As a result, improving the utilization of bulk radiological images has become an extremely important and valuable undertaking.
After heavy data collection process, we finally reached at a so-far largest sample size: 434,735 real-world patients, with 1,294,475 EHRs, 228,563 CT volumes, and 129,319 CXR images. Based on “golden standards” automatically derived from high-quality radiological reports and multi-modal discharge diagnosis records using Natural Language Processing (NLP) techniques, the DeepMRDTR system was trained, optimised and validated (Figure 1). We covered 20 radiological abnormalities, and a total of eight common respiratory diseases were included, namely bronchiectasis, chronic obstructive pulmonary disease (COPD), interstitial lung disease (ILD), lung cancer, pleural effusion, pneumonia, pneumothorax, and tuberculosis (TB) according to the prevalence rates.
The DeepMRDTR system comprises three models, including the NLP, CT-Net, and CXR-Net models, and is designed to provide the final disease prediction as well as quantitative possibilities of lesion characteristics. In addition, the CAM heatmaps were provided to enhance model interpretability. Our study could serve as a good example of leveraging NLP to fully utilize large-scale EMR data. Furthermore, the broad inclusion of abnormalities and diseases empowered the state-of-the-art model in terms of applicability in complicated respiratory situations.
Behind the paper, our study overcame many difficulties. Data collection was time- and labor- consuming, and communication with other hospitals initially had some issues on efficiency. In response to these requirements, we established a systematic online dataset for data collection. The DeepMRDTR system was trained based on data from West China Hospital of Sichuan University, one of the top-of-the-line Grade A tertiary hospitals, and validated in another independent institution. Derived from this experience, we have the hope that there will be more and deeper collaboration in respiratory academia in the future.
The robust diagnostic performance and actual clinical value in remote areas
In such a difficult multi-class diagnosis task, our state-of-the-art algorithms achieved a far promising result. In identifying about 20 chest abnormalities, the CT-Net model achieved an average of a multi-way AUC of 0.856, and the CXR-Net model achieved a slightly lower AUC of 0.841. The abnormality detection model performed especially well on emphysema, pneumoperitoneum, and pneumothorax, all of which were acute clinical conditions to be promptly treated. As for our disease diagnostic algorithms, with an AUC of 0.900 on CT images and 0.866 on CXR data. For lung cancer cases, the gold standard for which is traditionally the detection of cancer cells on pathology slides, our model achieved a fancy AUC of 0.952. This innovation of our system points to a direction that might revolutionize lung cancer clinical practice, maximizing the diagnostic efficiency and allowing patients to get relative in-time treatment before the pathology results come out, which contain invasive procedures and would take long.
Actual deployment of our system in clinical practice could improve the performance of junior radiologists to a level comparable to that of their seniors. To achieve a clinically actionable diagnosis, we implemented a preliminary version of DeepMRDTR into the clinical workflow, which was performed on par with senior experts in disease diagnosis, with an AUC of 0.890 and a Cohen k of 0.746–0.877 in an acceptable timescale. These findings demonstrate the potential of our model to facilitate early diagnosis as a triage tool for respiratory diseases which supports improved clinical diagnoses and decision-making.
With the ability to provide reliable advice without the limits of time and space, our model would not only help young physicians improve their efficacy and grow more quickly into experienced experts but also have the potential to work out well in remote areas. Foremost, we defined the breadth of usage of our model through comparison of relative performance on CT and CXR results. We determined the specific diseases on which the diagnostic accuracy of human + AI using CXR images can reach that of humans alone using CT images and get clues on which diseases are not suitable for CXR screening even in the presence of an AI assistant. Further, to increase accessibility of DeepMRDTR in areas where fast internet is less available, we also designed our system to be installed offline in the hospitals to build up an AI-assisted workflow. In addition, there might be several challenges including incompatibility of such systems with local medical equipment, and additional patient waiting time due to model inference. In order to address these concerns, we have developed easy-to-obtain dockers between the DeepMRDTR system and PACS or image scanners to make the software available. And with continuous efforts to integrate DeepMRDTR into clinical workflow, the time to achieve an ‘actionable’ diagnosis by our system will be extremely shortened.
Foreseeable endeavors ahead
This study has presented an example that through leveraging the power of AI science, we could revolutionize the actual clinical workflow, and this is just a sparkling water drop in the unstoppable current of China’s AI-driven healthcare revolution. To enhance efficiency and accuracy in abnormality detection and disease classification, clinical workers are calling for a multidimensional, multimodal, and multiparametric automatic diagnostic network developed based on comprehensive realistic medical scenarios, which entails efforts of all works of life, from researchers, clinicians, to the general public.