Pan-cancer diagnostic consensus through searching archival histopathology images using artificial intelligence

In this study, we report the results from searching the largest public repository (The Cancer Genome Atlas, TCGA) of whole-slide images. We successfully indexed and searched almost 30,000 high-resolution digitized slides. The proposed method achieved high accuracy values +90% for many cancer types.
Pan-cancer diagnostic consensus through searching archival histopathology images using artificial intelligence


A system that combines artificial intelligence (AI) with human knowledge promises faster and more accurate cancer diagnosis. The paper describes the validation of a new type of powerful image search engine, called Yottixel, developed by a team led by researchers at Kimia Lab (University of Waterloo) and Huron Digital Pathology. Yottixel uses digital images of tissue samples to match new cases of suspected cancer with previously diagnosed cases in a database within a fraction of a second. The search engine is validated using a largest publicly available archive in the world – comprised of about 30,000 digitized slides from almost 11,000 patients – the technology achieved high accuracy values (in some cases high 90s) for 32 forms of cancer in 25 organs and body parts.

The Search Engine

The major novelty of the “Yottixel Image Search Engine” is the approach it uses for representing whole slide images (WSIs). Each WSI is converted to a set of representative patches that are converted into barcodes using deep models. These barcodes, called Bunch of Barcodes (BoB) are a compact form of characterization of a WSI. The BoB index requires less computation and storage resources for searching in large archives of histopathology slides compared to other approaches. For example, a WSI of size 200-300 MB can be converted to a BoB index of ~10 KB offering up to 99.9% reduction in its size.

Computational Consensus

The accelerated adoption of digital pathology is coinciding with and probably partly attributed to recent progress in AI applications in the field of pathology. This disruption in the field of pathology offers a historic chance to find novel solutions for major challenges in diagnostic histopathology and adjacent fields, including biodiscovery. The main motivation behind our study was to answer a question – whether one can build a computational consensus to potentially remedy the high intra- and inter-observer variability in diagnosing certain pathology tumors? To answer this question, we performed a horizontal search to verify basic recognition capabilities of the image search engine. Furthermore, we performed leave-one-patient-out vertical searches to examine the accuracy of top-n search results for establishing a diagnostic consensus through majority voting for cancer subtypes.

We found that AI can help us tap into our medical wisdom, which at the moment is just stored in archives in form of evidently diagnosed cases. The results were assessed with conservative "majority voting" to build consensus for subtype diagnosis. The results demonstrated high accuracy values for both frozen section slides (e.g., bladder urothelial carcinoma 93%, kidney renal clear cell carcinoma 97%, and ovarian serous cyst-adenocarcinoma 99%) and permanent histopathology slides (e.g., prostate adenocarcinoma 98%, skin cutaneous melanoma 99%, and thymoma 100%). The key finding of this validation study was that computational consensus appears to be possible for rendering diagnoses if a sufficiently large number of searchable and evidently diagnosed cases are available for each cancer subtype. We analyzed the search results using an interactive chord diagram which shows how different cancer sub-types are interconnected due to morphological and visual similarities in their WSIs.

Figure 1: Chord diagram showing inter-connections between different cancer sub-types.  Interactive version can be found here.

Future of Search

Based on the known and verified findings of the majority of similar images, the Yottixel Image Search Engine could recommend a diagnosis for the new case. The study shows that it is possible to get incredibly accurate results if you have access to a large archive. More work is needed to analyze the findings and refine the search engine, but the results so far demonstrate that Yottixel has potential as a screening tool to both speed up and improve the accuracy of cancer diagnoses by pathologists. As well, it could save lives worldwide by enabling remote access to inexpensive diagnosis for the developing regions. This technology could be a blessing in places where there simply aren’t enough specialists. One could just send an image attached to an email and get a search-based report back.

Please sign in or register for FREE

If you are a registered user on Nature Portfolio Health Community, please sign in