A system that combines artificial intelligence (AI) with human knowledge promises faster and more accurate cancer diagnosis. The paper describes the validation of a new type of powerful image search engine, called Yottixel, developed by a team led by researchers at Kimia Lab (University of Waterloo) and Huron Digital Pathology. Yottixel uses digital images of tissue samples to match new cases of suspected cancer with previously diagnosed cases in a database within a fraction of a second. The search engine is validated using a largest publicly available archive in the world – comprised of about 30,000 digitized slides from almost 11,000 patients – the technology achieved high accuracy values (in some cases high 90s) for 32 forms of cancer in 25 organs and body parts.
The Search Engine
The major novelty of the “Yottixel Image Search Engine” is the approach it uses for representing whole slide images (WSIs). Each WSI is converted to a set of representative patches that are converted into barcodes using deep models. These barcodes, called Bunch of Barcodes (BoB) are a compact form of characterization of a WSI. The BoB index requires less computation and storage resources for searching in large archives of histopathology slides compared to other approaches. For example, a WSI of size 200-300 MB can be converted to a BoB index of ~10 KB offering up to 99.9% reduction in its size.
The accelerated adoption of digital pathology is coinciding with and probably partly attributed to recent progress in AI applications in the ﬁeld of pathology. This disruption in the ﬁeld of pathology offers a historic chance to ﬁnd novel solutions for major challenges in diagnostic histopathology and adjacent ﬁelds, including biodiscovery. The main motivation behind our study was to answer a question – whether one can build a computational consensus to potentially remedy the high intra- and inter-observer variability in diagnosing certain pathology tumors? To answer this question, we performed a horizontal search to verify basic recognition capabilities of the image search engine. Furthermore, we performed leave-one-patient-out vertical searches to examine the accuracy of top-n search results for establishing a diagnostic consensus through majority voting for cancer subtypes.
We found that AI can help us tap into our medical wisdom, which at the
moment is just stored in archives in form of evidently diagnosed
cases. The results were assessed with conservative "majority
voting" to build consensus for subtype diagnosis. The results
demonstrated high accuracy values for both frozen section slides
(e.g., bladder urothelial carcinoma 93%, kidney renal clear cell
carcinoma 97%, and ovarian serous cyst-adenocarcinoma 99%) and
permanent histopathology slides (e.g., prostate adenocarcinoma 98%,
skin cutaneous melanoma 99%, and thymoma 100%). The key finding of
this validation study was that computational consensus appears to be
possible for rendering diagnoses if a sufficiently large number of
searchable and evidently diagnosed cases are available for each
cancer subtype. We analyzed the search results using an interactive chord diagram which shows how different cancer sub-types are interconnected due to morphological and visual similarities in their WSIs.
Figure 1: Chord diagram showing inter-connections between different cancer sub-types. Interactive version can be found here.
Future of Search
Based on the known and verified findings of the majority of similar images, the Yottixel Image Search Engine could recommend a diagnosis for the new case. The study shows that it is possible to get incredibly accurate results if you have access to a large archive. More work is needed to analyze the findings and refine the search engine, but the results so far demonstrate that Yottixel has potential as a screening tool to both speed up and improve the accuracy of cancer diagnoses by pathologists. As well, it could save lives worldwide by enabling remote access to inexpensive diagnosis for the developing regions. This technology could be a blessing in places where there simply aren’t enough specialists. One could just send an image attached to an email and get a search-based report back.