Quantitative Video Analysis for Behavioral Studies at Home: An Active Sensing Paradigm

In spite of significant recent advances in molecular genetics and neuroscience, behavioral ratings based on clinical visual observations are still the gold standard for screening, diagnosing, and assessing outcomes in mental health disorders.

Like Comment

The paper in npj Digital Medicine is here: https://go.nature.com/2kF615X

Such behavioral ratings are still subjective, require significant clinician expertise and training, typically do not capture data from the participants in their natural environments, and are not scalable for large population screening, low resource settings, or longitudinal monitoring, all of which are critical for outcome evaluation in multi-site studies and for understanding and evaluating symptoms in the general population. The development of innovative and efficient approaches to behavioral screening, diagnosis, and monitoring is thus a significant unmet need in the area of healthcare. It is critical to develop validated (and if so required, FDA approved) low cost and scalable tools for behavioral analysis of mental health conditions. In the paper “Automatic emotion and attention analysis of young children at home: a ResearchKit autism feasibility study” we report our work on autism spectrum disorder (ASD), though this feasibility study has implication beyond that.

We have developed, tested, refined, and begun to validate automatic, objective, and quantitative measurement of ASD symptom-related behaviors based on visual attention and affective facial expressions that can serve as sensitive measures for ASD screening and assessment of outcomes. Our goal is to reliably capture quantitative, objective behavioral data from participants across a wide age range and functioning level in diverse settings (from the clinic to their natural environments), without the need for rater training or costly equipment. Our interdisciplinary team has been developing tools based on computer vision analysis (CVA) that automatically code a child’s behavior while s/he watches stimuli (on an iPad/iPhone/laptop/desktop), designed to elicit behaviors relevant to the disease/developmental symptoms. The stimuli, video recording, and automatic analysis are all integrated in the ubiquitous devices, producing a software-only objective solution without need for specialized hardware, allowing us to assess the participants in their natural environment. This approach, which both validates known biomarkers and also discovers new ones thanks to the big data and high accuracy of the sensing and algorithms, offers a scalable solution that addresses the challenges of reliable and quantitative behavioral assessment and phenotyping, and will move us from the standard, moment-in-time, single-visit-assessment paradigm toward continuous-observation-assessment.

The work we are pursuing and partially reported in this paper is based on active sensing, where we integrate stimuli design, hardware ubiquitous availability, and computer vision and machine learning. Science-based and friendly short movies are designed to elicit behaviors, the design both exploiting and being constrained by the available hardware and algorithms capabilities. The close interaction between the domain experts and the algorithm experts is critical here; domain experts have their “ideal stimuli,” algorithm experts know what is measurable and analyzable. Both teams interact to converge to the proper stimuli that achieve the same behavioral signal that would have been achieved with the ideal stimuli but without the need for any additional hardware. In this fashion, very valuable signal is sensed and automatically encoded in just a few minutes of observation, as demonstrated in this paper. This is in sharp contrast with passive sensing, where the patient/participant carries a device such as a watch and is constantly monitored, and where due to uncontrolled scenarios, the signal to noise is often very low. For behavioral analysis, it is clear that passive measures (e.g., to monitor movement, activity, and location) should be combined with carefully designed active ones as the ones in this paper.

The work reported in this paper is just the beginning and one representation of the incredible activities in the field (as clearly demonstrated for example by the broad participation in the recent NIMH Brain Behavior Quantification Workshop). In ASD, activities such as those carried out by SPARK (https://sparkforautism.org) and the NIH Autism Center of Excellence at Duke University (G. Dawson, PI) and at other institutions will produce multimodal data, including behavioral, that will revolutionize the field. Other challenges, from eating disorders to ADHD to depression, have similar needs and the computer vision and machine learning tools for behavioral analysis in natural environments are starting to be applied there as well. Automatic behavioral coding in natural environments, where the patients are participants spend their day, is imperative. Our work demonstrates not only that the sensing devices and computational tools are starting to be mature enough for this, but also that stakeholders, from healthcare providers to participants, are more than ready for this next generation of mental and developmental health.

Last words: Please visit the website for this project to learn about the incredible interdisciplinary team that participated in this adventure, https://autismandbeyond.researchkit.duke.edu/our-team. This not only makes engineers like me humble for the privilege to work with such worldwide clinical leaders (Prof. Egger and Prof. Dawson for me personally), but also illustrates the new era of medicine. 

Acknowledgments: The most important contributors to this work are the participants, and we thank them for their time and help to advance mental and developmental health. 

Guillermo Sapiro

Professor, Duke University