Using voice as a biomarker for diagnosis

Credit: Getty

In 5 seconds

Doctors could soon have a new tool to detect disorders and illnesses such as pneumonia, Alzheimer’s and autism, thanks to a Canada-U.S. database of human voices to which UdeM is contributing.

Artificial intelligence may soon help doctors diagnose and treat diseases, including cancer and depression, based on the sound of a patient’s voice, as 12 leading research institutions - including Université de Montréal - work to establish voice as a biomarker to be used in clinical care.

With $14-million U.S. in funding over four years from the U.S. National Institutes of Health, the project is being led by Dr. Yael Bensoussan of the University of South Florida and Olivier Elemento of Weil Cornell Medicine in New York City, along with six other institutions in the U.S. and four in Canada.

Vardit Ravitsky

Vardit Ravitsky

This is one of several projects funded by a new NIH Common Fund program called Bridge to Artificial Intelligence (Bridge2AI). French-American AI biotech startup Owkin is supplying the technology for the database.

Called Voice as a Biomarker of Health, the project aims to ethically collect hundreds of thousands of human voices while ensuring diversity and patients’ privacy. Machine learning models will then be trained to spot diseases by detecting changes in the human voice, at low cost.

We asked the project’s sole Quebec principal investigator, Vardit Ravitsky, a bioethics professor in UdeM’s School of Public Health and a senior lecturer on Global Health and Social Medicine at the Harvard Medical School, to tell us more.

First of all, what kinds of diseases could this database be used to detect?

Research performed on this database could allow us to better detect voice disorders, such as laryngeal cancers or vocal fold paralysis; respiratory disorders, such as pneumonia or chronic lung diseases; neurological and neurodegenerative disorders, such as Alzheimer’s, Parkinson’s, stroke, or ALS; mood and psychiatric disorders, such as depression, schizophrenia, or bipolar disorders; and pediatric voice and speech disorders, such as autism or speech and language delays. These diseases have all been studied individually and we know that there is already scientific evidence to show that changes in voice can be related to them. But we need additional research, assisted by AI, to know more.

How will the voices be collected?

At first, data collection will be done by clinicians at expert medical centres through an app-based software. However, during the third and fourth years of the project we plan to also collect data in remote and underserved communities and through crowdsourcing. We will use a novel technology called “federated learning technology” that allows machine learning models to be trained on these voice samples, without them ever leaving their original location. This will show that research based on AI can be deployed across multiple research centres, while preserving the privacy and security of sensitive voice data.

Will AI be able to spot variations in how a person speaks and send up red flags?

The primary objective of this project is to create the database of voice samples in such a way that it will be ready for analysis by AI. Eventually, based on this research, we foresee a future where AI will be able to notice differences in voice that are particular to certain diseases. For example, in Parkinson's disease, voice becomes monotonous with lower amplitude. Trained voice specialists would usually suspect Parkinson's right away when they hear these variations. But not all patients have access to such specialists. The goal is to make AI algorithms as good as trained voice specialists, so that this diagnostic tool is available also in areas of lower resources. This would improve outcomes for patients and promote equity in healthcare.

How will this project be different from existing use of voice data?

Although preliminary work with voice data has been promising, previous research had serious limitations and therefore it has been challenging to integrate voice as a biomarker into clinical practice. For example, previous studies only used small datasets and for clinical use we need robust evidence based on very large numbers. Additionally, past research had ethical concerns such as who owns the voice data, how to protect patient privacy, and lack of diversity. For example, if all the samples come from one ethnic or age group, or one gender, the diagnostic tools that are developed may be less effective for others. To solve these issues, the Voice as a Biomarker of Health project is creating a large, high-quality, multi-institutional, and diverse voice database. Our data will be unidentifiable or identity-protected, which means that the voice sample will not be linked to other data about the patient, such as demographics, medical imaging and genomics.

So what are the ethical issues involved?

I’m glad you ask, since the Canadian team based in UdeM and Simon Fraser University – led by ethicist Jean-Christophe Belisle-Pipon, an adjunct professor at UdeM and assistant professor at SFU – will focus on the ethical, legal, and social issues this raises. Our unique contribution to this vast effort will be to develop the governance necessary to ensure all this is done responsibly. Some of the issues are novel because voice is not yet a part of clinical care. There are currently no adequate best practices, regulations, and safeguards for its use. Our team will be the one developing them. For example, it is almost paradoxical that because voice is so easy to collect, privacy issues are actually exacerbated by this technology. We need to ensure patient data is protected and that it is not used for anything except medical purposes. We need to build this database with samples collected from diverse populations, to ensure equity in the future clinical use of this biomarker. We need to develop consent mechanisms to make sure patients understand for what purposes they are being recorded and what can be done with their voice in the future. We also need to clarify issues related to voice data ownership: can your voice be shared or sold? Who can benefit from it? Can data owned by health care systems, clinicians, and patients be shared with commercial entities for the development of AI models? These are just some examples of the complex ethical issues we will be addressing.

In sum, how revolutionary could this use of new technology be?

Very! If the infrastructure is well developed, this could represent the start of an international collaborative mission, such as the Human Genome Project, where voice data would be used by thousands of researchers, and then - based on that research - by clinicians worldwide. It could allow new and important discoveries and enhance what precision medicine has to offer patients. Voice is unique to individuals and can be collected easily in low-resource settings, in cost-effective and non-invasive ways. We hope to create the infrastructure to collect voice data in an accessible and ethical way, so that people feel safe in sharing their voice and associated medical data. Imagine a world in which you can record yourself through a specialized app on your mobile device and send it to your doctor as an additional powerful diagnostic tool! Think about how much easier this is compared with giving a blood sample or going for an imaging test such as a CT! This is what we hope would be the eventual outcome of this big research effort.

Another ethical challenge: mapping human cells using AI

Professors Ravitsky and Bélisle-Pipon are principal investigators on a second $20-million U.S. project funded under the same NIH program, Bridge2AI, titled Cell Maps for Artificial Intelligence.

This project seeks to map the architecture of human cells and use these maps to allow a better understanding of the relationship between genotype (a patient’s DNA) and phenotype (the conditions the patient is suffering from).

In genomics, machine-learning models are often “black boxes”. They use genomic information to predict the phenotype, without explaining the mechanisms that underly this translation. To address this gap, this project will use three complementary mapping approaches, to better understand the relationship between structure and its function.

The project will stimulate research and development in “visible” machine learning systems, that allow researchers and clinicians to understand the relationships AI reveals. Ravitsky and Bélisle-Pipon will explore ways to develop these systems in an ethical and trustworthy way.

“Voice-data and human cell maps research are both emerging research fields, and at present, there is little to no guidance with respect to the ethical, legal and social implications of this work,” said Bélisle-Pipon.

“Within the Bridge2AI collaboration, there will be an opportunity for professor Ravitsky and myself to identify, anticipate, address, and provide guidance to other researchers creating datasets that will be compiled for use in AI applications. Our work will anticipate and address ethical challenges such as inclusion, diversity, privacy, consent, data ownership and sharing, AI transparency, and potential bias.” 

Added Ravitsky: “We intend to use the approach of ethics inquiry through a continuum, starting from data generation and AI research and development, continuing into clinical adoption of the datasets, and extending to downstream patient health decisions and outcomes.”

About the funding

Vardit Ravitsky's work will be supported in part by awards OT2OD032720 and OT2OD032742 from the U.S. National Institutes of Health Common Fund.

Media contact