Publications

2025

Blatter, Alden, Hortense Gallois, Emily Evangelista, Yael Bensoussan, Bridge2AI-Voice Consortium, Jean-Christophe Bélisle-Pipon, Yael E. Bensoussan, et al. (2025) 2025. “‘Voice Is the New Blood’: A Discourse Analysis of Voice AI Health-Tech Start-up Websites”. Frontiers in Digital Health 7. https://doi.org/10.3389/fdgth.2025.1568159.
IntroductionVoice as a biomarker has emerged as a transformative field in health technology, providing non-invasive, accessible, and cost-effective methods for detecting, diagnosing, and monitoring various conditions. Start-ups are at the forefront of this innovative field, developing and marketing clinical voice AI solutions to a range of healthcare actors and shaping the field s early development. However, there is limited understanding of how start-ups in this field frame their innovations, and address—or overlook—critical socio-ethical, technical, and regulatory challenges in the rapidly evolving field of digital health.MethodsThis study uses discourse analysis to examine the language on the public websites of 25 voice AI health-tech start-ups. Grounded in constitutive discourse analysis, which asserts that discourse both reflects and shapes realities, the study identifies patterns in how these companies describe their identities, technologies, and datasets.ResultsThe analysis shows start-ups consistently highlight the efficacy, reliability, and safety of their technologies, positioning them as transformative healthcare solutions. However, descriptions of voice datasets used to train algorithms vary widely and are often absent, reflecting broader gaps in acoustic and ethical standards for voice data collection and insufficient incentives for start-ups to disclose key data details.DiscussionStart-ups play a crucial role in the research, development, and marketization of voice AI health-tech, prefacing the integration of this new technology into healthcare systems. By publicizing discourse around voice AI technologies at this early stage, start-ups are shaping public perceptions, setting expectations for end-users, and ultimately influencing the implementation of voice AI technologies in healthcare. Their discourse seems to strategically present voice AI health-tech as legitimate by using promissory language typical in the digital health field and showcase the distinctiveness from competitors. This analysis highlights how this double impetus often drives narratives that prioritize innovation over transparency. We conclude that the lack of incentive to share key information about datasets is due to contextual factors that start-ups cannot control, mainly the absence of clear standards and regulatory guidelines for voice data collection. Addressing these complexities is essential to building trust and ensuring responsible integration of voice AI into healthcare systems.
Anibal, James, Rebecca Doctor, Micah Boyer, Karlee Newberry, Iris De Santiago, Shaheen Awan, Yassmeen Abdel-Aty, et al. (2025) 2025. “Transformers for Rapid Detection of Airway Stenosis and Stridor”. Scientific Reports 15 (1): 15394. https://doi.org/10.1038/s41598-025-99369-y.
Upper airway stenosis is a potentially life-threatening condition involving the narrowing of the airway. In more severe cases, airway stenosis may be accompanied by stridor, a type of disordered breathing caused by turbulent airflow. Patients with airway stenosis have a higher risk of airway failure and additional precautions must be taken before medical interventions like intubation. However, stenosis and stridor are often misdiagnosed as other respiratory conditions like asthma/wheezing, worsening outcomes. This report presents a unified dataset containing recorded breathing tasks from patients with stridor and airway stenosis. Customized transformer-based models were also trained to perform stenosis and stridor detection tasks using low-cost data from multiple acoustic prompts recorded on common devices. These methods achieved AUC scores of 0.875 for stenosis detection and 0.864 for stridor detection, demonstrating the potential to add value as screening tools in real-world clinical workflows, particularly in high-volume settings like emergency departments.
Reed, Nicholas S., Jinyu Chen, Alison R. Huang, James R. Pike, Michelle Arnold, Sheila Burgard, Ziheng Chen, et al. (2025) 2025. “Hearing Intervention, Social Isolation, and Loneliness: A Secondary Analysis of the ACHIEVE Randomized Clinical Trial”. JAMA Internal Medicine 185 (7): 797-806. https://doi.org/10.1001/jamainternmed.2025.1140.
Promoting social connection among older adults is a public health priority. Addressing hearing loss may reduce social isolation and loneliness among older adults.To describe the effect of a best-practice hearing intervention vs health education control on social isolation and loneliness over a 3-year period in the Aging and Cognitive Health Evaluation in Elders (ACHIEVE) study.This secondary analysis of a multicenter randomized controlled trial with 3-year follow-up was completed in 2022 and conducted at 4 field sites in the US (Forsyth County, North Carolina; Jackson, Mississippi; Minneapolis, Minnesota; Washington County, Maryland). Data were analyzed in 2024. Participants included 977 adults (aged 70-84 years who had untreated hearing loss without substantial cognitive impairment) recruited from the Atherosclerosis Risk in Communities study (238 [24.4%]) and newly recruited (de novo; 739 [75.6%]). Participants were randomized (1:1) to hearing intervention or health education control and followed up every 6 months.Hearing intervention (4 sessions with certified study audiologist, hearing aids, counseling, and education) and health education control (4 sessions with a certified health educator on chronic disease, disability prevention).Social isolation (Cohen Social Network Index score) and loneliness (UCLA Loneliness Scale score) were exploratory outcomes measured at baseline and at 6 months and 1, 2, and 3 years postintervention. The intervention effect was estimated using a 2-level linear mixed-effects model under the intention-to-treat principle.Among the 977 participants, the mean (SD) age was 76.3 (4.0) years; 523 (53.5%) were female, 112 (11.5%) were Black, 858 (87.8%) were White, and 521 (53.4%) had a Bachelor’s degree or higher. The mean (SD) better-ear pure-tone average was 39.4 dB (6.9). Over 3 years, mean (SD) social network size reduced from 22.6 (11.1) to 21.3 (11.0) and 22.3 (10.2) to 19.8 (10.2) people over 2 weeks in the hearing intervention and health education control arms, respectively. In fully adjusted models, hearing intervention (vs health education control) reduced social isolation (social network size [difference, 1.05; 95% CI, 0.01-2.09], diversity [difference, 0.19; 95% CI, 0.02-0.36], embeddedness [difference, 0.27; 95% CI, 0.09-0.44], and reduced loneliness [difference, −0.94; 95% CI, −1.78 to −0.11]) over 3 years. Results were substantively unchanged in sensitivity analyses that incorporated models that were stratified by recruitment source, analyzed per protocol and complier average causal effect, or that varied covariate adjustment.This secondary analysis of a randomized clinical trial indicated that older adults with hearing loss retained 1 additional person in their social network relative to a health education control over 3 years. While statistically significant, it is unknown whether observed changes in social network are clinically meaningful, and loneliness measure changes do not represent clinically meaningful changes. Hearing intervention is a low-risk strategy that may help promote social connection among older adults.ClinicalTrials.gov Identifier: NCT03243422

2024

Evangelista, Emily, Rohan Kale, Desiree McCutcheon, Anais Rameau, Alexander Gelbard, Maria Powell, Michael Johns, et al. 2024. “Current Practices in Voice Data Collection and Limitations to Voice AI Research: A National Survey”. The Laryngoscope 134 (3): 1333-39. https://doi.org/10.1002/lary.31052.
Introduction Accuracy and validity of voice AI algorithms rely on substantial quality voice data. Although commensurable amounts of voice data are captured daily in voice centers across North America, there is no standardized protocol for acoustic data management, which limits the usability of these datasets for voice artificial intelligence (AI) research. Objective The aim was to capture current practices of voice data collection, storage, analysis, and perceived limitations to collaborative voice research. Methods A 30-question online survey was developed with expert guidance from the voicecollab.ai members, an international collaborative of voice AI researchers. The survey was disseminated via REDCap to an estimated 200 practitioners at North American voice centers. Survey questions assessed respondents current practices in terms of acoustic data collection, storage, and retrieval as well as limitations to collaborative voice research. Results Seventy-two respondents completed the survey of which 81.7% were laryngologists and 18.3% were speech language pathologists (SLPs). Eighteen percent of respondents reported seeing 40%–60% and 55% reported seeing \textgreater60 patients with voice disorders weekly (conservative estimate of over 4000 patients/week). Only 28% of respondents reported utilizing standardized protocols for collection and storage of acoustic data. Although, 87% of respondents conduct voice research, only 38% of respondents report doing so on a multi-institutional level. Perceived limitations to conducting collaborative voice research include lack of standardized methodology for collection (30%) and lack of human resources to prepare and label voice data adequately (55%). Conclusion To conduct large-scale multi-institutional voice research with AI, there is a pertinent need for standardization of acoustic data management, as well as an infrastructure for secure and efficient data sharing. Level of Evidence 5 Laryngoscope, 134:1333–1339, 2024
Evangelista, Emily G., Jean-Christophe Bélisle-Pipon, Matthew R. Naunheim, Maria Powell, Hortense Gallois, Bridge2AI-Voice Consortium, and Yael Bensoussan. 2024. “Voice As a Biomarker in Health-Tech: Mapping the Evolving Landscape of Voice Biomarkers in the Start-Up World”. Otolaryngology–Head and Neck Surgery 171 (2): 340-52. https://doi.org/10.1002/ohn.830.
Objective The vocal biomarkers market was worth \1.9B in 2021 and is projected to exceed \5.1B by 2028, for a compound annual growth rate of 15.15%. The investment growth demonstrates a blossoming interest in voice and artificial intelligence (AI) as it relates to human health. The objective of this study was to map the current landscape of start-ups utilizing voice as a biomarker in health-tech. Data Sources A comprehensive search for start-ups was conducted using Google, LinkedIn, Twitter, and Facebook. A review of the research was performed using company website, PubMed, and Google Scholar. Review Methods A 3-pronged approach was taken to thoroughly map the landscape. First, an internet search was conducted to identify current start-ups focusing on products relating to voice as a biomarker of health. Second, Crunchbase was utilized to collect financial and organizational information. Third, a review of the literature was conducted to analyze publications associated with the identified start-ups. Results A total of 27 start-up start-ups with a focus in the utilization of AI for developing biomarkers of health from the human voice were identified. Twenty-four of these start-ups garnered \$178,808,039 in investments. The 27 start-ups published 194 publications combined, 128 (66%) of which were peer reviewed. Conclusion There is growing enthusiasm surrounding voice as a biomarker in health-tech. Academic drive may complement commercialization to best achieve progress in this arena. More research is needed to accurately capture the entirety of the field, including larger industry players, academic institutions, and non-English content.
Bensoussan, Yaël, Olivier Elemento, and Anaïs Rameau. (2025) 2024. “Voice As an AI Biomarker of Health—Introducing Audiomics”. JAMA Otolaryngology–Head & Neck Surgery 150 (4): 283-84. https://doi.org/10.1001/jamaoto.2023.4807.
Voice, speech, and respiratory sounds provide important clinical insights into patients’ health status. In the age of artificial intelligence (AI), patients’ audio recordings are being investigated as digital biomarkers for early detection of a broad range of conditions, including laryngeal pathology, neurological and psychological disorders, head and neck cancers, and diabetes. Besides neurologists, speech language pathologists, and internists, otolaryngologists also have unique perspectives and expertise on voice, speech, and respiratory sounds that can fuel this line of research and innovation.
Bensoussan, Yael E., Emily G. Evangelista, Rebecca J. Doctor, Begum A. Mathyk, Kate L. Bevec, Jamie A. Toghranegar, and Rupal Patel. (2025) 2024. “Menopause and the Voice: A Narrative Review of Physiological Changes, Hormone Therapy Effects, and Treatment Options”. Menopause, 10.1097/GME.0000000000002636. https://doi.org/10.1097/GME.0000000000002636.
Importance and objective:  Voice changes during menopause affect patients’ communication and quality of life. This narrative review aims to provide a comprehensive exploration of voice changes during menopause. It presents objective and subjective/symptomatic changes as well as treatment options for this population. Lastly, it identifies areas of research and future directions needed to serve this population through collaboration between voice experts and gynecologists. Methods:  To inform this narrative review, a literature review was conducted using the PubMed database, encompassing publications from January 2005 to January 2025. The review synthesized research on hormonal influences, acoustic analyses, laryngeal imaging, and patient-reported outcomes, with a focus on understanding the physiological mechanisms underlying menopausal voice alterations. Results:  The review reveals a complex narrative of vocal transformation during menopause. Hormonal decline—characterized by reduced estrogen and progesterone levels—precipitates significant laryngeal changes. Up to 46% of menopausal women experience perceptible vocal modifications, including decreased fundamental frequency (by 0.94 semitones), increased vocal instability, and reduced phonation capabilities. Particularly vulnerable are professional voice users, who face unique challenges in maintaining vocal performance. Hormone therapy demonstrates potential protective effects, though findings remain inconsistent. Discussion and conclusion:  Menopause-related voice disorders represent a nuanced and underexplored medical phenomenon. This review underscores the critical need for interdisciplinary research that integrates gynecology, otolaryngology, endocrinology, and speech pathology. Future investigations could focus on developing AI-driven voice biomarkers, conducting longitudinal studies, and creating targeted interventions that recognize the voice and respiratory transitions women experience during menopause.
Awan, Shaheen N., Ruth Bahr, Stephanie Watts, Micah Boyer, Robert Budinsky, Bridge2AI Voice Consortium, and Yael Bensoussan. (2025) 2024. “Validity of Acoustic Measures Obtained Using Various Recording Methods Including Smartphones With and Without Headset Microphones”. Journal of Speech, Language, and Hearing Research 67 (6): 1712-30. https://doi.org/10.1044/2024_JSLHR-23-00759.
Purpose: The goal of this study was to assess various recording methods, including combinations of high- versus low-cost microphones, recording interfaces, and smartphones in terms of their ability to produce commonly used time- and spectral-based voice measurements. Method: Twenty-four vowel samples representing a diversity of voice quality deviations and severities from a wide age range of male and female speakers were played via a head-and-thorax model and recorded using a high-cost, research standard GRAS 40AF (GRAS Sound & Vibration) microphone and amplification system. Additional recordings were made using various combinations of headset microphones (AKG C555 L [AKG Acoustics GmbH], Shure SM35-XLR [Shure Incorporated], AVID AE-36 [AVID Products, Inc.]) and audio interfaces (Focusrite Scarlett 2i2 [Focusrite Audio Engineering Ltd.] and PC, Focusrite and smartphone, smartphone via a TRRS adapter), as well as smartphones direct (Apple iPhone 13 Pro, Google Pixel 6) using their built-in microphones. The effect of background noise from four different room conditions was also evaluated. Vowel samples were analyzed for measures of fundamental frequency, perturbation, cepstral peak prominence, and spectral tilt (low vs. high spectral ratio). Results: Results show that a wide variety of recording methods, including smartphones with and without a low-cost headset microphone, can effectively track the wide range of acoustic characteristics in a diverse set of typical and disordered voice samples. Although significant differences in acoustic measures of voice may be observed, the presence of extremely strong correlations (rs \textgreater .90) with the recording standard implies a strong linear relationship between the results of different methods that may be used to predict and adjust any observed differences in measurement results. Conclusion: Because handheld smartphone distance and positioning may be highly variable when used in actual clinical recording situations, smartphone + a low-cost headset microphone is recommended as an affordable recording method that controls mouth-to-microphone distance and positioning and allows both hands to be available for manipulation of the smartphone device.
Awan, Shaheen N., Ruth Bahr, Stephanie Watts, Micah Boyer, Robert Budinsky, and Yael Bensoussan. (2025) 2024. “Evidence-Based Recommendations for Tablet Recordings From the Bridge2AI-Voice Acoustic Experiments”. Journal of Voice. https://doi.org/10.1016/j.jvoice.2024.08.029.
Background As part of a larger goal to create best practices for voice data collection to fuel voice artificial intelligence (AI) research, the objective of this study was to investigate the ability of readily available iOS and Android tablets with and without low-cost headset microphones to produce recordings and subsequent acoustic measures of voice comparable to “research quality” instrumentation. Methods Recordings of 24 sustained vowel samples representing a wide range of typical and disordered voices were played via a head-and-torso model and recorded using a research quality standard microphone/preamplifier/audio interface. Acoustic measurements from the standard were compared with two popular tablets using their built-in microphones and with low-cost headset microphones at different distances from the mouth. Results Voice measurements obtained via tablets + headset microphones close to the mouth (2.5 and 5 cm) strongly correlated (r’s \textgreater 0.90) with the research standard and resulted in no significant differences for measures of vocal frequency and perturbation. In contrast, voice measurements obtained using the tablets’ built-in microphones at typical reading distances (30 and 45 cm) tended to show substantial variability in measurement, greater mean differences in voice measurements, and relatively poorer correlations vs the standard. Conclusion Findings from this study support preliminary recommendations from the Bridge2AI-Voice Consortium recommending the use of smartphones paired with low-cost headset microphones as adequate methods of recording for large-scale voice data collection from a variety of clinical and nonclinical settings. Compared with recording using a tablet direct, a headset microphone controls for recording distance and reduces the effects of background noise, resulting in decreased variability in recording quality. Data availability Data supporting the results reported in this article may be obtained upon request from the contact author.