Assessing the Utility of Language and Voice Biomarkers to Predict Cognitive Impairment in the Framingham Heart Study Cognitive Aging Cohort Data

Authors

Jason A. Thomas, Hannah A. Burkhardt, Safina Chaudhry, Anthony D. Ngo, Saransh Sharma, Larry Zhang, Rhoda Au, and Reza Hosseini Ghomi

Read the full paper

Abstract

Background: There is a need for fast, accessible, low-cost, and accurate diagnostic methods for early detection of cognitive decline. Dementia diagnoses are usually made years after symptom onset, missing a window of opportunity for early intervention.

Objective: To evaluate the use of recorded voice features as proxies for cognitive function by using neuropsychological test measures and existing dementia diagnoses.

Methods: This study analyzed 170 audio recordings, transcripts, and paired neuropsychological test results from participants selected from the Framingham Heart Study (FHS), which includes 97 recordings of cognitively normal and 73 recordings of cognitively impaired participants. Acoustic and linguistic features of the voice samples were correlated with cognitive performance measures to verify their association.

Results: Language and voice features, when combined with demographic variables, performed with an AUC of 0.942 (95% CI 0.929–0.983) in predicting cognitive status. Features with good predictive power included the acoustic features mean spectral slope in the 500–1500 Hz band, variation in the F2 bandwidth, and variation in the Mel-Frequency Cepstral Coefficient (MFCC) 1; the demographic features employment, education, and age; and the text features of number of words per sentence, number of compound words, number of unique nouns, and number of proper names.

Conclusion: Several linguistic and acoustic biomarkers show correlations and predictive power with neuropsychological testing results and cognitive impairment diagnoses, including dementia. This initial study paves the way for a follow-up comprehensive study incorporating the entire FHS cohort.

Key Findings

High Predictive Accuracy: The combined model achieved an AUC of 0.942 (95% CI 0.929–0.983) for predicting cognitive impairment status
Key Acoustic Features: Mean spectral slope in 500-1500 Hz band, F2 bandwidth variation, and MFCC variations were strongly predictive
Important Linguistic Features: Number of words per sentence, compound words, unique nouns, and proper names correlated with cognitive performance
Demographic Factors: Age, education level, and employment status significantly contributed to prediction accuracy
Feature Selection: The optimized model selected only 21 features while maintaining high performance, balancing accuracy with interpretability
Early Detection Potential: Voice changes can be detected at least one year prior to formal dementia diagnosis

Methodology

The study used a subset of 170 participants from the Framingham Heart Study Cognitive Aging Cohort, with audio recordings from neuropsychological testing sessions. The methodology included:

Audio Processing: Recordings from Logical Memory (Delayed Recall) tests were processed using the GeMAPS feature extraction pipeline
Feature Categories: Three types of features were extracted - acoustic (voice characteristics), linguistic (text analysis), and demographic variables
Machine Learning: Elastic Net and Lasso Lars regression models were employed with leave-one-out cross-validation
Feature Selection: Iterative feature selection was used to identify the most predictive variables while avoiding overfitting
Validation: Model performance was evaluated using area under the receiver operating characteristic curve (AUC) with 95% confidence intervals

Impact

This research establishes the feasibility of using voice analysis as a screening tool for cognitive decline, with several important implications:

Clinical Applications: Provides a foundation for developing accessible, low-cost screening tools that could be deployed in primary care settings
Early Intervention: The ability to detect cognitive changes before formal diagnosis could enable earlier therapeutic interventions
Scalable Technology: Voice-based assessments could be implemented through smartphones or other widely available devices
Research Foundation: The identified biomarkers provide targets for future larger-scale validation studies across the entire FHS cohort of 9,000+ participants
Healthcare Accessibility: Could reduce barriers to cognitive screening, particularly for underserved populations who lack access to specialized neuropsychological testing
Future Integration: The methodology could be combined with other biomarkers (brain imaging, blood markers) for more comprehensive assessment tools