Authors
Larry Zhang, Jacek Kolacz, Albert Rizzo, Stefan Scherer, Mohammad Soleymani
Abstract
Automatic detection of psychological disorders has gained significant attention in recent years due to the rise in their prevalence. However, the majority of studies have overlooked the complexity of disorders in favor of a “present/not present” dichotomy in representing disorders. Recent psychological research challenges favors transdiagnostic approaches, moving beyond general disorder classifications to symptom level analysis, as symptoms are often not exclusive to individual disorder classes. In our study, we investigated the link between speech signals and psychological distress symptoms in a corpus of 333 screening interviews from the Distress Analysis Interview Corpus (DAIC). Given the semi-structured organization of interviews, we aggregated speech utterances from responses to shared questions across interviews. We employed deterministic sample selection in classification to rank salient questions for eliciting symptom-specific behaviors in order to predict symptom presence. Some questions include “Do you find therapy helpful?” and “When was the last time you felt happy?“. The prediction results align closely to the factor structure of psychological distress symptoms, linking speech behaviors primarily to somatic and affective alterations in both depression and PTSD. This lends support for the transdiagnostic validity of speech markers for detecting such symptoms. Surprisingly, we did not find a strong link between speech markers and cognitive or psychomotor alterations. This is surprising, given the complexity of motor and cognitive actions required in speech production. The results of our analysis highlight the importance of aligning affective computing research with psychological research to investigate the use of automatic behavioral sensing to assess psychiatric risk.
Key Findings
- Symptom-level analysis reveals complexity: Analysis of depressed participants showed 17 unique symptom profiles and 96 unique expressions of symptom severity, highlighting the inadequacy of binary disorder classifications
- Speech markers align with somatic symptoms: Best prediction performance achieved for somatic-related depression symptoms (PHQ Sleep, Tired, Appetite) and PTSD negative symptoms
- Affective symptoms also predicted well: Strong performance on affective-related depression symptoms (NoInterest, Depressed, Failure feelings)
- Context matters for prediction: Specific interview questions showed higher information gain than entire interview context for symptom prediction
- Transdiagnostic validity: Speech markers effectively predicted symptoms across both depression and PTSD, supporting cross-disorder applicability
- Limited cognitive/psychomotor prediction: Speech markers were not strong predictors of cognitive concentration or psychomotor alterations, contrary to expectations
Methodology
The study analyzed speech data from the Distress Analysis Interview Corpus with 333 participants:
- Data Collection: Semi-structured interviews conducted by virtual human agent “Ellie” with both civilian and veteran populations
- Symptom Assessment: PHQ-8 for depression symptoms and PCL-C for PTSD symptoms, with binary classification of symptom presence/absence
- Speech Feature Extraction:
- Linguistic features: LIWC word frequency categories capturing psychological processes
- Acoustic features: COVAREP vocal descriptors including prosody, voice quality, and spectral energy
- Deterministic Sample Selection: Novel method to identify optimal question combinations for predicting each symptom
- Information Gain Analysis: Evaluated discriminative value of specific interview questions versus entire interview context
- Classification: Support Vector Machine with soft-majority voting across selected questions, validated using 10-fold cross-validation
Impact
This research advances understanding of speech-based mental health assessment:
- Clinical assessment: Provides framework for symptom-specific rather than disorder-level psychological assessment using speech
- Transdiagnostic approach: Demonstrates speech markers can identify symptoms across multiple disorders, supporting personalized treatment approaches
- Automated screening: Enables development of more precise automated tools for detecting specific psychological distress symptoms
- Research methodology: Establishes importance of aligning computational approaches with psychological research on symptom factor structures
- Context-aware analysis: Shows that specific interview questions are more informative than general conversation for symptom detection
- Mental health monitoring: Supports development of continuous, non-invasive monitoring systems focused on somatic and affective symptom domains