• Resumo

    Vocal Pathology Recognition Using Acoustic Features

    Data de publicação: 09/06/2026

    Voice disorders impact communication, employability, and quality of life; however, gold-standard assessments based on laryngoscopy remain invasive and resource-intensive. This work examines the impact of vowel choice, pitch condition, speaker sex, and feature design on the performance of machine-learning models for automatic voice pathology detection. Using the Saarbrücken Voice Database (SVD), we build a benchmark on sustained vowels by deriving 120 binary classification tasks from the factorial combination of vowel (/a/, /i/, /u/, all), pitch condition (low, high, normal, low-high-low, all), and speaker sex (male, female, both). We compared a baseline acoustic representation inspired by recent work with an extended feature set. Four off-the-shelf classifiers were evaluated using AUC as the main performance metric. Results show that XGBoost and SVM consistently achieve the best ranks, with median AUC values around 0.80 and a maximum of 0.866 for the configuration combining novel features, all pitch conditions, male speakers, and vowel /a/. Sex-specific models consistently surpass mixed-sex models, and feature-importance analysis highlights spectral bandwidth, jitter, and shimmer as key descriptors. The proposed feature set outperforms the baseline, and using only sustained /a/ is as effective as using all vowels, simplifying acquisition. Future work may improve the pre-processing step, expand the feature set, and employ deep features.

Anais do Computer on the Beach

O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.

Access journal