TY - JOUR
T1 - Machine learning techniques for semantic analysis of dysarthric speech
T2 - An experimental study
AU - Despotovic, Vladimir
AU - Walter, Oliver
AU - Haeb-Umbach, Reinhold
N1 - Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2018/5
Y1 - 2018/5
N2 - We present an experimental comparison of seven state-of-the-art machine learning algorithms for the task of semantic analysis of spoken input, with a special emphasis on applications for dysarthric speech. Dysarthria is a motor speech disorder, which is characterized by poor articulation of phonemes. In order to cater for these non-canonical phoneme realizations, we employed an unsupervised learning approach to estimate the acoustic models for speech recognition, which does not require a literal transcription of the training data. Even for the subsequent task of semantic analysis, only weak supervision is employed, whereby the training utterance is accompanied by a semantic label only, rather than a literal transcription. Results on two databases, one of them containing dysarthric speech, are presented showing that Markov logic networks and conditional random fields substantially outperform other machine learning approaches. Markov logic networks have proved to be especially robust to recognition errors, which are caused by imprecise articulation in dysarthric speech.
AB - We present an experimental comparison of seven state-of-the-art machine learning algorithms for the task of semantic analysis of spoken input, with a special emphasis on applications for dysarthric speech. Dysarthria is a motor speech disorder, which is characterized by poor articulation of phonemes. In order to cater for these non-canonical phoneme realizations, we employed an unsupervised learning approach to estimate the acoustic models for speech recognition, which does not require a literal transcription of the training data. Even for the subsequent task of semantic analysis, only weak supervision is employed, whereby the training utterance is accompanied by a semantic label only, rather than a literal transcription. Results on two databases, one of them containing dysarthric speech, are presented showing that Markov logic networks and conditional random fields substantially outperform other machine learning approaches. Markov logic networks have proved to be especially robust to recognition errors, which are caused by imprecise articulation in dysarthric speech.
KW - Acoustic units
KW - Dysarthric speech
KW - Machine learning
KW - Semantic analysis
KW - Spoken language understanding
UR - http://www.scopus.com/inward/record.url?scp=85046031303&partnerID=8YFLogxK
U2 - 10.1016/j.specom.2018.04.005
DO - 10.1016/j.specom.2018.04.005
M3 - Article
AN - SCOPUS:85046031303
SN - 0167-6393
VL - 99
SP - 242
EP - 251
JO - Speech Communication
JF - Speech Communication
ER -