TY - GEN
T1 - The Impact of Speaker Diarization on DNN-based Autism Severity Estimation
AU - Eni, Marina
AU - Gorodetski, Alex
AU - Dinstein, Ilan
AU - Zigel, Yaniv
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - This paper presents a speech-based system for autism severity estimation combined with automatic speaker diarization. Speaker diarization was performed by two different methods. The first used acoustic features, which included Mel-Frequency Cepstral Coefficients (MFCC) and pitch, and the second used x-vectors - embeddings extracted from Deep Neural Networks (DNN). The speaker diarization was trained using a Fully Connected Deep Neural Network (FCDNN) in both methods. We then trained a Convolutional Neural Network (CNN) to estimate the severity of autism based on 48 acoustic and prosodic features of speech. One hundred thirty-two young children were recorded in the Autism Diagnostic Observation Schedule (ADOS) examination room, using a distant microphone. Between the two diarization methods, the MFCC and Pitch achieved a better Diarization Error Rate (DER) of 26.91%. Using this diarization method, the severity estimation system achieved a correlation of 0.606 (Pearson) between the predicted and the actual autism severity scores (i.e., ADOS scores).
AB - This paper presents a speech-based system for autism severity estimation combined with automatic speaker diarization. Speaker diarization was performed by two different methods. The first used acoustic features, which included Mel-Frequency Cepstral Coefficients (MFCC) and pitch, and the second used x-vectors - embeddings extracted from Deep Neural Networks (DNN). The speaker diarization was trained using a Fully Connected Deep Neural Network (FCDNN) in both methods. We then trained a Convolutional Neural Network (CNN) to estimate the severity of autism based on 48 acoustic and prosodic features of speech. One hundred thirty-two young children were recorded in the Autism Diagnostic Observation Schedule (ADOS) examination room, using a distant microphone. Between the two diarization methods, the MFCC and Pitch achieved a better Diarization Error Rate (DER) of 26.91%. Using this diarization method, the severity estimation system achieved a correlation of 0.606 (Pearson) between the predicted and the actual autism severity scores (i.e., ADOS scores).
UR - http://www.scopus.com/inward/record.url?scp=85138127421&partnerID=8YFLogxK
U2 - 10.1109/EMBC48229.2022.9871523
DO - 10.1109/EMBC48229.2022.9871523
M3 - Conference contribution
C2 - 36086547
AN - SCOPUS:85138127421
T3 - Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
SP - 3414
EP - 3417
BT - 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2022
PB - Institute of Electrical and Electronics Engineers
T2 - 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2022
Y2 - 11 July 2022 through 15 July 2022
ER -