TY - GEN
T1 - Restoring Eye Contact to the Virtual Classroom with Machine Learning
AU - Greer, Ross
AU - Dubnov, Shlomo
N1 - Publisher Copyright:
Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Nonverbal communication, in particular eye contact, is a critical element of the music classroom, shown to keep students on task, coordinate musical flow, and communicate improvisational ideas. Unfortunately, this nonverbal aspect to performance and pedagogy is lost in the virtual classroom. In this paper, we propose a machine learning system which uses single instance, single camera image frames as input to estimate the gaze target of a user seated in front of their computer, augmenting the user’s video feed with a display of the estimated gaze target and thereby restoring nonverbal communication of directed gaze. The proposed estimation system consists of modular machine learning blocks, leading to a target-oriented (rather than coordinate-oriented) gaze prediction. We instantiate one such example of the complete system to run a pilot study in a virtual music classroom over Zoom software. Inference time and accuracy meet benchmarks for videoconferencing applications, and quantitative and qualitative results of pilot experiments include improved success of cue interpretation and student-reported formation of collaborative, communicative relationships between conductor and musician.
AB - Nonverbal communication, in particular eye contact, is a critical element of the music classroom, shown to keep students on task, coordinate musical flow, and communicate improvisational ideas. Unfortunately, this nonverbal aspect to performance and pedagogy is lost in the virtual classroom. In this paper, we propose a machine learning system which uses single instance, single camera image frames as input to estimate the gaze target of a user seated in front of their computer, augmenting the user’s video feed with a display of the estimated gaze target and thereby restoring nonverbal communication of directed gaze. The proposed estimation system consists of modular machine learning blocks, leading to a target-oriented (rather than coordinate-oriented) gaze prediction. We instantiate one such example of the complete system to run a pilot study in a virtual music classroom over Zoom software. Inference time and accuracy meet benchmarks for videoconferencing applications, and quantitative and qualitative results of pilot experiments include improved success of cue interpretation and student-reported formation of collaborative, communicative relationships between conductor and musician.
KW - Convolutional Neural Networks
KW - Distance Learning
KW - Gaze Estimation
KW - Human-Computer Interaction
KW - Music Education
KW - Telematic Performance
UR - http://www.scopus.com/inward/record.url?scp=85137955040&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85137955040
T3 - International Conference on Computer Supported Education, CSEDU - Proceedings
SP - 698
EP - 708
BT - CSEDU 2021 - Proceedings of the 13th International Conference on Computer Supported Education
A2 - Csapo, Beno
A2 - Uhomoibhi, James
PB - Science and Technology Publications, Lda
T2 - 13th International Conference on Computer Supported Education, CSEDU 2021
Y2 - 23 April 2021 through 25 April 2021
ER -