Duarte A. Cross-modal Neural Sign Language Translation. In: Torres J, Giró-i-Nieto X. Proceedings of the 27th ACM International Conference on Multimedia - Doctoral Symposium. Nice, France: ACM; 2019.  (392.69 KB)


Sign Language is the primary means of communication for the majority of the Deaf and hard-of-hearing communities. Current computational approaches in this general research area have focused specifically on sign language recognition and the translation of sign language to text. However, the reverse problem of translating from spoken to sign language has so far not been widely explored.

The goal of this doctoral research is to explore sign language translation in this generalized setting, i.e. translating from spoken language to sign language and vice versa. Towards that end, we propose a concrete methodology for tackling the problem of speech to sign language translation and introduce How2Sign, the first public, continuous American Sign Language dataset that enables such research. With a parallel corpus of almost 60 hours of sign language videos (collected with both RGB and depth sensor data) and the corresponding speech transcripts for over 2500 instructional videos, How2Sign is a public dataset of unprecedented scale that can be used to advance not only sign language translation, but also a wide range of sign language understanding tasks.

Poster at ACM Multimedia 2019


Xavier Giro and Amanda Duarte in ACM Multimedia 2019.