Speech2Signs: Spoken to Sign Language Translation using Neural Networks

Type Start End
Other Nov 2017 Nov 2018
Responsible URL
Xavier Giro-i-Nieto Caffe2 Research Awards 2017


Although recent advancements like the Internet, smartphones and social networks have enabled people to instantly communicate and share knowledge at a global scale, the Deaf community still have very limited access to large parts of the digital world. According to the World Health Organization, hearing impairment is the most common disorder affecting more than 360 million people worldwide and for many of these individuals, Sign Language is their primary mean of communication.

For most of deaf individuals, watching online videos is a challenging task. While some streaming and broadcast services provide accessibility options such as captions, these are available for just a part of the catalog and often in a limited amount of languages. When they are not available, volunteers or relatives may generate them and distribute them through third-party platforms. Moreover, a large portion of the online videos are not from streaming or broadcast services but generated by amateur users. As reported by the company statistics, an average of 400 hours of videos are uploaded everyday on a video-sharing website. These users do not typically create any metadata for accessibility. Their intention is informal, addressed to a reduced audience and produced in a very short time. The huge and growing amount of such online videos requires automatic methods capable of adapting these contents across modalities to make them more accessible to everybody.

Speech2Signs aims to remove these difficulties and communication barriers by making the audio track content from online videos available to deaf and hard-of-hearing people by automatically generating a video-based speech to sign language translation.

This project has been awarded with one of the five Caffe2 Research Awards 2017 granted by Facebook.