“If voice is the future of computing, what about those who cannot speak or hear?” That is the question posed by developer Abhishek Singh, the creator of an app that allows Amazon Alexa to respond to sign language. Mr Singh’s project uses a camera-based system to identify gestures and interpret them as text and speech. Future home devices should be designed to be inclusive for deaf users, the developer says. The past few years have seen a rise in popularity of voice assistants run by Amazon, Google and Apple. And a study by the Smart Audio Report suggests adoption of smart speakers has outstripped that of smartphones and tablets in the US.
But for the deaf community, a future where devices are increasingly controlled by voice poses problems. Speech recognition is rarely able to pick up the rhythms of deaf users. And a lack of hearing presents a clear challenge to communicating with voice-based assistants. Mr Singh’s project offers one potential solution rigging Amazon’s Alexa to respond in text to American Sign Language (ASL). “If these devices are to become a central way we interact with our homes or perform tasks, then some thought needs to be given to those who cannot hear or speak,” he says. “Seamless design needs to be inclusive in nature.”
The developer trained an AI using the machine-learning platform Tensorflow, which involved repeatedly gesturing in front of a webcam to teach the system the basics of sign language. Once the system was able to respond to his hand movements, he connected it to Google’s text-to-speech software to read the corresponding words aloud. The Amazon Echo reacts and its vocal response is automatically transcribed by the computer into text, which is read by the user. As a solution, it is a workaround, with the laptop as an interpreter between the user and Alexa. But, Mr Singh says: “There’s no reason that Amazon Show, or any of the camera and screen based voice assistants, couldn’t build this functionality right in,” says the developer.
“To me that’s probably the ultimate use case of what this prototype shows.” There have been a number of previous attempts to use AI and image recognition to translate sign language. Microsoft, for example, has trialled the use of its motion-sensing Kinect cameras for the purpose a project fated to dwindle once the Kinect was discontinued in 2017. Nvidia has also explored ways artificial intelligence could be used to automatically caption videos of sign language users, as has the translation software company KinTrans. A comprehensive way to automatically translate sign language into text or speech, and vice versa, has remained elusive, however.
Jeffrey Bigham, an expert in human-computer interaction from Carnegie Mellon University, says Mr Singh’s project is “a great proof of concept” but a system fully capable of recognising sign language would be hard to design “as it requires both computer vision and language understanding that we don’t yet have”. “Alexa doesn’t really understand English either, of course,” he adds, noting that voice assistants understand only a relatively small set of template phrases. Aine Jackson, of the British Deaf Association, says that, with the increase in voice-assisted technologies, many developments are leaving deaf sign language users behind.
“Many of these technologies are shaping the world we live in and with exciting new capabilities there is now the scope for some really imaginative solutions to language access for deaf people.” She notes a number of similar projects, from sign language reading gloves to signing avatars, but also the difficulties in communicating the grammar of signed languages conveyed not just by the hands but by body position and facial movements. “We would encourage companies to take steps to make their technologies accessible for all, and congratulate individuals such as Abhishek Singh who are turning their minds to the matter,” she adds.