Can Automated Speech Recognition improve the lives of people with communication disabilities in Ghana?

Catherine Holloway, Giulia Barbareschi
Jan. 22, 2025
Ghana

In Ghana, people who have difficulties with speech articulation face significant barriers in interacting with others, often leading to social isolation and reduced participation in everyday activities. As part of the AT2030 project, funded by FCDO, Global Disability Innovation Hub (GDI Hub) set out to evaluate the first freely available Android-based application for automated speech recognition , Google Project Relate, in Ghana.

The Challenge of Communication Disabilities

Communication disabilities affect millions of people worldwide. In Ghana, these challenges are compounded by a lack of Speech and Language Therapy (SLT) services and limited availability of assistive technologies. High rates of stigma surrounding disability further exacerbate the difficulties people face, and limit access to life opportunities including education, healthcare and employment. .

The Promise of Technology

The increasing penetration of mobile phones in Ghana presents a unique opportunity to leverage technology for communication support. Project Relate allows users to create personalised speech recognition models for those who have speech disabilities. These models can then be used to support real time communication through live captioning or easy-to-understand speech repetition.

How might Google Project Relate help?

Google Project Relate is an English-based ASR app designed for mobile phones with Android 8 and above. It targets individuals with non-standard speech, those for whom off the shelf speech-to-text applications do not work well, enabling them to record  500 speech samples using set phrases as well as specific sentences created by users themselves to create a personalised speech model . The app offers several functions, including live speech transcription, speech repetition using a synthesized voice, interaction with Google Assistant, and voice typing for SMS and other applications. We conducted a user study involving 10 Speach and Language Therapists and 20 people with communication difficulties, to evaluate the practical feasibility of Google Project Relate in Ghana. We wanted to understand the factors that influence the use and effectiveness of the app in a context different from where it was developed.

Potential and challenges of Google Relate

Our study revealed several key findings that highlight the potential and challenges for Ghanaian users of Project Relate:

  1. Access: Access to the app was influenced by the availability of suitable smartphones, internet connectivity, and language. Many participants did not own smartphones with the required specifications, and internet connectivity was often unreliable. Additionally, the application’s reliance on English limited its usability in a multilingual context.
  2. Capacity: The ability to benefit from the app depended on participants’ literacy, digital skills, and support from family and Speach and Language Therapists . Creating custom cards to improve the application’s recognition of local names and terms was crucial but required significant effort and understanding of the technology.
  3. Motivation: Participants’ motivation to use the app varied based on their life circumstances, perceived benefits, and the attitudes of their conversational partners. Successful interactions often depended on the willingness of others to adapt to the new communication method.

Differential Access

The most basic factor determining access to Google Project Relate was the availability of a smartphone that met the minimum specifications required. Although all 20 participants had access to a mobile phone before joining the study,  many had older models which lacked the required specifications or shared the phone with another member of the family which would limit their ability to access the application. The study provided appropriate smartphones to participants, but many mentioned they would have struggled to afford one otherwise.

Internet connectivity also played a significant role in access. While mobile internet penetration in Ghana is increasing, reliability remains an issue, especially outside Accra. Data is also expensive. Participants reported concerns over the amount of data consumed by Google Relate and devised strategies to mitigate this.

Language was another critical factor. The app is only available in English, which limited how much people could use it with conversational partners who did not speak English and often misrecognised or mispronounced local names.

Differential Capacity

Training Google Relate to build a customized voice model required participants to read aloud at least 500 English sentences. This relies on a certain degree of literacy, and some participants faced difficulties with unfamiliar words or sentences. There is an option to have sentences read aloud , but the American accent and pronunciation used by the synthesized voice posed challenges for some participants.

Digital literacy is also an important factor. The degree of digital literacy and understanding of how Google Relate works influenced the importance participants placed on creating custom cards. Those who created more custom cards reported greater satisfaction with the app, as it better recognized words important to them. Support from Speach and Language Theropists, friends, and family members was crucial in helping participants complete the necessary recordings and create custom cards.

Differential Motivation

Participants' motivation to use Google Relate varied based on their specific circumstances. For example, small business owners found the app useful for communicating with customers, while university students used it to assist with writing essays.

The willingness to take risks with the application also varied. Some participants, aimed to use Google Relate in specific professional situations, such as arguing cases in court or delivering speeches at conferences. Others were more cautious, using the application in less critical contexts.

The motivation of conversational partners also played a role. Successful interactions depended on others also being able to adapt to the new communication method. In some cases, lack of understanding or acceptance from conversational partners meant the app was not helpful in a particular situation.

Recommendations for Future Development

ASR apps such as Google Relate can improve the lives of people with non-standard speech by enabling them to communicate more easily. But it isn’t as simple as just providing people with an app. This study has enabled us to make several recommendations to improve the development and deployment of Automated Speach Recognition (ASR) technologies in Ghana and similar contexts:

  1. Adapting Language to the Context: ASR applications should consider the linguistic diversity of the regions they are deployed in. Incorporating local languages and dialects can significantly enhance the usability and effectiveness of these technologies. Building Capacity and Awareness: Training and support for both users and their conversational partners are essential for successful adoption. Building capacity among SLTs, teachers, and community leaders can promote scalability and social acceptance of ASR technologies. Awareness campaigns can help reduce stigma and encourage broader acceptance of technology-mediated communication.
  2. Recognizing the Role of Technology as a Tool: ASR applications should be seen as tools that can support existing communication strategies rather than replacing them. Understanding the limitations and appropriate use cases for these technologies is crucial for maximizing their benefits

What next?

Our evaluation of Google Project Relate in Ghana highlights the potential of ASR technology to enhance communication equity for individuals with speech difficulties. However, it also underscores the importance of addressing current shortcomings to enable ASR technologies to play a significant role in the lives of people with communication disabilities in Ghana and beyond.

Building on the success of this study, GDI Hub is now collaborating with the University of Ghana, Keio Graduate School of Media Design, and Google Research Africa on "tɛkyerɛma pa" ("Good Tongue"), another AI-based initiative aimed at improving communication for individuals with non-standard speech patterns, also funded by Google and FCDO as part of the AT2030 This project focuses on improving AI-powered speech recognition technology and aims to expand the reach of ASR to five major Ghanaian languages: Akan, Ewe, Ikposo, Dagbani, and Dagaare. The project will create the first open-source dataset of non-standard speech in Ghanaian languages. It will not only benefit individuals in Ghana but will lay the foundation for future language collection and AI modelling in Africa, and provide valuable insights for global AI development. Local speech therapists will gather diverse speech samples from individuals across Ghana. The project team will then train and refine ASR models tailored to the nuances of non-standard speech in Ghanaian languages. Looking ahead, the team aims to expand this technology to other languages and regions, helping to ensure the benefits of AI are shared widely and equitably.