Enhancing Communication Equity: Evaluation of an Automated Speech Recognition Application in Ghana

Catherine Holloway, Giulia Barbareschi, Richard Cave, Gifty Ayoka
May 11, 2024
Ghana
Academic Research Publications

Introduction

Although the global prevalence of communication disabilities is unknown, studies have estimated that as many as 28-49% of people with disabilities worldwide experience difficulties with communication at some point in their lives [39]. Impaired speech is often associated with severe stigma, and people with communication difficulties are amongst the most marginalised groups in society due to existing barriers and discriminatory attitudes [505393]. To avoid failed exchanges and misunderstandings, a person with impaired speech may choose to interact only when sharing essential information, choose only to speak with a familiar communication partner, or let others speak on their behalf [23]. A failed social exchange can carry the message of inferiority [30], resulting in reduced participation [79]. Talking remains the preferred medium of communication for many despite the difficulties they might face, as it represents a powerful medium of identity [20], which people can leverage to communicate mood, humour, geographical, social and educational background, health status, gender - as well as the content of the message [61]. If talking becomes difficult to understand, individual and social identity can be negatively affected, increasing the risk of social withdrawal [102].
 
In Ghana and other West African countries, these barriers are often more pronounced as a result of compounding factors that range from the lack of Speech and Language Therapy (SLT) services and poor availability of assistive technologies to support communication [233859101], to the stigmatising cultural beliefs that label disability as a curse [98101]. Within this context, the expanding technological infrastructure and the increasing penetration rate of mobile phones across all population segments offer a two-fold opportunity. First, a viable opportunity to expand the training and long-term professional development of Speech and Language Therapists (SLTs) in Ghana [38], Second, mobile phones also represent a vital asset to support people who experience difficulties with communication in their everyday lives [12133360677681].
Recent developments in the creation of bespoke language models have expanded the possibility of using Automatic Speech Recognition (ASR) software for people with dysarthria, the collective term for a group of neurologic speech disorders linked to muscular dysfunctions, or other conditions that affect the ability to articulate speech [727386103]. These technologies do not alter how the disabled person speaks but can help listeners better understand what is said by repeating or transcribing words and sentences in real time to facilitate communication [161758]. An example is Project Relate, an English-based ASR application freely available for mobile phones featuring Android 8 and above. It specifically targets individuals with ‘non-standard’ speech, described as speech that differs from the accepted and recognisable speech of adults in a particular language [1]. The application requires a minimum of 500 samples of an individual’s speech, collected by recording several pre-set phrases, to build a customised speech model, which can then be used to produce automated real-time transcription, facilitate interaction with Google Assistant, enable voice typing for SMS and other functions, as well as speech repetition using a synthesised easy-to-understand voice [2].
 
The Technology Amplification theory [87] illustrates how technology does not deliver additive or transformative benefits without adequate support and infrastructure but amplifies current trends and social inequalities. The mechanisms of amplification revolve around three dimensions: access, capacity, and motivation. Disregarding these differential aspects means that innovative technologies such as Project Relate may only positively impact a small minority of potential users with communication difficulties, while leaving the majority of others in even more marginalised positions.
 
Although Project Relate officially became available in Ghana in 2022 [1], we noticed very limited awareness of it within Disabled People’s Organisations (DPOs) and SLTs organisations in Ghana. Furthermore, as the application had been developed in the US and it is available only in English, it was unclear to what extent the different contexts could affect its use. Finally, as is the case for most digital products and assistive technologies, users are likely to require specific resources, competencies and support to integrate Project Relate into their life and benefit from its use. However, to date, there is limited guidance available for potential users of Project Relate, such as training, technical support, clear discussion on limitations and potential issues, and advice about navigating barriers or troubleshooting potential issues. It was unclear whether any of these factors impacted the use and usefulness of Project Relate to Ghanan users.
 
To identify the mechanisms that determine differences in access, capacity and motivation among users of Project Relate in Ghana, we conducted a 6-week study involving 10 SLTs and 20 adults with communication difficulties. In line with other HCI studies examining the use of mobile phones amongst marginalised populations, we leveraged the lens of the Technology Amplification Theory [41879196] to analyse data collected during training observations, semi-structured interviews and 4-weeks of self-reported accounts from participants using a photovoice approach.
 
Our results show differential access to the application and its features is determined by a variety of factors, including the severity of the dysarthria and the presence of other functional impairments, the type of phone, the availability of data, and the language of the user as well as their conversation partner. Differential capacity is affected by the person’s literacy, their ability to create and record customised sentences, which strengthen the language model, making Project Relate more effective in everyday life, and the stigma preventing people from interacting and communicating with others beyond their immediate circle. Finally, differential motivation depends on individuals’ specific life circumstances, which determine various use-cases. As Project Relate is meant to be used in conversational settings with another person, we highlight how the user’s motivation alone is insufficient; their conversational partners must also accept this new form of interaction.
 
Based on these results, we provide three key recommendations for the development and deployment of assistive and accessible technologies for communication in Ghana and potentially other contexts in the Global South: 1) Understanding the contextual nature of language, not only in relation to different national languages, but also vocabulary, expressions at an individual and social level; 2) Considering stakeholders beyond the users including their support structure within and beyond the family, SLTs as well as strangers to help normalise perception around technology-mediated communication; 3) Acknowledging strengths and limitations of ASR to understand in which situations they can be beneficial and when other strategies are preferable.
 
In summary, our study makes the following contributions to HCI:
 
•A novel study examining the experiences of people with communication difficulties in Ghana with mobile phone-based ASR technologies for non-standard speech
•An analysis of the factors determining differential access, capacity, and motivation in the use of mobile-phone based ASR technology for non-standard speech which can exacerbate existing social inequalities
•A series of recommendations targeting HCI researchers and developers of ASR and other communication technologies specifically targeting users with communication difficulties living in the Global South