Speech technology: so much more than a voice assistant
‘Hey Google! Tell me a joke.’ Speech technology is increasingly becoming part of our everyday life. For many people, it is no more than a fun gimmick built into a telephone or a virtual assistant. But the underlying technology can be used for very serious applications, for example in the medical world or in preserving languages and cultures, explains Matt Coler.
Text: Bart Talens/Industry Relations, UG
Matt Coler is an associate professor in Language and Technology at the University of Groningen Campus Fryslân. Together with his research group, he works in the field of culture, language and technology, studying various aspects of speech technology. Speech technology is defined as the use of artificial intelligence to automatically translate speech to text or written text to spoken text, or to automatically recognize specific speech or voice features. Coler is supervising PhD students Vass Verkhodanova and Phat Do, who are conducting research on this topic.
Recognize Parkinson’s disease based on speech
Verkhodanova investigates how neurologists can recognize neurodegenerative diseases, such as Parkinson’s disease, based on people’s speech and voice. For example, some doctors are very good at recognizing dysarthria, an articulatory disorder. Without running any tests, they can recognize the disorder immediately when they hear a patient speak. Coler compares this to standing in a busy metro station in a Spanish-speaking country. If you hear someone speak Dutch in such a setting, your brain will easily filter it out and the sound will reach you, even though it is probably not very salient acoustically. This is due to the fact that the brain picks out certain signals from the sound, and places extra emphasis on this information.
Monitoring a disease via telephone conversations
Verkhodanova investigates how such signals in voice and speech use develop in different disease profiles, and how you can use artificial intelligence to automate recognition. In the next research phase, doctors could, for instance, monitor the development of a disease in a patient via telephone conversations. This would make care easier and more accessible for many people. Someone with a higher risk of developing a specific neurodegenerative disease could then be automatically screened via a short telephone conversation.
Computer voice for Frisian
Phat Do, Coler’s other PhD student, is investigating speech synthesis and is working on creating a computer voice for the Frisian language. To develop a computer voice, the program has to integrate large amounts of speech data from the relevant language. Some smaller languages only have a few small speech corpora, and it is not always possible to collect more data. This is why in training his Frisian computer voice, Do uses both a Frisian data set and additional speech data from other languages, with a focus on ‘imitating’ a natural melody and sound. Thanks to this exceptional technology, Do is developing an artificial ‘voice’ that not only speaks Frisian, but also sounds very natural.
No gadgets
Campus Fryslân researchers take an interdisciplinary approach, in which collaboration between researchers, the corporate sector and societal partners plays a central role. Coler’s research group, with their application-driven focus, is certainly no exception. Coler: ‘If you wonder about the usefulness of these kinds of applications, it is good to realize that we are not studying gadgets here. This technology has very important real-life applications: think of patients with throat cancer, who can no longer speak in a healthy way or who suffer from language production disorders. A naturally sounding artificial voice can greatly improve these patients’ quality of life.’
Societal responsibility
Coler explains that big tech companies also work a lot with artificial speech. Think of Apple’s Siri or Amazon’s Alexa. ‘These companies work on the basis of a revenue model. They are not likely to include dialects or minority languages in their technology, simply because there is no market for it. However, this does not mean that there is no need for it.’ He continues: ‘This is where universities have a responsibility to work between the market and society, to continue to develop these technologies and prepare them for various uses. Instead of competing with big companies on basic speech technology applications, our research group tries to diversify the field by focusing on applications that improve quality of life.’
Great potential
The development of speech technology therefore has a great potential for societal impact and relevance – a potential that is not yet fully realized. Speech technology applications can even play a role in lawsuits. For example, the main suspect in a lawsuit surrounding the death of American teenager Trayvon Martin was acquitted due, among other things, to his claim that he could be heard shouting for help in a phone conversation with the alarm centre. Many people doubted whether the voice in question really belonged to the suspect. In such cases, voice recognition could play a crucial and decisive role in jurisdiction. In addition, the medical world has a great need for applications for patients. And speech synthesis can contribute to protecting and preserving endangered languages.
More information
Last modified: | 01 June 2021 08.51 a.m. |
More news
-
27 May 2024
Symposium 'From tensions to opportunities'
On 20 June 2024 a symposium will take place around the question: 'How to work effectively and meaningfully with internationalisation and diversity in study programs and disciplines?'. The symposium builds on the PhD research by Franka van den Hende...
-
22 May 2024
UG awards various prizes during the Ceremony of Merits
The UG awarded various prizes to excellent researchers and students during the Ceremony of Merits on 21 May 2024. The Wierenga-Rengerink PhD Award for the best UG dissertation was awarded to Dr Bram van Vulpen (Campus Fryslân). The Gratama Science...
-
02 April 2024
Is more data always better?
Xiaoyao Han researches the added value of Big Data and explores how the accumulation of data enriches our scientific understanding