Skip to content

Speech Dating: Finding Love With A Computer Voice

December 11, 2012

Saying: “I love you,” should sound quite different from saying: “I hate you,” but in Lee Ridley’s case they both sound exactly like anything else he would say.

Lee is Britain’s only stand-up comedian to use a solely computer-generated voice, as he is unable to speak.

His new film sketch, Voice by Choice, follows three people who use speech-synthesis technology as they meet at a speed-dating event.

The film was put together in collaboration with the Creative Speech Technology (Crest) Network to show how valuable speech synthesisers can be and to illustrate the difficulties of living with some of the current technology.

As the three romantic hopefuls make their introductions, they notice they all have the same voice – even though one is a woman and two are men.

There are some awkward pauses as they give each other time to get their words in order on their machines, and comic mishaps with predictive text – all based on real-life experiences.

The charity Communication Matters says at least 30,000 children and adults in the UK could benefit from speech-synthesis technology.

And it predicts this number will increase as the population ages and more people with complex needs survive.

People need these devices for a variety of reasons. Some are born without the ability to learn the process of speech, due to conditions affecting the brain and the muscles involved in speech, for example in some cases of cerebral palsy.

Other people develop conditions that lead to a deterioration in their speech in later years – for example motor neurone disease, in which the muscles used to speak may weaken, or strokes affecting certain areas of the brain.

“There is an urgent need for this type of technology to be more widely available and for it to be more reliable and personal,.” says Dr Alistair Edwards, co-principal investigator of the Crest Network, based at the University of York.

‘More identity’

Lee says he would be lost without his machine.

“I don’t need to rely on other people to get my message across any more,” he adds.

“It has made me a lot more independent and a lot more confident.”

But he says it is still really hard to show how he feels.

“It’s pretty disappointing when you want to express how you feel and it just doesn’t come out right.”

If Lee could choose any voice, he would like to try one with a Geordie accent, so he could have “a bit more of an identity”.

Nicola Bush, the actress in the sketch, received her first speech-synthesis device at the age of 15.

She says: “I felt dead, but when I got my first voice it opened important doors for me.”

Nicola says her device allows her to have a closer relationship with the people around her, but she hopes more children will be given these devices at a younger age.

‘Emotions are difficult’

David Niemeijer, founder of AssistiveWare, one of the companies involved in this technology, says there are a number of reasons the devices have been slow to change.

” It is a complex process. It is very costly and people just accepted there were no children’s voices, for example,” he says.

“As everyone accepted that, there was little incentive to change.”

His organisation has worked with the company Acapella to make Britain’s first speech synthesisers that use children’s voices.

Until their system was launched earlier this year, children – who are the most frequent users of this technology – had to use machines with adult voices or an adult voice processed to sound more like a child’s.

“To build the voice, we record a real person for 15-18 hours and the data gets cut up into little pieces and stitched back together so the voice can say anything – even things the person never said,” says Chris Pidcock, chief voice engineer at CereProc, a company that develops speech-synthesis technology.

CereProc has recently built a voice with a Brummie accent.

‘Designer voices’

“Because making these voices takes a lot of effort and expense, most people in the past focused on neutral sounding voices. People were quite cautions, but this is changing,” Mr Pidcock says.

“Emotions are more difficult because the voice does not know what your intentions are. It can’t know what emotion you want,” says Mr Niemeijer.

Another significant problem with current systems is the inability to chat in real-time says David Mason, of Toby Churchill, the company that makes Lee’s machine.

This means there can be lengthy pauses as people put their words together, but the company is working with academics to improve this and make the voices more natural.

“Text-to-speech is getting better. But it can never replace human speech. People are spectacular in terms of all the nuances they can offer,” says Mr Niemeijer.

Dr Chris Newell, co-principal investigator of the Crest Network, says as technology improves, there might even be an opportunity to go further.

“Perhaps we can create voices that are more special than regular human voices or even designer voices – maybe one day you could choose a voice so you sound sexy or sound like a film star”.

No comments yet

What are you thinking?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: