Inconsistencies with differe...

May 2, 2011

Have you ever tried experimenting with the different TTS (Text-to-Speech) engines (AT&T Natural Voices, Cepstral, Nuance Realspeak) and finding that each engine behaves differently with your IVR (Interactive Voice Response) code?

Well, that’s because each engine DOES behave differently!

Let’s take a look at a specific IVR example using a Spanish TTS voice for each of the different TTS engines.

For AT&T Natural Voices:

<?xml version=”1.0″ encoding=”latin-1″?>
<vxml version=”2.0″>
<form>
<block>
<prompt>
<voice name=”rosa”>
Hola. Su sombrero es muy grande.
</voice>
</prompt>
</block>
</form>
</vxml>

For Cepstral:

<?xml version=”1.0″ encoding=”latin-1″?>
<vxml version=”2.0″>
<form>
<block>
<prompt>
<voice name=”Marta”>
Hola. Su sombrero es muy grande.
</voice>
</prompt>
</block>
</form>
</vxml>

For Nuance Realspeak:

<?xml version=”1.0″ encoding=”latin-1″?>
<vxml version=”2.0″>
<form>
<block>
<prompt>
<speak xml:lang=”es-MX”><voice name=”Paulina” gender=”female”>
Hola. Su sombrero es muy grande.
</voice></speak>
</prompt>
</block>
</form>
</vxml>

One of these things is DEFINITELY not like the other. Nuance Realspeak has to be implemented in this manner in order to get consistent behavior for this prompt. However, AT&T Natural Voices and Cepstral do not need these extra tags.

So, the next time you’re implementing your IVR code and using a foreign language TTS voice, please be mindful of these inconsistencies that exist amongst the different TTS engines.

Leave a Reply