Plum Voice Platform v. 3.0
© 2008 Plum Group, Inc. All rights reserved.
<gender>:
AT&T Natural Voices, Cepstral Engine:
This attribute works fine for these speech engines.
RealSpeak Engine:
The gender attribute should not be used if the name attribute is already being used for the <voice> tag.
<age>:
AT&T Natural Voices:
This attribute is not supported.
Cepstral Engine:
This attribute looks for an exact match, instead of looking for the closest match. For example, <voice age="10"> will only select a ten-year-old voice, or fall back to the default voice if one is not found.
RealSpeak Engine:
This attribute is not supported.
<name>:
If you have an onsite system, please contact your sales account manager for which of these voices you have installed on your server.
The following names are supported by their respective engines:
AT&T Natural Voices:
| Language | Name | Gender | US | UK | Audio Sample |
|---|---|---|---|---|---|
| American English (en_us) | Mel | male | x | ||
| American English (en_us) | Mike | male | x | x | |
| American English (en_us) | Ray | male | x | x | |
| American English (en_us) | Rich | male | x | ||
| American English (en_us) | Claire | female | x | ||
| American English (en_us) | Crystal | female | x | x | |
| American English (en_us) | Julia | female | x | ||
| American English (en_us) | Lauren | female | x | x | |
| Spanish (es_us) | Alberto | male | x | ||
| Spanish (es_us) | Rosa | female | x | ||
| British English (en_uk) | Charles | male | x | x | |
| British English (en_uk) | Anjali | female | x | ||
| British English (en_uk) | Audrey | female | x | x | |
| French (fr_fr) | Alain | male | x | ||
| French (fr_fr) | Juliette | female | x | x | |
| German (de_de) | Reiner | male | x | x | |
| German (de_de) | Klara | female | x | x |
If no name is specified, mike is the default voice for the US AT&T Natural Voices while charles is the default voice for the UK AT&T Natural Voices.
Cepstral Engine (case-sensitive):
| Language | Name | Gender | US | UK | Audio Sample |
|---|---|---|---|---|---|
| American English (en_us) | David | male | x | x | |
| American English (en_us) | William | male | x | x | |
| American English (en_us) | Diane | female | x | x | |
| Spanish (es_us) | Miguel | male | x | x | |
| Spanish (es_us) | Marta | female | x | x | |
| British English (en_uk) | Lawrence | male | x | x | |
| British English (en_uk) | Millie | female | x | x | |
| French (fr_fr) | Jean-Pierre | male | x | x | |
| French (fr_fr) | Isabelle | female | x | x | |
| German (de_de) | Matthias | male | x | x | |
| German (de_de) | Katrin | female | x | x | |
| Italian (it_it) | Vittoria | female | x | x |
If no name is specified, Diane is the default voice for the US Cepstral Engine while Millie is the default voice for the UK Cepstral Engine. .
RealSpeak Engine (case-sensitive):
| Language | Name | Gender | US | UK | Audio Sample |
|---|---|---|---|---|---|
| American English (en-US) | Tom | male | x | ||
| American English (en-US) | Jennifer | female | x | ||
| American English (en-US) | Jill | female | x | ||
| American English (en-US) | Samantha | female | x | ||
| Mexican Spanish (es-MX) | Javier | male | x | ||
| Mexican Spanish (es-MX) | Paulina | female | x | ||
| British English (en-GB) | Daniel | male | x | x | |
| British English (en-GB) | Emily | female | x | x | |
| Australian English (en-AU) | Lee | male | x | ||
| Australian English (en-AU) | Karen | female | x | ||
| Canadian French (fr-CA) | Felix | male | x | ||
| Canadian French (fr-CA) | Julie | female | x | ||
| Portuguese (pt-PT) | Madalena | female | x | ||
| Brazilian Portuguese (pt-BR) | Raquel | female | x | ||
| German (de-DE) | Yannick | male | x | ||
| German (de-DE) | Steffi | female | x | x | |
| Spanish (es-ES) | Diego | male | x | ||
| Spanish (es-ES) | Isabel | female | x | ||
| French (fr-FR) | Sebastien | male | x | ||
| French (fr-FR) | Virginie | female | x | ||
| Italian (it-IT) | Silvia | female | x | x | |
| Dutch (nl-NL) | Claire | female | x | x | |
| Belgian Dutch (nl-BE) | Ellen | female | x | ||
| Mandarin Chinese (zh-CN) | Mei-Ling | female | x |
If no name is specified, Jill is the default voice for the US Realspeak Engine while Emily is the default voice for the UK Realspeak Engine.
Please contact your account manager if you want any of the following Realspeak voices:
| Language | Name | Gender |
|---|---|---|
| Danish (da-DK) | Nanna | female |
| Italian (it-IT) | Paolo | male |
| Indian English (en-IN) | Sangeeta | female |
| Spanish (es-ES) | Monica | female |
| Basque (eu-ES) | Arantxa | female |
| Japanese (ja-JP) | Kyoko | female |
| Korean (ko-KR) | Narae | female |
| Korean (kr-KR) | Narae | female |
| Norwegian (no-NO) | Nora | female |
| Polish (pl-PL) | Agata | female |
| Russian (ru-RU) | Katerina | female |
| Swedish (sv-SE) | Ingrid | female |
| Hong Kong Cantonese (zh-HK) | Sin-ji | female |
For the RealSpeak Engine, this attribute MUST be used along with its corresponding xml:lang attribute if the language is not en-US (American English). For example, to hear the Mexican Spanish voice "Javier", one must type the following:
<speak xml:lang="es-MX"><voice name="Javier"> ¿Hacen usted tienen gusto de los huevos? </voice></speak>
NOTE: For US speech recognition, we currently only offer American English speech recognition, Spanish speech recognition, and French-Canadian speech recognition for hosting. If you are interested in any other speech recognition languages, please contact your sales representative.
NOTE: For UK speech recognition, we currently only offer American English speech recogition and British English speech recognition for hosting. If you are interested in any other speech recognition languages, please contact your sales representative.
<xml:lang>:
If you have an onsite system, please contact your sales account manager for which of these languages you have installed on your server.
The following languages are supported by their respective engines:
AT&T Natural Voices:
| Language | Code Value | US | UK |
|---|---|---|---|
| German | de_de | x | x |
| British English | en_uk | x | x |
| American English | en_us | x | x |
| Spanish | es_us | x | x |
| French | fr_fr | x | x |
Cepstral Engine:
| Language | Code Value | US | UK |
|---|---|---|---|
| American English | en_us | x | x |
RealSpeak Engine:
| Language | Code Value | US | UK |
|---|---|---|---|
| American English | en-US | x | |
| Mexican Spanish | es-MX | x | |
| Canadian French | fr-CA | x | |
| German | de-DE | x | |
| British English | en-GB | x | |
| French | fr-FR | x | |
| Spanish | es-ES | x | |
| Belgian Dutch | nl-BE | x | |
| Dutch | nl-NL | x |
Please contact your account manager if you want any of the following Realspeak languages:
| Language | Code Value |
|---|---|
| Danish | da-DK |
| Swiss German | de-CH |
| Australian English | en-AU |
| Indian English | en-IN |
| Basque | eu-ES |
| Belgian French | fr-BE |
| Swiss French | fr-CH |
| Swiss Italian | it-CHC |
| Italian | it-IT |
| Japanese | ja-JP |
| Korean | ko-KR |
| Korean | kr-KR |
| Norwegian | no-NO |
| Polish | pl-PL |
| Brazilian Portuguese | pt-BR |
| Portuguese | pt-PT |
| Russian | ru-RU |
| Swedish | sv-SE |
| Mandarin Chinese | zh-CN |
| Hong Kong Cantonese | zh-HK |
Note that different syntax is used for the xml:lang attribute for the RealSpeak Engine. For example, <voice xml:lang="fr-FR"> would have to be typed to hear a French speaker. For the AT&T Natural Voices Engine and Cepstral Engine, one would type <voice xml:lang="en_us"> to hear an American speaker.
An "x" marks that the Child Tag is supported by the speech engine. An asterisk (*) means that there are notes to explain the difference between the speech engines.
| Child Tag | AT&T Natural Voices | Cepstral Engine | RealSpeak Engine |
|---|---|---|---|
| <break>* | x | x | x |
| <emphasis> | |||
| <enumerate> | |||
| <mark> | |||
| <paragraph>* | x | x | x |
| <phoneme>* | x | x | |
| <prosody>* | x | x | x |
| <say-as>* | x | x | x |
| <sentence>* | x | x | x |
| <speak> | x | x | x |
| <sub> | x | x | x |
| <value> | x | x | x |
<break>:
AT&T Natural Voices:
The break element works fine for when the voice speaker is en_us (American English) or when the language is set to en-us (American English). However, for the other languages (de_de (German), fr_fr (French), en_uk (British English), es_us (Spanish)), the "size" attribute does not work.
Cepstral Engine:
The "size" attribute of the break element does not work for the Cepstral Engine.
RealSpeak Engine:
The break element works fine for the RealSpeak Engine.
<paragraph>:
Cepstral Engine:
The "xml:lang" attribute does not work with the paragraph element.
<phoneme>:
AT&T Natural Voices and Cepstral Engine:
The phoneme element works fine using the Phoneme Sets shown below.
RealSpeak Engine:
This element is not supported.
Phoneme Set for AT&T Natural Voices:
US English:
| Phoneme | Example | Transcription |
|---|---|---|
| aa | Bob | b aa b 1 |
| ae | bat | b ae t 1 |
| ah | but | b ah t 1 |
| ao | bought | b ao t 1 |
| aw | down | d aw n 1 |
| ax | about | ax 0 b aw t 1 |
| ay | bite | b ay t 1 |
| b | bet | b eh t 1 |
| ch | church | ch er ch 1 |
| d | dig | d ih g |
| dh | that | dh ae t 1 |
| dx | butter | b ah 1 dx er 0 |
| eh | bet | b eh t 1 |
| em | Chatham | ch ae 1 dx em 0 |
| en | satin | s ae 1 q en 0 |
| er | bird | b er d 1 |
| ey | bait | b ey t 1 |
| f | fog | f ao g 1 |
| g | got | g aa t 1 |
| hh | hot | h aa t 1 |
| ih | bit | b ih t 1 |
| iy | beat | b iy t 1 |
| jh | jump | jh ah m p 1 |
| k | cat | k ae t 1 |
| l | lot | l aa t 1 |
| m | Mom | m aa m 1 |
| n | nod | n aa d 1 |
| ng | sing | s ih ng 1 |
| ow | boat | b ow t 1 |
| oy | boy | b oy 1 |
| p | pot | p aa t 1 |
| q | button | b ah 1 q en 0 |
| r | rat | r ae t 1 |
| s | sit | s ih t 1 |
| sh | shut | sh ah t 1 |
| t | top | t aa p 1 |
| th | thick | th ih k 1 |
| uh | book | b uh k |
| uw | boot | b uw t 1 |
| v | vat | v ae t 1 |
| w | won | w ah n 1 |
| y | you | y uw 1 |
| z | zoo | z uw 1 |
| zh | measure | m eh 1 zh er |
0 Unstressed
1 Primary stress
2 Secondary stress
& Word boundary
UK English:
| Phoneme | Example | Transcription |
|---|---|---|
| p | point | p OI n t 1 |
| b | big | bIg1 |
| t | team | t i: m 1 |
| d | dare | de@1 |
| k | case | k eI s 1 |
| g | good | gUd1 |
| dZ | ginger | dZ I n 1 dZ @ 0 |
| tS | check | tS e k 1 |
| f | fool | f u: l 1 |
| v | vest | vest1 |
| D | this | DIs1 |
| T | thick | TIk1 |
| s | sell | sel1 |
| z | zeal | z i: l 1 |
| S | shoot | S u: t 1 |
| Z | measure | me1Z@0 |
| h | house | h aU s 1 |
| m | main | m eI n 1 |
| n | name | n eI m 1 |
| N | sing | sIN1 |
| l | life | l aI f 1 |
| @I | bottle | b Q 1 t @l 0 |
| r | right | r aI t 1 |
| j | yes | jes1 |
| w | wood | wUd1 |
| i: | beat | b i: t 1 |
| I | bit | bIt1 |
| eI | bait | b eI t 1 |
| e | bet | bet1 |
| A: | father | f A: 1 D @ 0 |
| { | bat | b{t1 |
| @U | boat | b @U t 1 |
| O: | bought | b O: t 1 |
| Q | boss | bQs1 |
| u: | boot | b u: t 1 |
| U | book | bUk1 |
| V | but | bVt1 |
| 3: | bird | b 3: d 1 |
| aU | bout | b aU t 1 |
| OI | boy | b OI 1 |
| aI | bite | b aI t 1 |
| @ | scallop | sk{1l@p0 |
| I | believe | b I 0 l i: v 1 |
0 Unstressed
1 Primary stress
2 Secondary stress
& Word boundary
Phoneme Set for Cepstral Engine:
US English:
| Phoneme | Example | Transcription |
|---|---|---|
| aa | father | f aa1 dh er0 |
| ae | cat | k ae1 t |
| ah | about | ah0 b aw1 t |
| ao | bought | b ao1 t |
| aw | cow | k aw1 |
| ay | buy | b ay1 |
| b | book | b uh1 k |
| ch | catch | k eh1 ch |
| d | bad | b ae1 d |
| dh | then | dh eh1 n |
| eh | get | g eh1 t |
| er | earth | er1 th |
| ey | ate | ey1 t |
| f | fat | f ae1 t |
| g | good | g uh1 d |
| h | hello | h eh0 l ow1 |
| i | sheep | sh i1 p |
| ih | ship | sh ih1 p |
| j | yes | j eh0 s |
| jh | digit | d ih1 jh ih0 t |
| k | camera | k ae1 m r ah0 |
| l | late | l ey1 t |
| m | man | m ae1 n |
| n | new | n uw1 |
| ng | bang | b ae1 ng |
| ow | float | f l ow1 t |
| oy | boy | b oy1 |
| p | camper | k ae1 m p er0 |
| r | car | k aa1 r |
| s | sit | s ih1 t |
| sh | ship | sh ih1 p |
| t | tap | t ae1 p |
| th | thin | th ih1 n |
| uh | full | f uh1 l |
| uw | moon | m uw1 n |
| v | have | h ae1 v |
| w | water | w ao1 t er0 |
| z | zero | z i0 r ow0 |
| zh | vision | v ih1 zh ah0 n |
0 Unstressed
1 Primary stress
2 Secondary stress
& Word boundary
UK English:
| Phoneme | Example | Transcription |
|---|---|---|
| t | tap | t ae1 p |
| p | pat | p ae1 t |
| b | book | b uh1 k |
| d | done | d ah1 n |
| k | camera | k ae1 m r ah0 |
| g | good | g uh1 d |
| ch | chart | ch a1 t |
| jh | jack | jh ae1 k |
| f | fat | f ae1 t |
| v | various | v e@1 r i0 ih0 s |
| th | thin | th ih1 n |
| dh | then | dh eh1 n |
| s | sit | s ih1 t |
| z | zero | z i1 r ow0 |
| sh | clash | k l ae1 sh |
| zh | vision | v ih1 zh ah0 n |
| h | hello | h eh1 l ow0 |
| m | man | m ae1 n |
| n | new | n j uw1 |
| ng | sitting | s ih1 t ih0 ng |
| r | reason | r i1 z ah0 n |
| l | late | l ey1 t |
| w | water | w ao1 t er0 |
| j | yellow | j eh1 l ow0 |
| i | sheep | sh i1 p |
| ih | image | ih1 m ih0 jh |
| eh | end | eh1 n d |
| ae | bank | b ae1 ng k |
| er | earth | er1 th |
| ah | about | ah1 b aw0 t |
| a | father | f a1 dh er0 |
| oa | on | oa1 n |
| ao | bought | b ao1 t |
| uh | could | k uh1 d |
| uw | moon | m uw1 n |
| ay | buy | b ay1 |
| aw | cow | k aw1 |
| oy | oyster | oy1 s t er0 |
| ow | float | f l ow1 t |
| ey | bacon | b ey1 k ah0 n |
| e@ | fairly | f e@1 l i0 |
| i@ | weary | w i@1 r i0 |
0 Unstressed
1 Primary stress
2 Secondary stress
& Word boundary
<prosody>:
AT&T Natural Voices:
The prosody element works fine for this engine. You can specify a preset rate ("fast", "medium", "slow", or "default"). However, using a preset rate is not recommended because it either sets the voice rate to too slow or too fast. The "rate" attribute can also be set to an integer value such as "100.0" or "50.0". A normal voice rate should be set to around "150.0". These values are not in accordance with the SSML spec, where rates are specified relative to 1. Additionally, you can also adjust the voice rate by using percentages. To increase the rate you could type "+50%" to make the voice rate 50% faster or "-50%" to make the voice rate 50% slower.
Cepstral Engine:
The prosody element works fine for the Cepstral Engine. Also, the "pitch" attribute only works for the Cepstral Engine.
RealSpeak Engine:
When using a Realspeak TTS voice, the talking speed of the TTS voice does not revert back to the normal speed after the <prosody> tag has been used. To revert it back to normal, you must use the <prosody> tag again with the attribute of "volume" set to "100.0" and the attribute of "rate" set to "default".
<say-as>:
The table below shows the <say-as> tag types and the speech engines that support them. An "x" marks that the <say-as> tag is supported by the speech engine.
| Say-as Tag Types | AT&T Natural Voices | Cepstral Engine | RealSpeak Engine |
|---|---|---|---|
| acronym* | x | x | |
| address | x | x | x |
| number | x | x | x |
| number:cardinal | x | x | x |
| number:ordinal | x | x | x |
| number:digits | x | x | |
| number:decimal | x | x | x |
| number:fraction | x | x | x |
| number:telephone | x | x | x |
| date | x | x | x |
| date:dmy* | x | x | x |
| date:mdy* | x | x | x |
| date:ymd* | x | x | |
| date:ym* | x | x | |
| date:my* | x | x | x |
| date:md* | x | x | x |
| date:dm* | x | x | x |
| date:y* | x | x | x |
| date:m | x | x | x |
| date:d | x | x | |
| date:day | x | ||
| digits | x | ||
| duration | x | ||
| duration:h | x | ||
| duration:hm | x | ||
| duration:m | x | ||
| duration:ms | x | ||
| duration:s | x | ||
| measure* | x | x | x |
| name | x | ||
| net:email | x | x | x |
| net:uri | x | x | x |
| time* | x | x | x |
| time:h | x | x | x |
| time:hm | x | x | x |
| time:hms | x | x | |
| spell | x | ||
| telephone* | x | x | x |
| currency* | x | x | x |
acronym: The acronym tag type works fine in the US, but does not work in the UK. If you are using AT&T Natural Voices and you want to spell out words or say back digits in the UK, you would have to use commas inside of a string such as "a, c, r, o, n, y, m" or "1, 2, 3, 4, 5".
date:mdy: The preferred format of this tag is "month abbreviation day, year". For example, to return "December 25, 2001", you would type "Dec 25, 2001". You can also use the "month/day/year" format such as "12/25/01" for the US, but this format will not work in the UK.
date:dmy: The preferred format of this tag is "day month abbreviation, year". For example, to return "December 25, 2001", you would type "25 Dec, 2001".
date:ymd: The preferred format for this tag is "year month abbreviation day". For example, to return "December 25, 2001", you would type "2001, Dec 25".
date:my: The format of this tag should be "month abbreviation, year". For example, to return "December, 2001", you would type "Dec, 2001".
date:md: The preferred format for this tag is "month abbreviation day". For example, to return "December 25", you would type "Dec 25". You can also use the "month/day" format such as "12/25" for the US, but this format will not work in the UK.
date:dm: The preferred format for this tag is "day month abbreviation". For example, to return "December 25", you would type "25 Dec".
date:ym: The preferred format for this tag is "year/month". For example, to return "December 2001", you would type "2001/12".
date:y: The date:y tag type works fine in the US, but does not work in the UK.
measure: For AT&T Natural Voices, you could use either a format such as 5'4" or 5m (5 meters). For Cepstral, the preferred format would follow one such as 5m. For Realspeak, the preferred format would follow one such as 5'4".
time: The time tag type works fine in the US, but does not work in the UK.
telephone: The telephone tag type works fine in the US, but does not work in the UK.
The format for telephone numbers is: 123-456-7890
The format for telephone extensions is: 123-456-7890 ext1234
NOTE: For extensions, AT&T Natural Voices and Realspeak will say the number back correctly. In the example above, AT&T Natural Voices and Realspeak will say, "one two three four five six seven eight nine zero, extension one two three four." However, Cepstral will say, "one two three four five six seven eight nine zero, extension twelve thirty-four." To account for this, you can insert commas between the numbers after extension: 123-456-7890 ext1,2,3,4.
currency: When using the say-as type, currency, for AT&T Natural Voices with a Spanish TTS voice, please keep in mind that you will need to format the currency to $<dollar amount>,<cents amount>. The currency amount will not be pronounced correctly if you format it as $<dollar amount>.<cents amount>.
<sentence>:
Cepstral Engine:
The xml:lang attribute does not work with the sentence element.