Near-Real-Time Translation

Interactive voice response, or IVR, technology relies heavily on speech recognition software. IVR systems need to be able to understand callers if they’re going to help them.

Getting accurate, timely translation has been a work in progress over the last couple decades. But recent increases in computing power through more powerful processors have greatly improved speech recognition over the last couple years. And Siri, the iPhone’s speech-rec personal assistant, has brought the concept to the public’s attention.

Now we’re ready for the next step—for both IVRs and also personal mobile devices. And Microsoft Research may have just unveiled that very next step with new software that can translate speech in near real time.

According to Microsoft Chief Research Officer Steve Rashid, Microsoft Research and the University of Toronto made a breakthrough a couple years ago using a technique called Deep Neural Networks, “which is patterned after human brain behavior,” to “train more discriminative and better speech recognizers than previous methods.”

Rashid wrote in a guest blog on TechNet that the advances have reduced “the word error rate for speech by over 30% compared to previous methods” and recently enabled the Microsoft Research team to create a translator that recognizes speech using DNN and then quickly translates the speech to another language.

“…We take the text that represents my speech and run it through translation—in this case, turning my English into Chinese in two steps,” Rashid wrote. “The first takes my words and finds the Chinese equivalents, and while non-trivial, this is the easy part. The second reorders the words to be appropriate for Chinese, an important step for correct translation between languages.”

It’s another step towards speech recognition and IVR software becoming ever more natural and easy to use.

Share this with friends!twittergoogle_pluslinkedin