Yesterday’s post focused on the wide range of possibilities natural language processing offers interactive voice response applications. NLP has a long and fairly comprehensive history, and has been integrated with a variety of software applications at a frequent rate over the last half-century. However, many in the tech and IVR industries would say that Siri is the most groundbreaking, significant development in the field of natural language processing to date.
Whether you own an iPhone 4S or have merely seem the commercials in passing, most are familiar with the phone’s most popular new feature, Siri. For those of us in the IVR and tech communities, Siri is truly one of the most revolutionary developments in the field of speech technology.
Developed by iOS to act as an intelligent software assistant, Siri is an application that uses natural language processing to fulfill a variety of tasks including answering questions, offering directions, and interfacing with web services to perform general tasks and activities. Siri is innovative for myriad reasons, including its unique ability to adapt to user’s dialect, vocabulary, and preferences.
Initially, Siri was launched as an iPhone application that integrated with other iPhone applications like OpenTable, Google Maps, and Movie Tickets. Siri allowed users to operate these applications using speech recognition technology powered by Nuance (the same speech rec engines Plum uses to power its IVR applications).
Siri was integrated with iOS 5, the newest iPhone operating system, and was the primary feature upgrade of the iPhone 4S upon its release in October. Siri can support English, German, and French, and functions with many of the preloaded applications on the iPhone, making it the first phone that is nearly 100% controllable via speech.
While many using the iPhone 4S may take Siri for granted, the research and development that went in to the project went on for years and required the input of scholars, academics, engineers, and even philosophers from many distinguished institutions.
Per Wikipedia, “with Siri, Apple is using the results of over 40 years of research funded by Defense Advanced Research Projects Agency.” In addition, research teams from Carnegie Mellon, the University of Massachusetts, the University of Rochester, Oregon State University, and Stanford University contributed input on machine learning, probabilistic reasoning, ontology, and natural language understanding, to name a few.
Siri is truly one of a kind, and the technology that powers the application is pioneering in every sense of the word. Virtually any other company in the world attempting to integrate and utilize speech recognition will not be able to touch Siri, because the development process for this type of software application is complex, expensive, and requires the contribution of many top-notch academics from world-class institutions.
Basically, there are only a handful of companies in the entire world besides Apple that would have the capital and influence to develop and integrate this type of system into their applications.
Siri set the bar for voice recognition interfaces extremely high, and represents the possible future of speech recognition for all types of applications. However, the development of Siri-like software for wider market use by all types of companies in the speech rec field is probably pretty far off.
Like all types of new technology, the technology that governs Siri is the first of its kind and will have to go through many series of adaptations and integrations to reach a price point that makes integration feasible for companies beyond the world’s highest revenue generators.
However, it does give all of us in the voice recognition industry something to look forward to, and a potential model to base our current and future research and development upon.
