I blogged in the past about how Web2.0, which pretends to be the incarnation of the personal, social, human, interactive web, is lacking the most fundamental human protocol – Voice. I'm not talking about Voice as an application – Skype, or Gtalk, but rather about voice as a mean for communication – with people, humanoids and applications alike.
Well, Yahoo!, so it seems, is going this way (and Google is undoubtedly going there just the same). Jeff Bonforte, Senior Director of Voice Product Management at Yahoo!, described Yahoo!'s take on Voice 1, 2 and 3:
Voice 1 – the hundred years old, boring yet highly successful (business-wise) dial tone.
Voice 2 – Voice as an application: Skype, Messanger. Voice is Data.
Voice 3 – Voice as an invocation protocol – an interface to Yahoo!'s vast amount of content.
Rest the never ending problem of voice recognition. And here Bonforte describes a brilliant bypass to the problem: instead of trying to figure out what the speaker really said, Yahoo! sends the transcription to… their search engine. The search engine returns the "Did you mean: ___" which is the outcome of a machine-learning, based on the next thing users are searching (when users are mistyping their search query, the next thing they usually do is to re-type the correct string). So just like the search engine is "learning" over time what is a real search string and what is most probably a typo, the voice recognition engine will learn what a meaningful interaction is and what is not.
These voice advancements are highly symbolic. They represent a change in the current Browser paradigm, and they are the first steps toward a physical integration of human beings into the World Wide Web, an inevitable outcome of the technology that shapes up our lives.
Yahoo! and Emerging Telephony 00:19:38, 9 mb, Jan 25th, 2006
Original Podlink: Yahoo! and Emerging Telephony