On the latest episode of John Gruber’s The Talk Show, guest Ben Thompson tries to identify the ways in which Amazon’s Alexa speech recognition is better than Apple’s Siri.
One of his key points was that Alexa, by being theoretically less capable than Siri, manages to avoid the heightened expectations and subsequent disappointment that users feel when Siri fails to listen as well as it promises to. It may be less competent overall, but what it does do it does predictably and well.
A comparison that came immediately to my mind was Apple’s mid-1990’s failure with the Newton handheld computer. The ambitious handwriting recognition was pure magic when it worked, but failed to work a significant amount of the time. Meanwhile, Palm took a more pragmatic approach with Graffiti, an overtly limited interpretation of the Roman alphabet, Arabic numerals, and a few other widely used symbols. By dramatically diminishing the magic of its handwriting recognition technology, Palm dramatically increased its reliability. Users seemed to appreciate this compromise, as Newton sputtered, and Palm Pilots went on to define the whole genre of hand-held digital assistants.
As much as I like this comparison, I don’t think Siri is doomed in the same way Newton was. Handwriting recognition was a primary interface on Newton, while with iOS devices it’s usually considered an augmenting interface. You can, and many people do, get plenty of use out of an iPhone without ever relying upon Siri. Siri is also nowhere near as unreliable, in my opinion, as Newton handwriting was. I use Siri on a daily basis and, perhaps because I’ve rarely tried anything better, I still find it an overall boon to productivity.
I think we are still in the early days of speech recognition, which feels funny to write because back in the mid-1990’s when Apple was failing to perfect handwriting recognition, they were doing the very same thing with speech recognition. But as John and Ben said on the show, none of the existing technologies, whether from Apple, Amazon, Microsoft, Google, Nuance, or others, is even close to perfect. There is so much interest now in the technology, that it’s possible to at least imagine extremely fast, reliable, predictable speech recognition becoming the norm. Whether the standard ends up being a shorthand-based approach such as Amazon is taking with Alexa, or a more ambitious artificial intelligence, may depend on which company can close the gap faster using one approach or the other.