Siri Today and in the Future

Yesterday Wired magazine published an article about the most recent improvements to Siri. Several prominent Apple executives participated including Alex Acero, the Siri lead, and Greg Joswiak.

The focus of the article was the improvement to Siri’s voice with IOS 11. Having used the beta now for several months, I can tell you that Siri is most certainly more expressive than in prior versions. The technology behind it, as explained in the article, is quite fascinating. Rather than using recorded words, they are using phonemes, which are the individual sound components of words assembled by Siri on-the-fly to be as expressive as possible.

One issue I would take with the article is that it almost feels as if they are implying Apple is only working on making Siri more expressive and not generally smarter. I’m pretty sure Apple can walk and chew gum, and from my own experience with Siri, it has continually improved since first released.

An example of this is calendar appointments. Up until about a year ago, scheduling calendar appointments was a syntax-heavy task with Siri. That’s not true anymore. Now there are several ways that you can naturally ask Siri to schedule an appointment, and she usually gets it right. The “usually” in that sentence is the problem. “Usually” needs to become “Always” or “Almost Always”. For Siri, the make-or-break moment is the first time a user tries to do something new. If you try to set a calendar appointment and Siri crashes and burns, you probably won’t try it again. To get more users to buy in, Apple needs to continue to improve that first experience, so users are encouraged to dig deeper.

The Wired article also addresses the different philosophies of Apple versus Amazon with the development of intelligent assistants. Amazon, with the Echo, is opening things for third-party developers making the device work with more services but also requiring users to learn the specific syntax needed to use those newly acquired skills. Apple, on the other hand, wants things to become more natural language-based where users don’t have to use a specific syntax to get work done.

For non-nerd users, natural language seems the only approach. I can’t imagine convincing my wife to memorize the appropriate speaking syntax for every service she wants to use through Siri or Alexa.

I think in the short term, the Amazon approach is easier and gets the ball forward faster. In the long-term, I think the Apple approach could be right if properly executed. If Siri does incorporate machine learning and artificial intelligence the way Apple wants it to, it could ultimately end up leapfrogging the syntax driven approach of its competitors.