This week, we got the news that the $4.99 per month Apple Music Voice Plan has been shuttered. I’m not surprised. When this was announced, it seemed off-brand for Apple. I don’t think many people want a voice-only music system. Combining this with spotty performance from Siri it put a spotlight on Siri’s shortcomings.
The Information has an article by Wayne Ma reporting Apple is spending “millions of dollars a day” on Artificial Intelligence initiatives. The article is pay-walled, but The Verge summarizes it nicely.
Apple has multiple teams working on different AI initiatives throughout the company, including Large Language Models (LLMs), image generation, and multi-modal AI, which can recognize and produce “images or video as well as text”.
The Information article reports Apple’s Ajax GPT was trained on more than 200 billion parameters and is more potent than GPT 3.5.
I have a few points on this.
First, this should be no surprise.
I’m sure folks will start writing about how Apple is now desperately playing catch-up. However, I’ve seen no evidence that Apple got caught with its pants down on AI. They’ve been working on Artificial Intelligence for years. Apple’s head of AI, John Giannandrea, came from Google, and he’s been with Apple for years. You’d think that people would know by now that just because Apple doesn’t talk about things doesn’t mean they are not working on things.
Second, this should dovetail into Siri and Apple Automation.
If I were driving at Apple, I’d make the Siri, Shortcuts and AI teams all share the same workspace in Apple Park. Thus far, AI has been smoke and mirrors for most people. If Apple could implement it in a way that directly impacts our lives, people will notice.
Shortcuts with its Actions give them an easy way to pull this off. Example: You leave 20 minutes late for work. When you connect to CarPlay, Siri asks, “I see you are running late for work. Do you want me to text Tom?” That seems doable with an AI and Shortcuts. The trick would be for it to self-generate. It shouldn’t require me to already have a “I’m running late” shortcut. It should make it dynamically as needed. As reported by 9to5Mac, Apple wants to incorporate language models to generate automated tasks.
Similarly, this technology could result in a massive improvement to Siri if done right. Back in reality, however, Siri still fumbles simple requests routinely. There hasn’t been the kind of improvement that users (myself included) want. Could it be that all this behind-the-scenes AI research is Apple’s ultimate answer on improving Siri? I sure hope so.
I’ve spent too much time complaining about Siri lately. That’s not because Siri doesn’t deserve criticism (it does!) but because my job is to help you get the most out of this stuff, even when the underlying tech could be better…This is a post for MacSparky Labs Members only. Care to join? Or perhaps you need to sign in?
There’s a lot of news lately about Apple staffing up Siri. First we heard that they are adding something like 100 additional engineers to the product. Now the New Your Times is reporting Apple hired Google’s former artificial intelligence chief, John Gannandrea to oversee Apple’s machine learning and artificial intelligence efforts. Reportedly, Gannandrea will report directly to Tim Cook.
Speaking at John Gruber’s Daring Fireball party a few years ago, Apple’s Craig Federighi and Phil Schiller both explained that Apple can still make Siri smart without looking at all of its user’s data the way Google does. I don’t remember the exact example, but they said something like they don’t need to look at your pictures of mountains to teach a computer what a mountain looks like. Nevertheless, Siri does lag behind competing virtual assistants. I found their confidence uplifting because I want both to protect my privacy and for Siri to get smarter.
It looks like Apple is going to try and make Siri better by increasing engineering while maintaining its position on user privacy. I hope this makes a difference because Google and Amazon certainly aren’t standing still.
Regardless, don’t expect results immediately. I think Siri improvements will be a gradual thing, over time. I think it’s similar to the way Apple has improved its cloud services. They’ve come a long way with iCloud over the past few years, but that would be easy to miss if you weren’t paying attention.
I have been using the Siri watch face with watchOS 4 as my primary watch face since iOS 11 shipped. Ordinarily, I am not a digital watch face guy. I grew up looking at analog watches and I’ve been primarily using those on the Apple Watch since it first arrived. Nevertheless, I like the idea of a smart watch face on the Apple Watch giving me more timely information, so I went in with the Siri watch face. Also, I spend a lot of time at the sharp end of the stick when it comes to Siri, so I had to give it a try.
The idea behind the Siri watch face is to contextually give users the information most relevant to them at the time. The face itself is the time with a few complications and a scrolling list of information boxes below that you can move throughout using the Digital Crown. Tapping on any of these boxes brings you into the source application. Tap on an event, for instance, and you go to the calendar app.
There are a lot of Apple applications acting as a data sources for the Siri watch face. Using the Apple Watch face you can get information as to when the sun will rise, the weather forecast, and upcoming appointments. It runs much deeper than that, however. Data sources can also include reminders, alarms, timers, the stopwatch, your wallet, workouts, meditation/breathing reminders, HomeKit notifications, what’s now playing on your media device, photos, and even news.
For the two complications, I use the one on the right to display the current date and the left one for OmniFocus.
There are a lot of applications feeding data into the Siri watch face. One of the first things I did was customize that. If you go into the Apple Watch settings application on your iPhone and tap on your Siri watch face, you get a screen that gives you several options to turn these data sources on or off. I left most of them on but turned off photos, because pictures on that tiny screen don’t make sense to me, and news, which I found to be too much of a distraction.
I have had a few pleasant surprises using the Siri watch face. I like the way it displays upcoming appointments. They are easy-to-read, and they disappear automatically as the day goes on. Rotating the Digital Crown up gives you future Siri chosen events and spinning the opposite direction brings up prior entries and if you’ve played audio recently, the last playing audio. This gives you an easy way to restart podcast or music from your wrist.
I’ve often been tempted to add the timer and alarm complications to my analog faces, but that complication space is so valuable. With the Siri face timers, stopwatch, and alarms only appear when in use so I get them when I need them and only that. Finally, the now playing entries are great for getting back into whatever audio you played last.
Overall, the convenience of the Siri watch face is enough to get me to stick with it despite my preference for analog faces. I’m going to keep using it for the foreseeable future. If you are going to use it, take the time to go into the settings application and customize the data sources to your preference.
My biggest wish for the Siri watch face is to see third-party applications get on that data source list. For instance, why can’t I get upcoming OmniFocus deadlines or Carrot Weather reports? Hopefully, that comes with future iterations.
Yesterday Wired magazine published an article about the most recent improvements to Siri. Several prominent Apple executives participated including Alex Acero, the Siri lead, and Greg Joswiak.
The focus of the article was the improvement to Siri’s voice with IOS 11. Having used the beta now for several months, I can tell you that Siri is most certainly more expressive than in prior versions. The technology behind it, as explained in the article, is quite fascinating. Rather than using recorded words, they are using phonemes, which are the individual sound components of words assembled by Siri on-the-fly to be as expressive as possible.
One issue I would take with the article is that it almost feels as if they are implying Apple is only working on making Siri more expressive and not generally smarter. I’m pretty sure Apple can walk and chew gum, and from my own experience with Siri, it has continually improved since first released.
An example of this is calendar appointments. Up until about a year ago, scheduling calendar appointments was a syntax-heavy task with Siri. That’s not true anymore. Now there are several ways that you can naturally ask Siri to schedule an appointment, and she usually gets it right. The “usually” in that sentence is the problem. “Usually” needs to become “Always” or “Almost Always”. For Siri, the make-or-break moment is the first time a user tries to do something new. If you try to set a calendar appointment and Siri crashes and burns, you probably won’t try it again. To get more users to buy in, Apple needs to continue to improve that first experience, so users are encouraged to dig deeper.
The Wired article also addresses the different philosophies of Apple versus Amazon with the development of intelligent assistants. Amazon, with the Echo, is opening things for third-party developers making the device work with more services but also requiring users to learn the specific syntax needed to use those newly acquired skills. Apple, on the other hand, wants things to become more natural language-based where users don’t have to use a specific syntax to get work done.
For non-nerd users, natural language seems the only approach. I can’t imagine convincing my wife to memorize the appropriate speaking syntax for every service she wants to use through Siri or Alexa.
I think in the short term, the Amazon approach is easier and gets the ball forward faster. In the long-term, I think the Apple approach could be right if properly executed. If Siri does incorporate machine learning and artificial intelligence the way Apple wants it to, it could ultimately end up leapfrogging the syntax driven approach of its competitors.
alt Mossberg wrote an article over at Recode, Why does Siri seem so Dumb?. In it Walt points out several failings.
I understand that Apple has fixed several of these issues since the article posted but that’s actually part of the problem. Why does it take an article by a popular journalist to get these things fixed? I feel as if Siri needs more attention. I don’t think the underlying technology is as bad as most people think but it is these little failures that causes everyone to lose faith. Siri is a cloud based service and needs to be upgraded and improved every day. While things are better, the rate of improvement needs to accelerate.
In our recent Mac Power Users episode on macOS Sierra, both Katie and I bemoaned the fact that you can’t verbally trigger Siri on the Mac. It seems a no brainer to me as someone with an iMac on my desk sitting there waiting to work 24/7. Several listeners wrote in to explain that you can trigger Siri on the Mac with your voice using the Mac’s accessibility features. Lifehacker has an article showing you every step to enable voice-activated Siri on your Mac. Click on the link to set it up but I will tell you that this actually involves making two separate voice commands: “Hello” and “Computer”. Having used it now a few days, I find it works best if you leave a slight delay between the two words. Have fun.
Siri’s original developers, Dag Kittlaus and Adam Cheyer, left Apple in 2011 and took a bunch of their team with them. Since then, they’ve been working on a new artificial intelligence system, Viv, that is going to get it’s first public demonstration Monday.
Right now there is a lot going on in the intelligent digital assistant world. While Apple was early to this game, Microsoft and Google are right behind and it’s clear there’s a lot of resources from a lot of big companies being thrown at this problem.
Most surprising to me has been the utility of the Amazon Echo. I have been using Siri for years but nobody else in my family does. I think it has something to do with the slight delay that exists between activating Siri and stating a command combined with the sometimes indecipherable syntax you need to use in order to make it work. There is also that thing where Siri will perform a complicated instruction perfectly only to botch things up entirely when you ask it to tell you the weather five minutes later. All of this has improved over the years but there still is enough resistance that my non-nerd family members are not interested.
The Amazon Echo on the other hand has no such resistance. I frequently witness my family turning on the lights, checking the weather, and otherwise interacting with Alexa. To me this is the closest glimpse we’ve had yet to a future with reliable intelligent digital assistants.
I’ve been thinking quite a bit about why the Amazon Echo is “stickier” for the non-techies in my house than Siri. One argument is because the Echo is always on and listening (which is kind of creepy). You don’t need to push a button to get it started. I’m sure that’s part of it, but I believe the real reason is because it’s both easier to talk to and more responsive.
Amazon’s Echo does a better job of parsing the question and giving you useful information. Too often, Siri gets confused because you don’t ask the question just right. Also, the Amazon Echo has never done that thing where it seems to understand me perfectly only report it can’t answer my question because of some mysterious problem out there on the Internet … somewhere. Either way, in the Sparks household the Amazon Echo has been a clear winner for my wife and children.
So getting back to where I started with all of this, we’re getting our first demonstration of Viv on Monday. Your guess is as good as mine over what the long game is for Viv’s developers. Maybe they want to wow us so some big company throws large sums of money at them. However, they already did that with Siri.
I suspect they are more interested in making something that they can develop without the limitations that come with tying their wagon to a large corporation. Keeping Viv independent allows them to make deals easier with third parties so it’s easier to add functionality. It also lets the developers be, generally, more nimble. The downside is that it’s going to be harder to activate. One of the big attractions of Siri is that it is everywhere on iOS. If I have to go open an application to get a digital assistant working for me, I’m much less likely to use it. (I downloaded Microsoft’s Cortana app and I still only launch it for the purpose of testing Cortana.) I think members of my family would be even less likely to launch an app for a digital assistant.
Either way, I hope that Viv is a smashing success. I want there to be a lot of competition in this space and I want these big companies to duke it out. It feels like we are on the cusp of having useful digital assistants in our lives and the sooner that comes, the better.
This week, John Gruber wrote a post about how Siri is becoming more useful to him. I think John’s right, Siri is a lot better. Although I am definitely looking at this from the vantage point of a Siri fanboy. I liked it from the beginning. I dictate often and like to think I’m pretty good at it. I’m actually dictating these words right now in Drafts on my iPhone. I’ve watched Siri grow up to a certain extent. Those people who gave up on it at the beginning are missing out. You should understand that like some other Apple terms (iCloud, for instance), Siri has several components.
This is the easiest bit. You speak and Siri returns your words as text. With iOS 8, Siri dictation got the ability to return your words as (or very shortly after) you say them. This was the single biggest improvement to Siri yet. With pre-iOS 8 Siri, you’d dictate your words and, only after you finished, would the recording grind through Apple’s servers and return words, at least theoretically. Sometimes it would just blink and silently mock you. Even if you have no interest in asking Siri how to bury a dead body, tapping the little microphone button and speaking to your iOS device (or Mac) to make words appear can be liberating. I recommend trying it for three days. It’s a game changer.
Siri the Intelligent Assistant
Then there is the entirely separate Siri that you ask to do something, like set a timer. This version of Siri has two jobs: 1) figure out what you just said, and; 2) figure out what you want it to do. Even if Siri gets all the words right in step 1, it still has a new and unique opportunity to fail in step 2. I think Siri has improved at phase 2 as much as it has improved at phase 1. I also think this improvement could only have come from Apple shipping Siri and letting millions of people bang up against it. While Siri is hardly perfect, it is damn useful.
Siri on the Mac
Apple has put dictation on the Mac with recent version of the Mac OS. It is, in some ways superior than Siri dictation in iOS, in that it no longer requires an Internet connection, assuming you enable enhanced dictation. However, Apple has yet to bring the Intelligent Assistant to the Mac. Dan Moren thinks they should and I agree. As someone who spends a lot of time behind a Mac, I’d love to be able to ask about the weather, set a timer, or do a simple web search with my voice. As Home Kit gets legs, it’d push more than few of my buttons to also be able to turn down the lights are start a playlist with my voice. I particularly agree with Dan’s desire for hypothetical Mac Siri to let users set the custom trigger phrase. Using “Hey Siri” leads to way too many hijinks as it is. If instead I could set my own unique phrase, I could make it something less likely to go off unexpectedly. For those of us that give our Macs names, it would also let us have a little fun. “Good Day Thelonious, What’s the weather going to be tomorrow?”