Siri Will Soon Understand You a Whole Lot Better

It all started at a small academic get-together in Whistler, British Columbia. The topic was speech recognition, and whether a new and unproven approach to machine intelligence—something called deep learning—could help computers more effectively identify the spoken word. Microsoft funded the mini-conference, held just before Christmas 2009, and two of its researchers invited the world’s […]
130318siri0078
siri, christina, bonnington, application, app, iPhone, iPhone 5, Photo: Alex Washburn / WiredAlex Washburn/WIRED

It all started at a small academic get-together in Whistler, British Columbia.

The topic was speech recognition, and whether a new and unproven approach to machine intelligence---something called deep learning---could help computers more effectively identify the spoken word. Microsoft funded the mini-conference, held just before Christmas 2009, and two of its researchers invited the world's preeminent deep learning expert, the University of Toronto's Geoff Hinton, to give a speech.

Hinton's idea was that machine learning models could work a lot like neurons in the human brain. He wanted to build "neural networks" that could gradually assemble an understanding of spoken words as more and more of them arrived. Neural networks were hot in the 1980s, but by 2009, they hadn't lived up to their potential.

At Whistler, the gathered speech researchers were polite about the idea, "but not that interested," says Peter Lee, the head of Microsoft's research arm. These researchers had already settled on their own algorithms. But Microsoft's team felt that deep learning was worth a shot, so the company had a couple of engineers work with Hinton's researchers and run some experiments with real data. The results were "stunning," Lee remembers---a more than 25 percent improvement in accuracy. This, in a field where a 5 percent improvement is game-changing. "We published those results, then the world changed," he says.

Now, nearly five years later, neural network algorithms are hitting the mainstream, making computers smarter in new and exciting ways. Google has used them to beef up Android's voice recognition. IBM uses them. And, most remarkably, Microsoft uses neural networks as part of the Star-Trek-like Skype Translate, which translates what you say into another language almost instantly. People "were very skeptical at first," Hinton says, "but our approach has now taken over."

>One big-name company hasn't made the jump: Apple, whose Siri software is due for an upgrade

One big-name company, however, hasn't made the jump: Apple, whose Siri software is due for an upgrade. Though Apple is famously secretive about its internal operations--and did not provide comment for this article--it seems that the company previously licensed voice recognition technology from Nuance---perhaps the best known speech recognition vendor. But those in the tight-knit community of artificial intelligence researchers believe this is about to change. It's clear, they say, that Apple has formed its own speech recognition team and that a neural-net-boosted Siri is on the way.

Lee would know. Apple hired one of his top managers, Alex Acero, last year. Acero, now a senior director in Apple's Siri group had put in nearly 20 years at Microsoft, researching speech technology. There he oversaw Li Deng and Dong Yu, the two Microsoft researcher who invited Geoff Hinton to that conference in British Columbia. Apple has also poached speech researchers from Nuance, including Siri Manager Gunnar Evermann. Another speech research hire: Arnab Ghoshal, a researcher from the University of Edinburgh.

"Apple is not hiring only in the managerial level, but hiring also people on the team-leading level and the researcher level," says Abdel-rahman Mohamed, a postdoctoral researcher at the University of Toronto, who was courted by Apple. "They're building a very strong team for speech recognition research."

Ron Brachman, who oversees research at Yahoo and helped launch the project that originally gave rise to Siri, points out that Apple's digital iPhone assistant depends on a lot more than just speech recognition. But Microsoft's Peter Lee gives Apple six months to catch up to Microsoft and Google and start using neural nets, and he thinks this will significantly boost Siri's talents. "All of the major players have switched over except for Apple Siri," he says. "I think it's just a matter of time."

Cade Metz contributed reporting to this story.