Revealed: The Secret Gear Connecting Google's Online Empire

For a decade, Google has been building the networking equipment that runs its online empire in secret. Today, it's raising the curtain.
Amin Vahdat the former University of California San Diego professor who oversees Googles network.
Google Fellow Amin Vahdat, the former University of California San Diego professor who oversees the company's rather unusual computer networking tech.Zach Gross for WIRED

Three-and-a-half years ago, a strange computing device appeared at an office building in the tiny farmland town of Shelby, Iowa.

It was wide and thin and flat, kind of like a pizza box. On one side, there were long rows of holes where you could plug in dozens of cables. On the other, a label read "Pluto Switch." But no one was quite sure what it was. The cable connectors looked a little strange. The writing on the back was in Finnish.

Someone shipped the thing to the IT guys at company headquarters in Wisconsin, and at first, they couldn't figure it out either. But after chatting with some other tech types via a little-known Internet discussion forum---and fiddling with the software loaded on the device---they realized what it was. It was a networking switch, a way of moving digital data across the massive computing centers that underpin the Internet. And it belonged to Google.

Google runs a data center not far from Shelby, and apparently, someone had sent the switch to the wrong place. After putting two and two together, those IT guys shipped it back to Google and promptly vanished from the 'net. But the information they posted to that online discussion forum, including several photos of the switch, opened a small window into an operation with enormous implications for the Internet as a whole---an operation Google had never discussed in public. For several years, rather than buying traditional equipment from the likes of Cisco, Ericsson, Dell, and HP, Google had designed specialized networking gear for the engine room of its rapidly expanding online empire. Photos of the mysterious Pluto Switch provided a glimpse of the company's handiwork.

Seeing such technology as a competitive advantage, Google continued to keep its wider operation under wraps. But it did reveal how it handled the networking links between its data centers, and now, as part of a larger effort to share its fundamental technologies with the world at large, it's lifting the curtain inside its data centers as well. This morning, at a conference in Silicon Valley, Amin Vahdat, the former academic who oversees Google's networking technologies, will detail the company's first five generations of networking gear---hardware and software that spans nearly a decade of work. That may seem like inside baseball in the extreme, but his talk serves as a signpost for the near-future of the Internet. Google's network shows where everyone else is headed.

According to Vahdat, Google started designing its own gear in 2004, under the aegis of a project called Firehose, and by 2005 or 2006, it had deployed a version of this hardware in at least a handful of data centers. The company not only designed "top-of-rack switches" along the lines of the Pluto Switch that turned up in Iowa. It created massive "cluster switches" that tied the wider network together. It built specialized "controller" software for running all this hardware. It even built its own routing protocol, dubbed Firehose, for efficiently moving data across the network. "We couldn't buy the hardware we needed to build a network of the size and speed we needed to build," Vahdat says. "It just didn't exist."

Google’s “Pluto Switch” sits at the top of a rack of computer servers, connecting each to the wider network.

Google
An Unusual Empire

The aim, Vahdat says, was twofold. A decade ago, the company's network had grown so large, spanning so many machines, it needed a more efficient way of shuttling data between them all. Traditional gear wasn't up to the task. But it also needed a way of cutting costs. Traditional gear was too expensive. So, rather than construct massively complex switches from scratch, it strung together enormous numbers of cheap commodity chips.

Google's online empire is unusual. It is likely the largest on earth. But as the rest of the Internet expands, others are facing similar problems. Facebook has designed a similar breed of networking hardware and software. And so many other online operations are moving in a same direction, including Amazon and Microsoft. AT&T, one of the world's largest Internet providers, is now rebuilding its network in similar ways. "We're not talking about it," says Scott Mair, senior vice president of technology planning and engineering at AT&T. "We're doing it."

Unlike Google and Facebook, the average online company isn't likely to build its own hardware and software. But so many startups are now offering commercial technology that mimics The Google Way. Basically, they're fashioning software that lets companies build complex networks atop cheap "bare metal" switches, moving the complexity out of the hardware and into the software. People call this software-defined networking, or SDN, and it provides a more nimble way of building, expanding, and reshaping computer networks.

"It gives you agility, and it gives you scale," says Mark Russinovich, who has helped build similar software at Microsoft. "If you don't have this, you're down to programming individual devices---rather than letting a smart controller do it for you."

It's a movement that's overturning the business models of traditional network vendors such as Cisco, Dell, and HP. Vahdat says that Google now designs 100 percent of the networking hardware used inside its data centers, using contract manufacturers in Asia and other locations to build the actual equipment. That means it's not buying from Cisco, traditionally the world's largest networking vendor. But for the Ciscos of the world, the bigger threat is that so many others are moving down the same road as Google.

"It means that Cisco is restricted to a smaller piece of the pie," says JR Rivers, who helped design Google's first networking switches, previously built hardware at Cisco, and now runs a networking software company called Cumulus.

The Mainframes of Networking

A decade ago, Google built its networks like everyone else did. It bought enormous "cluster switches" from companies like Cisco. Inside each data center, these served as the backbone of the network. "They were essentially mainframes," Vahdat says.

Each of these switches cost the company "hundreds of thousands to millions" of dollars. And they could accommodate only so many of Google's computers. Cables from these routers would connect to only so many switches at the top of the company's server racks, and these switches could connect to only so many servers. "You would just call up your vendor," Vahdat says, "and ask them to give you the biggest thing they had."

So, in 2004, a small team of Google engineers starting building a new breed of gear. At the time, Vahdat was still a professor of University of California, San Diego, but unknowingly, he mirrored the company's work with a seminal academic paper published in 2009, a year before he joined the company. It was only natural he would make the move to Google. "In academia," he says, "you can only build systems at a certain scale."

In the simplest terms, Google used commodity networking chips to build hardware devices that could run whatever software it wanted. In the past, the company had to buy what the Ciscos of the world were selling---both hardware and software. But it moved to a model where it could mold the hardware and the software to suit its needs.

Buying chips from companies such as Broadcom, Google built its own top-of-rack switches. And using the same chips, it pieced together larger cluster switches that could serve as a network backbone. Today, according to Vahdat, one of the company's "Jupiter" cluster switches provides about 40 terabits of bandwidth per second---or the equivalent of 40 million home Internet connections.

You can think of the top-of-rack switches as building blocks for the cluster switches. "Rather than buy high-end specialized components---the mainframes, if you will, of switching and routing---you can buy commodity, off-the-shelf, merchant silicon," Vahdat says. "You can build an infinitely large network from small building blocks."

This mirrored similar changes in worlds of computer servers and data storage hardware. In the past, when companies built large online operations, they bought enormously expensive and complex machines that ran proprietary software. Now, they buy lots of little machines that they can string together into a larger whole, running all sorts of software that allows these machines to work in concert. "You scale out rather than scale up," Vahdat says, echoing what has become standard lingo in the world of computing.

The Google Protocol

Vahdat says relatively little about Google's networking software. But he hints that the hardware runs an operating system based on Linux, the open source OS that also runs the company's computer servers. And he says the company went so far as to design its own data routing protocol, Firepath, used to dictate how information moves from machine to machine. This is somewhat surprising. Typically, businesses run their private data center networks using standard internet protocols such as BGP and OSPF. But Microsoft's Russinovich says this kind of thing is something Microsoft "is always exploring" in an effort to improve efficiency, and when Google was building its protocol, standard technologies weren't as efficient as they are today. "We decided this would be simpler, faster, and more scalable," Vahdat says. "We wanted to build the biggest network ever."

Google’s Jupiter "cluster switches" provide 40 terabits per second of bandwidth—about as much as 40 million home internet connections.

Google

The details of this setup are enormously complex. But the long and the short of it is that the company can now use a central controller to oversee its network and even change how the network operates. That may seem like a simple requirement. But traditionally, computer networks have not worked this way, forcing technicians to make individual changes to individual devices if they wanted to revamp or expand their networks.

As Google was fashioning this kind of "software-defined networking," researchers at Stanford University were developing similar methods, giving rise to an open source protocol called OpenFlow. Vahdat says that Google did not use OpenFlow with its Firehose network, but it did adopt the protocol with its latest generation of networking hardware, dubbed Jupiter, using it alongside other protocols.

Trickle-Down Networking

In the years since Google overhauled its network, a wide range of companies have sought to commercialize Google's approach, including startups such as Nicira (now owned by VMware), Big Switch Networks, and JR Rivers' Cumulus Networks. As a result, the paradigm has trickled across the industry.

Nicira and Cumulus have long said their tech is used by big-name Internet companies---apparently, software based on Nicira's tech was once in use at Google---and they've worked with various ISPs and well-known Wall Street firms. Meanwhile, Facebook has pushed the movement forward still further by open sourcing its software and hardware designs, freely sharing them with the world at large.

Until now, Google has kept quiet about its work. And to a certain extent, it continues to do so. Vahdat will not discuss the particulars of Google's latest technologies, and the company is not open sourcing any of its work, something Microsoft has done, according to Russinovich. But more than anyone else, the company shows where the world of networking is moving.

After rolling out this new technology inside its data centers, Google built something similar for the "wide area networks" that connect its data centers to one another and plug them into the wider Internet. And according to Vahdat, the company now sends more information between its data centers than it trades with the Internet as a whole. "Computer-to-computer interaction is growing faster than person-to-computer interaction," he says.

This massive flow is driven by the sweeping "Big Data" systems that store, juggle, and analyze information across tens of thousands of Googles servers---systems like MapReduce, Bigtable, Dremel, and Spanner. In order to serve the people, Google's machines must do far more work than ever before. And this phenomenon isn't limited to Google. Software like MapReduce and Dremel have spawned open source projects used across the Internet. To accommodate these tools, the biggest online companies are building networks that were unimaginable just a few years ago. And that requires a new breed of hardware and software.