Out in the Open: Build Your Own Netflix-Style Suggestion Machine for Free

Netflix has spent years building and improving its recommendation engine, and even sponsored a $1 million contest to improve its algorithm. But not every company has the time or money to build a system like that. Using a new open source offering from Mortar Data, one engineer should be able to get a custom recommendation engine up and running in about a week's time.
Image Mortar Data
Image: Mortar Data

Netflix has spent years building and improving its recommendation engine, and even sponsored a $1 million contest to improve its algorithm. But now anyone can download and tinker with this kind of software, thanks to a new open source project.

When streaming video company Shelby.tv built a new app for discovering online video last year, it decided to outsource the job to a company called Mortar Data, a New York-based company that builds and hosts custom big data applications. "We wanted to build fast," says Shelby.tv CEO Reece Pacheco. "We were impressed with the product and the team [Mortar Data] had built."

The company also wanted the freedom to build its own recommendation engine in the future. Because Mortar Data was built on standard open source tools like Hadoop, it was easy for the Shelby.tv team to move their data in and out of the system in a format that they could later use themselves.

But now Mortar Data has gone a step further. Earlier this month it open sourced its recommendation engine platform, so that anyone could build their own system and run it in their own data center.

Recommendations for the Masses

Mortar Data co-founder and CEO K Young.

Photo: Mortar Data

Recommendation systems have become one of the main ways that companies cash in on the huge amounts of data they collect. Retailers use them to suggest products, music services like Pandora and Last.fm use them to find music, and publications like Wired use them to suggest the next article you might want to read.

Companies that want such a recommendation system generally have two choices: build it themselves, or use off-the-shelf technology. Building your own is risky. In addition to being expensive, a recommendation engine that isn't very good can be even worse than not having one at all, Pacheco says.

That provides a strong incentive to buy an existing product. But Mortar Data CEO K Young says many companies are hesitant to rely too heavily on another company to run a core part of their business. That's a big part of why Mortar Data has open sourced its frameworks, Young explains.

There are other open source recommendation engines. Overstock.com, for example, built its own system using a collection of open source algorithms from the Apache Mahout project. But it's harder to get started with Mahout. Overstock.com has a team of about six engineers and a project manager working on its recommendation engine. As Ted Dunning — a contributor to the Mahout project who works for big data company MapR — told us in 2012: "It’s not a product. It’s not a package. It’s not a service. Batteries are not included."

>'We hope that the open source tools will provide enough value that users will consider hosting with us.'

Mortar Data CEO K Young

Mortar Data hopes to make it much easier to get started. According to its documentation, just one engineer should be able to get a custom recommendation engine up and running in about a week's time.

But Mortar Data isn't giving away everything for free. The company makes money by building and hosting custom big data solutions, and it has built a few tools that make that jobs easier, such as a system that lets you deploy your application to a large cluster of servers with a single click. Those tools for deploying and scaling applications aren't open source. You can still run your Mortar Data apps in your own data center, but you'll have to do the work of deploying them to a cluster and managing that cluster yourself. But since the core software is open source, someone else could eventually build a tool for easily deploying Mortar Data apps to other infrastructures.

In that sense, the open source tools serve as marketing for the company -- and an assurance that the customers have an exit strategy if they ever choose to leave. "We hope that the open source tools will provide enough value that users will consider hosting with us," Young says.

The strategy seems to be working. In addition to small startups like Shelby.tv, Mortar Data has attracted a few big name companies that will soon be using the system for public-facing projects. For example, online ticket ordering company StubHub will use it to recommend other events you might want to attend, and MTV.com is testing its own video recommendation system based on the product.

Young hopes that eventually Mortar Data can be useful for more than just helping companies sell more products. "Data is a model of the world as we understand it, and data science allows us to understand the world and make more intelligent decisions," he says. "We as humanity have a lot of challenges coming up, and the better we can be at making intelligent decisions that are thoughtful and informed and are not just guesses, the better we'll be at tackling them."

"This is my way of helping that all happen," he says. "I know that's grandiose, but that is, I think, why Mortar matters."