Netflix never used its $1 million algorithm due to engineering costs

Eliezer · on April 13, 2012

That doesn't quite sound like what the content of the article says - it's more, "We used the two algorithms from the first Progress Prize that gave us most of the benefit, and the 107 blended algorithms required to get the next 1.7% improvement weren't worth it. Oh, and here's how we had to reengineer them." (Numbers from memory, not to scale.)

The title makes it sound like the prize ended up being pointless. The article says otherwise.

elq · on April 14, 2012

Exactly.

I work on the cinematch team at netflix. The several hundred blended algorithms that was output of the grand prize winning team is very... impractical... for netflix' needs. We cherry picked (and then modified) the best of the bunch.

jonhinson · on April 13, 2012

Netflix's blog post mentions they used two algorithms from one of the Progress Prize ($50,000) winners.

The article goes on to say "...you might be wondering what happened with the final Grand Prize ensemble that won the $1M two years later...We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment."

The title is completely accurate.

DougBTX · on April 14, 2012

> The title is completely accurate.

Accurate, but a half truth. The prize was for a 10% improvement, but before that solution was produced they had already improved by 8.4%. The headline makes it sound like the improvement from zero to 10% was not worth the engineering cost, but really it was the improvement from 8.4% to 10% which cost too much.

benhamner · on April 14, 2012

This is why you should award prizes for the top K performers, as opposed to the first one to cross a benchmark.

Drbble · on April 14, 2012

And when the teams merge into K groups and then stop trying? Netflix paid for performance, and got performance.

benhamner · on April 14, 2012

In the Netflix prize, did all teams merge into one group and stop trying? Nor would we expect so with K groups, especially with exponentially decreasing prize amounts. If this is even slightly a risk, you can compensate by making the maximum number of prizes awarded a function of the number of participating teams. For example, you can award 10 prizes if there are >100 teams, and if not award floor(num_teams/10) prizes, with the prize pool redistributed among the top 10%.

It was pure luck that the threshold for the million-dollar prize was crossed. If it had been arbitrarily set at 11% (as opposed to 10%), then there's a good chance the million dollar prize would have never been paid out.

The advantage of paying out K top prizes is that other teams that don't win outright may have developed additional useful models or insights (or used more computationally efficient algorithms), and you may access to these in this manner.

eli · on April 13, 2012

Yes, the engineering costs are too hard to justify... now that DVDs by mail is no longer their primary focus. it's not clear to me that that would still be the case if Netflix business were still operating around 2007's constraints.

akkartik · on April 13, 2012

Did the entry ranked 5 also have the 2 algorithms? Then the title makes sense. A contest is a good way to motivate the base, but it can also cause lots of effort that is unnecessary in the end.

benhamner · on April 13, 2012

As Eliezer said - they implemented the two most important algorithms from the contest, which advanced the state of the art in recommendation systems & gave the majority of the benefits, and didn't implement the long tail of algorithms that each only gave a very slight marginal benefit & would have been costly to re-train and maintain.

This is one of the advantages of running shorter competitions: normally it takes 1-3 months to approximately hit the asymptotic level of performance on a dataset given the inherent noise in it and the state of the art in machine learning. The shorter competitions are focused on finding the low-hanging fruit that generate large improvements (such as SVD & RBM's in Netflix's case) and exploring the space of possible model structures, as opposed to optimally ensembling across a large number of models to eek out the last 0.01% of performance.

Exploring the space of useful features & possible models enables you to trade off computational efficiency & maintainability vs. model performance in production as well. The $1 million dollars Netflix put to the prize leveraged >> $1 million in human effort to explore the possible models, from which they found and applied the two best suited for their production implementation.

(disclaimer - I work with Kaggle)

therealarmen · on April 13, 2012

Even if they didn't take advantage of the algorithm, I bet the $1 million paid for itself due to the all the press and hoopla surrounding the prize.

petesoder · on April 13, 2012

Exactly. A $1M investment by a company like Netflix to stay in our faces for - how long - '06 to 09 (and beyond!), was well worth it to them in terms of branding and new subscribers.

Drbble · on April 14, 2012

And bought multiple millions of dollars worth of engineering labor.

lawnchair_larry · on April 14, 2012

It also gave them a pretty solid idea where they stood, and how much (not) to invest in developing that area further. I'm not sure how much that is worth.

numlocked · on April 14, 2012

Can't emphasize this enough - with only a handful of employees working on the problem, Netflix could not have ever been completely confident that they were maximizing the predictive value of the data. The competition format quite clearly shows just how good the models can get. So even if they only implement the model with an ~8% gain, it's now an informed decision with known trade offs.

sesqu · on April 14, 2012

Well, there's still the question of which error function to use. They decided on sum of squares, which can and should be improved on - but it's not clear how. If they ever develop a different one, the performant models may well differ substantially from what they got.

mturmon · on April 14, 2012

And increased visibility in the ML community.

colonelxc · on April 13, 2012

Original discussion: http://news.ycombinator.com/item?id=3810058

ralfd · on April 14, 2012

When do you know you lurk to much on hacker news? When you think: "Hey the original Netflix blog post was just last week on the front page. Why is it reported again!?"

hluska · on April 14, 2012

"Netflix Watch Instantly is about the here and now, and Netflix is priming to respond to that time frame."

Did this come out of the press release? :)

droithomme · on April 14, 2012

As part of the prize payment, does NetFlix exclusively own the implementation that won the prize? Have they patented it? Just asking since there are few details of the grand prize winner algorithm referenced in the article, and it seems it could be of great economic benefit, if not to use it, but to also prevent others from using it as well, while also owning the proprietary second, third, fourth, etc rating systems. Could be a good way to block competitors from getting a leg up if you end up owning the technology within each contest entry, and even are able to block the winners from selling it to someone else.

sureshv · on April 14, 2012

No - Netflix has a non-exclusive license to the algorithm and source code from the winner: Belkor Pragmatic Chaos (AT&T et al.). They can't sublicense and don't own the patents related to the winning algorithm.

ScottBurson · on April 14, 2012

TL;DR: they get better data now from the streaming system, because they can tell exactly how much of each video people watch.

So it's another example of what the data mining people always say: getting more data is a quicker way to better results than improving your algorithm.