"We can take data sets with millions and millions of data points and figure out what’s related to a given item in a few milliseconds. Most recommendations engines pre-compute stuff rather than generating the recommendations in real-time like we do"
I've been looking into recommendation algorithms recently (started with the excellent book "Programming Collective Intelligence"), and this sounds lightyears ahead of the way we currently do things. I suppose you could take one of the algorithms that requires pre-computing and throw resources at it, but it seems like they are talking about something new.
Since I'm just getting started, I'd like to find some academic (or blog-faux-academic) articles on whatever recent advances behind recommendations without the need for precomputing. Anyone know where to look?
There is no magic. I don't know what Directed Edge are doing, but simpler Amazon-style recommendations (people who bought X also bought...) doesn't need to precompute anything if you choose your data structures properly: get a list of things people bought along with X, count them (or actually store them counted), optionally normalize for general popularity, sort in decreasing order, show top results.
Amazon's algorithms are simpler than our own (at least as far as I can tell) and most recommendation engines use some sort of embedding to reduce the dimensionality of the problem.
Amazon's related products do in fact seem to be, or very near to, a simple counting structure. Our ranking algorithm builds a large subgraph around an item and then does a few passes with a couple different ranking schemes to try to figure out the hot items within that subgraph, prunes "noisy" connections (i.e. links that are "hot", but don't actually pack much semantic meaning) and then tries to scale things so that the results returned aren't simply those with the largest overlap, but those that are most relevant within that subgraph relative to the larger graph. In that sense, it has some similarities to web-search algorithms.
In user-visible terms, that means that our results are often less obvious than Amazon's recommendations -- for a long time we called that the "tell me something I don't know" problem. It's no good to do a search for "Miles Davis" and have "Jazz" com back as a related item. If you know about Miles Davis, you already know about Jazz.
If you mean metrics, the only one that I find really meaningful is a feedback loop to see what users are in fact interacting with, and we'll have something in place for that shortly. Synthetic metrics on recommendations quality don't really impress me because they ignore that recommendation algorithms are solving an human-computer interaction problem as much as they're solving a k-nearest-neighbors problem. I've got another article in the pipe on some of the interesting problems of ranking on real data, but it keeps getting pushed back since there's you know, a lot to do at the moment. :-)
"Directed Edge truly believes that we’re about to see a shift on the web away from search and towards recommendations."
The difference is somewhat arbitrary when you think about it. When I Google, I'm asking it to recommend me stuff related to what I'm looking for. Google is nothing but the world's best recommendation engine.
There are about 1,000 sites that could use good recommendation technology to enhance their profits though, so I like this company's monetization chances. Easy elevator pitch too: recommendations as a service.
It's a continuum, as hinted below, but the results are pretty different in the polar cases. I'll fall back to my Miles Davis example.
Searching for "Miles Davis" returns 10 pages about Miles Davis. Hitting our engine with Wikipedia data for Miles Davis gives you John Coltrane, Herbie Hancock, Wayne Shorter, Thelonious Monk, Sonny Rollins, Canonball Aderly, and so on.
The search end of the spectrum is about finding something you're looking for -- recommendations are about discovering things you didn't know about.
The two meet in the middle with "personalized search". If I type "python" into a search engine, do I want snakes or code? You can probably figure that out based on what I've done in the past.
Yeah, and that's awesome, and I think will totally converge with the type of search we have today. Imagine you type "Miles Davis" in Bing (which I'm using because they'd be more likely to experiment with altering the dominant paradigm than Google) and they show you a column of pages on the left about Miles and the stuff you're showing on the right. That'd be bad ass.
You are right, there is a continuum of possibilities between pure search and pure recommendation systems.
The difference is in the input data that the search/recommendation engine considers: the input in classic search is only the few words in the query, while a recommendation engine considers some history of your interactions with a site or the internet (and nothing that indicates your intent in this particular instance).
Google, of course, keeps search history for people logged in. But it doesn't seem to affect their relevancy algorithm much, unlike, say, your location and language.
I'd speculate Google Checkout is mostly about gathering data that would allow for meaningful recommendations.
I was about to post this too. Surely all the possible nodes include all possible search terms, as well as all web pages. Then search and relations are synonymous. Any thoughts on how your system would be different from a search engine in that sense Wheels?
Greg Linden, who worked on Amazon.com's recommendation engine, has referred to what he calls the "harry potter problem". To quote from his blog:
'...this calculation would seem to suffer from what we used to call the "Harry Potter problem", so-called because everyone who buys any book, even books like Applied Cryptography, probably also has bought Harry Potter. Not compensating for that issue almost certainly would reduce the effectiveness of the recommendations, especially since the recommendations from the two clustering methods likely also would have a tendency toward popular items.'
How did you compensate for this problem? Do you simply ignore vertices in the graph that have a large degree?
Or, are you using non-linear weighting functions, such as a perceptron's sigmoid function?
With regard to Wikipedia, almost everyone who has edited an article has also edited the article on Bill Clinton. So, if you are using the edit-history metadata to compute recommendations, you would have to compensate for the "Bill Clinton problem".
I have been following the "reccomendation algos as a service" for about 2 years now. This definetely seems interesting but a side opportunity could be to do aggregator/optimizer of recco algos for merchant/publishers similar to what Rubicon Project/Pubmatic does on aggregating/optimizing ad networks.
The reccoAlgo aggregatot would take all the various recco algo services such as DirectedEdge, Aggregate Knowledge, Loomia, Minekey, Persai (now dead) and many others and keep running tests (similar to the netflix prize) and whatever is better is given more airtime on suggesting related products/pages for retailers/publishers.
The compensation model would work on a percentage of revenue for additional clicks/purchases on the suggestions.
"....we’ve gone from having a graph-store to having a proper graph database..."
A graph database and a "triple store" in semantic technologies are essentially the same thing. This company makes some very aggressive claims that allegrograph, Jena, Oracle (with Spatial), sesame and others (including the korean arm of my current company) have also made. Typically, such claims fail to live up to the marketing. I wonder how this solution compares to these traditional triple stores?
There is more than one, depending on the problem and what exactly do you want to improve. For example RMSE is the measurement used for the Netflix Prize.
You can actually search in the little bar there for any wikipedia article to show related stuff, and as kind of a little easter egg, you can specify a starting page via the query string, e.g.
We tried a few combinations -- and may try some yet. If it was further right it threw off the balance of the layout. If it was up at the top it drew too much attention to itself. We tried messing with the z-index on hover too, but that looked funky.
I've been looking into recommendation algorithms recently (started with the excellent book "Programming Collective Intelligence"), and this sounds lightyears ahead of the way we currently do things. I suppose you could take one of the algorithms that requires pre-computing and throw resources at it, but it seems like they are talking about something new.
Since I'm just getting started, I'd like to find some academic (or blog-faux-academic) articles on whatever recent advances behind recommendations without the need for precomputing. Anyone know where to look?