Hacker News new | past | comments | ask | show | jobs | submit login

"We can take data sets with millions and millions of data points and figure out what’s related to a given item in a few milliseconds. Most recommendations engines pre-compute stuff rather than generating the recommendations in real-time like we do"

I've been looking into recommendation algorithms recently (started with the excellent book "Programming Collective Intelligence"), and this sounds lightyears ahead of the way we currently do things. I suppose you could take one of the algorithms that requires pre-computing and throw resources at it, but it seems like they are talking about something new.

Since I'm just getting started, I'd like to find some academic (or blog-faux-academic) articles on whatever recent advances behind recommendations without the need for precomputing. Anyone know where to look?




There is no magic. I don't know what Directed Edge are doing, but simpler Amazon-style recommendations (people who bought X also bought...) doesn't need to precompute anything if you choose your data structures properly: get a list of things people bought along with X, count them (or actually store them counted), optionally normalize for general popularity, sort in decreasing order, show top results.

I'd love to hear something from wheels though.


Amazon's algorithms are simpler than our own (at least as far as I can tell) and most recommendation engines use some sort of embedding to reduce the dimensionality of the problem.

Amazon's related products do in fact seem to be, or very near to, a simple counting structure. Our ranking algorithm builds a large subgraph around an item and then does a few passes with a couple different ranking schemes to try to figure out the hot items within that subgraph, prunes "noisy" connections (i.e. links that are "hot", but don't actually pack much semantic meaning) and then tries to scale things so that the results returned aren't simply those with the largest overlap, but those that are most relevant within that subgraph relative to the larger graph. In that sense, it has some similarities to web-search algorithms.

In user-visible terms, that means that our results are often less obvious than Amazon's recommendations -- for a long time we called that the "tell me something I don't know" problem. It's no good to do a search for "Miles Davis" and have "Jazz" com back as a related item. If you know about Miles Davis, you already know about Jazz.


Do you have any kind of results that demonstrate how accurate your recommendations are? I checked your site but couldn't find anything


You mean examples or metrics?

If you mean examples, you can see our algorithm applied to link structure analysis on the related pages here:

http://pedia.directededge.com/

If you mean metrics, the only one that I find really meaningful is a feedback loop to see what users are in fact interacting with, and we'll have something in place for that shortly. Synthetic metrics on recommendations quality don't really impress me because they ignore that recommendation algorithms are solving an human-computer interaction problem as much as they're solving a k-nearest-neighbors problem. I've got another article in the pipe on some of the interesting problems of ranking on real data, but it keeps getting pushed back since there's you know, a lot to do at the moment. :-)


How would you compare your offering, both in terms of technology and business application, to someone like CleverSet (now ATG Recommendations)?


So, you presume doing all this does not take any time for millions of objects?


All I'm saying is that it's possible with the right data structures. It's not more than what a typical search engine does.


They have a couple of articles on their site: http://directededge.com/tech.html


Thanks, you replied right before I was going to edit in the answer from http://developer.directededge.com/article/Introduction_to_Re...:

"Graph-based recommendation systems".

Something new to learn!

edit: the linked page mentions the book I'm reading! Looks like the answer was a few pages further in :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: