Hacker News new | past | comments | ask | show | jobs | submit login
Popularity Algorithms
21 points by Anon84 on June 1, 2008 | hide | past | favorite | 12 comments
I'm working on a site with a "digg-reddit-dzone-like" feature. How should I select the front page stories? Are there any standard references for this type of algorithms? Best practices? Caveats? Any Hacker suggestions?



There's always the source to news.yc: http://ycombinator.com/arc/arc2.tar


Can I recommend the book "Programming Collective Inteligence" (http://www.amazon.com/Programming-Collective-Intelligence-Bu...)? It doesn't actually describe a Reddit-like algorithm but it does describe looks of recommendation algorithms along with practical Python code. Should get you thinking in the right direction.


the algorithms in this book might get you thinking in generally the right direction, but you might miss out on a couple of critical ingredients -- namely, time-based decay functions that make things float and sink (and float again -- dont forget that!), weighting things based on a user's "input worth" (how much you value someone's vote), and so on.

i see these three components as being the basic broth for a good "organic ranking" site:

  1. time since the article was submitted, probably represented logarithmically.

  2. how many votes came in, measured by how quickly they came in from the articles submission and how far apart each vote is.

  3. the "weight" of the users voting on the article.


4. The amount of comment activity.

5. Votes vs views.


Second that... That book is a great starting point... to get you thinking


Umm...not sure on a specific "algorithm", but....

I would say just do something which takes into account how quickly items are getting voted on and how many comments vs. votes it has. Also take into account sometimes there will be more people on site than others; If you get a swarm of people around the same time (i.e. launch), some items submitted around that time will get tons more votes than things submitted an hour ago - those stories may not have been exactly "less popular", just had less users to make the votes count.


Right. Instead of counting time you might count the number of votes.


Remember, whatever you start with will change. So make sure it's easy to change.


You can weight user importance with a ranking algorithm ( http://en.wikipedia.org/wiki/PageRank ).


do you mean that you could rank user importance by the "quality" (highest "worth") of the content they view or vote on w/ a pagerank implementation?


Why not try something novel: Initially place articles with a random number generator. It will ensure that the site's front page has a varied content.


Votes / total number of votes * views / total number of views * time decay function




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: