Hacker News new | past | comments | ask | show | jobs | submit login

The stories that he's wrong about are really important here, and would determine if this is news or not.

If his site has a higher quality of content, good for him! It's like the netflix prize, except no cash.

If it's worse than digg's page, then he hasn't improved anything.

Also, I'd be curious to see how "63% accuracy" is defined. In an ecosystem where 1% of stories get through, whether this number is based on false-positives or false-negatives will make a big difference. (He could be underselling himself!)




There are usually 2 numbers used to measure the 'accuracy' of a test (be it a "This link will reach the digg front page" or "This person has HIV" etc.). Those numbers are the false positive rate (you said they'd get to the front page, and they didn't), and the false negative rate (you said they wouldn't get to the front page, and they did).

It's common for these numbers to be related. Decrease one number and the other goes up. E.g. you could get a 0% false positive rate by just saying "Yes this link will get to the front page" for all pages, however your false negative rate would be massive, about 99.999999% (since you're predicting that every link gets to the front page). This test would be useless because of the high false negative rate. A breakthrough occurs when you are able to have a low false positive and low false negative result. The holy grail of any test is one that would have a 0% false negative and a 0% false positive rate.

Usually it's a trade of between false positive and false negative. Most western justice systems would rather a high false negative than a high false positive, "Better 10 guilty men walk free, than 1 innocent man goes to jail".

His statement about "63% accuracy" is ambiguous. What is he refering to? What are the false positive and false negative rates?


I think he means 37% false positives. This guess is based on the archive section, where he lists hits and misses. http://digginthefuture.com/archive


Based on what I've used with Digg's upcoming engine, it doesn't seem near 63% at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: