Hacker News new | past | comments | ask | show | jobs | submit login
Curation is the New Search is the New Curation (kedrosky.com)
46 points by atularora on Jan 12, 2011 | hide | past | favorite | 15 comments



Chrome just needs to put a 'mark as spam' button right in the browser which could then be used to augment their current ranking system.

Isn't that pretty much how spam filtering for email works? If enough people say something is garbage, it goes in the toilet. The user is still free to look in the toilet, but usually they don't need to.


This isn't necessarily false, but I think it's too simplistic to work well. How long before it too is being gamed?

The current PageRank system is essentially that the creators of web content vote for what ranks highly based on what they link to, with the strength of their vote based on the strength of their incoming votes. Seeing that this difficult to compromise system is falling to pieces, I worry that an easy to game system such as you suggest will be laid to waste within days.

Email filtering is a bit different, since you can only vote down the things that you've been sent. It's difficult for you to mark your competitors correspondence as spam since you never have a chance to vote on it. Whereas with search, you can vote on anything you choose to search for.

There's also the question of whether it's really in Google's best interest to remove the spam. It's a plausible argument that they have no easy way out --- if they remove all the links to spam sites hosting Google ads, it's entirely possible that their revenue would fall precipitously. To argue that they need to remove spam, one first needs to conclude that they benefit from doing so.

I have more hope that a 3rd party might be able to solve this. Offer a series of browser plugins for Firefox/IE/Chrome that will allow you to black list sites from search results? Everything you mark is blacklisted for you forever, and once there is a consensus (likely with human oversight) it's globally blacklisted as well? AdBlock essentially works on this model, so it should be possible.

ps. Google already offers an official Chrome extension that does much of what you suggest: https://chrome.google.com/extensions/detail/efinmbicabejjhja.... It's somewhat ironic to note that many of the 'user reviews' for it on that page are actually spam.


It amazes me that Yahoo doesn't want to keep/use delicious. What an amazing database of user voted content.


Yes, but it's only a matter of time until it gets gamed. Google could index it to find less spammy content for a while.

Ultimately you need to know who is who, who is posting what. Is that what their internet ID lobbying is about?


That's the real problem. The stakes are huge, so eventually most systems get gamed by spammers.

Maybe they could have meta-moderators (crowd-sourced?) like Slashdot has (had? haven't been there in a while). People who review what people have reported as spam, and only the top 5% most trusted and consistent accounts would be used to actually influence live rankings.

For a spammer to get there, he would have to do lots of good work for a while, and risk losing that top spot as soon as he starts trying to do spammy stuff.

Maybe it would work... Any obvious way to game such a system?


/.'s system works because the number of users who want to have a discussion outnumber the people who want to shout obscenities in all caps. That is only true because there is no money in shouting obscenities in all caps on /.- but there is plenty of money in upvoting spam on Google. If they switched to crowdsourced curation today, you'd have have a spammer to user ratio of 10:1 before February 1st.


Very good points. Thanks for the reply!


I think anything will be gamed until gaming it is more costly than its gains. There is enough cheap labour as of now who is willing to game the system - I wonder if there is a correlation between search quality decrease and increase of internet access in countries where people are willing to work for very low wages (I did not find precise enough data on google for the evolution of internet usage in the last few years in third-world countries).


That's kind of the point of my system. It takes a lot of work to become "trusted", and it's trivially easy for a trusted meta-mod to remove that status from you.

If Wikipedia can work, this might just work. But Google would no doubt prefer an algorithmic way to solve the problem. Maybe some form of narrow AI is close enough in the pipeline...


It is difficult to scale trust, for once: you would get people who keep submitting everything as spam, until overload of the "spamming" committee you are suggesting (if I understand you well). Many people are not that able to make the difference between spam and not, also (is efreedom spam or not ? If you are not a programmer ?).

I also suspect just taking the top 5 % will get a lot of false rejections, which is a real issue.

But when I say that there is a cost/revenue issue, it goes both ways: everybody thinks about the cost of spamming, but if the income coming from spamming decreases, it would also fight spam quite well. Decreasing this without decreasing google's revenue may be challenging.


It's harder to game delicious because you can narrow your search to sites that have been bookmarked by people you have chosen to be in your network. You are essentially searching across curated lists from several trusted curators.


It makes me wonder: how true is it that the web is too large to curate by hand? It's remarkable how fast a human can determine that a site is spam, and with the exception of a few large user-created sites once one has determined that a few pages on a domain are spam you don't have to look farther.

Kedrosky uses the number (unsure from where) 234 million sites. At $1/site you could have a number of quite well paid people checking off whether a site is 'real' or 'spam' at a price that Google could easily afford --- if it would improve their bottom line.

Even if you went page by page, you could figure out a way to build an index of spam-free pages. But would this really provide sufficient economic benefit? I think the answer is no, otherwise it would already have been done. What can be done to change this?


I wonder if you would even need to pay people---someone (google would seem like an obvious candidate) could write a browser extension with a "total crap" button. I know I'm sufficiently annoyed when I land on some spammy garbage that I'd like to punish them with a consequential down vote.

Hopefully you could get enough legitimate uptake to make paying people to fraudulently vote wouldn't be worth it.


Suppose the 4chan hivemind decides to mass-spam your startup with hundreds of thousands of downvotes over the next few weeks. How would you have any recourse in situations like that?


You'd probably have to have a trusted human-in-the-loop. I.e., if a site gets enough down votes, it would trip some internal procedure to review the site.

I think also if you didn't generate public feedback (i.e., you didn't list the most down voted sites for example) you'd avoid the kind of Beiber-to-DPRK stunt that 4chan would be interested in.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: