Hacker News new | past | comments | ask | show | jobs | submit login

I actually have a theory, based on the last episode of the 2.5 admins podcast. Try spinning up a MediaWiki site. I have a feeling that wiki installation are being targeted to a much higher degree. You could also do a Git repo of some sort. Either two could give the impression that content is changed frequently.



yep, I'm running a pretty sizeable game Wiki and it's being scraped to hell with very specific urls that pretty much guarantees cache busting. (usually revision ids and diffs)


I could believe that. Plus, because both of those are more dynamic, they're going to have to do more work per request anyway, meaning the effects of scraping are exacerbated.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: