> In the end, the conclusion is that there’s no real way to us fix this that would stop “attacks” against small consumer grade sites without also significantly degrading the overall functionality.
Nonsense. Every web crawler should have some form of rate limiting. That's just good etiquette. I can control the number of requests that the Google search indexer sends to my site via webmaster tools. I don't see a good reason why Facebook can't be a good net citizen and do the same.
I think in this intance we've passsed the line between "not being nice" and "being a dick", or in this case being an accessory to a dickish move.
If the issue is used to DoS a site Zuckerberg cares about, not that I would encourage such action as it would in itself be a dick move, I'm sure some form of rate limiting will be implemented in short order...
I'm kind of wondering why they need so many different servers to fetch the file from the remote host.
It would be smart to at implement rate-limiting, and also delegate to a specific server close to the host to fetch the image, and then sync the image/file across their own networks to whatever server needs it.
It is just a huge waste of bandwidth that 100+ servers need to fetch the file, instead of Facebook itself absorbing the cost.
Well I would assume that when you add a link to a note the link gets thrown into a queue and a cluster of servers pops items off that queue to fetch and store the result. Also remember FB sees each link as a different link so it can't fetch it once and share it.
Nonsense. Every web crawler should have some form of rate limiting. That's just good etiquette. I can control the number of requests that the Google search indexer sends to my site via webmaster tools. I don't see a good reason why Facebook can't be a good net citizen and do the same.