Hacker News new | past | comments | ask | show | jobs | submit login

If your feed reader is refreshing every 20 minutes for a blog that is updated daily, nearly 99% of the data sent is identical. It looks like Rachel's blog is updated (roughly) weekly, so that jumps to 99.8%. It's not the least efficient thing in the world of computers, but it is definitely incurring unnecessary costs.



I opened the xml file she provides in the blog and it seems very long but okay. Then I decided it is a good blog to subscribe so I went and tried to add to my freshrss selfhosted instance (same ip obviously) and I couldn't because I got blocked/rate limited. So yes it is aggressive for different reasons.


This post (before reading your comment) actually made me look into my own freshrss setup.

NixOS defaults to refresh frequency of every 5 minutes[0] (0_0).

I had noticed some blogs blackholing me before, but never quite made the connection.

So now it is configured to fetch every 12 hours. I believe that is fair.

[0] https://github.com/NixOS/nixpkgs/blob/d70bd19e0a38ad4790d391...


Same, I made 3 requests in total and got blocked.


I know she's mentioned this particular problem before on her blog, I don't remember where to find it offhand now but my vague recollection is that because browsers have largely removed the ability to directly view RSS feeds she doesn't consider this a significant issue anymore.

Why did you view the XML file directly?


> Why did you view the XML file directly?

There are many reasons why I do personally this.

1- Check that the link actually loads and works!

2- See how much content and does it contain last n or all feed history by default

3- To see if the feed gives summary or full content of posts

4- Just for curiosity like in this case I wanted to see what is this feed that prompted a blog post that reached HN front page.

It is usually a superposition state of those reasons. But this is why it is aggressive limit and I know it her server her rulee but this wasn't pleasant experience for me as an end user. I was just sharing my experience.


I feel like a reasonable way to deal with this situation might be to look at the user agent: if two requests come from the same IP but different user agents, then it's likely that it's either actually two completely different people (behind a NAT), or this situation the GP described.

That's certainly a bit more effort to implement, though, and the author night not think it's worth the time.


Weird. Those should have had different user-agents, and I would guess it cannot be purely based on up.


Yeah, that's insane. Pretty much telling me not to subscribe to your blog at that point. Like sites that have an rss feed yet put Cloudflare protection in front of it...

The correct thing to do here is put a caching layer in front so that every feed reader isn't simultaneously hitting the origin for the same content. IP banning is the wrong approach. (Even if it's only a temporary block, that's going to cause my reader to show an error and is entirely unnecessary.)


It should be a timeboxed block if anything. Most RSS users are actual readers and expecting them to spend lots of time figuring out why clicking "refresh" twice on their RSS app got them blocked is totally unreasonable. I've got my feeds set up to refresh every hour. Considering the small number of people still using RSS and how lightweight it is, it's not bad enough to freak out over. At some point all Rachel's complaining and investigating will be more work than her simply interacting directly with the makers of the various readers that cause the most traffic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: