Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Free tool to find RSS feeds, even if not linked on the page
152 points by domysee 60 days ago | hide | past | favorite | 50 comments
I developed a small tool to find RSS feeds for websites. You can try it out here: https://lighthouseapp.io/tools/feed-finder

In >90% of cases the standard way of checking meta tags is enough to find the feeds. But my goal for this tool is that it finds feeds regardless if they're linked somewhere or not. That if this feed finder doesn't find a feed, no feed exists.

It's a big goal and admittedly not there yet, but it does a few things that are a step in that direction.

* Checks meta tags of parent pages (sometimes the article itself doesn't have the meta tag, but the main blog page does)

* Checks common suffixes like /rss, /index.xml and many others (sometimes the feed exists but isn't linked)

* Checks the sitemap

* Checks all links on the page

* Checks 3rd party feeds (OpenRSS for now, when I find more such repositories I'll add them too)

There are a couple of additional ideas I have, like checking search engines and crawling the entire domain (highly inefficient, but possible).

Would love if you could try it, and even more if you post sites where it doesn't work.




Quick rant about websites that go into all the trouble of having an RSS feed but not linking to it in the <head>... I don't want to go hunting for the cute orange button, I want to copy and paste "https://example.com" into my feed reader and let the computer handle the work.

If you maintain any website with a news feed, go right now and check that you have this in your <head>:

    <link rel="alternate" type="application/rss+xml" href="/rss.xml" title="News feed" />
                                                           ^^^^^^^^ change! ^^^^^^^^^
(Also note whether and where you need to use application/rss+xml, application/atom+xml, or application/json.)


Thanks for this comment, it encouraged me to go and add this to the <head> of my blog.


Subscribed ;)


This is great, it's hard to believe sites can have RSS feeds but make it so difficult to find.

I suspect some sites are just running some framework than enables it and don't even realize they have one.

I have used this site in the past to find feeds: https://www.rsssearchhub.com/

In the past I was looking for a feed for https://ra.co, but could not find it, though I had seen old posts referencing a RSS feed.

I ended up emailing them and, to my delight, they let me know they still have an unsupported RSS feed here:

https://ra.co/xml/rss_news.xml

Just for feedback, this tool doesn't find the feed, though it doesn't look like a standard URL to me.


Definitely not a standard path, but good to know for testing, thank you!


If I can't find an RSS link directly, I generally copy the root URL into archive.org and search for all URLs matching "xml", which includes content type, not just URL names.


This is 100% a feature that should be in the browser, not a third party tool. I still use an very old version of Firefox for this. Too bad Mozilla decided auto-discovery wasn't necessary in 2016 and removed it. Then two years later claimed no one was aware of RSS/Atom feeds and didn't use them (I wonder why?!?). All so they could try to replace it with their profit/adware that is pocket and we all know how that went.

>Mozilla is working on alternatives such as Pocket or Reader Mode, and on improving WebExtensions which could provide features related to RSS/Atom feeds without the toll on maintenance. (ref: https://www.ghacks.net/2018/07/25/mozilla-plans-to-remove-rs...)


Interesting. These days I was trying to subscribe to some blogs, and they didn’t have a RSS button in their page, so I had to inspect the page to find out the feed URL. Not sure why keep a RSS feed but hide from the visitors. It could be it expected the feed reader to be able to identify it, but since I was using Thunderbird it did not.


Most feed readers find at least feeds that are linked with a link tag in the header, if it's <link rel="alternate" type="application/rss+xml" ... />

Probably they're expecting people to just paste the website URL in the feed reader and them identifying it. But it would be nice to see the RSS URL linked somewhere.


Some of these cases are sites that are built on a CMS that exposes RSS by default, but people don’t consider showing a link/button/whatever in their design.


> Application error: a client-side exception has occurred (see the browser console for more information).

Ok then.

Also, this would make more sense as a browser extension. Especially if it brought back the RSS icon in the address bar to indicate when a feed is available (although maybe you don't want it to do all of the checks until prompted).


Which URL did you try?

Yeah the checks are quite expansive, depending on the URL it might more than a hundred requests.

A browser extension would make sense. Guess I have another project :D


100!? I have a tool to find feeds from sites - checks like 4 things.


Well, it must miss many then: my list already is only (and omits a few variations e.g. with 'atom'):

  .../rss , .../rss.xml , .../.rss , .../rss_full.xml , .../feed , .../rss-feed , .../feed/all/ , .../MySection.xml , .../MySection.atom , feedserver.example.com/section/index


Great idea. I tried it with my personal site (https://matthew.science) and it didn't find any, which admittedly doesn't have any meta tags, but it is linked at the footer at https://matthew.science/atom.xml. It was the default feed URL for my SSG. I'd recommend adding this to the common suffix list.


This I must check, it looks standard enough that the tool should've found it. Thanks for the feedback!


Tried the hacker news front page (https://news.ycombinator.com/news) and when clicking on OpenRSS I get this error:

TypeError: URL constructor: is not a valid URL. [NextJS] (5603-cb6f1c5a9761f9d0.js:14:5466)

Browser is Firefox 130.0 on Windows.

Would be really nice to see this working really well since I search for RSS feeds a lot for a bunch of different things. Whether the RSS feed is good is always another question.


I don't get the error on my machine, but there probably is a timing issue somewhere. Thanks for letting me know!


I've been using an NPM package called rss-url-finder [1] in my blog search engine project to find the RSS link. It works relatively well, but still fails sometimes. For now I end up manually searching the source code of the HTML page for .xml or similar link.

[1] https://www.npmjs.com/package/rss-url-finder


FYI it's only finding one (Atom) feed at earth.org.uk, even though there are several feeds, Atom and RSS.

Your method described above should have found at least two feeds I think.


Interesting, I'll check that, thanks for letting me know!


I am very grateful for this actually. I still read RSS and when I find a good news site I tend to spend 15 minutes or more looking for their feed.


Are you opposed to this being used programmatically? I've been working on a site [0] that replays feeds, but the initial step is to first find the feed given a website, and it's not always able to find it. I'd be interested in using your service to try to find the feed when I'm unable to do so.

[0] https://refeed.to


Can you explain the purpose of replaying a feed is?


My initial use case was for reading content from blogs that had been published before I'd subscribed to their feed. I could visit their site and read their previous posts, but I much prefer the slow drip of an RSS feed. So I created refeed.to to be able to add 1 post per day from the blog to my feed starting from their first post.

Since creating it I also use it to inject a few extra cartoons into my feed (xkcd every day!) and have also had fun with tech flashbacks from trustedreviews.com. So it's just a way to add a little variation to my feed.


Sure, email me at dominik at lighthouseapp.io



This is great, thank you!


Great work! I've stopped using Twitter but I managed to taper from it by following things using RSS feeds drawn from Nitter. Don't know if that still works but could be an idea?


Twitter feeds would definitely be great to have, will check Nitter to see how I can get them. Thanks for the suggestion!


add also .feed to common suffixes example: https://wiadomosci.onet.pl/.feed


thank you!



Great tools.

I always use RSSHub Radar , Your tools support more website than RSSHub Radar

Detection of /feed could be added, most wordpres supported sites have this suffix


Cool. I wrote a script to search google and find sites with rss feeds so I can create a collection on a particular topic.


That's awesome. Is there any specific search text you used to find the feeds? I know Bing has a command to do that but don't know about Google.


Don't forget DDG and Kagi - might of some tools too


I tried it on my website, ebookany.com, but didn't find anything. So sad :(( But your idea is quite interesting.


That's good to know, thank you, helps me debugging


I bet this finds some feeds that sites don't know or have forgotten they even have.


The tool misses reddit rss feeds.


Thanks for the hint, will fix that!


cant find lex fridman podcast's feed. https://lexfridman.com/


my suggestion is a way to have users of the extension suggest a feed URL if it doesn't find one


Cool. I'm a big fan of RSS feeds.

Wondering if it's necessary to continue with the other checks if you find a feed in the meta tags?


Probably not, but I'm trying to find all feeds.

I guess the best option is to show results as soon as they are found, without waiting for everything to complete.


[deleted]


That's super interesting, will definitely try it, thank you!


RIP Google Reader





Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: