Hacker News new | past | comments | ask | show | jobs | submit login

Prometheus has become ubiquitous for a reason. Exporting metrics on a basic http endpoint for scraping is as simple as you can get.

Service discovery adds some complexity, but if you’re operating with any amount of scale that involves dynamically scaling machines then it’s also the simplest model available so far.

What about it doesn’t work for you?

Edit: I didn’t touch on logging because the post is about metrics. Personally I’ve enjoyed using Loki better than ELK/EFK, but it does have tradeoffs. I’d still be interested to hear why it doesn’t work, so I can keep that in mind when recommending solutions in the future.




Last time I tried Prometheus was years ago. So I don't know how much might have changed... I gave it a good month or two of effort trying to get the stack to do what I needed and never really succeeded.

Just my opinion, but I honestly don't think the scraping model makes much sense. It requires you expose extra ports and paths on your servers that the push model doesn't require. I'm not a fan of the extra effort required to keep those ports and paths secure.

Beyond that, promql is an extra learning curve that I didn't like. I still ran into disk space issues when I used a proper data backend (TimescaleDB). Configuring all the scrapers was overly complicated. Making sure to deploy all the collectors and the needed configuration was rather complicated.

In comparison, deploying Filebeat and Metricbeat is super simple, just configure the yaml file via something like Ansible and you're done. Elastic Agent is annoying in that you can't do that when using Fleet, or at least I have yet to figure out how to automate it. But it's still way easier than the Prometheus stack.

I've tried to get Loki to work 2 or 3 times. Never have really succeeded. I think I was able to browse a few log lines during one attempt, I don't think I even got that far in the other attempts... The impression I came away with was that it was designed to be run by people with lots of experience with it. Either that, or it just wasn't actually ready to be used by anyone not actively developing it.

So, yeah, while I figure a lot of people do well with the Prometheus/Grafana/Loki stack, it just isn't for me.


The most basic setup, and the one typically used until you need something more advanced, is using Prometheus for scraping and as the TSDB backend. If you ever decide to revisit prometheus, you’ll likely have better luck starting with this approach, rather than implementing your own scraping or involving TimescaleDB at all (at least until you have a working monitoring stack).

There used to be a connector called Promscale that was for sending metrics data from Prometheus to Timescale (using Prometheus’ remote_write) but it was deprecated earlier this year.


Also important to add: using prometheus as the tsdb is good for short term use (on the order of days to months). For longer retention you could offload it elsewhere, like another Prometheus-based backend or something else SQL-based, etc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: