I found the software in this stack to be very bloated and difficult to maintain....

robotmay · on July 21, 2020

I have a bit of a love/hate relationship with Prometheus. At home I really like it; it was simple to set up for my needs and most of my configuration is on my server which then scrapes other machines for the data. However I find it quite frustrating at scale for work, both in its concepts (it's hard to describe but it's sort of...backwards?) and in its query performance, although that might be a side-effect of using it with Grafana and me attempting to misuse it. By contrast I think the concepts of something like TimescaleDB are easier to understand when it comes to scaling and optimising that service.

In my previous job I had a very clear use-case for not using Prometheus and did for a while use InfluxDB (it involved devices sending data from behind firewalls across many sites). I found it pretty expensive to scale and it fell over when it ran out of storage, which feels like something that should have been handled automatically considering it was a PaaS offering.

ddevault · on July 21, 2020

One point of note for SourceHut's Prometheus use is that we generally don't make dashboards. I don't really like Grafana. I will sometimes use gnuplot with styx to plot graphs on an as-needed basis:

https://github.com/go-pluto/styx

This is how I made the plots in that blog post.

robotmay · on July 21, 2020

I have a similar relationship to Grafana as I do for Prometheus; love it for my home and I've got some very useful graphs for my home network, but it's almost unusable for work due to its speed degradation the moment you start adding more graphs. Again it's probably due to my lack of knowledge around some of the Prometheus functions for reducing the amount of data returned, but it would be nice if it could handle some of that automatically rather than just grinding to a halt.

dewey · on July 21, 2020

Can't you generate the same kind of graphs you have there with the normal Prometheus query explorer / web ui?

ddevault · on July 21, 2020

On a basic level, yes, but I often just use it as a starting point for more complex gnuplot graphs, or different kinds of visualizations - box plots, histograms, etc.

dewey · on July 21, 2020

I guess the https://github.com/prometheus/pushgateway could help with that? As for the query performance there's a lot of things you can do with recording rules, that might help a lot with speeding up dashboards or queries.

robotmay · on July 21, 2020

Yeah the pushgateway was the alternative to using InfluxDB. In the end we actually used Datadog for it, despite the cost, as it was just easier to scale on it (we had hundreds of devices per site). The pushgateway route with Prometheus just ended up feeling like there were too many things relying on each other, i.e. Prometheus -> Push Gateway <- Multiple agents on each device, is inherently more complex than just connecting directly to a DB/service from the device.

valyala · on July 26, 2020

Try VictoriaMetrics next time. It supports data push via multiple popular data ingestion protocols [1] and it provides Prometheus-compatible API for Grafana [2].

[1] https://victoriametrics.github.io/#how-to-import-time-series...

[2] https://victoriametrics.github.io/#grafana-setup