I have to use an icinga instance from time to time (icinga is a nagios fork). I really can't see the value, beyond seeing if a service is up or down.
I'm surprised no one has named Zabbix. Zabbix is way better. I hadn't the chance to use Zabbix past 4.something but it's worth it.
I've been using Prometheus/grafana and frankly the value I see is it's out of the box adaptability at capturing a mutating data source (example: metrics about ephemeral pods Una kubernetes cluster).
> I really can't see the value, beyond seeing if a service is up or down.
This is an extreme oversimplification. The value is not in "seeing" if something is "up or down", the value is in the modularity of what a "service" can mean in the first place (anything you can script -- and the eco-system of plugins is huge), the fact that you don't have to "see" it (because notifications are extremely modular), the fact that escalations of issues can happen automatically if they are not resolved, and the fact that event-handlers in many cases can help you resolve the issue automatically without even having to raise an alert in the first place.
Nagios is a monitoring tool built with the UNIX philosophy in mind, and it's ingenious in its simplicity: decide state based on script or binary exit codes, relate dependencies between objects to avoid unnecessary troubleshooting, notify if necessary (again, with scripts/binaries) and/or try to resolve if configured. It hooks into a server frame of mind very well if you're a sysadmin.
Sure, if you main use case is "mutating data sources" and collecting metrics, any Nagios flavor won't be for you, because it's not what Nagios is made to do. There's a reason it's extremely popular in large enterprises, because it was created for them. No monitoring solution is for everyone and solves every problem.
I'm surprised no one has named Zabbix. Zabbix is way better. I hadn't the chance to use Zabbix past 4.something but it's worth it.
I've been using Prometheus/grafana and frankly the value I see is it's out of the box adaptability at capturing a mutating data source (example: metrics about ephemeral pods Una kubernetes cluster).