I have to use an icinga instance from time to time (icinga is a nagios fork). I ...

hnarn · on July 22, 2020

> I really can't see the value, beyond seeing if a service is up or down.

This is an extreme oversimplification. The value is not in "seeing" if something is "up or down", the value is in the modularity of what a "service" can mean in the first place (anything you can script -- and the eco-system of plugins is huge), the fact that you don't have to "see" it (because notifications are extremely modular), the fact that escalations of issues can happen automatically if they are not resolved, and the fact that event-handlers in many cases can help you resolve the issue automatically without even having to raise an alert in the first place.

Nagios is a monitoring tool built with the UNIX philosophy in mind, and it's ingenious in its simplicity: decide state based on script or binary exit codes, relate dependencies between objects to avoid unnecessary troubleshooting, notify if necessary (again, with scripts/binaries) and/or try to resolve if configured. It hooks into a server frame of mind very well if you're a sysadmin.

Sure, if you main use case is "mutating data sources" and collecting metrics, any Nagios flavor won't be for you, because it's not what Nagios is made to do. There's a reason it's extremely popular in large enterprises, because it was created for them. No monitoring solution is for everyone and solves every problem.