I like the idea of asserting "here is what my environment should look like, and how it should behave."
However, What is the advantage of this over a configuration management tool (puppet, salt, chef, ansible, etc)? You can make similar operational state assertions with a CMS, and better, it will fix any problems it encounters when running.
It doesn't always though, with chef/puppet et al, you can write the code to throw up X (aapche w/ ssl etc), but unless you use a testing facility, there is no way to know for sure (other than manually checking obviously).
You can dictate that a service is running, and it will ensure that the service is running. You can also set up more advanced HTML checks in at least some frameworks[1] to ensure that the contents are being served correctly.
Writing code/recipes/etc for chef/ansible/puppet is no different to writing code for an application. Just because you think you have dictated (in code) that things should be a certain way does not mean that what will be executed or run.
Brilliant post and brilliantly timed for me as I've been thinking about ways to expand my own operations test suite into the realm of web service behavior.
To address the folks who say just use Nagios, well I say use the right tool for the job and the right tool for your existing processes. For some environments and tasks Nagios is a great choice. For others a scripted Rspec style solution is more appropriate.
And for still others (like mine) a combination of a traditional monitoring system with some scripted TDD-style tests (which can also tie into the central monitoring agent, btw) is the right solution.
I'm a little skeptical of it; it seems very complex. In contrast, I'm leveraging the Ruby+RSpec infrastructure which I've already learned. And I interact with it via programming, instead of via configuration.
Also, by using just one language, Ruby, I can easily have full test coverage of the framework code itself. There are a lot of nagios plugins and scripts out there, but they don't seem to usually include tests of themselves.
But nagios is very intriguing - I'll check out what a "simple" install would be like.
nagios is awful :-), active checks are ultimately a dead end.
Heartbeat style stuff is where it's at.
You should definately check out sensu and riemann, (or hevean forbid if you really like node.js everything Godot)
However, What is the advantage of this over a configuration management tool (puppet, salt, chef, ansible, etc)? You can make similar operational state assertions with a CMS, and better, it will fix any problems it encounters when running.