I caused an API server outage once with a monitoring tool, however in my case it was a monstrosity of a 20,000 line script. We quickly realized what we had done and turned it off, and I have seen in very large clusters with 1000+ nodes that you need to be especially sensitive about monitoring API server resource usage depending on what precisely you are doing. Surprised they hadn't learned this lesson yet, given the likely scale of their workloads.