More likely crash looping of so many VMs overloading some system with insufficient back pressure, possibly combined with unfortunate cluster management scheduler behavior at this scale of crash looping (e.g. too eager to retry scheduling instances, maybe even on new hosts which causes more infrastructure load).