I forgot they said that (I read this when it was posted a couple days ago), than...

silon42 · on Feb 4, 2021

We had shutdown hooks on our preemptive VMs, but we often had cases (at least weekly), where it looked like they failed to run (failing to unregister from cluster). Any explanation?

boulos · on Feb 4, 2021

Do you mean on GKE or directly on GCE? It sounds like you mean GKE (“failed to unregister from cluster”).

We’re looking to fix up the GKE graceful node shutdown, because it’s currently “racy” and doesn’t actually respect the grace period properly (system pods / processes can be shutdown before waiting for user pods, causing you to lose logging or say the kubelet).

silon42 · on Feb 5, 2021

Yes, GKE containers, sorry about confusion... sometimes it looks like a node has disappeared without much shutdown work.