We hit this in a past gig too. One of the big services had a leak, but deployed every 24 hours which was hiding it. When the holiday deploy freeze hit the pods lived much longer than normal and caused an OOM storm.
At first I thought maybe we should add a "hack" to cycle all the pods over 24 hours old, but then I wondered if making holiday freezes behave like normal weeks was really a hack at all or just reasonable predictability.
In the end folks managed to fix the leak and we didn't resolve the philosophical question though.
At first I thought maybe we should add a "hack" to cycle all the pods over 24 hours old, but then I wondered if making holiday freezes behave like normal weeks was really a hack at all or just reasonable predictability.
In the end folks managed to fix the leak and we didn't resolve the philosophical question though.