> So you’re in the on-call rotation, now what? Make pages approach zero. You can do it. Trust me, you can make pages trend towards zero. Many have done it on teams at the highest-scale companies in existence.
How about people own up to their mistakes instead of relying on random Joe who happens to be on call to fix it? I have been burned too many times by write only code written by some coworker who wants nothing to do with it. This is not just an on-call issue btw.
> Apologizing to angry customers
This is a terrible idea. Just because a customer is angry it doesn't mean you should apologize. SaaS customers are angry all the time because some 3rd party integration is returning a 5xx error.
How about people own up to their mistakes instead of relying on random Joe who happens to be on call to fix it? I have been burned too many times by write only code written by some coworker who wants nothing to do with it. This is not just an on-call issue btw.
Sorry for my french but
Jesus fucking christ, thank you.
I was being woken up constantly at last job due to a bug in the application because I was the Devops guy and this is exactly how I was treated. I spent weeks finding contributing causes to frequent failures. Gathered all the evidence I could find and begged for a fix, I scheduled calls with the product team; I poured over documentation and diagrams “this is the problem, this is inherent to the way these requests are being made, until it’s fixed we have no choice but to throttle the app”. I recorded a ticket each time I got paged and linked back to the team who owned the feature.
What was infuriating in the end was finally being told that they knew about the problem long before I brought it up and even HOW to fix it; it was just consistently being deprioritized by people who had the means to influence the decision on “fix this” or “write new shit”. Eventually I just suppressed the application in PagerDuty and stopped waking up whenever it would fail.
SLA slipped, customers complained, I find myself in a meeting being asked very aggressively “what’s being done about this and why are you ignoring PagerDuty?”
I said I wasn’t ignoring PagerDuty and presented a log of 20 something tickets I created linking back to the original “Please fix this request”. I told my leadership “I’m not the one ignoring this problem, I have been trying to get this addressed and prioritized for months”.
Laid off a month later. It’s becoming a growing reason why I want out of Ops and never want to return to it: companies hiring Ops people and treating us like the kitchen sink for work the company is too lazy to actually address and properly prioritize through staffing, planning or both.
How about people own up to their mistakes instead of relying on random Joe who happens to be on call to fix it? I have been burned too many times by write only code written by some coworker who wants nothing to do with it. This is not just an on-call issue btw.
> Apologizing to angry customers
This is a terrible idea. Just because a customer is angry it doesn't mean you should apologize. SaaS customers are angry all the time because some 3rd party integration is returning a 5xx error.