We are dogmatic and emotional, but the temptation to base your opinions on the "deeper theory" is large.
Pragmatically, restart the service periodically and spend your time on more pressing matters.
On the other hand, we fully understand the reason for the fault, but we don't know exactly where the fault is. And it is, our fault. It takes a certain kind of discipline to say "there are many things I understand but don't have the time to master now, let's leave it."
"certain kind" of discipline, indeed... not the good kind. and while your comment goes to great pains to highlight how that particular God is dead (and i agree, for the record), the God of Quality (the one that Pirsig goes to great lengths to not really define) toward which the engineer's heart of heart prays that lives within us all is... unimpressed, to say the least.
Sure, you worship the God of Quality until you realize that memory leak is being caused by a 3rd party library (extra annoying when you could have solved it yourself) or a quirky stdlib implementation
Then you realize it's a paper idol and the best you can do is suck less than the average.
> "certain kind" of discipline, indeed... not the good kind.
Not OP but this is a somewhat normal case of making a tradeoff? They aren't able to repair it at the moment (or rather don't want/can't allocate the time for it) and instead trade their ressource usage for stability and technical debt.
Pragmatically, restart the service periodically and spend your time on more pressing matters.
On the other hand, we fully understand the reason for the fault, but we don't know exactly where the fault is. And it is, our fault. It takes a certain kind of discipline to say "there are many things I understand but don't have the time to master now, let's leave it."
It's, mostly, embarrassing.