The real key here is to understand Netflix’s business, and also many social media companies too.
These companies have achieved vast scale because correctness doesn’t matter that much so long as it is “good enough” for a large enough statistical population, and their Devops practices and coding practices have evolved with this as a key factor.
It is not uncommon at all for Netflix or Hulu or Facebook or Instagram to throw an error or do something bone headed. When it happens you shrug and try again.
Now imagine if this was applied to credit card payments systems, or your ATM network, or similar. The reality of course is that some financial systems do operate this way, but it’s recognized as a problem and usually gets on people’s radar to fix as failed transaction rates creep up and it starts costing money directly or clients.
“Just randomly kill shit” is perfectly fine in the Netflix world. In other domains, not so much (but again it can and will be used as an emergency measure!).
These companies have achieved vast scale because correctness doesn’t matter that much so long as it is “good enough” for a large enough statistical population, and their Devops practices and coding practices have evolved with this as a key factor.
It is not uncommon at all for Netflix or Hulu or Facebook or Instagram to throw an error or do something bone headed. When it happens you shrug and try again.
Now imagine if this was applied to credit card payments systems, or your ATM network, or similar. The reality of course is that some financial systems do operate this way, but it’s recognized as a problem and usually gets on people’s radar to fix as failed transaction rates creep up and it starts costing money directly or clients.
“Just randomly kill shit” is perfectly fine in the Netflix world. In other domains, not so much (but again it can and will be used as an emergency measure!).