Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another commenter said that this change was a malformed configuration that crashed the application. If this is the case, you wouldn't need days to see this problem manifest, but only a few minutes. If they had rolled it out to 1% of their customers and waited for a couple hours before releasing it everywhere, they probably would have caught it.


A couple of hours is a long time in the world of automated attacks


It only takes a couple of minutes if you first update your on-site set of LIVE systems sitting there to detect a problem.

If problem encountered, don't send it out to everyone else.


A couple of hours is absolutely nothing compared to the massive worldwide effort that many people have to put in to fix the problem of a company’s shitty product and release practices.

This is inexcusable, point blank. “A couple of hours is a long time” is not a valid excuse when the alternative, as clearly evidenced, is millions of computers and critical systems simultaneously failing hard.

This might have been different if it was a small subset of computers, but this clearly could have been caught in minutes with any sort of sensible testing or canary rollout practices.


I'm guessing they didn't expect content updates to cause such an impact, they've been doing this for 15 years, it is that uncommon. a couple of hours in their world is a long time because their concern is protecting customers as soon as possible. I'm sure they'll do all kinds of tests going forward and be transparent about it. Keep in mind how easy it is for you or I to come to conclusions without understanding or knowing the context they operate in, maybe it will be more clear soon enough.


Then they should make their testing pipelines even faster, and make sure that they can go from detecting a new threat->tested definition file as quickly as possible. You genuinely cannot skimp on testing in this case. It's inherent to the update, threat protection and not breaking their consumers systems should be non-negotiable for a release. That means testing before deploying. If they can't do it fast enough, their product is broken.


An automated attack would struggle to reach the level of destruction that this failure had due the scale of Crowdstrike deployment and the direct update vector and kernel mode failure. Even with the most critical type of remote vulnerability it would be difficult to achieve anything approaching this level of damage, and for all we know (and by all probabilities) this update was addressing a much less severe vulnerability.


Not as long as the weeks it's going to take to undo this.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: