Because an essential enterprise security application was /able/ to bring down an entire OS like this. The issue is that Microsoft doesn't provide an interface for an application to operate in user-space to have the functionality it requires.
Linux has eBPF which can provide most of the capability that Crowdstrike needs, by using an "in-kernel verifier which performs static code analysis and rejects programs which crash, hang or otherwise interfere with the kernel negatively". If MS had this functionality, it is likely this incident would not have happened.
That said, from personal experience on Linux it's been an extremely long time since a bad kernel module has rendered a system entirely FUBAR'd.
> Linux has eBPF which can provide most of the capability that Crowdstrike needs, by using an "in-kernel verifier which performs static code analysis and rejects programs which crash, hang or otherwise interfere with the kernel negatively". If MS had this functionality, it is likely this incident would not have happened.
It didn't stop Linux machines from being down so it is clearly not as easy as you put it. The reality is that writing software is hard yet devs often trivialise it to their own detriment
The issue I am raising is /design/, not /development/. The current model of unconstrained unforgiving highly privileged execution space is a bad design, that is what eBPF tries to address.
It is a different issue[0]. The Linux issue from April was a Linux Kernel bug[1], that CS Falcon happened to trigger. The design to use eBPF is sound, but the implementation on the kernel side had a bug.
Also, CS Falcon didn't support RHEL 9.4 (only up to 9.3), so for this specific bug you highlighted, CS should not be held accountable for regression testing, because it was a platform they did not support.
With Windows, the design is currently poor to not be able to run code in a safe manner. Most recently, it appears MS is blaming the EU for forcing them to create an interface for services such as CS to run[2]. Rather than lean into the problem and create a good design, they didn't create security boundaries - risking the entire system.
Bugs happen, and Linux will continue to harden and be more resilient - but unless MS focussed on secure design in this area, things like this will continue to happen (same as they have with AV before).
>Not sure what questions Microsoft have to answer.
The only thing I could think of is if it was a driver update, the driver has to be "WHQL" signed. WHQL stands for "Windows Hardware Quality Lab" -- what quality are they ensuring? (spoiler alert from my time at Microsoft: it's not terribly robust :p )
It's not realistic for Microsoft to test drivers in a manner that represents real-world usage, but perhaps they need to start doing some basic "it works with whatever integrated agent/etc is required" testing as a requirement for signing a driver.
If it was a user-mode update? Yeah no real fault on Microsoft here.
From what I heard Crowdstrike just updated their DB file, which means the bug was alreadyq there, waiting for someone to trigger it with a "low risk" quick roll out.
You're confusing the Crowdstrike issue with Azure being down. Microsoft is ultimately responsible for anything regarding Azure even if it was a vendor that did something wrong because they choose their vendors
I guess the only question they could answer is why they don't provide a framework like Apple do with Endpoint Security for third-party vendors to use.