- Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
- The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
- Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”
It was a Crowdstrike-triggered issue that only affected Microsoft Windows machines. Crowdstrike on Linux didn’t have issues and Windows without Crowdstrike didn’t have issues. It’s appropriate to refer to it as a Microsoft-Crowdstrike outage.
I guess microsoft-crowdstrike is fair, since the OS doesn’t have any kind of protection against a shitty antivirus destroying it.
I keep seeing articles that just say “Microsoft outage”, even on major outlets like CNN.
To be clear, an operating system in an enterprise environment should have mechanisms to access and modify core system functions. Guard-railing anything that could cause an outage like this would make Microsoft a monopoly provider in any service category that requires this kind of access to work (antivirus, auditing, etc). That is arguably worse than incompetent IT departments hiring incompetent vendors to install malware across their fleets resulting in mass-downtime.
The key takeaway here isn’t that Microsoft should change windows to prevent this, it’s that Delta could have spent any number smaller than $500,000,000 on competent IT staffing and prevented this at a lower cost than letting it happen.
I guarantee someone in their IT department raised the point of not just downloading updates. I can guarantee they advise to test them first because any borderline competent I.T professional knows this stuff. I can also guarantee they were ignored.
Also, part of the issue is that the update rolled out in a way that bypassed deployments having auto updates disabled.
You did not have the ability to disable this type of update or control how it rolled out.
https://www.crowdstrike.com/blog/falcon-content-update-preliminary-post-incident-report/
Their fix for the issue includes “slow rolling their updates”, “monitoring the updates”, “letting customers decide if they want to receive updates”, and “telling customers about the updates”.
Delta could have done everything by the book regarding staggered updates and testing before deployment and it wouldn’t have made any difference at all. (They’re an airline so they probably didn’t but it wouldn’t have helped if they had).