Is This the Biggest IT Outage in History?

Home » Blog » Is This the Biggest IT Outage in History?

 

Most of us were looking forward to a chill and laid-back Friday on 19th of July 2024 but unfortunately something big struck on that day. As a result, 8.5 million Windows systems were affected globally, displaying the ill-famed ‘Blue Screen of Death.’ Reports indicate that this global disruption was caused by a routine update from CrowdStrike, a cybersecurity software company.

From airlines to IT operations, banking systems, and even government offices, the Microsoft outage caused chaos among numerous companies around the world. However, some reports stated that home PCs weren’t particularly affected, as CrowdStrike’s products were primarily popular in large corporations that required powerful cyber attack prevention.

 

Who is CrowdStrike?

CrowdStrike, a global security company with a leading cloud detection and response(CDR) for threats to stop breaches, releases new security configurations multiple times a day. The company delivers cloud solutions to many organizations and has a minimum licensing requirement of 1,000 users, implying it is not generally used by individual home users.

The issue that impacted global airlines, banks, and other major companies tells us that technology is advancing, but we still have much to learn. CrowdStrike operates at a deeper level within Windows. But interestingly, Mac and Linux were not affected by this issue, the reasons of which are mentioned later in the article.

 

What Led to the Outage?

CrowdStrike’s Falcon system is created to safeguard against cyberattacks. Unfortunately, the system released a faulty update that disrupted millions of Windows PCs and servers for several hours at a stretch. Customers running Falcon sensors for Windows version 7.11 and above that were online between Friday, July 19, 2024, 04:09 UTC, and Friday, July 19, 2024, 05:27 UTC were impacted. Systems running Falcon sensors for Windows 7.11 and above that downloaded the updated configuration from 04:09 UTC to 05:27 UTC became susceptible to a system crash.

It wasn’t a Microsoft error directly but a CrowdStrike Falcon flaw that led to the problem. The error lay in the sensor configuration update of CrowdStrike Falcon. Generally, these sensors (or virus signature files in anti-virus software terms) are updated regularly, often multiple times a day, to provide users with optimal threat protection and mitigation. The faulty update was contained in a file referred to as a “channel file” in Crowdstrike’s terminology, which delivers configuration updates for behavioral protections. The “Channel file 291” mainly caused a logic error that resulted in Windows System Crash. Nonetheless, the issue has been identified and a new update was released.

 

What are the Recovery Steps?

Since this caused a major outage, the affected systems are taking time to recover and get back to normal. In some cases, the bit locker codes have to be looked up one by one and the DSRN password that needs to be reset on Domain controllers to log in locally and fix the issues, is taking time.

While CrowdStrike has issued an update to fix its software that led to millions of Blue Screen of Death errors, not all machines can automatically receive that fix. Some IT admins have reported that rebooting PCs multiple times will get the necessary update, but for others, the only route is to manually boot into Safe Mode and delete the problematic CrowdStrike update file.

Microsoft released a new recovery tool that simplifies this process. The tool boots into its Windows PE environment via USB, accesses the disk of the affected machine, and automatically deletes the problematic CrowdStrike file to allow the machine to boot properly. 

This is similar to a power blackout, where multiple choke points exist and other issues prevent everything from coming back to normal. It is taking time (in days) to get everything back to normal.

 

Why Apple and Linux Kept Running Smoothly

CrowdStrike software runs on not just Microsoft Windows but also Apple’s iOS and Linux OS. However, the outage only hit Microsoft Windows.

 

Why is that?

Apparently, a messed-up sensor configuration update caused the disruption. Windows provides direct BIOS or kernel access, something Apple brags about as their “walled garden.” The Channel 291 update didn’t roll out to iOS or Linux systems because it specifically deals with named pipe execution, a feature unique to Microsoft Windows. Plus, the way the Falcon sensor integrates with Windows OS is different from iOS and Linux OS, which have unique integration points designed to lower risks.

 

A New Hacking Trick Emerges Amid CrowdStrike Glitch

Reports are coming in about a file pretending to be a ‘CrowdStrike fix’ that’s making its way around the internet. It’s loaded with malware that gives hackers full remote access to a target’s computer, allowing them to steal personal information and data. There’s not much info yet, but it’s seen as a serious threat.

To tackle the outage, CrowdStrike said they’re actively working on the issue, which only hit Windows users. “This isn’t a security incident or cyberattack. The issue has been identified, isolated, and a fix has been deployed. Our team is fully mobilized to ensure the stability and security of CrowdStrike customers,” said the company CEO.

It’ll be interesting to see how this plays out in the coming weeks and months. We might even see a House or Senate hearing on this soon.