Advertisement
A general view from Düsseldorf Airport as passengers gather and wait due to the global communications outage caused by CrowdStrike, which provides cybersecurity services to U.S. technology company Microsoft, on July 19, 2024, in Düsseldorf, Germany.
Security experts believe that the routine update of CrowdStrike’s widely used cybersecurity software, which caused a global system crash on Friday, likely did not undergo adequate quality checks before its deployment. This latest version of the Falcon Sensor software was intended to enhance security for CrowdStrike clients by updating threat defenses, but instead, it resulted in one of the most extensive tech outages in recent history for companies using Microsoft’s Windows operating system.
Extensive Disruptions
The outage disrupted operations at global banks, airlines, hospitals, and government offices. While CrowdStrike quickly released information to fix the affected systems, experts warned that recovery would take time due to the need to manually eliminate the flawed code.
Steve Cobb, Chief Security Officer at Security Scorecard, commented, “What it looks like is, potentially, the vetting or the sandboxing they do when they look at code, maybe somehow this file was not included in that or slipped through.” Security Scorecard also experienced impacts from the issue.
Immediate Fallout
Problems surfaced soon after the update’s rollout on Friday, with users posting images on social media of their computers displaying the dreaded “blue screen of death” error messages. Patrick Wardle, a security researcher specializing in operating system threats, analyzed the code and identified the source of the outage.
“The update’s problem was in a file that contains either configuration information or signatures,” Wardle explained. These signatures are used to detect specific types of malicious code or malware.
Frequent Updates and Potential Oversights
Wardle noted that the frequent updates, intended to keep clients protected from the latest threats, might have contributed to the lack of thorough testing. “It’s very common that security products update their signatures, like once a day… because they’re continually monitoring for new malware and want to make sure their customers are protected,” he said. This frequency may have led to the oversight.
John Hammond, Principal Security Researcher at Huntress Labs, emphasized the importance of gradual rollouts. “Ideally, this would have been rolled out to a limited pool first,” he said, suggesting that such an approach could have prevented the widespread disruption.
Historical Context and Broader Impact
Similar incidents have occurred in the past. For instance, a buggy antivirus update from McAfee in 2010 stalled hundreds of thousands of computers. However, the global impact of the CrowdStrike outage underscores the company’s market dominance. Over half of Fortune 500 companies and various government bodies, including the Cybersecurity and Infrastructure Security Agency (CISA), rely on CrowdStrike’s software.
The incident serves as a stark reminder of the critical importance of rigorous quality checks in cybersecurity updates to prevent large-scale disruptions. As businesses and government agencies work to recover, the focus will likely shift to ensuring such oversights do not happen again.