How to improve technology outage preparedness for your business.
Last Friday, a widespread technology outage impacted thousands of businesses in different industries around the world. This event highlights the critical importance of business technology outage preparedness. Airlines, hospitals, train networks, and TV stations that use Microsoft operating systems were all impacted.
That left healthcare facilities closed, retail stores unable to accept payments, travelers stranded in airports, and workers unable to log in to affected systems. Many 911 emergency lines in the United States were also impacted, leaving dispatchers to manually assign calls.
Kaiser Permanente, a healthcare system with 12.6 million members across the U.S., reported that all of its hospitals’ systems were impacted. Financial institutions like JPMorgan Chase couldn’t process transactions since bankers were prevented from logging into systems. And TD Bank reported thousands of complaints from consumers who couldn’t access their online accounts.
Some problems were resolved quickly on Friday, while outages in certain industries extended into the weekend. CrowdStrike, the software company responsible for the outage, said it could be days or even weeks until other businesses recover.
How did the outage happen?
Disruptions began rippling across the globe in the early morning hours after a flawed software update from CrowdStrike. Faulty code was automatically pushed to PCs running one of the cybersecurity firm’s programs, which are used by more than half of all Fortune 500 companies.
The faulty code caused affected machines to get stuck in an endless loop of reboots, knocking them completely offline and displaying only the dreaded “blue screen of death.” Many computers even run their basic Windows operating systems, making the issue difficult to fix remotely, while Mac and Linux systems were mostly unaffected.
CrowdStrike took responsibility for the software bug and insisted it was not attributed to a cyberattack. For its part, Microsoft kept mostly quiet, pointing to CrowdStrike’s statements on the outage and offering technical guidance for backing up operating systems.
When will the outage be fixed?
In many cases, affected machines had to be fixed individually by a person manually removing the faulty code from CrowdStrike’s software update. Until the company finds a way to automate that time-consuming, resource-intensive process, cybersecurity experts worry that this particular outage won’t end easily.
Many cybersecurity experts said lackluster digital resiliency and insufficient recovery protocols made the outage exponentially worse. As The New York Times reported, the outage was caused by “purely human error — a few bad keystrokes that demonstrated the fragility of a vast set of interconnected networks in which one mistake can cause a cascade of unintended consequences.”
What can businesses do to be better prepared?
A hybrid approach is critical to mitigate the damage of a global outage like last week’s. Many big businesses use both digital and analog elements to ensure continuity and resilience when problems occur. In the healthcare industry, both Johns Hopkins and Cleveland Clinic maintain both electronic health records (EHRs) and paper backups for critical patient information.
Many airlines integrate manual procedures alongside digital systems for check-in and security screenings—and those that had experience reverting to paper-based backups were able to keep passengers moving last week when computer systems went down. In the retail sector, Walmart and Target both use hybrid systems for inventory management and point-of-sale operations so that revenue can continue to come in, even in the absence of electricity and Internet connectivity.
Other guardrails can help enhance digital resiliency. The World Economic Forum’s 2024 Global Risks Report recommends developing robust policies and procedures that can stress test critical systems. This kind of proactive approach can help businesses better prepare for and mitigate the impacts of tech outages.
Here are a few strategies that CMIT Solutions recommends to protect information, strengthen cybersecurity protocols, and stay safe in an increasingly unstable digital age:
- Train employees to respond to different types of cyber risks. Many of the most common cybersecurity problems occur due to human error: inadvertently clicking on a malicious website link, accidentally opening an infected attachment, or supplying private information to a scammer posing as a coworker or professional peer. But last week’s tech outage demonstrates that it’s equally important to educate employees on what to do if systems go down. Simulations of tech outages or ransomware attacks can empower employees to keep up their good work, even in the most stressful environments.
- Execute regular, remote, and redundant data backups. This IT service addresses the most significant digital risk. Many of the worst cyberattacks are successful simply because companies don’t have access to extra copies of relevant data and feel they have to do whatever possible to retrieve stolen or encrypted information. With reliable backups executed regularly and stored remotely, your company can survive malicious attacks, hardware failures, natural disasters, and even broad tech outages. The investment is worth it, too, as only one or two days of significant data downtime can affect employee productivity and your company’s bottom line.
- Practice restoration and virtualization procedures. Many business owners think that if an outage strikes, they’ll only be affected for a few hours. But without virtualization and restoration strategies deployed in advance, your business could be knocked offline for days or even weeks. Virtualization takes the data you have backed up remotely and rebuilds it on existing or secondary equipment in case of disaster. You should test your solution thoroughly to see how quickly it can retrieve information and get you back up and running. Best-in-class offerings from trusted IT providers should be able to perform a full restore in less than 48 hours—and, if needed, a quick restore in less than 24 hours.
- Protect every device used by every employee. Remote work is a norm now, so laptops, tablets, smartphones, routers, and hard drives are constantly in transit—and often left unprotected. Hackers will often target a company’s least-protected machine and try to infiltrate that first, exploiting obvious vulnerabilities. As a business owner or manager, you’re responsible for securing the devices used by remote and hybrid employees, protecting company data everywhere it lives, and constantly looking for vulnerabilities that can expose a company’s entire network.
- Work with someone who understands your business. An experienced IT provider like CMIT Solutions can recommend the right steps to solve today’s disruptions while planning for tomorrow’s success. We can deploy software updates behind the scenes—and be ready to step up with critical support if something goes wrong. We can help prevent sophisticated ransomware attacks—and streamline legacy machines so that employees aren’t bogged down by tech hiccups. We can deliver IT support that’s affordable and reliable in the short-term—while enhancing digital resilience in the future.
In short, CMIT Solutions can do it all. We can implement cybersecurity guidelines that ensure your company is in compliance with industry standards. We can enhance overall digital resilience to protect you from tech outages. We can adapt the right plan for your business, applying the latest best practices while monitoring digital threats.
That’s the benefit of working with a trusted IT provider: you get industry expertise and elite customer service wrapped in a reliable, affordable package. Need help bouncing back from an outage or preparing for the next IT challenge? Contact CMIT Solutions today for expert guidance. We defend your data, empower your employees, and protect your systems from digital risk.