A Microsoft glitch triggered by an update from its security partner Crowdstrike recently brought down computer systems around the world. While we noticed airports and banks going down, many big and small businesses were impacted and incurred substantial losses. It was a short-lived digital version of what we saw during the Covid pandemic.
Analysis-paralysis dominated the screens that were still working. Some blamed negligence on part of the company, while others worked out a conspiracy theory in a jiffy.
This could have been a simple glitch that could be both intentional or unintentional. Most probably it was unintentional. You see, it is difficult to predict software behaviour during upgrades and updates, especially when many independent pieces are interacting with each other.
Software engineers do test for all possible scenarios, but being exhaustive is close to impossible. At the same time, if a bug is to be inserted intentionally, it can be easily done by simply tweaking a few things. This is how the whole virus industry was born, which in turn gave birth to the anti-virus industry.
Conspiracy theorists called it sabotage. I am yet to understand by whom and against whom. Is someone trying to hurt the businesses impacted or the one who is providing the services? Competition can play spoilsport but I do not see that as a case here. The only conspiracy I can think of is an attempt to sell some new cybersecurity product by occasionally creating an alarming situation like this.
This, to my mind, is the biggest possibility given that many industries thrive on a bit of fear. Among the ‘fear players’, the insurance industry comes to my mind right on top.
It could well be the insurance industry trying to build a new vertical in cybersecurity insurance. Well, it already exists for big corporations, but they are still not selling it to common users like you and me. We may well be on their radar. Remember, Berkshire Hathaway and Warren Buffet have started talking about cybersecurity risks at their annual investor meets.
I would call this breakdown a ‘digital accident’. As we ride digital highways, some of these accidents are bound to happen. Some would be due to the negligence by either a naïve user or the lack of diligence at the technology provider’s end.
Many such digital accidents would take place simply due to situations no one could foresee. Most systems fail at such exception points. Yes, there are blind spots on digital e-ways too, which are not so easily visible or predictable.
At a personal level, we all have stories of lost files or precious pictures to a crashed hard disk or a phone that decided to reset itself. At a business level, these stories mostly stay within the organisation, as it shows both business risks as well as weak links within their systems.
These risks have not been much spoken about since the Y2K days when the focus on technology risk was at a peak. The Indian IT industry owes its widespread reach to this bug that never really materialised. Those were the days when manual alternatives were still available, which is not the case now.
There are questions the Crowdstrike incident raises. Should we have a few big companies owning a piece of every other business via technology? Should we not have many small companies so that the risks are distributed and losses are limited? To an extent, it also raises the question of how much of your technology should be outsourced.
Going overboard with technology just because it is available is another syndrome that we need to deal with. Forcing a QR code when a simple paper menu can work, or an automated entry pass when a security person is standing there doing nothing are things that we should let go of.
We need to move from overuse to optimal use of technology. For example, check-in process that used to be a step act before digitisation is now a 5-6-step process - web check-in at home, printing of boarding passes at airport, printing luggage tags, struggling at various automated entry points when QR code readers fail, standing in queue to drop the luggage, time wasted when people struggle to find their boarding passes on their phones. One simple queue would have kept things simple for everyone. A clear case of technology overuse.
Employees not trained in manual processes are a bigger risk in such scenarios. No one has any clue what to do if the system stops working. How many times we have heard ‘Server not working’ as an excuse for transactions not going through? There should be a system to keep the business running, potentially in a skeleton mode, even if all the systems come to a standstill.
This would demand a high level of integrity and accountability at the human resources level, which is questionable with the constant churn organisations face these days. Trusting technology more than people is also a trend that feeds this risk, as people simply shrug their shoulders and say, “It’s not me, it’s the system.”
Someone once asked, if a magnetic tsunami erased all data from the digital world could the world still function? While most of us may think of it as an impossibility, pauses are pretty much a reality that we need to live with as the Crowdstrike incident has shown us. Organisations and users need to be prepared for these pauses caused by digital accidents. Probably, we need some technology-independent solutions as redundancy or alternatives.
Think about it while taking your backup.
Anuradha Goyal
Author and founder of IndiTales.com
Follow her on X @anuradhagoyal