Software Outages and How to Avoid Them

Christian Brink Frederiksen

CEO of Leapwork

We’ve seen the implications of huge software failures in the news. With companies under pressure to digitize, more of them will struggle with failures and outages in the years to come. By digitizing, companies are racing to unlock customer value and gain a competitive advantage. 

Software outages happen when a component of a system fails. This leads to poor performance and a loss of revenue and brand reputation. Outages and bugs are also leading to a blame game in QA, where both CEOs and testers fear losing their jobs as a result. 

 

Consequences of insufficient testing

 

High-profile software outages

The consequences of famous software outages have brought huge companies to their knees; Facebook’s October 2021 power outage is a prime example.  

It resulted in a 5% drop in share price, losing billions from its market cap value. According to Facebook, it was caused by a bug which was supposed to identify and prevent commands – commands that could take systems offline accidentally – from being issued.

This major outage has had a ripple effect on the perception of risk and the actions to take to avoid risk. For one, it’s made CEO’s more aware of the need for adequate testing.

Read more: 5 Reasons Why Moving Faster Makes Risk Awareness 10 x More Important

The sheer cost of poor quality software should be enough to alert CEOs and QA teams to the need to take action; it continues to cost US organizations over $2 trillion annually. This is partly due to a culture of patch testing software after it has been released.

 

Risk awareness in QA

 

It isn’t only Facebook that has suffered major software outages in recent years. In 2017, British Airways experienced an almost total system failure where the switching off of a power supply resulted in critical servers being damaged from a power surge. Hundreds of flights were canceled. Around £80 million was lost.

Another example saw Fastly – one of the largest content delivery networks on the internet –  experience an outage which affected major companies using its services. Amazon’s reliance on this critical internet infrastructure for its traffic saw it lose $32 million in sales during the outage. This bug was triggered by a customer performing a routine configuration change and triggering a bug that made 85% of the network return errors. 

When it comes to software testing, companies are cutting corners and taking unnecessary risks. What’s worse, there is a disconnect between upper management and those tasked with carrying out the work.

Why is this happening? 

So why do these software failures keep happening? While the C-suite might be aware of the consequences of releasing poorly tested software, Leapwork’s 2022 Risk Radar Report showed the extent of poorly tested software being released. 

In a dilemma of speed vs risk, businesses don’t think there is a realistic and cost effective option. This mindset leads to underinvestment in test automation and a reliance on manual testing (itself the cause of faulty software and outages). Manual testing is repetitive and time-consuming, and leads to human error. 

Read more: What is the Difference Between Manual Testing and Automation Testing?

An important factor in releasing new software is having an adequate testing system, one that fixes errors before they are released to market. To avoid this, companies need to make a transition from manual testing towards automation. As things stand, businesses are struggling both to test increasingly complex software and to be able to scale their solutions. 

The problem is that scripted test automation (that which requires coding)  is extremely difficult to maintain. Plus they require developer resources which are both expensive and often hard to source in today’s market. All this means that scripted automation is hard to successfully adopt and scale. 

 

Test automation and productivity

 

A combination of a lack of qualified resources and testing bottlenecks means that there is less time for QA, which in turn results in insufficiently tested software. With scripted testing, those who know your business best will remain separated from the testing process (unless they learn how to code). 

In other words, this is a resource that isn’t being used to its full potential. 

Preventing bugs and outages 

Preventing the risk that bugs and outages bring requires a mindset shift. Companies need to look at things differently and adopt an automation strategy that doesn’t require coding resources. 

A visual, no-code alternative to test automation brings business people into the QA process and allows them to be involved in testing. This reduces the skills gap in QA, while teams are freed up and can focus on high-value tasks. 

If coders aren’t stuck automating their regression tests by writing code, they can spend time doing exploratory testing. In other words, they can start planning ahead and reducing the risk of broken software. A thorough approach with more exploratory testing has one significant outcome: less risk. This is why no-code test automation can help to prevent risk. 

It takes a new mindset to cause real change. Firstly, by extending the execution of test automation to the many instead of the few. And secondly, by enabling those without programming skills to utilize their domain knowledge and expertise to build automation. Only then will your organization be able to unlock the full potential of test automation, helping you reduce risk, cut costs, increase time to market, and safeguard your brand reputation.

 

New call-to-action