Are digital systems fit for purpose?

note-thanun-GI10ZiPO_3w-unsplash

The world is becoming increasingly dependent on digital services. In this connected world, we ask: are these services fit for purpose?  In this blog we give evidence that they’re not. We use a metaphor that we introduced in Radix on 23rd November 2023 – Fractured Backbones – that helps us understand the underlying causes; and suggest some ways to mitigate the lack of resilience.

Evidence

Capita provides software and IT services to governments and to private companies. In March 2023 details of more than half a million members of the UK’s private sector pension schemes were hacked. Separately in May, details on local council benefit payments were exposed¹.

Between April and July 2023, all three major cloud providers suffered regional outages. The largest AWS region (us-east-1) degraded severely for 3 hours, impacting 104 services: Fortnite matchmaking, McDonalds and Burger King food orders, all stopped working. A Google Cloud region (europe-west-9) went offline for about a day. Azure’s West Europe region partially went down for about 8 hours due to a major storm in the Netherlands. 

Blackbaud specialises in financial, fundraising and admin software for  educational institutions and non-profits. In 2020 the data of over 20 UK universities and non-profits including the National Trust was hacked. Data on Labour Party donors was also taken. The company in 2023 reached an agreement to pay $49.5m to resolve claims that it violated state and federal laws. The Information Commissioner’s Office in the UK also reprimanded it.

Meta’s Facebook and Instagram services were down on 5th March 2024. A more than two-hour outage impacted hundreds of thousands of users globally. The outage was probably caused by an issue with a backend service such as authentication: at the time it was suggested that there had been corruption of the backup data which made it essential to close the platform completely to restart.

The British Library’s 31st October 2023 cyber-attack  led to a leak of employee data and resulted in the library’s website being down until January 2024, making it impossible for library readers globally to locate or order materials. The Rhysida ransomware group claim to be behind the attack shared an image on the web showing documents which appear to be employment contracts and passports.

Something is clearly broken – after this level of disruption in physical services, accident reports are published and follow up actions introduced to avoid repeat occurrences. Why does this not happen for digital services? What can organisations do to protect themselves?

Fractured Backbones

We (the authors) use the concept of Fractured Backbones to explore how to improve the resilience of digital services, and to provide thoughts on what organisations can do in the meantime.

Mitigation

The UK’s National Preparedness Commission’s report highlighted the growing importance of software in the economy and society.  Further, the awareness of software fragility and the impact to the economy and society of loss of digital services is not yet on the widely shared.  Increasing this awareness is a step towards restoring Fractured Backbones.

Wider awareness could lead to measuring the impact of outages of digital services. A BCS and BCI RoundTable suggested that governments could take a lead by sharing data on the impact of service outages and data breaches in public sector services. The US Underwriters Lab supports the AI Incident Database – a non-profit organisation and website tracking all the different ways the technology goes wrong. The website has catalogued over 600 unique automation and AI-related incidents so far. 

Measurement of the impact of service outages and their effect on productivity could underpin targeted investment.

What could governments and organisations do to mend the Fractured Backbone threatening  their delivery of digital services?

The BCS and BCI RoundTable examined this question and suggested that an approach initially defined for financial services provided a roadmap to improve resilience. This requires  organisations to define their most important business services; to test this service under a wide range of potential failure conditions; and to resolve all potential sources of outages.  This approach builds shared accountability across the organisation.

In the meantime, governments and organisations can adapt to the knowledge that digital services are subject to failure by planning for manual backup services that can cope with minimal disruption to the customer. The connected world includes people – their role in resilience is easy to underestimate.

¹ Financial Times “Capita shares slide 22% after higher than expected loss”, 7th March 2024.

A version of this first appeared on 29 April 2024 on the Long Finance Pamphleteers blog page

Rate this post!

Average rating 5 / 5. Vote count: 2

No votes so far! Be the first to rate this post.

Radix is the radical centre think tank. We welcome all contributions which promote system change, challenge established notions and re-imagine our societies. The views expressed here are those of the individual contributor and not necessarily shared by Radix.

Leave a Reply

The Authors
Latest Related Work
Follow Us