Both of us – the co-authors – spent many years designing, coding and testing software. We have a visceral understanding of what can go wrong, even before the human element – the user – is added to the mix. And then we have the nightmare of finding that the data the system we relied on was not available, faulty, or incomplete.
Any system depends on having an algorithm which is appropriate and complete; an accurate implementation; and reliable data.
An example is a system for making soup. The algorithm is the recipe: hopefully, it has been tested. The implementation could be by an experienced cook, or if a robot is to make the soup, a much more detailed set of instructions are needed.
The soup will depend on the data – describing the ingredients – all the ingredients being available and of good quality.
Software is the same. An algorithm is designed to fulfil the system’s purpose. The algorithm may be imperfect because it fails to cover some of the conditions or does not reflect the real world. It is executed in code which may be an imperfect implementation. The outputs are based on the data used by the software. Problems with utility of data are so widespread that a term “GIGO” – garbage in, garbage out – is well known.
Might this illuminate the utility of a software model to influence government policy during early stages of the covid-19 pandemic? The algorithms were based on assumptions about the rate and type of infection, and the threat to human life from the virus. These turned out to be false.
The implementation in code was suspect according to a number of observers[2]. The available data input was incomplete and misleading: during early stages of the pandemic, those with suspected symptoms were told to not call their GPs or the NHS helpline unless they needed an ambulance.
As a result, for instance, the official statistics showed three cases in West Berkshire while one of us was personally aware of eight people with all the symptoms. Hence, the government scientists and politicians were depending on misleading projections which were based on inadequate algorithms, questionable code implementation, and data not reflecting the real world.
As our dependence on IT increases – accelerated by covid-19 – so does the number of stories of software failures harming people and costing lives.
Countless examples from airlines, banks, Facebook, and across the commercial and government spectrum, suggest that software engineering standards[3] for development (design and coding) and testing are widely ignored where they exist. Software is a new discipline, and as the British Computer Society describes, published standards cover some parts of the software life cycle, but not all.[4]
It could be argued that lack of application of software engineering standards will be remedied over time through market forces, such as acceptable insurance rates for organisations with a bad track record of software failure.
Many AI-assisted systems are currently subject to human verification – for instance, the auto-correct feature on text messages, teaching assistants, customer service robots, autonomous vehicles.
The next generation of software systems is likely to outstrip the ability of humans to check the results in detail. AI systems are increasingly being used in circumstances where there is no human ability to check the logic or to query the outcomes.
One of the fundamental challenges of machine learning is that the models depend on data supplied by humans. Unfortunately, this data is likely to have been selected according to biases. And algorithms are developed and implemented by people, and people have in-built biases. So, systems for:[5]
- Evaluation of CVs as part of a recruitment process
- Criteria for getting credit or a mortgage
- Court sentences[6]
are problematic. Any built in errors in these systems cannot to be checked.[5]
**
So, yes, we do think that there is a fly in the soup.[1] In fact, it is a huge and dangerous insect. We think that software is a problem flying just under the radar, ready to fall into the soup, leaving devastation in its wake. It could crash our planet.
As we continue to depend on systems that are faulty; that we do not “understand”; and those that are based on data or assumptions that are incomplete or faulty, so the danger increases.
[1] A translation of a Dutch phrase meaning the fly in the ointment
[2] https://www.nationalreview.com/corner/professor-lockdown-modeler-resigns-in-disgrace/
[3] https://www.cio.com/article/2437864/process-improvement-capability-maturity-model-integration-cmmi-definition-and-solutions.html
[4] https://www.bcs.org/membership/member-communities/software-testing-specialist-group/the-tester/newsletter-archive/standards-for-software-testing/
[5] https://builtin.com/artificial-intelligence/examples-ai-in-industry
[6] https://mindmatters.ai/2020/01/ai-in-the-courtroom-will-a-robot-sentence-you/
[7] https://www.distilnfo.com/lifesciences/2019/03/26/how-ai-will-go-out-of-control-according-to-52-experts/