Plato Data Intelligence.
Vertical Search & Ai.

Making AI stand the test of time

Date:

Opinion All the worries and fears about AI boil down to one. How do we know how well it’s working?

The kind of benchmark that IT normally worries about isn’t without importance. How fast a particular data set is learned, how quickly prompts can be processed, what resources are required and how it all scales? If you’re creating an AI system as part of your business, you’d better get those things right, or at the least understand their bounds.

They don’t much matter otherwise, although you can be sure marketing and pundits will disagree. The fact it doesn’t matter is a good thing: benchmarks too often become targets that distort function, and they should be kept firmly in their kennels.

The most important benchmark for AI is how truthful it is or, more usefully, how little it is misrepresented by those who sell or use it. As Pontius Pilate was said to have said 2,000 years ago, what is truth? There is no benchmark. Despite the intervening millennia and infinite claims to have fixed this, there still isn’t. The most egregious of liars can command the support of nations in the midst of what should be a golden age of reason. If nobody’s prepared or able to stop them, what chance have we got to keep AI on the side of the angels?

The one mechanism that’s in with a chance is that curious synthesis of regulatory bodies and judicial systems which exists – in theory – outside politics but inside democratic control. Regulators set standards, the courts act as backstop to those powers and adjudicators of disputes.

If you look at where the debate is about AI in the judicial system, you’ll find a lot of nuance. Like all professions, law is made up of humans who’d like to keep their jobs. They’d also like to do those jobs better. This means reducing the gap, of which they are acutely aware, between those who need justice and those who can access it. The outlook of AI here is seen as potentially immensely beneficial – provided it is transparent, accessible, trustworthy, and appealable.

This is true for AI elsewhere, of course, and as the legal profession has spent more time worrying after truth in human affairs than anyone else, it behoves us all to keep a close eye on what they think. After all, they had code a long time before COBOL.

Which takes us to the regulators. It should be that the more technical and measurable the field being regulated, the easier the regulator’s job is. If you’re managing the radio spectrum or the railways, something going wrong shows up quickly in the numbers. Financial regulators, operating in the miasma of capital economics and corporate misdirection, go through a cycle of being weakened in the name of strong growth until everything falls apart and a grand reset follows. Wince and repeat. Yet very technical regulators can go wrong, as with the FAA and Boeing’s 737 MAX. Regulatory capture by their industries or the politicians is a constant threat. And sometimes we just can’t tell – GDPR has been with us for five years. Is it working?

It is, alas, impossible to create a market that is regulated in two different ways and compare the results. The EU has years of GDPR: there is no EU of the same period that hasn’t. Control groups, one of the foundations of empiricism, aren’t easily found in regulators, and not much more in the legal system. The jury system tests a case through the presence of multiple independent minds, while the final backstop of supreme courts see the hardest cases likewise presented to a panel. There’s no parallel, different system. How could there be?

It is here that the nature of AI may hint at a regulatory foothold in responsibly integrating machines with the affairs of humanity. There is not just one AI, there are multitudes of models, of hardware platforms, of approaches, of experiments. They are machines, and we can make as many as we need. Ultimate truth is ultimately unknowable, but a workable consensus is achievable – or even a workable majority.

If you have a critical task where an AI is involved and there’s no way to immediately spot a blooper, get another one in parallel. And another. Compare answers. If you can’t find enough independent AI for a particular problem, don’t use AI until you can.

Redundancy is a powerful weapon against error. Apollo got to the Moon not because the systems were perfect, but because they had redundancy in place in the expectation of failure. The Soviet manned lunar effort eschewed that in favor of what looked like expediency, but ended in ignominy.

We don’t have to trust AI, which is just as well – what is truth? It knows no more than we do. But we have fashioned a workable society around systems that trust and verify, and we see this working in judges as well as jet planes. The philosophy, potential, and pitfalls of our increasingly thoughtful companions will work out over time, we should just set the odds in our favor while we can. Whatever we do, we won’t be able to wash our hands of it. ®

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?