Why Generative AI’s Lack Of Modularity Means It Can’t Be Meaningfully Open, Is Unreliable, And Is A Technological Dead End

Techdirt. 2024-12-04

One of the most important shifts in computing over the last few decades has been the increasing use of open source software on nearly every platform, from cloud computing to smartphones (well, I would say that). For the distributed development methodology pioneered by Linus Torvalds with Linux to work, modularity is key. It allows coders anywhere in the world, connected by the Internet, to work independently on self-contained elements that can be easily upgraded or even replaced, without a major redesign of the overall architecture. Modularity brings with it many other important benefits, including these noted by Eerke Boiten, Professor of Cyber Security at De Montfort University Leicester, in an article published on the British Computer Society Web site:

parts can be engineered (and verified) separately and hence in parallel, and reused in the form of modules, libraries and the like in a ‘black box’ way, with re-users being able to rely on any verification outcomes of the component and only needing to know their interfaces and their behaviour at an abstract level. Reuse of components not only provides increased confidence through multiple and diverse use, but also saves costs.

Unfortunately, today’s hot generative AI systems enjoy none of those advantages:

Current AI systems have no internal structure that relates meaningfully to their functionality. They cannot be developed, or reused, as components. There can be no separation of concerns or piecewise development. A related issue is that most current AI systems do not create explicit models of knowledge — in fact, many of these systems developed from techniques in image analysis, where humans have been notably unable to create knowledge models for computers to use, and all learning is by example (‘I know it when I see it’). This has multiple consequences for development and verification.

Current generative AI systems are not modular, which is one reason why today’s “open” AI tools are nothing of the kind, as a recent article in Nature explores in detail. Moreover, their monolithic nature leads to some serious problems when it comes to testing them, as Boiten explains:

The only verification that is possible is of the system in its entirety; if there are no handles for generating confidence in the system during its development, we have to put all our eggs in the basket of post-hoc verification. Unfortunately, that is severely hampered, following from the issues listed above:

Current AI systems have input and state spaces too large for exhaustive testing.

A correct output on a test of a stochastic system only evidences that the system has the capability to respond correctly to this input, but not that it will do this always or frequently enough.

Lacking components, current AI systems do not allow verification by parts (unit testing, integration testing, etc).

As the entire system is involved in every computation, there are no meaningful notions of coverage to gain confidence from non-exhaustive whole system testing.

Today’s leading AI systems are not only inherently unreliable because of their “stochastic” design — meaning they can produce different outputs for the same input — but they can’t even be tested in a useful way to establish exactly how unreliable they are. For Boiten there is only one conclusion to be drawn: “all this puts even state-of-the-art current AI systems in a position where professional responsibility dictates the avoidance of them in any serious application.” Moreover, he says: “current generative AI systems represent a dead end, where exponential increases of training data and effort will give us modest increases in impressive plausibility but no foundational increase in reliability.”

He does offer a glimmer of hope that “hybrids between symbolic and intuition-based AI should be possible — systems that do generate some explicit knowledge models or confidence levels, or that are coupled with more traditional data retrieval or theorem proving”. The problem is that in the current investment climate, neither existing generative AI companies, nor venture capitalists funding new ones, seem the slightest bit interested in tackling this hard and possibly impossible challenge. They’d both rather let the AI hype rip in the hope they can cash in before some of the bubbles start bursting.

Follow me @glynmoody on Bluesky and on Mastodon.