1. AI: Always InYourFace

Can we just ban AI?

Na. Next question.

Where is AI going?

Well Generative AI (modern LLMs) are inherently probabilistic, NOT [DJT mannerism] deterministic. Although thinking about it that's not really the issue, because you can actually make an LLM deterministic if you want (set the temperature to 0 and it gives you the same answer every time) and it would STILL be a terrible idea to let it run a Bank. The real issue is that unlike rules based AI, it has no precomprehension of following strict rules for things like bank transfers. It just gives you a plausible looking answer, with no guarantee it's actually correct.

However, "Agentic Workflows" and "Chain of Thought" where LLMs split complicated problems into smaller parts and work through them sequentially are showing more promise (Recursive LLMs are cool: arxiv.org/abs/2512.24601). BUT breaking a problem into steps doesn't make the final answer any more guaranteed, at the core of these LLM systems it's still the same problem.

So the future will likely consist of a couple of options. A) Human in the Loop. A human asks AI to solve X, LLM breaks problem down, tries to solve. Then we use our deterministic reasoning (while trying not to f*** it up ourselves) to confirm the AI has done it correctly. This is probably close to how most professionals use AI today for Excel Formulas, Pieces of Code, Emails etc. With this setup we will never really get rid of humans completely, the skill just shifts from everything to: creation, editing and verification. And that last one sounds boring but it's actually where the skill goes now, once a machine can spit out an answer, working out what even counts as a correct answer is the hard bit. Option B: Formal Verification & Neurosymbolic AI. We now have startups like Axiom and Projects like DeepMind's AlphaProof, that are combining AI with Formal Verification using the Curry Howard Isomorphism, where types become propositions and programs act as the evidence of the proof. This sort of verified computation relies on a Kernel - and the trick is that the Kernel is deliberately TINY, a small core that gets scrutinised half to death, so that everything else in the system can be sloppy as long as that one little piece is provably right. Axiom (startup) uses a programming language called Lean as the deterministic spine, while using AI as the probabilistic hands (or maybe Drawing Hands - Art Reference). To keep going with the Hand Analogy we can think of the propositions within Lean as OJs Hand and the proof as the glove, whilst AI is the one selecting the glove - only the right glove will fit. This option is a GREAT solution hence the f****** huge valuation (Axiom - Startup). But what I think is often overlooked when people manage to wrap their heads around these complicated biological analogies... What does this ACTUALLY solve? The Kernel only ever checks that the glove fits the hand you gave it. It says NOTHING about whether that was the right hand. It proves the proof matches the spec, not that the spec matches reality. The AI can hand you a perfect, fully checked proof of completely the wrong thing, a missed edge case, some rule nobody bothered to write down. The hard part was never whether the proof is valid, it's whether we asked for the right thing to begin with. Running with our banking Analogy: The majority of our global financial systems have been written in COBOL (COmmon Business-Oriented Language - lol) for the last ~40 years. And the real barrier isn't rewriting all that into Lean, nobody actually runs a bank on a proof assistant, that's not what Lean is for. The barrier is that you can't verify code against a spec that was never written down. 40 years of behaviour piled up and nobody ever wrote down what the thing is really meant to do, so there's nothing solid to check it against.

And even if you nailed the spec perfectly, you STILL haven't escaped, you've just moved the trust somewhere quieter. That machine checked proof was verified by the Kernel, the Kernel was built by a compiler, that compiler was built by another compiler, and the whole lot runs on silicon you also never checked. Back in 1984 Ken Thompson's "Reflections on Trusting Trust" showed the nightmare version of this: you can rig a compiler to slip a backdoor into everything it builds AND into its own future self, so the source code you're reading is spotless and the malice never shows up anywhere you can actually look. So the honest framing was never "guarantees vs trust". It's that verification SHRINKS trust, it never deletes it. You keep pushing the floor down, smaller kernels, formally verified compilers like CompCert, verified microkernels like seL4, but you never hit bedrock, it bottoms out at the hardware, the proof checker, and eventually the axioms themselves (Gödel: a system can't even prove its own consistency from the inside). Guarantees all the way down turns out to be trust all the way down, just with the trusted part made as small, and as stared at, as we can manage.

Please email if you'd like any discourse on this subject.

In the next blog: Who actually wants to talk to an AI?!

← Back to writing