Transcript: ‘The AI Model Built for What LLMs Can't Do’

The transcript of AI & I with Eve Bodnia is below. Watch on X or YouTube, or listen on Spotify or Apple Podcasts.

Timestamps

Introduction: 00:00:51
Why correctness and verifiability matter in AI: 00:02:09
What an energy-based model is: 00:09:33
How EBMs construct energy landscapes to understand data: 00:14:21
Why modeling intelligence through language alone is a flawed approach: 00:19:00
What it means for a model to “understand” data: 00:26:54
How EBMs solve the vibe coding problem and enable formally verified code: 00:37:21
Why LLM progress is plateauing: 00:43:21
Mission-critical industries haven’t adopted LLMs, and why EBMs can fill that gap: 00:49:54

Transcript

Dan

Eve, welcome to the show.

Eve

Hi. Thanks for having me.

Dan

Great to have you on. For people who don’t know, you are the founder and CEO of Logical Intelligence. Tell us what Logical Intelligence does.

Eve

Logical Intelligence does a few things. First of all, we see ourselves as a foundational AI company. We work with both EBMs and LLMs. Everything we’ve built in-house we prototyped on LLMs initially, and we’re building EBMs at the same time—those get plugged in over the long term.

We’re focused on correctness of software and hardware as a product, because I believe there are a lot of issues with AI being placed in mission-critical systems today. Can we do code generation? Can we do chip design? The answer is yes—people use LLMs today. But very few are actually questioning whether the results are correct, whether what’s produced actually makes sense. There’s a big gap in the market around deterministic, verifiable AI, and we’re trying to fill that gap.

Dan

Where my brain goes first is: why does correctness, or whether something makes sense, actually matter if it works?

Eve

Let me ask you a question back. Imagine there’s AI driving a car and you’re in that car, and the car is running on an LLM, and someone tells you that 20% of the time it’s going to hallucinate and you might end up in the wrong place. How would you feel about that?

Dan

In my case, I’d be like, wow, that’s kind of interesting—I’m curious where it takes me.

Eve

Okay, let me give you another example. What about a plane? You’re flying from San Francisco to New York and someone says 20% of the time the next word isn’t going to match and the plane is going to go down. How would you feel?

Dan

My feeling is that planes are currently run very well by deterministic systems, so I’m not sure why I’d need an AI for that.

Eve

I feel like we just can’t avoid AI anywhere over the next 10 years. People are going to try to place AI everywhere, automate systems with it. Technically, you might not need it—we survived without AI up to this point—but it’s the next step in an evolution that people want. For banking, you didn’t need AI initially, but we learned it’s really helpful to automate certain processes and decision-making. It saves a lot of time and creates space to be creative instead of constantly debugging and fixing things. I just feel like it’s an unavoidable future.

Dan

What I’m getting at is—it seems like if you want a guarantee of certainty, the only way to achieve that is to use something you can express in code or logic.

Eve

That’s part of it. For us, certainty comes from both internal and external verifiers. If you take an LLM, the architecture doesn’t allow for internal verifiers—it’s a black box. You don’t have access to what’s happening inside until everything is fully processed. You only have access to the output.

Many companies take an LLM, train it for certain tasks, and if it requires logic, they attach external verifiers—languages like Lean 4, which is a machine-verifiable proof language that lets you check output using mathematical frameworks. But that doesn’t solve the cost problem, because the architecture is still playing a guessing game. Even with an external verifier, even if you fine-tune the LLM for a specific task, you’re still not solving the problem of tokens being expensive. It takes compute to play that guessing game.

EBMs solve that problem differently. EBMs don’t have tokens. There’s no guessing game of that kind. You can essentially oversee all possible scenarios.

Dan

Can you define EBM for us?

Eve

I’ll define it in a second. For now, just think of it as something that doesn’t play a guessing game—something whose architecture allows it to self-align as it processes information. It’s no longer a black box. As it’s performing, you can open it at any time during training and see what’s happening inside. You can’t do that with LLMs. The nature of the architecture is fundamentally different.

So for verification tasks, you have this notion of self-alignment because of the EBM architecture, and the absence of tokens makes it cheaper. And then you also have an external verifier on top of that. Verification on both sides—inside and outside.

Dan

Let me play that back to you. We’re living in a world with LLMs where we can generate a lot of output, and that output is useful for a wide range of things. But to tell if that output is right, the best we can do is guess and check—we generate output and then, if it’s code, we go check it with integration tests or manual tests or whatever. That totally works, but it’s expensive and time-consuming. And one of the core problems is that it’s very hard to know how the LLM arrived at its answer. We can’t look inside it.

Eve

Exactly.

Dan

And what you’re saying is there are other types of models that are more inspectable—ones that give us a sense, before we even run the output, of whether it works. We can look at the model’s internals and understand: how good does the model think this solution is? It’s like being able to ask someone, “Are you sure about this?”—before you go check their work. A language model can answer that question, but at a different level than an EBM does. The answers from EBMs are more likely to be actually correct.

Eve

Yes. With EBMs, you always have the opportunity to see what’s inside. You control the training—it’s no longer a black box. You can do that in real time. With LLMs, you need to wait until training is done before you go look inside. And you can attach the same external verifiers that work for LLMs on top of EBMs, so you get double verification.

You asked me what an EBM is. I want to give a little historical context, because there are so many terms being thrown around today without being defined.

EBM simply means energy-based model. The concept of “energy-based” comes from physics. In theoretical physics, if you’re writing Lagrangians—which correspond to the energy terms in a system, like kinetic energy and potential energy—you’re trying to derive the equations of motion by minimizing that energy. That’s essentially how all of theoretical physics works: you start with energy terms, minimize the energy, and derive equations of motion. Those equations of motion give you conservation laws, so you know exactly what the rules of your system are.

This principle is fundamental—everything around us wants to minimize energy. We’re sitting in chairs talking instead of jumping around because that’s the natural state: minimized energy. We’re using that minimization principle for how AI processes information.

(00:10:00)

Our model is formally called the Energy Based Reasoning Model with Latent Variables, though we call it Kona—we like coffee culture and Kona is a favorite. Let me walk through exactly what those words mean.

Dan

Before you do, I want to make sure people understand what energy minimization actually means. Is this a good concrete example: I’m going to go lie on the couch behind me. My body is uneven, the couch is uneven, and I’m trying to understand how my body is going to end up settling into it, given the laws of gravity. I’ll end up settling in a way that minimizes energy—a good fit between my body and the couch, rather than being all jerky with lots of gaps. Is that the kind of energy minimization you’re talking about?

Eve

Yes. It’s all about your body finding the most comfortable configuration—the one that corresponds to the lowest potential of your body. I’d go even higher level. Imagine you’re tired, Dan—you’ve done thousands of podcasts and you’ve just come home. Let’s say Dan is a variable. We’re trying to figure out his equations of motion around the house: where is he most likely to end up?

You’re probably going to end up on the couch with a nice show and maybe a drink.

Dan