Midjourney/Every illustration.

You’re the Manager Now

Plus: Why small models can't match Mythos, an AI workflow confidence check, Claude Code token tracking, our agent-muting plugin, the AI philosopher draft, and a mini-Vibe Check on Dia

Like Comments

Was this newsletter forwarded to you? Sign up to get it in your inbox.


Now, next, nixed

Developer UI

Now: Anthropic gave Claude Code’s desktop app a redesign, adding a sidebar for managing sessions, drag-and-drop panes, and an integrated terminal and file editor. Altogether, it makes it easier to work multiple projects in parallel. Cora general manager Kieran Klaassen was thrilled—this was already his preferred setup.

Kieran’s existing work setup in Cursor looks a lot like the new Claude Code. (Image courtesy of X/Kieran Klaassen.)
Kieran’s existing work setup in Cursor looks a lot like the new Claude Code. (Image courtesy of X/Kieran Klaassen.)


Next: Claude Code’s refreshed look is not exactly original, says Monologue general manager Naveen Naidu. Cursor offers a similar experience, and both companies “just copied Codex’s design,” he says.

But it confirms where dev work is headed: overseeing agents, not writing code.

Nixed: The idea that command-line interface (CLI) will eat user interface (UI). With a CLI-first workflow, you mostly supervise through text: commands, logs, git state, diffs, and terminal output. Now that agents are doing the coding, that’s not a good primary interface.

Instead, the future coding UI is centered on managing parallel work, staying aware of git/task context, and—most importantly, Kieran says—having access to a preview of what you’re building.


Permission to skip

Smaller models can’t do what Claude Mythos does

A researcher at a cybersecurity company made waves online when he reported smaller models could find the same security vulnerabilities as Mythos, Anthropic’s new model so powerful it isn’t being made public, when pointed to the relevant code.

You have permission to skip this discourse—or better yet, reframe it.

Because this is a framing issue, says Dan Shipper, Every’s CEO. Mythos and smaller models are operating within completely different ones. Yes, you can point a smaller model to a codebase and tell it to find a bug when you already know that capability is possible, but you cannot ask it to find serious vulnerabilities in critical software across every major operating system and browser, autonomously, the way Mythos did.

Older models finding the same security bugs as Mythos is not an apples-to-apples comparison. (Image courtesy of X/Dan Shipper.)
Older models finding the same security bugs as Mythos is not an apples-to-apples comparison. (Image courtesy of X/Dan Shipper.)


As models get better, they automatically handle smaller, concrete problems, allowing you to demand more from them.

Say you have a bug in your code. A lower-level frame, which requires you to describe the problem in detail, would be to explain what’s going wrong and propose possible solutions. A higher-level frame allows you to get abstract: “There seems to be a problem, can you fix it?”

As you climb the frame hierarchy, your role is less about communicating the mechanics of a problem and more about defining what the most important problem even is. In the coding example, the higher frame is powerful because it allows for expansiveness. (“There seems to be a problem, can you fix it?” might surface the same bug as the lower-frame prompt, or it may find that bug and identify a far more significant architectural issue.)

The higher the frame, the more possible solutions unfold before you—and the more room to consider what constitutes a solution in the first place.

Better models open up a dizzying number of approaches to solving a problem. (Image courtesy of Slack/Dan Shipper.)
Better models open up a dizzying number of approaches to solving a problem. (Image courtesy of Slack/Dan Shipper.)


Uploaded image

Documentation for the AI era

AI-era documentation can be a nightmare. Your user is querying an LLM to figure out how to use your product, and the LLM is pulling from all manner of outdated sources. GitBook fixes this. Its connected knowledge system combines your docs, external sources like YouTube tutorials and GitHub Discussions, and an embeddable AI assistant that can be inserted inside your product. Answers are grounded in actual knowledge and link back to real sources—and you get data that helps you tell exactly where users get stuck. Used by teams at Nvidia, Zoom, and n8n.


Steal this workflow

The confidence check

Before he lets Claude Code ship anything, Austin Tedesco, Every’s head of growth, asks it one question: How confident are you in this, on a scale of 1–100? Anything under 90 percent and he sends it back to find improvements. Without an engineering background, this single question has changed the quality of everything from growth experiments to product PRs.

Austin asks Claude Code to confirm its confidence interval before creating a pull request. (Image courtesy of Austin Tedesco.)
Austin asks Claude Code to confirm its confidence interval before creating a pull request. (Image courtesy of Austin Tedesco.)


The workflow:
  1. Finish the task, then ask for a confidence score. Once Claude Code has a working solution, type: “How confident are you in this, 1–100?” If it comes back above 90, move on. If not, go to step 2.
  2. Send it back. Tell it: “Find improvements and get to 90+.” Claude will catch edge cases, tighten logic, or flag assumptions it glossed over the first time. Repeat until it crosses the threshold.
  3. Ship at 90. Don’t chase 100—that’s where you burn tokens on diminishing returns. At 90, it’s checked its own work and flagged what it wasn’t sure about.—Katie Parrott


Inside Every

A plugin for getting agents to shut up

Every is half agent now, which has made Slack a noisy place. OpenClaws are constantly popping up in threads trying to be helpful, whether they’ve been mentioned or not.

The bots, god love them, cannot read the room.

Agent Rocky butts in. (Image courtesy of Every’s Slack.)
Agent Rocky butts in. (Image courtesy of Every’s Slack.)


To stop Claudie, the consulting team’s AI manager, from inserting herself in discussions, Every engineer Nityesh Agarwal updated her instructions so she could only respond in the consulting team channel. “She’ll deny every other request,” he says.

Hard rules help, but they’re “like telling someone they can never use a certain word in conversation—sometimes that word might actually be the right one,” says Willie Williams, Every’s head of platform. On occasion, agents have something to contribute even when they’re not explicitly tagged.

Enter Tact, an OpenClaw plugin “that will keep your Plus Ones, our hosted OpenClaw agents, from responding in Slack unless they should,” per Dan. The classifier is built using real examples of bots speaking up in Slack, with each instance labeled as appropriate—or not. It’s a way to program social norms, “like giving a human a little recorder with a light: If the light is green, you can respond; if it’s red, don’t,” says Willie.

Tact gives agents the context to read the room.

Data point

2.2 million

That’s the number of Claude Code tokens Every’s head of tech consulting Mike Taylor used in March. He’d expect similar figures for most data and product management roles. Engineers running agentic workflows or subagents will burn significantly more, but Mike says it’s rare for a coder to exceed a Claude Max plan, which gets you upwards of 30 million tokens a month.

To check your own Claude Code token usage, run this command in the terminal, Claude Code desktop app, or any agent where you can run shell commands:

npx ccusage@latest monthly
Mike’s Claude Code usage by month. (Screenshot courtesy of Mike Taylor.)
Mike’s Claude Code usage by month. (Screenshot courtesy of Mike Taylor.)


April draft, philosopher edition

Philosophy is back thanks to AI. Google DeepMind just hired a philosopher.

Anthropic already has two.

So naturally we ran a draft of which philosopher each major AI lab would select if they could pick anyone from history.

xAI: Friedrich Nietzsche. “What is alignment but the morality of those too weak to endure the answer?”

Anthropic: Jeremy Bentham. “The question is not, can it reason? Nor, can it speak? But, can it minimize the greatest expected harm across all sentient beings?”

OpenAI: Plato. “The many call it appetite for compute; I call it the turning of the machine’s soul toward the Good.”

Google: Gottfried Leibniz. “The best of all possible worlds is one in which every application contains its own small reasoner. Our small reasoner.”

Meta: Seneca. “I’m just here for the nine-figure retention package.”—Dan Shipper


Mini-Vibe Check

The Dia browser

The Dia Good Morning tab always features art at the top. (Image courtesy of Eleanor Warnock.)
The Dia Good Morning tab always features art at the top. (Image courtesy of Eleanor Warnock.)


I’ve been using the Dia browser for the last few months, and one of their most recent features has become part of my daily routine: a gorgeously designed Good Morning tab that pops up when I start my workday, pulling in to-dos from Slack, Notion, and email alongside my schedule. There’s a “Prep me” button in the schedule section that opens a chat about how to prepare for whatever’s next on my calendar.

It doesn’t capture everything, and I still track most of my to-dos with my Plus One. But the Good Morning tab is beautiful. It gives me a small moment of aesthetic orientation at the start of the day that is more important to me than completeness.

This is The Browser Company doing what they’ve always done well: making software that feels crafted. Dan has talked with cofounders Josh Miller and Hursh Agrawal about how they killed Arc to build Dia, and the bet was that design and feeling still matter in AI products. In a world where every AI tool is racing to be the most capable, Dia is betting that the most pleasant one wins your morning. I think they’re on to something.—Eleanor Warnock


Laura Entis is a staff writer at Every. You can follow her on LinkedIn.

To read more essays like this, subscribe to Every, and follow us on X at @every and on LinkedIn.

For sponsorship opportunities, reach out to [email protected].

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Monologue Monologue
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders

Comments

You need to login before you can comment.
Don't have an account? Sign up!

We use analytics and advertising tools by default. You can update this anytime.