How Every Is Harnessing the World-changing Shift of Opus 4.5

Our last coding camp of the year is Codex Camp—a live workshop about building with OpenAI’s coding agent, open to all Every subscribers on Friday, December 12 at 12 p.m. ET. Learn more and reserve your spot.—Kate Lee

Was this newsletter forwarded to you? Sign up to get it in your inbox.

Every’s take on Anthropic’s latest model, Opus 4.5, is that it’s a world-changing shift in AI-powered coding.

When Every CEO Dan Shipper started using Opus 4.5, his previous frustrations with AI coding—errors, hallucinations—disappeared. After the model was released in late November he built his dream reading app between meetings in one week, work that would have taken him six months before.

Meanwhile, resident Claude Code wizard Kieran Klaassen was so enamored with the model, he started 10 projects at once and kept sneaking away from Thanksgiving dinner to prompt more. “This is by far the most exciting model, maybe ever,” he said.

We knew we needed to do more than write about what we were seeing. We needed to show it. Cue our latest Claude Code Camp.

Over the course of the hour, Dan and Kieran walked more than 400 Every subscribers through what they’ve discovered working with Opus 4.5. They demoed real workflows, shared emerging patterns in how to think about code, and answered questions about everything from how to take advantage of the model’s new-and-improved computer use capabilities to what this means for the compound engineering playbook.

Here’s what we learned—and what you can apply immediately.

Key takeaways

It finishes what it starts. Previous models spiral into errors three to four steps in. Opus keeps going—from idea to working app.
It can fix problems in code you didn’t write. Opus can modify the pre-written code your app uses (like calendar displays or payment systems—built by other developers so you don’t have to code them from scratch) and trace bugs through all the code, frameworks, and dependencies that make your app work.
It can test your app like a human would. Opus can test features end-to-end, find bugs, and generate before/after screenshots on its own.
You can build apps by simply describing what you want. Your app calls an AI and tells it what to do—no traditional feature code required.
Your brain is the bottleneck. The question is no longer “Can the AI do this?” but “Which of these 10 things should I build?”

What makes Opus 4.5 an infinite coding machine?

Dan calls Opus 4.5 the infinite coding machine. Previous models were great for demos but would peter out when you tried to build something real—three steps in, they’d start hallucinating fixes and making the same errors over and over again. Opus doesn’t hit that wall.

The shift has implications on two fronts: how you think about software (coding philosophy) and how you structure that software (code architecture).

Coding philosophy: Depth and delegation

Dan started building his reading app with other frontier models before Opus 4.5 arrived. He’d make progress, then hit the error wall. When he switched to Opus, the pattern changed. “It kept going,” he said during the camp. “I kept adding features and adding features.”

One of the reasons Opus 4.5 can do this is because it can fix problems in code you didn’t even write. Usually, if something breaks, you can only fix the issue in your own code. Anything else, you would have to become an expert in that tool to fix it. For example, if your app uses React, a framework for building web apps, you are also dependent on specific code that React uses to display content on screen, which in turn depends on the interface that lets code communicate with a browser like Chrome or Safari. Now, Opus 4.5 can trace a bug through all three layers and fix it.

Opus 4.5 also lets you work on multiple things in parallel, by delegating multiple tasks to the model at once. This increases the speed at which you can build.

Code architecture: Agent-native apps

The second shift is architectural. Opus changes how apps can be structured.

Traditionally, you code each feature as a step-by-step recipe. Every action is precisely defined: if user clicks button → fetch data → format results → display.

With Opus, you can build what Dan calls “agent-native” apps. Instead of coding recipes, you build a general-purpose agent (an AI that can use tools and follow instructions, similar to a chef that can follow recipes and cook) with access to tools, then give it prompts describing outcomes. The agent figures out the steps.

Dan’s reading app includes a feature that determines your “reading profile” from books and articles you have screenshotted on your phone’s camera roll. To code this traditionally, you would have had to write an algorithm to efficiently scan through a gigantic photo library, identify photos that are likely to contain text, convert them from images to text, collect them, and summarize them into a coherent, interesting synthesis—a complex process. Now, when you use the reading profile feature, the app calls the agent with a prompt. The agent scans your camera roll, identifies books, analyzes your taste, and writes it up—no feature code required.

Features are faster to build because you’re writing prompts instead of code, and they’re faster to iterate because you edit the prompt instead of hunting down the exact line of code that broke.

The tradeoffs are significant. Agent-native apps are more expensive—every feature costs more money to run because they consume costly compute resources, instead of just executing pre-written code—and slower, because the agent has to think. But model costs are dropping, and you don’t have to go all in. Use agent-native architecture for exploration, then write common workflows into traditional code as patterns emerge.

What a senior engineer can do with it

The philosophy and architecture shifts unlock workflows that weren’t possible before. Kieran demonstrated three patterns that show what Opus 4.5 looks like in daily use.

Computer use for testing

Previous models could technically control browsers, but they’d get confused, click the wrong things, or lose track. Opus stays coherent through complex multi-step tasks, which makes computer use viable for tasks like regression testing (checking that new code doesn’t break existing features) and user interface validation that would previously require manually clicking through your entire app.

Kieran uses Playwright for end-to-end testing (checking that features work from start to finish in a real browser). During the demo, he showed how he would test a new feature using Opus 4.5. Instead of manually clicking through each link and button in the app to make sure nothing was broken, he told Opus: “Use Playwright to test this feature. Click through everything and make sure nothing is broken.”

The model booted up Chrome, navigated the interface, found a bug, went back to the code, fixed it, and continued testing. All on its own.

He’s also automated before/after screenshots for pull requests (drafts of code changes submitted for review). Every time he uses the /work command (which tells the AI to start building), Playwright captures screenshots of the user interface before and after the change, then adds them to the pull request. Code reviewers can see exactly what changed visually without running the code themselves.

AI controlling your computer used to be a flashy trick. Now, it’s something you can rely on for real work.

Parallel delegation

Within days of getting Opus 4.5 access, Kieran was building 10 things at the same time. He confessed to being “agent addicted”—sneaking away from Thanksgiving dinner to start new prompts, shutting down his devices at night to force himself to stop.

Previous models would bleed context between tasks—they would take some information or instructions you used in one project and apply it in another, or lose track of details. Opus holds the thread on all of them, so the amount of work you are capable of is as much as you can hold in your head at one time.

One of Kieran’s projects was finishing an abandoned feature for Cora that would detect warning signs that a user is about to stop using the product. He’d spent days trying to build it with previous models. They’d write half the code and get stuck, or they’d stop to ask clarifying questions about every small decision—should the button be on the left or right, what color should the error message be, how many seconds before timing out. Opus 4.5 finished it in 30 minutes. When it hit a small decision, it made a reasonable choice and kept building instead of stopping to ask.

This pattern works when you’re launching several features at once, testing multiple approaches at the same time, or doing maintenance work across different parts of a codebase.

Meta-level work

Kieran was working on Every Board, a feedback tool the team is building internally, when he realized the compound engineering plugin needed a fourth research agent for design work. Usually, that would require him to stop what he was doing, open up the new tool’s code in a different place, make changes, and go back to his original project—by which time he would have forgotten where he was in coding. Instead, he created a command called /modify-plugin that let the AI work on both things at once: the Every Board project and making changes to the compound engineering framework—the set of instructions that Kieran has built to define the four steps that make up his AI-powered coding workflow..

“This is 4.5,” Kieran said. “This kind of work would have never been possible before without it becoming a complete mess.”

The model maintained what it knew about the design problems Kieran was hitting with the Every Board project, then switched to the codebase for the compound engineering workflow, which contains all of the rules and behaviors that make compound engineering work. It analyzed how the research agents were defined there and added a new agent that fit the pattern.

With Opus, you work on your project while modifying the tools you’re using to build that project simultaneously. This is particularly useful when you’re building tools, plugins, or systems that all need to work together.

The Q&A

Over the second half of the camp, Dan and Kieran fielded questions about their workflows. Here are the highlights.

Planning and compound engineering

Q: Is planning still necessary with Opus?

Kieran: For exploratory work or personal projects, Opus is good enough that you can just start building. But if you’re working on a team or building something complex, planning is more valuable than ever because Opus can execute much deeper, more sophisticated plans.

Q: When do you use the compound engineering workflow versus just vibing?

Kieran: If I already know exactly what I want to build and it’s production work, I use the full workflow—plan, delegate, assess, codify. For personal projects or exploration, I skip it.

Computer use and testing

Q: How do I get Playwright working with Claude Code?

Kieran: Install the compound engineering plugin—it’s two lines in your terminal and Playwright is bundled. Then just ask Claude to test something.

Q: Can Opus replace manual testing entirely?

Kieran: Not yet. It’s way easier to add unit tests now, and end-to-end testing with Playwright works well for web apps. But some manual testing is still needed.

Parallel work and delegation

Q: How many projects can you realistically work on at once?

Kieran: However many your brain can handle. That’s the actual bottleneck now. The AI can manage more parallel work than you can keep track of.

Dan: Three to five substantial things feels manageable for me. Beyond that, I start losing track.

Working style and fatigue

Q: Do you hit a fatigue wall working with multiple AI agents?

Kieran: I use the Pomodoro method and bunch certain types of work together. Code reviews for an hour, then research for an hour. The fatigue isn’t from the volume of work—the AI handles that. It’s from switching between different ways of thinking about what you’re asking the AI to do.

Agent native architecture

Q: How expensive is the reading app to run?

Dan: Very expensive right now. Every feature call costs money. But model costs are dropping fast, and you don’t have to keep it agent-native forever. You can harden common patterns into regular code once you figure out what users actually do.

Q: Is this architecture suitable for production?

Dan: It’s production right now. I use this app every day. Kieran’s kids use the story generator. The tradeoff is cost and speed, but if those work for your use case, it’s completely viable.

Skills and context management

Q: What are skills and how do you use them?

Kieran: Skills are bundles of knowledge you can load to perform dedicated tasks. Instead of cramming everything into your Claude.md file, you create skills for specific technologies that trigger only when you need them. When you’re doing design work, load the design skill. It keeps your context window clean and gives Claude deeper knowledge exactly when it needs it.

What we’re learning

Dan wrote about reaching the edge of the known world with Opus 4.5. The question was whether we’d find dragons or a new horizon.

We’re pretty sure it’s the horizon.

What struck us most was watching the metaphorical room also come to this realization, and understand that we’re delegating to AI now, not debugging it. Instead of asking “How do I fix this error loop?” builders can be asking “Which of these five features should I prioritize?” The latter is better problems to have.

Opus 4.5 has been out for three weeks. Already engineers are shipping prompt-native apps and parallel work is becoming the default. In another three weeks, we’ll probably discover workflows that make today’s look primitive.

The map is still being drawn—and we’re drawing it together.

We run camps on Claude Code and other frontier models and tools every month for paid Every subscribers. Our team demos their latest workflows, answers questions live, and shares what’s working (and what isn’t) as the systems evolve. Subscribe to Every to get invited to the next one.

Katie Parrott is a staff writer and AI editorial lead at Every. You can read more of her work in her newsletter.

To read more essays like this, subscribe to Every, and follow us on X at @every and on LinkedIn.

We build AI tools for readers like you. Write brilliantly with Spiral. Organize files automatically with Sparkle. Deliver yourself from email with Cora. Dictate effortlessly with Monologue.

We also do AI training, adoption, and innovation for companies. Work with us to bring AI into your organization.

Get paid for sharing Every with your friends. Join our referral program.

For sponsorship opportunities, reach out to [email protected].

What did you think of this post?

Amazing Good Meh Bad

Comments

You need to login before you can comment.
Don't have an account? Sign up!

Mick Alford 3 months ago

I'm not sure on Opus 4.5
it's taken me 3 days to build a simple incident management system using google sheets and App script. I have to keep checking the code opus creates with gemini. too many hallucinations and basic errors for my liking.

♡ 0 · Reply

@unun 29 days ago

@mick_4454 现在，你可以使用 Opus 4.6了，开启多 Agents 协作，制作完整的开发和测试工作流，虽然更耗费 tokens，但是会更省时间。

How Every Is Harnessing the World-changing Shift of Opus 4.5

Key takeaways