Every
We Tested Claude Sonnet 4.5 for Writing and Editing
Midjourney/Every illustration.

We Tested Claude Sonnet 4.5 for Writing and Editing

Five tests across blind comparisons, editorial standards, and deadlines—here's what changed our setup

Oct 23, 2025Updated Jun 22, 2026

Comments2

Early bird pricing for our Claude Code for Beginners class taught by Dan Shipper on November 19 ends tonight. Save $500 and reserve your spot today.—Kate Lee


Since GPT-5 came out three months ago, my writing workflow has been straddling LLM providers: ChatGPT for drafting, Claude for editing. The setup works, but the back-and-forth is tedious: Copy a draft from one window, paste it into another, wait for feedback, then hop back to revise. I’ve been starting to feel a bit like a glorified traffic conductor.

Then Anthropic dropped Sonnet 4.5, and within 48 hours my workflow collapsed from two chat interfaces into one.

Our Vibe Check on Sonnet 4.5 focused on coding. The model shined in Claude Code, wowing with its speed and handling long agentic tasks and multi-file reasoning without getting lost. And Anthropic followed Sonnet 4.5 closely with Haiku 4.5—a smaller, cheaper model that got our engineers excited for its building implications.

But as much as code and writing have in common—they’re both arranging letters and symbols in rows to achieve specific tasks, after all—code has some objective standards, namely, “Does it run?” Writing is different. There's no "Does it compile?"—the clear signal in programming that tells you if the code works or not—for good prose. Writing is subjective, taste-driven, and full of edge cases where two editors will disagree about what "better" even means.

We spend a lot of time working with AI in writing contexts at Every, whether it’s Spiral general manager Danny Aziz training models to produce stellar copy inside the app, or me yapping at my computer to hammer out first drafts of my essays about work and technology. A byproduct is that we’ve developed a set of benchmarks by assessing how well the new model works within our systems. They aren't objective measures, but they're what we use when we're deciding which model to reach for

Uploaded image

AI should handle that

Looking for a Notion power user? Notion Agent is exactly that, and it completes everything you need to get done in Notion, with memory and intelligence. It updates databases, drafts documents, and wrangles feedback across tools. It knows every building block, searches everywhere you work (Slack, Google Drive, your workspace), and personalizes to match your style. Give it a goal and let it work.

So how do we decide whether a model is worth the switch? We run five tests based on our own workflows and what we need the model to do. As a result, they matter more to us than any benchmarks. The tests fall into two categories:

Output (Can it write?): Tests that tell Danny if he can trust Spiral to produce great copy, or I can trust my Working Overtime project to sound “like me” while keeping “AI smell” to a minimum.

Judgment (Can it recognize good writing?): Tests to see if the model has the taste to make existing writing better, again for Spiral as well as our internal editorial needs.

If you've ever wondered how a company built on words and AI tests how AI does with words, here's what happened when we put Sonnet 4.5 to the test...


Become a paid subscriber to Every to unlock this piece and learn about:

  1. The five writing tests we ran on Sonnet 4.5, GPT-5, and Opus 4.1
  2. How Sonnet 4.5 stacked up in interviewing, editing, short-form writing, and sounding human
  3. The Reach Test: Which model do Katie and Danny turn to for writing first?

Thanks to our Sponsor: Notion

Uploaded image

AI should handle that

Looking for a Notion power user? Notion Agent is exactly that, and it completes everything you need to get done in Notion, with memory and intelligence. It updates databases, drafts documents, and wrangles feedback across tools. It knows every building block, searches everywhere you work (Slack, Google Drive, your workspace), and personalizes to match your style. Give it a goal and let it work.

Create a free account to continue reading

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Every ContentEvery Content
AI&I PodcastAI&I Podcast
MonologueMonologue
CoraCora
SparkleSparkle
SpiralSpiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in.

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

PencilFront-row access to the future of AI
CheckIn-depth reviews of new models on release day
CheckPlaybooks and guides for putting AI to work
CheckPrompts and use cases for builders

Related Essays

Comments

You need to login before you can comment.
Don't have an account? Sign up!

We use analytics and advertising tools by default. You can update this anytime.