
We Tested Claude Sonnet 4.5 for Writing and Editing
Five tests across blind comparisons, editorial standards, and deadlines—here's what changed our setup
Oct 23, 2025Updated Jun 22, 2026
Early bird pricing for our Claude Code for Beginners class taught by Dan Shipper on November 19 ends tonight. Save $500 and reserve your spot today.—Kate Lee
Since GPT-5 came out three months ago, my writing workflow has been straddling LLM providers: ChatGPT for drafting, Claude for editing. The setup works, but the back-and-forth is tedious: Copy a draft from one window, paste it into another, wait for feedback, then hop back to revise. I’ve been starting to feel a bit like a glorified traffic conductor.
Then Anthropic dropped Sonnet 4.5, and within 48 hours my workflow collapsed from two chat interfaces into one.
Our Vibe Check on Sonnet 4.5 focused on coding. The model shined in Claude Code, wowing with its speed and handling long agentic tasks and multi-file reasoning without getting lost. And Anthropic followed Sonnet 4.5 closely with Haiku 4.5—a smaller, cheaper model that got our engineers excited for its building implications.
But as much as code and writing have in common—they’re both arranging letters and symbols in rows to achieve specific tasks, after all—code has some objective standards, namely, “Does it run?” Writing is different. There's no "Does it compile?"—the clear signal in programming that tells you if the code works or not—for good prose. Writing is subjective, taste-driven, and full of edge cases where two editors will disagree about what "better" even means.
We spend a lot of time working with AI in writing contexts at Every, whether it’s Spiral general manager Danny Aziz training models to produce stellar copy inside the app, or me yapping at my computer to hammer out first drafts of my essays about work and technology. A byproduct is that we’ve developed a set of benchmarks by assessing how well the new model works within our systems. They aren't objective measures, but they're what we use when we're deciding which model to reach for
AI should handle that
Looking for a Notion power user? Notion Agent is exactly that, and it completes everything you need to get done in Notion, with memory and intelligence. It updates databases, drafts documents, and wrangles feedback across tools. It knows every building block, searches everywhere you work (Slack, Google Drive, your workspace), and personalizes to match your style. Give it a goal and let it work.
So how do we decide whether a model is worth the switch? We run five tests based on our own workflows and what we need the model to do. As a result, they matter more to us than any benchmarks. The tests fall into two categories:
Output (Can it write?): Tests that tell Danny if he can trust Spiral to produce great copy, or I can trust my Working Overtime project to sound “like me” while keeping “AI smell” to a minimum.
Judgment (Can it recognize good writing?): Tests to see if the model has the taste to make existing writing better, again for Spiral as well as our internal editorial needs.
If you've ever wondered how a company built on words and AI tests how AI does with words, here's what happened when we put Sonnet 4.5 to the test...
Become a paid subscriber to Every to unlock this piece and learn about:
- The five writing tests we ran on Sonnet 4.5, GPT-5, and Opus 4.1
- How Sonnet 4.5 stacked up in interviewing, editing, short-form writing, and sounding human
- The Reach Test: Which model do Katie and Danny turn to for writing first?
Thanks to our Sponsor: Notion
AI should handle that
Looking for a Notion power user? Notion Agent is exactly that, and it completes everything you need to get done in Notion, with memory and intelligence. It updates databases, drafts documents, and wrangles feedback across tools. It knows every building block, searches everywhere you work (Slack, Google Drive, your workspace), and personalizes to match your style. Give it a goal and let it work.
The Only Subscription
You Need to
Stay at the
Edge of AI
The essential toolkit for those shaping the future
"This might be the best value you
can get from an AI subscription."
- Jay S.
Join 100,000+ leaders, builders, and innovators

Email address
Already have an account? Sign in.
What is included in a subscription?
Daily insights from AI pioneers + early access to powerful AI tools













Comments
Don't have an account? Sign up!