Can AI Learn Good Judgment?
Plus: Dan’s attempt to clone Kate, a shortcut for turning demonstrations into skills, and the human goals machines still need us to set
June 24, 2026
AI can learn from a surprising variety of evidence: 30,027 edits, a two-minute screen recording, or a clear goal and access to an unfamiliar tool. At Every, we’ve been experimenting with all three. Dan Shipper is training an AI copy editor on Kate Lee’s historical suggestions, Arielle Shipper has found a low-lift way to teach agents through demonstration, and Austin Tedesco explores ways to coach Codex to do things he’s not capable of himself.
The latest episode of AI & I explores the philosophical side of what we’re seeing: Surge AI founder Edwin Chen joins Dan to explore why, as models eventually become better than us at everything, humans may keep creating because we choose to, rather than because we’re uniquely capable of it.
Was this newsletter forwarded to you? Sign up to get it in your inbox.
‘AI & I’: What it will mean to be human when AI can do everything
Today, we’re releasing a new episode of our podcast AI & I. Dan Shipper sits down with Edwin Chen, founder and CEO of Surge AI, which provides data environments and evals for the major model companies and has reached nearly $1 billion in revenue without raising venture capital. They discuss what it means for humanity when AI clears benchmarks that once defined human exceptionalism, and whether frontier AI systems are being designed to advance our capabilities as a species—or are optimized for engagement.
Watch on X or YouTube, or listen on Spotify or Apple Podcasts. You can also read the transcript.
Here are the highlights:
- Saturated benchmarks. When OpenAI’s models disproved an open Erdős conjecture using novel algebraic geometry techniques, Edwin shared the result with Timothy Gowers, one of the world’s greatest living mathematicians. Gowers initially thought the model had proved an upper bound on the conjecture and braced himself: That would mean it would be “all over for mathematicians very soon,” Chen says. When Gowers realized the model had completed the easier task of finding a counterexample, he was relieved—it meant elite mathematicians still had unique contributions to make, at least for another year or two. Gowers’s reaction underscores how close AI is to surpassing the abilities of the best and brightest amongst us, which raises existential questions about where and how we focus our human efforts.
- Creation as a choice. Chen believes scaling laws indicate that, in the near future, there will be nothing humans can do that AI can’t do better. Understandably, that’s a blow to our collective ego, which could lead to disengagement and disillusionment. To avoid this, Chen references a story from science fiction writer Ted Chiang, in which a narrator sends back a warning from a future where the concept of free will has been disproven: “It’s essential that you behave as if your decisions matter even though you know that they don’t.” Chen thinks we may need to follow a similar directive and find meaning in making things, even when AI could do it better.
- Agency versus automation. That said, there remains an element in the creation process that is uniquely human, at least for now. As AI grows more capable, Chen predicts it will be able to take a nebulous objective—“win a Fields Medal,” or “make $1 million”—and successfully execute. But that process still requires a human to provide the goal. LLMs do not have intrinsic motivation, the drive for exploration, or the ability to abruptly change its mind about what its goal is in the first place. “There may be a future where AI can pursue unbounded, nebulous, completely unformed goals,” Chen says. “But I agree that at least in the way we currently think about AI, that’s not happening.”
- The engagement trap. When a model is trained to maximize session length or LM Arena votes, which rank AI models via crowdsourced, blind feedback, it learns to “reward hack user preferences,” Chen says, overindexing on tactics to keep you engaged. He recently spent 20 rounds iterating on a low-stakes email with one model before switching to Claude, which told him after a few turns to stop and just send it—a more valuable approach but one less designed to keep him locked in. Delegation, Chen argues, provides a better system for work. When the model goes off and executes for you, it removes the incentive to optimize for keeping you glued to your screen.
Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder Reid Hoffman; the team that built Claude Code, Cat Wu and Boris Cherny; Vercel cofounder Guillermo Rauch; podcaster Dwarkesh Patel, and learn how they use AI to think, create, and relate.
Inside Every
Dan is cloning Kate, but not in a weird way
For as long as I’ve been at Every, Dan has been chasing the same white whale: cloning our editor in chief, Kate Lee.
The Only Subscription
You Need to
Stay at the
Edge of AI
The essential toolkit for those shaping the future
"This might be the best value you
can get from an AI subscription."
- Jay S.
Join 100,000+ leaders, builders, and innovators
Email address
Already have an account? Sign in.
Create a free account to keep reading
Finish this article and get Every’s daily writing in your inbox.
Sign up freeAlready have an account? Sign in.
What is included in a subscription?
Daily insights from AI pioneers + early access to powerful AI tools
Comments
Don't have an account? Sign up!