Midjourney/Every illustration.

Can AI Learn Good Judgment?

Plus: Dan’s attempt to clone Kate, a shortcut for turning demonstrations into skills, and the human goals machines still need us to set

Like 1 Comments

AI can learn from a surprising variety of evidence: 30,027 edits, a two-minute screen recording, or a clear goal and access to an unfamiliar tool. At Every, we’ve been experimenting with all three. Dan Shipper is training an AI copy editor on Kate Lee’s historical suggestions, Arielle Shipper has found a low-lift way to teach agents through demonstration, and Austin Tedesco explores ways to coach Codex to do things he’s not capable of himself.

The latest episode of AI & I explores the philosophical side of what we’re seeing: Surge AI founder Edwin Chen joins Dan to explore why, as models eventually become better than us at everything, humans may keep creating because we choose to, rather than because we’re uniquely capable of it.

Was this newsletter forwarded to you? Sign up to get it in your inbox.


‘AI & I’: What it will mean to be human when AI can do everything

Today, we’re releasing a new episode of our podcast AI & I. Dan Shipper sits down with Edwin Chen, founder and CEO of Surge AI, which provides data environments and evals for the major model companies and has reached nearly $1 billion in revenue without raising venture capital. They discuss what it means for humanity when AI clears benchmarks that once defined human exceptionalism, and whether frontier AI systems are being designed to advance our capabilities as a species—or are optimized for engagement.

Watch on X or YouTube, or listen on Spotify or Apple Podcasts. You can also read the transcript.

Here are the highlights:

  1. Saturated benchmarks. When OpenAI’s models disproved an open Erdős conjecture using novel algebraic geometry techniques, Edwin shared the result with Timothy Gowers, one of the world’s greatest living mathematicians. Gowers initially thought the model had proved an upper bound on the conjecture and braced himself: That would mean it would be “all over for mathematicians very soon,” Chen says. When Gowers realized the model had completed the easier task of finding a counterexample, he was relieved—it meant elite mathematicians still had unique contributions to make, at least for another year or two. Gowers’s reaction underscores how close AI is to surpassing the abilities of the best and brightest amongst us, which raises existential questions about where and how we focus our human efforts.
  2. Creation as a choice. Chen believes scaling laws indicate that, in the near future, there will be nothing humans can do that AI can’t do better. Understandably, that’s a blow to our collective ego, which could lead to disengagement and disillusionment. To avoid this, Chen references a story from science fiction writer Ted Chiang, in which a narrator sends back a warning from a future where the concept of free will has been disproven: “It’s essential that you behave as if your decisions matter even though you know that they don’t.” Chen thinks we may need to follow a similar directive and find meaning in making things, even when AI could do it better.
  3. Agency versus automation. That said, there remains an element in the creation process that is uniquely human, at least for now. As AI grows more capable, Chen predicts it will be able to take a nebulous objective—“win a Fields Medal,” or “make $1 million”—and successfully execute. But that process still requires a human to provide the goal. LLMs do not have intrinsic motivation, the drive for exploration, or the ability to abruptly change its mind about what its goal is in the first place. “There may be a future where AI can pursue unbounded, nebulous, completely unformed goals,” Chen says. “But I agree that at least in the way we currently think about AI, that’s not happening.”
  4. The engagement trap. When a model is trained to maximize session length or LM Arena votes, which rank AI models via crowdsourced, blind feedback, it learns to “reward hack user preferences,” Chen says, overindexing on tactics to keep you engaged. He recently spent 20 rounds iterating on a low-stakes email with one model before switching to Claude, which told him after a few turns to stop and just send it—a more valuable approach but one less designed to keep him locked in. Delegation, Chen argues, provides a better system for work. When the model goes off and executes for you, it removes the incentive to optimize for keeping you glued to your screen.

Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder Reid Hoffman; the team that built Claude Code, Cat Wu and Boris Cherny; Vercel cofounder Guillermo Rauch; podcaster Dwarkesh Patel, and learn how they use AI to think, create, and relate.


Inside Every

Dan is cloning Kate, but not in a weird way

For as long as I’ve been at Every, Dan has been chasing the same white whale: cloning our editor in chief, Kate Lee.

Create a free account to continue reading

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Monologue Monologue
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in.

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders

Related Essays

The Best Bang for Your Model Buck

Plus, a Nano Banana nano-Vibe Check

Aug 31, 2025

Every Staff

Claude’s Agent Chaos

Plus: How to know when AI is wrong

Aug 1, 2025

Every Staff

Claude Code Takes Pole Position

Plus: Two camps this week to get you building with agents

3 Jan 18, 2026

Every Staff

Comments

You need to login before you can comment.
Don't have an account? Sign up!

We use analytics and advertising tools by default. You can update this anytime.